1
|
Zhang J, Shen B, Zhou Z, Cai M, Wu X, Han L, Wen Y. An Extended Application of the Fast Multi-Locus Ridge Regression Algorithm in Genome-Wide Association Studies of Categorical Phenotypes. PLANTS (BASEL, SWITZERLAND) 2024; 13:2520. [PMID: 39274004 PMCID: PMC11397509 DOI: 10.3390/plants13172520] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/27/2024] [Revised: 09/02/2024] [Accepted: 09/05/2024] [Indexed: 09/16/2024]
Abstract
Categorical (either binary or ordinal) quantitative traits are widely observed to measure count and resistance in plants. Unlike continuous traits, categorical traits often provide less detailed insights into genetic variation and possess a more complex underlying genetic architecture, which presents additional challenges for their genome-wide association studies. Meanwhile, methods designed for binary or continuous phenotypes are commonly used to inappropriately analyze ordinal traits, which leads to the loss of original phenotype information and the detection power of quantitative trait nucleotides (QTN). To address these issues, fast multi-locus ridge regression (FastRR), which was originally designed for continuous traits, is used to directly analyze binary or ordinal traits in this study. FastRR includes three stages of continuous transformation, variable reduction, and parameter estimation, and it can computationally handle categorical phenotype data instead of link functions introduced or methods inappropriately used. A series of simulation studies demonstrate that, compared with four other continuous or binary or ordinal approaches, including logistic regression, FarmCPU, FaST-LMM, and POLMM, the FastRR method outperforms in the detection of small-effect QTN, accuracy of estimated effect, and computation speed. We applied FastRR to 14 binary or ordinal phenotypes in the Arabidopsis real dataset and identified 479 significant loci and 76 known genes, at least seven times as many as detected by other algorithms. These findings underscore the potential of FastRR as a very useful tool for genome-wide association studies and novel gene mining of binary and ordinal traits.
Collapse
Affiliation(s)
- Jin Zhang
- College of Science, Nanjing Agricultural University, Nanjing 210095, China
| | - Bolin Shen
- College of Science, Nanjing Agricultural University, Nanjing 210095, China
| | - Ziyang Zhou
- College of Science, Nanjing Agricultural University, Nanjing 210095, China
| | - Mingzhi Cai
- College of Science, Nanjing Agricultural University, Nanjing 210095, China
| | - Xinyi Wu
- College of Science, Nanjing Agricultural University, Nanjing 210095, China
| | - Le Han
- College of Science, Nanjing Agricultural University, Nanjing 210095, China
| | - Yangjun Wen
- College of Science, Nanjing Agricultural University, Nanjing 210095, China
| |
Collapse
|
2
|
Azevedo CF, Ferrão LFV, Benevenuto J, de Resende MDV, Nascimento M, Nascimento ACC, Munoz PR. Using visual scores for genomic prediction of complex traits in breeding programs. TAG. THEORETICAL AND APPLIED GENETICS. THEORETISCHE UND ANGEWANDTE GENETIK 2023; 137:9. [PMID: 38102495 DOI: 10.1007/s00122-023-04512-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/09/2023] [Accepted: 11/21/2023] [Indexed: 12/17/2023]
Abstract
KEY MESSAGE An approach for handling visual scores with potential errors and subjectivity in scores was evaluated in simulated and blueberry recurrent selection breeding schemes to assist breeders in their decision-making. Most genomic prediction methods are based on assumptions of normality due to their simplicity and ease of implementation. However, in plant and animal breeding, continuous traits are often visually scored as categorical traits and analyzed as a Gaussian variable, thus violating the normality assumption, which could affect the prediction of breeding values and the estimation of genetic parameters. In this study, we examined the main challenges of visual scores for genomic prediction and genetic parameter estimation using mixed models, Bayesian, and machine learning methods. We evaluated these approaches using simulated and real breeding data sets. Our contribution in this study is a five-fold demonstration: (i) collecting data using an intermediate number of categories (1-3 and 1-5) is the best strategy, even considering errors associated with visual scores; (ii) Linear Mixed Models and Bayesian Linear Regression are robust to the normality violation, but marginal gains can be achieved when using Bayesian Ordinal Regression Models (BORM) and Random Forest Classification; (iii) genetic parameters are better estimated using BORM; (iv) our conclusions using simulated data are also applicable to real data in autotetraploid blueberry; and (v) a comparison of continuous and categorical phenotypes found that investing in the evaluation of 600-1000 categorical data points with low error, when it is not feasible to collect continuous phenotypes, is a strategy for improving predictive abilities. Our findings suggest the best approaches for effectively using visual scores traits to explore genetic information in breeding programs and highlight the importance of investing in the training of evaluator teams and in high-quality phenotyping.
Collapse
Affiliation(s)
- Camila Ferreira Azevedo
- Statistics Department, Federal University of Viçosa, Viçosa, Minas Gerais, Brazil
- Horticultural Sciences Department, Blueberry Breeding and Genomics Lab, University of Florida, Gainesville, FL, USA
| | - Luis Felipe Ventorim Ferrão
- Horticultural Sciences Department, Blueberry Breeding and Genomics Lab, University of Florida, Gainesville, FL, USA
| | - Juliana Benevenuto
- Horticultural Sciences Department, Blueberry Breeding and Genomics Lab, University of Florida, Gainesville, FL, USA
| | - Marcos Deon Vilela de Resende
- Statistics Department, Federal University of Viçosa, Viçosa, Minas Gerais, Brazil
- Department of Forestry Engineering, Federal University of Viçosa, Viçosa, Minas Gerais, Brazil
- Embrapa Café, Brasília, Distrito Federal, Brazil
| | - Moyses Nascimento
- Statistics Department, Federal University of Viçosa, Viçosa, Minas Gerais, Brazil
| | | | - Patricio R Munoz
- Horticultural Sciences Department, Blueberry Breeding and Genomics Lab, University of Florida, Gainesville, FL, USA.
| |
Collapse
|
3
|
Manthena V, Jarquín D, Howard R. Integrating and optimizing genomic, weather, and secondary trait data for multiclass classification. Front Genet 2023; 13:1032691. [PMID: 37065625 PMCID: PMC10090538 DOI: 10.3389/fgene.2022.1032691] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2022] [Accepted: 12/22/2022] [Indexed: 04/18/2023] Open
Abstract
Modern plant breeding programs collect several data types such as weather, images, and secondary or associated traits besides the main trait (e.g., grain yield). Genomic data is high-dimensional and often over-crowds smaller data types when naively combined to explain the response variable. There is a need to develop methods able to effectively combine different data types of differing sizes to improve predictions. Additionally, in the face of changing climate conditions, there is a need to develop methods able to effectively combine weather information with genotype data to predict the performance of lines better. In this work, we develop a novel three-stage classifier to predict multi-class traits by combining three data types-genomic, weather, and secondary trait. The method addressed various challenges in this problem, such as confounding, differing sizes of data types, and threshold optimization. The method was examined in different settings, including binary and multi-class responses, various penalization schemes, and class balances. Then, our method was compared to standard machine learning methods such as random forests and support vector machines using various classification accuracy metrics and using model size to evaluate the sparsity of the model. The results showed that our method performed similarly to or better than machine learning methods across various settings. More importantly, the classifiers obtained were highly sparse, allowing for a straightforward interpretation of relationships between the response and the selected predictors.
Collapse
Affiliation(s)
- Vamsi Manthena
- Department of Statistics, University of Nebraska-Lincoln, Lincoln, NE, United States
| | - Diego Jarquín
- Agronomy Department, University of Florida, Gainesville, FL, United States
| | - Reka Howard
- Department of Statistics, University of Nebraska-Lincoln, Lincoln, NE, United States
| |
Collapse
|
4
|
Wang F, Wang Y, Wang Y, Jia T, Chang L, Ding J, Zhou L. Urinary polycyclic aromatic hydrocarbon metabolites were associated with hypertension in US adults: data from NHANES 2009-2016. ENVIRONMENTAL SCIENCE AND POLLUTION RESEARCH INTERNATIONAL 2022; 29:80491-80501. [PMID: 35716300 DOI: 10.1007/s11356-022-21391-8] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/13/2022] [Accepted: 06/06/2022] [Indexed: 06/15/2023]
Abstract
Polycyclic aromatic hydrocarbons (PAHs) are widely existing organic pollutants in the environment, and their persistence in the environment makes us have to pay continuous attention to their health effects. However, since the American Heart Association updated its definition of hypertension in 2017, few studies have explored the relationship. This study aimed to investigate the relationship between PAH exposure and hypertension after the updated definition of hypertension and explore whether body mass index (BMI) moderates this relationship. A total of 6332 adult participants from the 2009-2016 National Health and Nutrition Examination Survey (NHANES) were examined. Multiple logistic regression and restricted cubic splines were used to analyze the association between urinary polycyclic aromatic hydrocarbon metabolites and hypertension, and the dose-response relationship. Weighted quantile sum (WQS) regression was applied to blood pressure to reveal multiple exposure effects and the relative weights of each PAH. The prevalence of hypertension in the study population was 48.52%. There was a positive dose-response relationship between high exposure to 1-hydroxynaphthalene, 2&3-hydroxyphenanthrene, and the risk of hypertension. Naphthalene metabolites accounted for the most significant proportion of systolic blood pressure, and phenanthrene metabolites accounted for the most significant proportion of diastolic blood pressure. Obese individuals with high PAH exposure were at greater risk for hypertension than individuals with low PAH exposure and normal BMI. Higher prevalence rate and stronger association of metabolites with outcomes were obtained in the general population of the USA under the new guideline. High levels of exposure to PAHs were positively associated with the risk of hypertension, and these effects were modified by BMI.
Collapse
Affiliation(s)
- Fang Wang
- Department of Epidemiology and Health Statistics, School of Public Health, Shanxi Medical University, No. 56, Xinjian South Road, Yingze District, Taiyuan, China.
| | - Yuying Wang
- Department of Epidemiology and Health Statistics, School of Public Health, Shanxi Medical University, No. 56, Xinjian South Road, Yingze District, Taiyuan, China
| | - Yu Wang
- Department of Epidemiology and Health Statistics, School of Public Health, Shanxi Medical University, No. 56, Xinjian South Road, Yingze District, Taiyuan, China
| | - Teng Jia
- Department of Epidemiology and Health Statistics, School of Public Health, Shanxi Medical University, No. 56, Xinjian South Road, Yingze District, Taiyuan, China
| | - Li Chang
- Department of Epidemiology and Health Statistics, School of Public Health, Shanxi Medical University, No. 56, Xinjian South Road, Yingze District, Taiyuan, China
| | - Jie Ding
- Department of Epidemiology and Health Statistics, School of Public Health, Shanxi Medical University, No. 56, Xinjian South Road, Yingze District, Taiyuan, China
| | - Li Zhou
- Department of Epidemiology and Health Statistics, School of Public Health, Shanxi Medical University, No. 56, Xinjian South Road, Yingze District, Taiyuan, China
| |
Collapse
|
5
|
McAllister CH, Cullingham CI, Peery RM, Mbenoun M, McPeak E, Feau N, Hamelin RC, Ramsfield TD, Myrholm CL, Cooke JEK. Evidence of Coevolution Between Cronartium harknessii Lineages and Their Corresponding Hosts, Lodgepole Pine and Jack Pine. PHYTOPATHOLOGY 2022; 112:1795-1807. [PMID: 35166574 DOI: 10.1094/phyto-09-21-0370-r] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Variation in rate of infection and susceptibility of Pinus spp. to the fungus Cronartium harknessii (syn. Endocronartium harknessii), the causative agent of western gall rust, has been well documented. To test the hypothesis that there is a coevolutionary relationship between C. harknessii and its hosts, we examined genetic structure and virulence of C. harknessii associated with lodgepole pine (P. contorta var. latifolia), jack pine (P. banksiana), and their hybrids. A secondary objective was to improve assessment and diagnosis of infection in hosts. Using 18 microsatellites, we assessed genetic structure of C. harknessii from 90 sites within the ranges of lodgepole pine and jack pine. We identified two lineages (East and West, FST = 0.677) associated with host genetic structure (r = 0.81, P = 0.001), with East comprising three sublineages. In parallel, we conducted a factorial experiment in which lodgepole pine, jack pine, and hybrid seedlings were inoculated with spores from the two primary genetic lineages. With this experiment, we refined the phenotypic categories associated with infection and demonstrated that stem width can be used as a quantitative measure of host response to infection. Overall, each host responded differentially to the fungal lineages, with jack pine exhibiting more resiliency to infection than lodgepole pine and hybrids exhibiting intermediate resiliency. Taken together, the shared genetic structure between fungus and host species, and the differential interaction of the fungal species with the hosts, supports a coevolutionary relationship between host and pathogen.[Formula: see text] Copyright © 2022 The Author(s). This is an open access article distributed under the CC BY-NC-ND 4.0 International license.
Collapse
Affiliation(s)
- Chandra H McAllister
- Department of Biological Sciences, University of Alberta, Edmonton, Alberta, Canada
| | | | - Rhiannon M Peery
- Department of Biological Sciences, University of Alberta, Edmonton, Alberta, Canada
| | - Michael Mbenoun
- Department of Biological Sciences, University of Alberta, Edmonton, Alberta, Canada
| | - Eden McPeak
- Department of Biological Sciences, University of Alberta, Edmonton, Alberta, Canada
| | - Nicolas Feau
- Department of Forest Science, University of British Columbia, Vancouver, British Columbia, Canada
- Pacific Forestry Centre, Canadian Forest Service, Natural Resources Canada, Victoria, British Columbia, Canada
| | - Richard C Hamelin
- Department of Forest Science, University of British Columbia, Vancouver, British Columbia, Canada
| | - Tod D Ramsfield
- Northern Forestry Centre, Canadian Forest Service, Natural Resources Canada, Edmonton, Alberta, Canada
| | - Colin L Myrholm
- Northern Forestry Centre, Canadian Forest Service, Natural Resources Canada, Edmonton, Alberta, Canada
| | - Janice E K Cooke
- Department of Biological Sciences, University of Alberta, Edmonton, Alberta, Canada
| |
Collapse
|
6
|
Stephen MA, Cheng H, Pryce JE, Burke CR, Steele NM, Phyn CVC, Garrick DJ. Estimating Heritabilities and Breeding Values From Censored Phenotypes Using a Data Augmentation Approach. Front Genet 2022; 13:867152. [PMID: 35957692 PMCID: PMC9358037 DOI: 10.3389/fgene.2022.867152] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2022] [Accepted: 05/24/2022] [Indexed: 11/13/2022] Open
Abstract
Time-dependent traits are often subject to censorship, where instead of precise phenotypes, only a lower and/or upper bound can be established for some of the individuals. Censorship reduces the precision of phenotypes but can represent compromise between measurement cost and animal ethics considerations. This compromise is particularly relevant for genetic evaluation because phenotyping initiatives often involve thousands of individuals. This research aimed to: 1) demonstrate a data augmentation approach for analysing censored phenotypes, and 2) quantify the implications of phenotype censorship on estimation of heritabilities and predictions of breeding values. First, we simulated uncensored phenotypes, representing fine-scale “age at puberty” for each individual in a population of some 5,000 animals across 50 herds. Analysis of these uncensored phenotypes provided a gold-standard control. We then produced seven “test” phenotypes by superimposing varying degrees of left, interval, and/or right censorship, as if herds were measured on only one, two or three occasions, with a binary measure categorized for animals at each visit (either pre or post pubertal). We demonstrated that our estimates of heritabilities and predictions of breeding values obtained using a data augmentation approach were remarkably robust to phenotype censorship. Our results have important practical implications for measuring time-dependent traits for genetic evaluation. More specifically, we suggest that data collection can be designed with relatively infrequent repeated measures, thereby reducing costs and increasing feasibility across large numbers of animals.
Collapse
Affiliation(s)
- Melissa A. Stephen
- DairyNZ Ltd., Hamilton, New Zealand
- AL Rae Centre for Genetics and Breeding—Massey University, Hamilton, New Zealand
- *Correspondence: Melissa A. Stephen,
| | - Hao Cheng
- Department of Animal Science, University of California, Davis, Davis, CA, United States
| | - Jennie E. Pryce
- Centre for AgriBioscience, Agriculture Victoria Research, AgriBio, Bundoora, VIC, Australia
- School of Applied Systems Biology, La Trobe University, Bundoora, VIC, Australia
| | | | | | | | - Dorian J. Garrick
- AL Rae Centre for Genetics and Breeding—Massey University, Hamilton, New Zealand
| |
Collapse
|
7
|
Crossa J, Montesinos-López OA, Pérez-Rodríguez P, Costa-Neto G, Fritsche-Neto R, Ortiz R, Martini JWR, Lillemo M, Montesinos-López A, Jarquin D, Breseghello F, Cuevas J, Rincent R. Genome and Environment Based Prediction Models and Methods of Complex Traits Incorporating Genotype × Environment Interaction. Methods Mol Biol 2022; 2467:245-283. [PMID: 35451779 DOI: 10.1007/978-1-0716-2205-6_9] [Citation(s) in RCA: 15] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Genomic-enabled prediction models are of paramount importance for the successful implementation of genomic selection (GS) based on breeding values. As opposed to animal breeding, plant breeding includes extensive multienvironment and multiyear field trial data. Hence, genomic-enabled prediction models should include genotype × environment (G × E) interaction, which most of the time increases the prediction performance when the response of lines are different from environment to environment. In this chapter, we describe a historical timeline since 2012 related to advances of the GS models that take into account G × E interaction. We describe theoretical and practical aspects of those GS models, including the gains in prediction performance when including G × E structures for both complex continuous and categorical scale traits. Then, we detailed and explained the main G × E genomic prediction models for complex traits measured in continuous and noncontinuous (categorical) scale. Related to G × E interaction models this review also examine the analyses of the information generated with high-throughput phenotype data (phenomic) and the joint analyses of multitrait and multienvironment field trial data that is also employed in the general assessment of multitrait G × E interaction. The inclusion of nongenomic data in increasing the accuracy and biological reliability of the G × E approach is also outlined. We show the recent advances in large-scale envirotyping (enviromics), and how the use of mechanistic computational modeling can derive the crop growth and development aspects useful for predicting phenotypes and explaining G × E.
Collapse
Affiliation(s)
- José Crossa
- International Maize and Wheat Improvement Center (CIMMYT), Carretera México-Veracruz, Mexico
- Colegio de Postgraduados, Montecillos, Mexico
| | | | | | - Germano Costa-Neto
- Departamento de Genética, Escola Superior de Agricultura "Luiz de Queiroz" (ESALQ/USP), São Paulo, Brazil
| | - Roberto Fritsche-Neto
- Departamento de Genética, Escola Superior de Agricultura "Luiz de Queiroz" (ESALQ/USP), São Paulo, Brazil
| | - Rodomiro Ortiz
- Department of Plant Breeding, Swedish University of Agricultural Sciences (SLU), Alnarp, Sweden
| | - Johannes W R Martini
- International Maize and Wheat Improvement Center (CIMMYT), Carretera México-Veracruz, Mexico
| | - Morten Lillemo
- Department of Plant Sciences, Norwegian University of Life Sciences, IHA/CIGENE, Ås, Norway
| | - Abelardo Montesinos-López
- Departamento de Matemáticas, Centro Universitario de Ciencias Exactas e Ingenierías (CUCEI), Universidad de Guadalajara, Guadalajara, Jalisco, Mexico
| | | | | | - Jaime Cuevas
- Universidad de Quintana Roo, Chetumal, Quintana Roo, Mexico.
| | - Renaud Rincent
- Université Paris-Saclay, INRAE, CNRS, AgroParisTech, Génétique Quantitative et Evolution - Le Moulon, Gif-sur-Yvette, France.
| |
Collapse
|
8
|
Zhou S, Zhu Q, Liu H, Jiang S, Zhang X, Peng C, Yang G, Li J, Cheng L, Zhong R, Zeng Q, Miao X, Lu Q. Associations of polycyclic aromatic hydrocarbons exposure and its interaction with XRCC1 genetic polymorphism with lung cancer: A case-control study. ENVIRONMENTAL POLLUTION (BARKING, ESSEX : 1987) 2021; 290:118077. [PMID: 34523522 DOI: 10.1016/j.envpol.2021.118077] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/12/2021] [Revised: 08/27/2021] [Accepted: 08/29/2021] [Indexed: 06/13/2023]
Abstract
Humans are extensively exposed to polycyclic aromatic hydrocarbons (PAHs) daily via multiple pathways. Epidemiological studies have demonstrated that occupational exposure to PAHs increases the risk of lung cancer, but related studies in the general population are limited. Hence, we conducted a case-control study among the Chinese general population to investigate the associations between PAHs exposure and lung cancer risk and analyze the modifications of genetic polymorphisms in DNA repair genes. In this study, we enrolled 122 lung cancer cases and 244 healthy controls in Wuhan, China. Urinary PAHs metabolites were determined by gas chromatography-mass spectrometry, and rs25487 in X-ray repair cross-complementation 1 (XRCC1) gene was genotyped by the Agena Bioscience MassARRAY System. Then, multivariable logistic regression models were performed to estimate the potential associations. We found that urinary hydroxynaphthalene (OH-Nap), hydroxyphenanthrene (OH-Phe) and the sum of hydroxy PAHs (∑OH-PAHs) levels were significantly higher in lung cancer cases than those in controls. After adjusting for gender, age, BMI, smoking status, smoking pack-years, drinking status and family history, urinary ∑OH-Nap and ∑OH-Phe levels were positively associated with lung cancer risk, with dose-response relationships. Compared with those in the lowest tertiles, individuals in the highest tertiles of ∑OH-Nap and ∑OH-Phe had a 2.13-fold (95% CI: 1.10, 4.09) and 2.45-fold (95% CI: 1.23, 4.87) increased risk of lung cancer, respectively. Effects of gender, age, smoking status and smoking pack-years on the associations of PAHs exposure with lung cancer risk were shown in the subgroup analysis. Furthermore, associations of urinary ∑OH-Nap and ∑OH-PAHs levels with lung cancer risk were modified by XRCC1 rs25487 (Pinteraction ≤ 0.025), and were more pronounced in wild-types of rs25487. These findings suggest that environmental exposure to naphthalene and phenanthrene is associated with increased lung cancer risk, and polymorphism of XRCC1 rs25487 might modify the naphthalene exposure-related lung cancer effect.
Collapse
Affiliation(s)
- Shuang Zhou
- State Key Laboratory of Environment Health (Incubation), Key Laboratory of Environment and Health, Ministry of Education, Key Laboratory of Environment and Health (Wuhan), Ministry of Environmental Protection, School of Public Health, Tongji Medical College, Huazhong University of Science and Technology, 13 Hangkong Road, Wuhan, Hubei, 430030, China
| | - Qiuqi Zhu
- State Key Laboratory of Environment Health (Incubation), Key Laboratory of Environment and Health, Ministry of Education, Key Laboratory of Environment and Health (Wuhan), Ministry of Environmental Protection, School of Public Health, Tongji Medical College, Huazhong University of Science and Technology, 13 Hangkong Road, Wuhan, Hubei, 430030, China
| | - Huimin Liu
- State Key Laboratory of Environment Health (Incubation), Key Laboratory of Environment and Health, Ministry of Education, Key Laboratory of Environment and Health (Wuhan), Ministry of Environmental Protection, School of Public Health, Tongji Medical College, Huazhong University of Science and Technology, 13 Hangkong Road, Wuhan, Hubei, 430030, China
| | - Shunli Jiang
- State Key Laboratory of Environment Health (Incubation), Key Laboratory of Environment and Health, Ministry of Education, Key Laboratory of Environment and Health (Wuhan), Ministry of Environmental Protection, School of Public Health, Tongji Medical College, Huazhong University of Science and Technology, 13 Hangkong Road, Wuhan, Hubei, 430030, China; Key Laboratory of Occupational Health and Environmental Medicine, Department of Public Health, Jining Medical University, 133 Hehua Road, Jining, Shandong, 272067, China
| | - Xu Zhang
- State Key Laboratory of Environment Health (Incubation), Key Laboratory of Environment and Health, Ministry of Education, Key Laboratory of Environment and Health (Wuhan), Ministry of Environmental Protection, School of Public Health, Tongji Medical College, Huazhong University of Science and Technology, 13 Hangkong Road, Wuhan, Hubei, 430030, China
| | - Cheng Peng
- State Key Laboratory of Environment Health (Incubation), Key Laboratory of Environment and Health, Ministry of Education, Key Laboratory of Environment and Health (Wuhan), Ministry of Environmental Protection, School of Public Health, Tongji Medical College, Huazhong University of Science and Technology, 13 Hangkong Road, Wuhan, Hubei, 430030, China
| | - Guanlin Yang
- State Key Laboratory of Environment Health (Incubation), Key Laboratory of Environment and Health, Ministry of Education, Key Laboratory of Environment and Health (Wuhan), Ministry of Environmental Protection, School of Public Health, Tongji Medical College, Huazhong University of Science and Technology, 13 Hangkong Road, Wuhan, Hubei, 430030, China
| | - Jiaoyuan Li
- Department of Laboratory Medicine, Tongji Hospital, Tongji Medical College, Huazhong University of Science and Technology, 1095 Jiefang Road, Wuhan, Hubei, 430030, China
| | - Liming Cheng
- Department of Laboratory Medicine, Tongji Hospital, Tongji Medical College, Huazhong University of Science and Technology, 1095 Jiefang Road, Wuhan, Hubei, 430030, China
| | - Rong Zhong
- State Key Laboratory of Environment Health (Incubation), Key Laboratory of Environment and Health, Ministry of Education, Key Laboratory of Environment and Health (Wuhan), Ministry of Environmental Protection, School of Public Health, Tongji Medical College, Huazhong University of Science and Technology, 13 Hangkong Road, Wuhan, Hubei, 430030, China
| | - Qiang Zeng
- State Key Laboratory of Environment Health (Incubation), Key Laboratory of Environment and Health, Ministry of Education, Key Laboratory of Environment and Health (Wuhan), Ministry of Environmental Protection, School of Public Health, Tongji Medical College, Huazhong University of Science and Technology, 13 Hangkong Road, Wuhan, Hubei, 430030, China
| | - Xiaoping Miao
- State Key Laboratory of Environment Health (Incubation), Key Laboratory of Environment and Health, Ministry of Education, Key Laboratory of Environment and Health (Wuhan), Ministry of Environmental Protection, School of Public Health, Tongji Medical College, Huazhong University of Science and Technology, 13 Hangkong Road, Wuhan, Hubei, 430030, China
| | - Qing Lu
- State Key Laboratory of Environment Health (Incubation), Key Laboratory of Environment and Health, Ministry of Education, Key Laboratory of Environment and Health (Wuhan), Ministry of Environmental Protection, School of Public Health, Tongji Medical College, Huazhong University of Science and Technology, 13 Hangkong Road, Wuhan, Hubei, 430030, China.
| |
Collapse
|
9
|
Mehrban H, Lee DH, Moradi MH, IlCho C, Naserkheil M, Ibáñez-Escriche N. Predictive performance of genomic selection methods for carcass traits in Hanwoo beef cattle: impacts of the genetic architecture. Genet Sel Evol 2017; 49:1. [PMID: 28093066 PMCID: PMC5240470 DOI: 10.1186/s12711-016-0283-0] [Citation(s) in RCA: 51] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2015] [Accepted: 12/22/2016] [Indexed: 12/15/2022] Open
Abstract
Background Hanwoo beef is known for its marbled fat, tenderness, juiciness and characteristic flavor, as well as for its low cholesterol and high omega 3 fatty acid contents. As yet, there has been no comprehensive investigation to estimate genomic selection accuracy for carcass traits in Hanwoo cattle using dense markers. This study aimed at evaluating the accuracy of alternative statistical methods that differed in assumptions about the underlying genetic model for various carcass traits: backfat thickness (BT), carcass weight (CW), eye muscle area (EMA), and marbling score (MS). Methods Accuracies of direct genomic breeding values (DGV) for carcass traits were estimated by applying fivefold cross-validation to a dataset including 1183 animals and approximately 34,000 single nucleotide polymorphisms (SNPs). Results Accuracies of BayesC, Bayesian LASSO (BayesL) and genomic best linear unbiased prediction (GBLUP) methods were similar for BT, EMA and MS. However, for CW, DGV accuracy was 7% higher with BayesC than with BayesL and GBLUP. The increased accuracy of BayesC, compared to GBLUP and BayesL, was maintained for CW, regardless of the training sample size, but not for BT, EMA, and MS. Genome-wide association studies detected consistent large effects for SNPs on chromosomes 6 and 14 for CW. Conclusions The predictive performance of the models depended on the trait analyzed. For CW, the results showed a clear superiority of BayesC compared to GBLUP and BayesL. These findings indicate the importance of using a proper variable selection method for genomic selection of traits and also suggest that the genetic architecture that underlies CW differs from that of the other carcass traits analyzed. Thus, our study provides significant new insights into the carcass traits of Hanwoo cattle. Electronic supplementary material The online version of this article (doi:10.1186/s12711-016-0283-0) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Hossein Mehrban
- Department of Animal Science, Shahrekord University, P.O. Box 115, Shahrekord, 88186-34141, Iran
| | - Deuk Hwan Lee
- Department of Animal Life and Environment Science, Hankyong National University, Jungang-ro 327, Anseong-si, Gyeonggi-do, 456-749, Korea.
| | - Mohammad Hossein Moradi
- Department of Animal Science, Faculty of Agriculture and Natural Resources, Arak University, Arāk, 38156-8-8349, Iran
| | - Chung IlCho
- Hanwoo Improvement Center, National Agricultural Cooperative Federation, Haeun-ro 691, Unsan-myeon, Seosan-si, Chungnam-do, 356-831, Korea
| | - Masoumeh Naserkheil
- Department of Animal Science, University College of Agriculture and Natural Resources, University of Tehran, P.O. Box 4111, Karaj, 31587-11167, Iran
| | - Noelia Ibáñez-Escriche
- The Roslin Institute, Royal (Dick) School of Veterinary Studies, University of Edinburgh, Roslin, UK
| |
Collapse
|
10
|
Genomic Prediction Models for Count Data. JOURNAL OF AGRICULTURAL BIOLOGICAL AND ENVIRONMENTAL STATISTICS 2015. [DOI: 10.1007/s13253-015-0223-4] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
|
11
|
Khojastehkey M, Aslaminejad AA, Shariati MM, Dianat R. Body size estimation of new born lambs using image processing and its effect on the genetic gain of a simulated population. JOURNAL OF APPLIED ANIMAL RESEARCH 2015. [DOI: 10.1080/09712119.2015.1031789] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
|
12
|
Rolf MM, Garrick DJ, Fountain T, Ramey HR, Weaber RL, Decker JE, Pollak EJ, Schnabel RD, Taylor JF. Comparison of Bayesian models to estimate direct genomic values in multi-breed commercial beef cattle. Genet Sel Evol 2015; 47:23. [PMID: 25884158 PMCID: PMC4433095 DOI: 10.1186/s12711-015-0106-8] [Citation(s) in RCA: 29] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2013] [Accepted: 02/04/2015] [Indexed: 11/24/2022] Open
Abstract
Background While several studies have examined the accuracy of direct genomic
breeding values (DGV) within and across purebred cattle populations, the accuracy
of DGV in crossbred or multi-breed cattle populations has been less well examined.
Interest in the use of genomic tools for both selection and management has
increased within the hybrid seedstock and commercial cattle sectors and research
is needed to determine their efficacy. We predicted DGV for six traits using
training populations of various sizes and alternative Bayesian models for a
population of 3240 crossbred animals. Our objective was to compare alternate
models with different assumptions regarding the distributions of single nucleotide
polymorphism (SNP) effects to determine the optimal model for enhancing
feasibility of multi-breed DGV prediction for the commercial beef industry. Results Realized accuracies ranged from 0.40 to 0.78. Randomly assigning 60
to 70% of animals to training (n ≈ 2000 records) yielded DGV accuracies with the
smallest coefficients of variation. Mixture models (BayesB95, BayesCπ) and models
that allow SNP effects to be sampled from distributions with unequal variances
(BayesA, BayesB95) were advantageous for traits that appear or are known to be
influenced by large-effect genes. For other traits, models differed little in
prediction accuracy (~0.3 to 0.6%), suggesting that they are mainly controlled by
small-effect loci. Conclusions The proportion (60 to 70%) of data allocated to training that
optimized DGV accuracy and minimized the coefficient of variation of accuracy was
similar to large dairy populations. Larger effects were estimated for some SNPs
using BayesA and BayesB95 models because they allow unequal SNP variances. This
substantially increased DGV accuracy for Warner-Bratzler Shear Force, for which
large-effect quantitative trait loci (QTL) are known, while no loss in accuracy
was observed for traits that appear to follow the infinitesimal model. Large
decreases in accuracy (up to 0.07) occurred when SNPs that presumably tag
large-effect QTL were over-regressed towards the mean in BayesC0 analyses. The DGV
accuracies achieved here indicate that genomic selection has predictive utility in
the commercial beef industry and that using models that reflect the genomic
architecture of the trait can have predictive advantages in multi-breed
populations. Electronic supplementary material The online version of this article (doi:10.1186/s12711-015-0106-8) contains supplementary material, which is available to authorized
users.
Collapse
Affiliation(s)
- Megan M Rolf
- Department of Animal Sciences, Oklahoma State University, Stillwater, OK, 74078, USA.
| | - Dorian J Garrick
- Department of Animal Science, Iowa State University, Ames, IA, 50011, USA.
| | - Tara Fountain
- Department of Animal Science, Kansas State University, Manhattan, KS, 66502, USA.
| | - Holly R Ramey
- Division of Animal Sciences, University of Missouri, Columbia, MO, 65211, USA.
| | - Robert L Weaber
- Department of Animal Science, Kansas State University, Manhattan, KS, 66502, USA.
| | - Jared E Decker
- Division of Animal Sciences, University of Missouri, Columbia, MO, 65211, USA.
| | - E John Pollak
- USDA, ARS, US Meat Animal Research Center, PO Box 166, Clay Center, NE, 68933, USA.
| | - Robert D Schnabel
- Division of Animal Sciences, University of Missouri, Columbia, MO, 65211, USA.
| | - Jeremy F Taylor
- Division of Animal Sciences, University of Missouri, Columbia, MO, 65211, USA.
| |
Collapse
|
13
|
Muranty H, Troggio M, Sadok IB, Rifaï MA, Auwerkerken A, Banchi E, Velasco R, Stevanato P, van de Weg WE, Di Guardo M, Kumar S, Laurens F, Bink MCAM. Accuracy and responses of genomic selection on key traits in apple breeding. HORTICULTURE RESEARCH 2015; 2:15060. [PMID: 26744627 PMCID: PMC4688998 DOI: 10.1038/hortres.2015.60] [Citation(s) in RCA: 76] [Impact Index Per Article: 8.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/17/2023]
Abstract
The application of genomic selection in fruit tree crops is expected to enhance breeding efficiency by increasing prediction accuracy, increasing selection intensity and decreasing generation interval. The objectives of this study were to assess the accuracy of prediction and selection response in commercial apple breeding programmes for key traits. The training population comprised 977 individuals derived from 20 pedigreed full-sib families. Historic phenotypic data were available on 10 traits related to productivity and fruit external appearance and genotypic data for 7829 SNPs obtained with an Illumina 20K SNP array. From these data, a genome-wide prediction model was built and subsequently used to calculate genomic breeding values of five application full-sib families. The application families had genotypes at 364 SNPs from a dedicated 512 SNP array, and these genotypic data were extended to the high-density level by imputation. These five families were phenotyped for 1 year and their phenotypes were compared to the predicted breeding values. Accuracy of genomic prediction across the 10 traits reached a maximum value of 0.5 and had a median value of 0.19. The accuracies were strongly affected by the phenotypic distribution and heritability of traits. In the largest family, significant selection response was observed for traits with high heritability and symmetric phenotypic distribution. Traits that showed non-significant response often had reduced and skewed phenotypic variation or low heritability. Among the five application families the accuracies were uncorrelated to the degree of relatedness to the training population. The results underline the potential of genomic prediction to accelerate breeding progress in outbred fruit tree crops that still need to overcome long generation intervals and extensive phenotyping costs.
Collapse
Affiliation(s)
- Hélène Muranty
- Institut de Recherche en Horticulture et Semences UMR1345, INRA, SFR 4207 QUASAV, F-49071 Beaucouze, France
- ()
| | - Michela Troggio
- Research and Innovation Center, Fondazione Edmund Mach, San Michele all’Adige, Trento, Italy
| | - Inès Ben Sadok
- Institut de Recherche en Horticulture et Semences UMR1345, INRA, SFR 4207 QUASAV, F-49071 Beaucouze, France
| | - Mehdi Al Rifaï
- Institut de Recherche en Horticulture et Semences UMR1345, INRA, SFR 4207 QUASAV, F-49071 Beaucouze, France
| | | | - Elisa Banchi
- Research and Innovation Center, Fondazione Edmund Mach, San Michele all’Adige, Trento, Italy
| | - Riccardo Velasco
- Research and Innovation Center, Fondazione Edmund Mach, San Michele all’Adige, Trento, Italy
| | - Piergiorgio Stevanato
- DAFNAE, Dipartimento di Agronomia Animali Alimenti Risorse Naturali e Ambiente, viale Università 16, 35020 Legnaro (PD), Università, degli Studi di Padova, Italy
| | - W Eric van de Weg
- Wageningen UR Plant Breeding, Wageningen University and Research Center, Wageningen, The Netherlands
| | - Mario Di Guardo
- Research and Innovation Center, Fondazione Edmund Mach, San Michele all’Adige, Trento, Italy
- Wageningen UR Plant Breeding, Wageningen University and Research Center, Wageningen, The Netherlands
| | - Satish Kumar
- The New Zealand Institute for Plant & Food Research Limited, Private Bag 1401, Havelock North 4157, New Zealand
| | - François Laurens
- Institut de Recherche en Horticulture et Semences UMR1345, INRA, SFR 4207 QUASAV, F-49071 Beaucouze, France
| | - Marco C A M Bink
- Biometris, Wageningen University and Research Center, Wageningen, The Netherlands
- ()
| |
Collapse
|
14
|
Abstract
Statistical methodology has played a key role in scientific animal breeding. Approximately one hundred years of statistical developments in animal breeding are reviewed. Some of the scientific foundations of the field are discussed, and many milestones are examined from historical and critical perspectives. The review concludes with a discussion of some future challenges and opportunities arising from the massive amount of data generated by livestock, plant, and human genome projects.
Collapse
|