Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Long N, Gianola D, Rosa GJM, Weigel KA. Application of support vector regression to genome-assisted prediction of quantitative traits. Theor Appl Genet 2011;123:1065-1074. [PMID: 21739137 DOI: 10.1007/s00122-011-1648-y] [Citation(s) in RCA: 46] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/14/2011] [Accepted: 06/22/2011] [Indexed: 05/31/2023]

For:	Long N, Gianola D, Rosa GJM, Weigel KA. Application of support vector regression to genome-assisted prediction of quantitative traits. Theor Appl Genet 2011;123:1065-1074. [PMID: 21739137 DOI: 10.1007/s00122-011-1648-y] [Citation(s) in RCA: 46] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/14/2011] [Accepted: 06/22/2011] [Indexed: 05/31/2023]

Number

Cited by Other Article(s)

Ren Y, Wu C, Zhou H, Hu X, Miao Z. Dual-extraction modeling: A multi-modal deep-learning architecture for phenotypic prediction and functional gene mining of complex traits. PLANT COMMUNICATIONS 2024;5:101002. [PMID: 38872306 DOI: 10.1016/j.xplc.2024.101002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/25/2024] [Revised: 05/27/2024] [Accepted: 06/11/2024] [Indexed: 06/15/2024]

Farooq MA, Gao S, Hassan MA, Huang Z, Rasheed A, Hearne S, Prasanna B, Li X, Li H. Artificial intelligence in plant breeding. Trends Genet 2024:S0168-9525(24)00167-7. [PMID: 39117482 DOI: 10.1016/j.tig.2024.07.001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2024] [Revised: 07/06/2024] [Accepted: 07/12/2024] [Indexed: 08/10/2024]

Affiliation(s)

Muhammad Amjad Farooq State Key Laboratory of Crop Gene Resources and Breeding, Institute of Crop Sciences, Chinese Academy of Agricultural Sciences (CAAS), International Maize and Wheat Improvement Center (CIMMYT) China office, Beijing 100081, China; Nanfan Research Institute, CAAS, Sanya, Hainan 572024, China
Shang Gao State Key Laboratory of Crop Gene Resources and Breeding, Institute of Crop Sciences, Chinese Academy of Agricultural Sciences (CAAS), International Maize and Wheat Improvement Center (CIMMYT) China office, Beijing 100081, China; Nanfan Research Institute, CAAS, Sanya, Hainan 572024, China
Muhammad Adeel Hassan Adaptive Cropping Systems Laboratory, Beltsville Agricultural Research Center, US Department of Agriculture, Beltsville, MD 20705, USA; Oak Ridge Institute for Science and Education, Oak Ridge, TN 37830, USA
Zhangping Huang State Key Laboratory of Crop Gene Resources and Breeding, Institute of Crop Sciences, Chinese Academy of Agricultural Sciences (CAAS), International Maize and Wheat Improvement Center (CIMMYT) China office, Beijing 100081, China; Nanfan Research Institute, CAAS, Sanya, Hainan 572024, China
Awais Rasheed Department of Plant Sciences, Quaid-i-Azam University, Islamabad 45320, Pakistan
Sarah Hearne CIMMYT, KM 45 Carretera Mexico-Veracruz, El Batan, Texcoco 56237, Mexico
Boddupalli Prasanna CIMMYT, International Centre for Research in Agroforestry (ICRAF) House, Nairobi 00100, Kenya
Xinhai Li State Key Laboratory of Crop Gene Resources and Breeding, Institute of Crop Sciences, Chinese Academy of Agricultural Sciences (CAAS), International Maize and Wheat Improvement Center (CIMMYT) China office, Beijing 100081, China
Huihui Li State Key Laboratory of Crop Gene Resources and Breeding, Institute of Crop Sciences, Chinese Academy of Agricultural Sciences (CAAS), International Maize and Wheat Improvement Center (CIMMYT) China office, Beijing 100081, China; Nanfan Research Institute, CAAS, Sanya, Hainan 572024, China.

Collapse

Nascimento M, Nascimento ACC, Azevedo CF, de Oliveira ACB, Caixeta ET, Jarquin D. Enhancing genomic prediction with Stacking Ensemble Learning in Arabica Coffee. FRONTIERS IN PLANT SCIENCE 2024;15:1373318. [PMID: 39086911 PMCID: PMC11288849 DOI: 10.3389/fpls.2024.1373318] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 01/19/2024] [Accepted: 06/12/2024] [Indexed: 08/02/2024]

Wang X, Zhang Z, Du H, Pfeiffer C, Mészáros G, Ding X. Predictive ability of multi-population genomic prediction methods of phenotypes for reproduction traits in Chinese and Austrian pigs. Genet Sel Evol 2024;56:49. [PMID: 38926647 PMCID: PMC11201905 DOI: 10.1186/s12711-024-00915-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2022] [Accepted: 05/30/2024] [Indexed: 06/28/2024] Open

Abstract

BACKGROUND

Multi-population genomic prediction can rapidly expand the size of the reference population and improve genomic prediction ability. Machine learning (ML) algorithms have shown advantages in single-population genomic prediction of phenotypes. However, few studies have explored the effectiveness of ML methods for multi-population genomic prediction.

RESULTS

In this study, 3720 Yorkshire pigs from Austria and four breeding farms in China were used, and single-trait genomic best linear unbiased prediction (ST-GBLUP), multitrait GBLUP (MT-GBLUP), Bayesian Horseshoe (BayesHE), and three ML methods (support vector regression (SVR), kernel ridge regression (KRR) and AdaBoost.R2) were compared to explore the optimal method for joint genomic prediction of phenotypes of Chinese and Austrian pigs through 10 replicates of fivefold cross-validation. In this study, we tested the performance of different methods in two scenarios: (i) including only one Austrian population and one Chinese pig population that were genetically linked based on principal component analysis (PCA) (designated as the "two-population scenario") and (ii) adding reference populations that are unrelated based on PCA to the above two populations (designated as the "multi-population scenario"). Our results show that, the use of MT-GBLUP in the two-population scenario resulted in an improvement of 7.1% in predictive ability compared to ST-GBLUP, while the use of SVR and KKR yielded improvements in predictive ability of 4.5 and 5.3%, respectively, compared to MT-GBLUP. SVR and KRR also yielded lower mean square errors (MSE) in most population and trait combinations. In the multi-population scenario, improvements in predictive ability of 29.7, 24.4 and 11.1% were obtained compared to ST-GBLUP when using, respectively, SVR, KRR, and AdaBoost.R2. However, compared to MT-GBLUP, the potential of ML methods to improve predictive ability was not demonstrated.

CONCLUSIONS

Our study demonstrates that ML algorithms can achieve better prediction performance than multitrait GBLUP models in multi-population genomic prediction of phenotypes when the populations have similar genetic backgrounds; however, when reference populations that are unrelated based on PCA are added, the ML methods did not show a benefit. When the number of populations increased, only MT-GBLUP improved predictive ability in both validation populations, while the other methods showed improvement in only one population.

Collapse

Li X, Chen X, Wang Q, Yang N, Sun C. Integrating Bioinformatics and Machine Learning for Genomic Prediction in Chickens. Genes (Basel) 2024;15:690. [PMID: 38927626 PMCID: PMC11202573 DOI: 10.3390/genes15060690] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2024] [Revised: 05/12/2024] [Accepted: 05/23/2024] [Indexed: 06/28/2024] Open

Bose S, Banerjee S, Kumar S, Saha A, Nandy D, Hazra S. Review of applications of artificial intelligence (AI) methods in crop research. J Appl Genet 2024;65:225-240. [PMID: 38216788 DOI: 10.1007/s13353-023-00826-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/13/2023] [Revised: 12/23/2023] [Accepted: 12/26/2023] [Indexed: 01/14/2024]

Mota LFM, Arikawa LM, Santos SWB, Fernandes Júnior GA, Alves AAC, Rosa GJM, Mercadante MEZ, Cyrillo JNSG, Carvalheiro R, Albuquerque LG. Benchmarking machine learning and parametric methods for genomic prediction of feed efficiency-related traits in Nellore cattle. Sci Rep 2024;14:6404. [PMID: 38493207 PMCID: PMC10944497 DOI: 10.1038/s41598-024-57234-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2023] [Accepted: 03/15/2024] [Indexed: 03/18/2024] Open

Hoque A, Anderson JV, Rahman M. Genomic prediction for agronomic traits in a diverse Flax (Linum usitatissimum L.) germplasm collection. Sci Rep 2024;14:3196. [PMID: 38326469 PMCID: PMC10850546 DOI: 10.1038/s41598-024-53462-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2023] [Accepted: 01/31/2024] [Indexed: 02/09/2024] Open

Abstract

Breeding programs require exhaustive phenotyping of germplasms, which is time-demanding and expensive. Genomic prediction helps breeders harness the diversity of any collection to bypass phenotyping. Here, we examined the genomic prediction's potential for seed yield and nine agronomic traits using 26,171 single nucleotide polymorphism (SNP) markers in a set of 337 flax (Linum usitatissimum L.) germplasm, phenotyped in five environments. We evaluated 14 prediction models and several factors affecting predictive ability based on cross-validation schemes. Models yielded significant variation among predictive ability values across traits for the whole marker set. The ridge regression (RR) model covering additive gene action yielded better predictive ability for most of the traits, whereas it was higher for low heritable traits by models capturing epistatic gene action. Marker subsets based on linkage disequilibrium decay distance gave significantly higher predictive abilities to the whole marker set, but for randomly selected markers, it reached a plateau above 3000 markers. Markers having significant association with traits improved predictive abilities compared to the whole marker set when marker selection was made on the whole population instead of the training set indicating a clear overfitting. The correction for population structure did not increase predictive abilities compared to the whole collection. However, stratified sampling by picking representative genotypes from each cluster improved predictive abilities. The indirect predictive ability for a trait was proportionate to its correlation with other traits. These results will help breeders to select the best models, optimum marker set, and suitable genotype set to perform an indirect selection for quantitative traits in this diverse flax germplasm collection.

Collapse

Wu C, Zhang Y, Ying Z, Li L, Wang J, Yu H, Zhang M, Feng X, Wei X, Xu X. A transformer-based genomic prediction method fused with knowledge-guided module. Brief Bioinform 2023;25:bbad438. [PMID: 38058185 PMCID: PMC10701102 DOI: 10.1093/bib/bbad438] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2023] [Revised: 10/15/2023] [Accepted: 11/03/2023] [Indexed: 12/08/2023] Open

Hamadani A, Ganai NA. Artificial intelligence algorithm comparison and ranking for weight prediction in sheep. Sci Rep 2023;13:13242. [PMID: 37582936 PMCID: PMC10427635 DOI: 10.1038/s41598-023-40528-4] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2023] [Accepted: 08/11/2023] [Indexed: 08/17/2023] Open

Abstract

In a rapidly transforming world, farm data is growing exponentially. Realizing the importance of this data, researchers are looking for new solutions to analyse this data and make farming predictions. Artificial Intelligence, with its capacity to handle big data is rapidly becoming popular. In addition, it can also handle non-linear, noisy data and is not limited by the conditions required for conventional data analysis. This study was therefore undertaken to compare the most popular machine learning (ML) algorithms and rank them as per their ability to make predictions on sheep farm data spanning 11 years. Data was cleaned and prepared was done before analysis. Winsorization was done for outlier removal. Principal component analysis (PCA) and feature selection (FS) were done and based on that, three datasets were created viz. PCA (wherein only PCA was used), PCA+ FS (both techniques used for dimensionality reduction), and FS (only feature selection used) bodyweight prediction. Among the 11 ML algorithms that were evaluated, the correlations between true and predicted values for MARS algorithm, Bayesian ridge regression, Ridge regression, Support Vector Machines, Gradient boosting algorithm, Random forests, XgBoost algorithm, Artificial neural networks, Classification and regression trees, Polynomial regression, K nearest neighbours and Genetic Algorithms were 0.993, 0.992, 0.991, 0.991, 0.991, 0.99, 0.99, 0.984, 0.984, 0.957, 0.949, 0.734 respectively for bodyweights. The top five algorithms for the prediction of bodyweights, were MARS, Bayesian ridge regression, Ridge regression, Support Vector Machines and Gradient boosting algorithm. A total of 12 machine learning models were developed for the prediction of bodyweights in sheep in the present study. It may be said that machine learning techniques can perform predictions with reasonable accuracies and can thus help in drawing inferences and making futuristic predictions on farms for their economic prosperity, performance improvement and subsequently food security.

Collapse

Alves AAC, Fernandes AFA, Lopes FB, Breen V, Hawken R, Gianola D, Rosa GJDM. (Quasi) multitask support vector regression with heuristic hyperparameter optimization for whole-genome prediction of complex traits: a case study with carcass traits in broilers. G3 (BETHESDA, MD.) 2023;13:jkad109. [PMID: 37216670 PMCID: PMC10411556 DOI: 10.1093/g3journal/jkad109] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/13/2023] [Revised: 03/13/2023] [Accepted: 04/24/2023] [Indexed: 05/24/2023]

Abstract

This study investigates nonlinear kernels for multitrait (MT) genomic prediction using support vector regression (SVR) models. We assessed the predictive ability delivered by single-trait (ST) and MT models for 2 carcass traits (CT1 and CT2) measured in purebred broiler chickens. The MT models also included information on indicator traits measured in vivo [Growth and feed efficiency trait (FE)]. We proposed an approach termed (quasi) multitask SVR (QMTSVR), with hyperparameter optimization performed via genetic algorithm. ST and MT Bayesian shrinkage and variable selection models [genomic best linear unbiased predictor (GBLUP), BayesC (BC), and reproducing kernel Hilbert space (RKHS) regression] were employed as benchmarks. MT models were trained using 2 validation designs (CV1 and CV2), which differ if the information on secondary traits is available in the testing set. Models' predictive ability was assessed with prediction accuracy (ACC; i.e. the correlation between predicted and observed values, divided by the square root of phenotype accuracy), standardized root-mean-squared error (RMSE*), and inflation factor (b). To account for potential bias in CV2-style predictions, we also computed a parametric estimate of accuracy (ACCpar). Predictive ability metrics varied according to trait, model, and validation design (CV1 or CV2), ranging from 0.71 to 0.84 for ACC, 0.78 to 0.92 for RMSE*, and between 0.82 and 1.34 for b. The highest ACC and smallest RMSE* were achieved with QMTSVR-CV2 in both traits. We observed that for CT1, model/validation design selection was sensitive to the choice of accuracy metric (ACC or ACCpar). Nonetheless, the higher predictive accuracy of QMTSVR over MTGBLUP and MTBC was replicated across accuracy metrics, besides the similar performance between the proposed method and the MTRKHS model. Results showed that the proposed approach is competitive with conventional MT Bayesian regression models using either Gaussian or spike-slab multivariate priors.

Collapse

Ruperao P, Rangan P, Shah T, Thakur V, Kalia S, Mayes S, Rathore A. The Progression in Developing Genomic Resources for Crop Improvement. Life (Basel) 2023;13:1668. [PMID: 37629524 PMCID: PMC10455509 DOI: 10.3390/life13081668] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2023] [Revised: 07/21/2023] [Accepted: 07/25/2023] [Indexed: 08/27/2023] Open

Zhao L, Walkowiak S, Fernando WGD. Artificial Intelligence: A Promising Tool in Exploring the Phytomicrobiome in Managing Disease and Promoting Plant Health. PLANTS (BASEL, SWITZERLAND) 2023;12:plants12091852. [PMID: 37176910 PMCID: PMC10180744 DOI: 10.3390/plants12091852] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/06/2023] [Revised: 04/25/2023] [Accepted: 04/27/2023] [Indexed: 05/15/2023]

Jeon D, Kang Y, Lee S, Choi S, Sung Y, Lee TH, Kim C. Digitalizing breeding in plants: A new trend of next-generation breeding based on genomic prediction. FRONTIERS IN PLANT SCIENCE 2023;14:1092584. [PMID: 36743488 PMCID: PMC9892199 DOI: 10.3389/fpls.2023.1092584] [Citation(s) in RCA: 7] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/08/2022] [Accepted: 01/05/2023] [Indexed: 06/18/2023]

Comparison of artificial intelligence algorithms and their ranking for the prediction of genetic merit in sheep. Sci Rep 2022;12:18726. [PMID: 36333409 PMCID: PMC9636184 DOI: 10.1038/s41598-022-23499-w] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2022] [Accepted: 11/01/2022] [Indexed: 11/06/2022] Open

Manthena V, Jarquín D, Varshney RK, Roorkiwal M, Dixit GP, Bharadwaj C, Howard R. Evaluating dimensionality reduction for genomic prediction. Front Genet 2022;13:958780. [PMID: 36313472 PMCID: PMC9614092 DOI: 10.3389/fgene.2022.958780] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2022] [Accepted: 09/05/2022] [Indexed: 11/13/2022] Open

Zhang F, Weigel K, Cabrera V. Predicting daily milk yield for primiparous cows using data of within-herd relatives to capture genotype-by-environment interactions. J Dairy Sci 2022;105:6739-6748. [DOI: 10.3168/jds.2021-21559] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2021] [Accepted: 03/29/2022] [Indexed: 11/19/2022]

Gabur I, Simioniuc DP, Snowdon RJ, Cristea D. Machine Learning Applied to the Search for Nonlinear Features in Breeding Populations. Front Artif Intell 2022;5:876578. [PMID: 35669178 PMCID: PMC9164111 DOI: 10.3389/frai.2022.876578] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2022] [Accepted: 04/19/2022] [Indexed: 11/13/2022] Open

Wang X, Shi S, Wang G, Luo W, Wei X, Qiu A, Luo F, Ding X. Using machine learning to improve the accuracy of genomic prediction of reproduction traits in pigs. J Anim Sci Biotechnol 2022;13:60. [PMID: 35578371 PMCID: PMC9112588 DOI: 10.1186/s40104-022-00708-0] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2021] [Accepted: 03/13/2022] [Indexed: 12/02/2022] Open

Abstract

Background

Recently, machine learning (ML) has become attractive in genomic prediction, but its superiority in genomic prediction over conventional (ss) GBLUP methods and the choice of optimal ML methods need to be investigated.

Results

In this study, 2566 Chinese Yorkshire pigs with reproduction trait records were genotyped with the GenoBaits Porcine SNP 50 K and PorcineSNP50 panels. Four ML methods, including support vector regression (SVR), kernel ridge regression (KRR), random forest (RF) and Adaboost.R2 were implemented. Through 20 replicates of fivefold cross-validation (CV) and one prediction for younger individuals, the utility of ML methods in genomic prediction was explored. In CV, compared with genomic BLUP (GBLUP), single-step GBLUP (ssGBLUP) and the Bayesian method BayesHE, ML methods significantly outperformed these conventional methods. ML methods improved the genomic prediction accuracy of GBLUP, ssGBLUP, and BayesHE by 19.3%, 15.0% and 20.8%, respectively. In addition, ML methods yielded smaller mean squared error (MSE) and mean absolute error (MAE) in all scenarios. ssGBLUP yielded an improvement of 3.8% on average in accuracy compared to that of GBLUP, and the accuracy of BayesHE was close to that of GBLUP. In genomic prediction of younger individuals, RF and Adaboost.R2_KRR performed better than GBLUP and BayesHE, while ssGBLUP performed comparably with RF, and ssGBLUP yielded slightly higher accuracy and lower MSE than Adaboost.R2_KRR in the prediction of total number of piglets born, while for number of piglets born alive, Adaboost.R2_KRR performed significantly better than ssGBLUP. Among ML methods, Adaboost.R2_KRR consistently performed well in our study. Our findings also demonstrated that optimal hyperparameters are useful for ML methods. After tuning hyperparameters in CV and in predicting genomic outcomes of younger individuals, the average improvement was 14.3% and 21.8% over those using default hyperparameters, respectively.

Conclusion

Our findings demonstrated that ML methods had better overall prediction performance than conventional genomic selection methods, and could be new options for genomic prediction. Among ML methods, Adaboost.R2_KRR consistently performed well in our study, and tuning hyperparameters is necessary for ML methods. The optimal hyperparameters depend on the character of traits, datasets etc.

Supplementary Information

The online version contains supplementary material available at 10.1186/s40104-022-00708-0.

Collapse

Genome-Enabled Prediction Methods Based on Machine Learning. METHODS IN MOLECULAR BIOLOGY (CLIFTON, N.J.) 2022;2467:189-218. [PMID: 35451777 DOI: 10.1007/978-1-0716-2205-6_7] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]

Hamla S, Sacré PY, Derenne A, Derfoufi KM, Cowper B, Butré CI, Delobel A, Goormaghtigh E, Hubert P, Ziemons E. A new alternative tool to analyse glycosylation in pharmaceutical proteins based on infrared spectroscopy combined with nonlinear support vector regression. Analyst 2022;147:1086-1098. [PMID: 35174378 DOI: 10.1039/d1an00697e] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/07/2023]

Budhlakoti N, Kushwaha AK, Rai A, Chaturvedi KK, Kumar A, Pradhan AK, Kumar U, Kumar RR, Juliana P, Mishra DC, Kumar S. Genomic Selection: A Tool for Accelerating the Efficiency of Molecular Breeding for Development of Climate-Resilient Crops. Front Genet 2022;13:832153. [PMID: 35222548 PMCID: PMC8864149 DOI: 10.3389/fgene.2022.832153] [Citation(s) in RCA: 39] [Impact Index Per Article: 19.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2021] [Accepted: 01/10/2022] [Indexed: 12/17/2022] Open

Gardiner LJ, Krishna R. Bluster or Lustre: Can AI Improve Crops and Plant Health? PLANTS (BASEL, SWITZERLAND) 2021;10:plants10122707. [PMID: 34961177 PMCID: PMC8707749 DOI: 10.3390/plants10122707] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/03/2021] [Revised: 11/24/2021] [Accepted: 12/06/2021] [Indexed: 06/14/2023]

Predicting Heritability of Oil Palm Breeding Using Phenotypic Traits and Machine Learning. SUSTAINABILITY 2021. [DOI: 10.3390/su132212613] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/01/2023]

Improving Biomass and Grain Yield Prediction of Wheat Genotypes on Sodic Soil Using Integrated High-Resolution Multispectral, Hyperspectral, 3D Point Cloud, and Machine Learning Techniques. REMOTE SENSING 2021. [DOI: 10.3390/rs13173482] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/19/2022]

Abstract Sodic soils adversely affect crop production over extensive areas of rain-fed cropping worldwide, with particularly large areas in Australia. Crop phenotyping may assist in identifying cultivars tolerant to soil sodicity. However, studies to identify the most appropriate traits and reliable tools to assist crop phenotyping on sodic soil are limited. Hence, this study evaluated the ability of multispectral, hyperspectral, 3D point cloud, and machine learning techniques to improve estimation of biomass and grain yield of wheat genotypes grown on a moderately sodic (MS) and highly sodic (HS) soil sites in northeastern Australia. While a number of studies have reported using different remote sensing approaches and crop traits to quantify crop growth, stress, and yield variation, studies are limited using the combination of these techniques including machine learning to improve estimation of genotypic biomass and yield, especially in constrained sodic soil environments. At close to flowering, unmanned aerial vehicle (UAV) and ground-based proximal sensing was used to obtain remote and/or proximal sensing data, while biomass yield and crop heights were also manually measured in the field. Grain yield was machine-harvested at maturity. UAV remote and/or proximal sensing-derived spectral vegetation indices (VIs), such as normalized difference vegetation index, optimized soil adjusted vegetation index, and enhanced vegetation index and crop height were closely corresponded to wheat genotypic biomass and grain yields. UAV multispectral VIs more closely associated with biomass and grain yields compared to proximal sensing data. The red-green-blue (RGB) 3D point cloud technique was effective in determining crop height, which was slightly better correlated with genotypic biomass and grain yield than ground-measured crop height data. These remote sensing-derived crop traits (VIs and crop height) and wheat biomass and grain yields were further simulated using machine learning algorithms (multitarget linear regression, support vector machine regression, Gaussian process regression, and artificial neural network) with different kernels to improve estimation of biomass and grain yield. The artificial neural network predicted biomass yield (R2 = 0.89; RMSE = 34.8 g/m2 for the MS and R2 = 0.82; RMSE = 26.4 g/m2 for the HS site) and grain yield (R2 = 0.88; RMSE = 11.8 g/m2 for the MS and R2 = 0.74; RMSE = 16.1 g/m2 for the HS site) with slightly less error than the others. Wheat genotypes Mitch, Corack, Mace, Trojan, Lancer, and Bremer were identified as more tolerant to sodic soil constraints than Emu Rock, Janz, Flanker, and Gladius. The study improves our ability to select appropriate traits and techniques in accurate estimation of wheat genotypic biomass and grain yields on sodic soils. This will also assist farmers in identifying cultivars tolerant to sodic soil constraints. Collapse

Liu S, Xue P, Lu J, Lu W. Fitting analysis and research of measured data of SAW yarn tension sensor based on PSO-SVR model. ULTRASONICS 2021;116:106511. [PMID: 34237494 DOI: 10.1016/j.ultras.2021.106511] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/06/2020] [Revised: 06/14/2021] [Accepted: 06/28/2021] [Indexed: 06/13/2023]

Srivastava S, Lopez BI, Kumar H, Jang M, Chai HH, Park W, Park JE, Lim D. Prediction of Hanwoo Cattle Phenotypes from Genotypes Using Machine Learning Methods. Animals (Basel) 2021;11:ani11072066. [PMID: 34359194 PMCID: PMC8300336 DOI: 10.3390/ani11072066] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2021] [Revised: 07/06/2021] [Accepted: 07/09/2021] [Indexed: 11/16/2022] Open

Liang M, Chang T, An B, Duan X, Du L, Wang X, Miao J, Xu L, Gao X, Zhang L, Li J, Gao H. A Stacking Ensemble Learning Framework for Genomic Prediction. Front Genet 2021;12:600040. [PMID: 33747037 PMCID: PMC7969712 DOI: 10.3389/fgene.2021.600040] [Citation(s) in RCA: 16] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2020] [Accepted: 01/12/2021] [Indexed: 11/22/2022] Open

Piles M, Bergsma R, Gianola D, Gilbert H, Tusell L. Feature Selection Stability and Accuracy of Prediction Models for Genomic Prediction of Residual Feed Intake in Pigs Using Machine Learning. Front Genet 2021;12:611506. [PMID: 33692825 PMCID: PMC7938892 DOI: 10.3389/fgene.2021.611506] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2020] [Accepted: 01/20/2021] [Indexed: 11/25/2022] Open

Abstract

Feature selection (FS, i.e., selection of a subset of predictor variables) is essential in high-dimensional datasets to prevent overfitting of prediction/classification models and reduce computation time and resources. In genomics, FS allows identifying relevant markers and designing low-density SNP chips to evaluate selection candidates. In this research, several univariate and multivariate FS algorithms combined with various parametric and non-parametric learners were applied to the prediction of feed efficiency in growing pigs from high-dimensional genomic data. The objective was to find the best combination of feature selector, SNP subset size, and learner leading to accurate and stable (i.e., less sensitive to changes in the training data) prediction models. Genomic best linear unbiased prediction (GBLUP) without SNP pre-selection was the benchmark. Three types of FS methods were implemented: (i) filter methods: univariate (univ.dtree, spearcor) or multivariate (cforest, mrmr), with random selection as benchmark; (ii) embedded methods: elastic net and least absolute shrinkage and selection operator (LASSO) regression; (iii) combination of filter and embedded methods. Ridge regression, support vector machine (SVM), and gradient boosting (GB) were applied after pre-selection performed with the filter methods. Data represented 5,708 individual records of residual feed intake to be predicted from the animal’s own genotype. Accuracy (stability of results) was measured as the median (interquartile range) of the Spearman correlation between observed and predicted data in a 10-fold cross-validation. The best prediction in terms of accuracy and stability was obtained with SVM and GB using 500 or more SNPs [0.28 (0.02) and 0.27 (0.04) for SVM and GB with 1,000 SNPs, respectively]. With larger subset sizes (1,000–1,500 SNPs), the filter method had no influence on prediction quality, which was similar to that attained with a random selection. With 50–250 SNPs, the FS method had a huge impact on prediction quality: it was very poor for tree-based methods combined with any learner, but good and similar to what was obtained with larger SNP subsets when spearcor or mrmr were implemented with or without embedded methods. Those filters also led to very stable results, suggesting their potential use for designing low-density SNP chips for genome-based evaluation of feed efficiency.

Collapse

Tong H, Nikoloski Z. Machine learning approaches for crop improvement: Leveraging phenotypic and genotypic big data. JOURNAL OF PLANT PHYSIOLOGY 2021;257:153354. [PMID: 33385619 DOI: 10.1016/j.jplph.2020.153354] [Citation(s) in RCA: 47] [Impact Index Per Article: 15.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/30/2020] [Revised: 12/14/2020] [Accepted: 12/15/2020] [Indexed: 05/07/2023]

van Dijk ADJ, Kootstra G, Kruijer W, de Ridder D. Machine learning in plant science and plant breeding. iScience 2021;24:101890. [PMID: 33364579 PMCID: PMC7750553 DOI: 10.1016/j.isci.2020.101890] [Citation(s) in RCA: 78] [Impact Index Per Article: 26.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022] Open

Tusell L, Bergsma R, Gilbert H, Gianola D, Piles M. Machine Learning Prediction of Crossbred Pig Feed Efficiency and Growth Rate From Single Nucleotide Polymorphisms. Front Genet 2020;11:567818. [PMID: 33391339 PMCID: PMC7775539 DOI: 10.3389/fgene.2020.567818] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2020] [Accepted: 11/17/2020] [Indexed: 11/24/2022] Open

Abstract

This research assessed the ability of a Support Vector Machine (SVM) regression model to predict pig crossbred (CB) performance from various sources of phenotypic and genotypic information for improving crossbreeding performance at reduced genotyping cost. Data consisted of average daily gain (ADG) and residual feed intake (RFI) records and genotypes of 5,708 purebred (PB) boars and 5,007 CB pigs. Prediction models were fitted using individual PB genotypes and phenotypes (trn.1); genotypes of PB sires and average of CB records per PB sire (trn.2); and individual CB genotypes and phenotypes (trn.3). The average of CB offspring records was the trait to be predicted from PB sire’s genotype using cross-validation. Single nucleotide polymorphisms (SNPs) were ranked based on the Spearman Rank correlation with the trait. Subsets with an increasing number (from 50 to 2,000) of the most informative SNPs were used as predictor variables in SVM. Prediction performance was the median of the Spearman correlation (SC, interquartile range in brackets) between observed and predicted phenotypes in the testing set. The best predictive performances were obtained when sire phenotypic information was included in trn.1 (0.22 [0.03] for RFI with SVM and 250 SNPs, and 0.12 [0.05] for ADG with SVM and 500–1,000 SNPs) or when trn.3 was used (0.29 [0.16] with Genomic best linear unbiased prediction (GBLUP) for RFI, and 0.15 [0.09] for ADG with just 50 SNPs). Animals from the last two generations were assigned to the testing set and remaining animals to the training set. Individual’s PB own phenotype and genotype improved the prediction ability of CB offspring of young animals for ADG but not for RFI. The highest SC was 0.34 [0.21] and 0.36 [0.22] for RFI and ADG, respectively, with SVM and 50 SNPs. Predictive performance using CB data for training leads to a SC of 0.34 [0.19] with GBLUP and 0.28 [0.18] with SVM and 250 SNPs for RFI and 0.34 [0.15] with SVM and 500 SNPs for ADG. Results suggest that PB candidates could be evaluated for CB performance with SVM and low-density SNP chip panels after collecting their own RFI or ADG performances or even earlier, after being genotyped using a reference population of CB animals.

Collapse

Alves AAC, Espigolan R, Bresolin T, Costa RM, Fernandes Júnior GA, Ventura RV, Carvalheiro R, Albuquerque LG. Genome-enabled prediction of reproductive traits in Nellore cattle using parametric models and machine learning methods. Anim Genet 2020;52:32-46. [PMID: 33191532 DOI: 10.1111/age.13021] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 10/13/2020] [Indexed: 12/31/2022]

Abstract

This study aimed to assess the predictive ability of different machine learning (ML) methods for genomic prediction of reproductive traits in Nellore cattle. The studied traits were age at first calving (AFC), scrotal circumference (SC), early pregnancy (EP) and stayability (STAY). The numbers of genotyped animals and SNP markers available were 2342 and 321 419 (AFC), 4671 and 309 486 (SC), 2681 and 319 619 (STAY) and 3356 and 319 108 (EP). Predictive ability of support vector regression (SVR), Bayesian regularized artificial neural network (BRANN) and random forest (RF) were compared with results obtained using parametric models (genomic best linear unbiased predictor, GBLUP, and Bayesian least absolute shrinkage and selection operator, BLASSO). A 5-fold cross-validation strategy was performed and the average prediction accuracy (ACC) and mean squared errors (MSE) were computed. The ACC was defined as the linear correlation between predicted and observed breeding values for categorical traits (EP and STAY) and as the correlation between predicted and observed adjusted phenotypes divided by the square root of the estimated heritability for continuous traits (AFC and SC). The average ACC varied from low to moderate depending on the trait and model under consideration, ranging between 0.56 and 0.63 (AFC), 0.27 and 0.36 (SC), 0.57 and 0.67 (EP), and 0.52 and 0.62 (STAY). SVR provided slightly better accuracies than the parametric models for all traits, increasing the prediction accuracy for AFC to around 6.3 and 4.8% compared with GBLUP and BLASSO respectively. Likewise, there was an increase of 8.3% for SC, 4.5% for EP and 4.8% for STAY, comparing SVR with both GBLUP and BLASSO. In contrast, the RF and BRANN did not present competitive predictive ability compared with the parametric models. The results indicate that SVR is a suitable method for genome-enabled prediction of reproductive traits in Nellore cattle. Further, the optimal kernel bandwidth parameter in the SVR model was trait-dependent, thus, a fine-tuning for this hyper-parameter in the training phase is crucial.

Collapse

Liang M, Miao J, Wang X, Chang T, An B, Duan X, Xu L, Gao X, Zhang L, Li J, Gao H. Application of ensemble learning to genomic selection in chinese simmental beef cattle. J Anim Breed Genet 2020;138:291-299. [PMID: 33089920 DOI: 10.1111/jbg.12514] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2020] [Revised: 09/03/2020] [Accepted: 10/01/2020] [Indexed: 11/30/2022]

Azodi CB, Bolger E, McCarren A, Roantree M, de Los Campos G, Shiu SH. Benchmarking Parametric and Machine Learning Models for Genomic Prediction of Complex Traits. G3 (BETHESDA, MD.) 2019;9:3691-3702. [PMID: 31533955 PMCID: PMC6829122 DOI: 10.1534/g3.119.400498] [Citation(s) in RCA: 75] [Impact Index Per Article: 15.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/03/2019] [Accepted: 09/09/2019] [Indexed: 12/21/2022]

Olatoye MO, Hu Z, Aikpokpodion PO. Epistasis Detection and Modeling for Genomic Selection in Cowpea (Vigna unguiculata L. Walp.). Front Genet 2019;10:677. [PMID: 31417604 PMCID: PMC6682672 DOI: 10.3389/fgene.2019.00677] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2019] [Accepted: 06/27/2019] [Indexed: 12/24/2022] Open

Cherlin S, Plant D, Taylor JC, Colombo M, Spiliopoulou A, Tzanis E, Morgan AW, Barnes MR, McKeigue P, Barrett JH, Pitzalis C, Barton A, Consortium MATURA, Cordell HJ. Prediction of treatment response in rheumatoid arthritis patients using genome-wide SNP data. Genet Epidemiol 2018;42:754-771. [PMID: 30311271 PMCID: PMC6334178 DOI: 10.1002/gepi.22159] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2018] [Revised: 07/06/2018] [Accepted: 07/28/2018] [Indexed: 01/13/2023]

Affiliation(s)

Svetlana Cherlin Institute of Genetic MedicineNewcastle UniversityNewcastle upon TyneUK
Darren Plant NIHR Manchester Biomedical Research Centre, Manchester University NHS Foundation TrustManchester Academic Health Science CentreManchesterUK
John C. Taylor Leeds Institute of Cancer and PathologyUniversity of LeedsLeedsUK NIHR Leeds Biomedical Research CentreLeeds Teaching Hospitals NHS TrustLeedsUK
Marco Colombo Centre for Population Health Sciences, Usher Institute of Population Health Sciences and InformaticsUniversity of EdinburghEdinburghUK
Athina Spiliopoulou Centre for Population Health Sciences, Usher Institute of Population Health Sciences and InformaticsUniversity of EdinburghEdinburghUK
Evan Tzanis Centre for Experimental Medicine and Rheumatology, William Harvey Research Institute, Barts and the London School of Medicine and DentistryQueen Mary University of London and Barts Health NHS TrustLondonUK
Ann W. Morgan NIHR Leeds Biomedical Research CentreLeeds Teaching Hospitals NHS TrustLeedsUK Leeds Institute of Rheumatic and Musculoskeletal MedicineUniversity of LeedsLeedsUK
Michael R. Barnes Centre for Experimental Medicine and Rheumatology, William Harvey Research Institute, Barts and the London School of Medicine and DentistryQueen Mary University of London and Barts Health NHS TrustLondonUK
Paul McKeigue Centre for Population Health Sciences, Usher Institute of Population Health Sciences and InformaticsUniversity of EdinburghEdinburghUK
Jennifer H. Barrett Leeds Institute of Cancer and PathologyUniversity of LeedsLeedsUK NIHR Leeds Biomedical Research CentreLeeds Teaching Hospitals NHS TrustLeedsUK
Costantino Pitzalis Centre for Experimental Medicine and Rheumatology, William Harvey Research Institute, Barts and the London School of Medicine and DentistryQueen Mary University of London and Barts Health NHS TrustLondonUK
Anne Barton NIHR Manchester Biomedical Research Centre, Manchester University NHS Foundation TrustManchester Academic Health Science CentreManchesterUK Arthritis Research UK Centre for Genetics and Genomics, Centre for Musculoskeletal ResearchThe University of ManchesterManchesterUK
MATURA Consortium Institute of Genetic MedicineNewcastle UniversityNewcastle upon TyneUK NIHR Manchester Biomedical Research Centre, Manchester University NHS Foundation TrustManchester Academic Health Science CentreManchesterUK Leeds Institute of Cancer and PathologyUniversity of LeedsLeedsUK NIHR Leeds Biomedical Research CentreLeeds Teaching Hospitals NHS TrustLeedsUK Centre for Population Health Sciences, Usher Institute of Population Health Sciences and InformaticsUniversity of EdinburghEdinburghUK Centre for Experimental Medicine and Rheumatology, William Harvey Research Institute, Barts and the London School of Medicine and DentistryQueen Mary University of London and Barts Health NHS TrustLondonUK Leeds Institute of Rheumatic and Musculoskeletal MedicineUniversity of LeedsLeedsUK Arthritis Research UK Centre for Genetics and Genomics, Centre for Musculoskeletal ResearchThe University of ManchesterManchesterUK
Heather J. Cordell NIHR Manchester Biomedical Research Centre, Manchester University NHS Foundation TrustManchester Academic Health Science CentreManchesterUK

Collapse

Gianola D, Cecchinato A, Naya H, Schön CC. Prediction of Complex Traits: Robust Alternatives to Best Linear Unbiased Prediction. Front Genet 2018;9:195. [PMID: 29951082 PMCID: PMC6008589 DOI: 10.3389/fgene.2018.00195] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2018] [Accepted: 05/14/2018] [Indexed: 12/05/2022] Open

Watanabe T, Otowa T, Abe O, Kuwabara H, Aoki Y, Natsubori T, Takao H, Kakiuchi C, Kondo K, Ikeda M, Iwata N, Kasai K, Sasaki T, Yamasue H. Oxytocin receptor gene variations predict neural and behavioral response to oxytocin in autism. Soc Cogn Affect Neurosci 2017;12:496-506. [PMID: 27798253 PMCID: PMC5390696 DOI: 10.1093/scan/nsw150] [Citation(s) in RCA: 30] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2016] [Accepted: 10/04/2016] [Indexed: 12/27/2022] Open

Affiliation(s)

Takamitsu Watanabe Institute of Cognitive Neuroscience, University College London, 17 Queen Square, London WC1N 3AZ, UK
Takeshi Otowa Department of Neuropsychiatry Graduate School of Medicine, The University of Tokyo, 7-3-1 Hongo, Bunkyo-ku, Tokyo 113-8655, Japan
Osamu Abe Department of Radiology, Nihon University School of Medicine, 30-1 Oyaguchikami-cho, Itabashi-ku, Tokyo 173-8610, Japan
Hitoshi Kuwabara Disability Services Office
Yuta Aoki Department of Neuropsychiatry Graduate School of Medicine, The University of Tokyo, 7-3-1 Hongo, Bunkyo-ku, Tokyo 113-8655, Japan
Tatsunobu Natsubori Department of Neuropsychiatry Graduate School of Medicine, The University of Tokyo, 7-3-1 Hongo, Bunkyo-ku, Tokyo 113-8655, Japan
Hidemasa Takao Department of Radiology Graduate School of Medicine, The University of Tokyo, 7-3-1 Hongo, Bunkyo-ku, Tokyo, 113-8655, Japan
Chihiro Kakiuchi Department of Neuropsychiatry Graduate School of Medicine, The University of Tokyo, 7-3-1 Hongo, Bunkyo-ku, Tokyo 113-8655, Japan
Kenji Kondo Department of Psychiatry Fujita Health University School of Medicine, Aichi 470-1192, Japan
Masashi Ikeda Department of Psychiatry Fujita Health University School of Medicine, Aichi 470-1192, Japan
Nakao Iwata Department of Psychiatry Fujita Health University School of Medicine, Aichi 470-1192, Japan
Kiyoto Kasai Department of Neuropsychiatry Graduate School of Medicine, The University of Tokyo, 7-3-1 Hongo, Bunkyo-ku, Tokyo 113-8655, Japan
Tsukasa Sasaki Department of Physical and Health Education Graduate School of Education, The University of Tokyo, 7-3-1 Hongo, Bunkyo-ku, Tokyo 113-8655, Japan
Hidenori Yamasue Department of Neuropsychiatry Graduate School of Medicine, The University of Tokyo, 7-3-1 Hongo, Bunkyo-ku, Tokyo 113-8655, Japan.,Department of Psychiatry Hamamatsu University School of Medicine, 1-20-1 Handayama, Higashiku, Hamamatsu City 431-3192, Japan

Collapse

Genome-Wide Association Studies with a Genomic Relationship Matrix: A Case Study with Wheat and Arabidopsis. G3-GENES GENOMES GENETICS 2016;6:3241-3256. [PMID: 27520956 PMCID: PMC5068945 DOI: 10.1534/g3.116.034256] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/02/2022]

Gong P, Nan X, Barker ND, Boyd RE, Chen Y, Wilkins DE, Johnson DR, Suedel BC, Perkins EJ. Predicting chemical bioavailability using microarray gene expression data and regression modeling: A tale of three explosive compounds. BMC Genomics 2016;17:205. [PMID: 26956490 PMCID: PMC4784335 DOI: 10.1186/s12864-016-2541-5] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2015] [Accepted: 02/25/2016] [Indexed: 11/10/2022] Open

Zhao W, Tao T, Zio E. System reliability prediction by support vector regression with analytic selection and genetic algorithm parameters selection. Appl Soft Comput 2015. [DOI: 10.1016/j.asoc.2015.02.026] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]

Li L, Long Y, Zhang L, Dalton-Morgan J, Batley J, Yu L, Meng J, Li M. Genome wide analysis of flowering time trait in multiple environments via high-throughput genotyping technique in Brassica napus L. PLoS One 2015;10:e0119425. [PMID: 25790019 PMCID: PMC4366152 DOI: 10.1371/journal.pone.0119425] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2014] [Accepted: 01/13/2015] [Indexed: 11/19/2022] Open

Onogi A, Ideta O, Inoshita Y, Ebana K, Yoshioka T, Yamasaki M, Iwata H. Exploring the areas of applicability of whole-genome prediction methods for Asian rice (Oryza sativa L.). TAG. THEORETICAL AND APPLIED GENETICS. THEORETISCHE UND ANGEWANDTE GENETIK 2015;128:41-53. [PMID: 25341369 DOI: 10.1007/s00122-014-2411-y] [Citation(s) in RCA: 33] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/09/2014] [Accepted: 10/03/2014] [Indexed: 05/25/2023]

Morota G, Gianola D. Kernel-based whole-genome prediction of complex traits: a review. Front Genet 2014;5:363. [PMID: 25360145 PMCID: PMC4199321 DOI: 10.3389/fgene.2014.00363] [Citation(s) in RCA: 99] [Impact Index Per Article: 9.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2014] [Accepted: 09/29/2014] [Indexed: 01/18/2023] Open

González-Recio O, Rosa GJ, Gianola D. Machine learning methods and predictive ability metrics for genome-wide prediction of complex traits. Livest Sci 2014. [DOI: 10.1016/j.livsci.2014.05.036] [Citation(s) in RCA: 75] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/01/2022]

Howard R, Carriquiry AL, Beavis WD. Parametric and nonparametric statistical methods for genomic selection of traits with additive and epistatic genetic architectures. G3 (BETHESDA, MD.) 2014;4:1027-46. [PMID: 24727289 PMCID: PMC4065247 DOI: 10.1534/g3.114.010298] [Citation(s) in RCA: 89] [Impact Index Per Article: 8.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/08/2014] [Accepted: 03/18/2014] [Indexed: 01/12/2023]

Morota G, Boddhireddy P, Vukasinovic N, Gianola D, Denise S. Kernel-based variance component estimation and whole-genome prediction of pre-corrected phenotypes and progeny tests for dairy cow health traits. Front Genet 2014;5:56. [PMID: 24715901 PMCID: PMC3970026 DOI: 10.3389/fgene.2014.00056] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2013] [Accepted: 03/04/2014] [Indexed: 11/13/2022] Open

Abstract

Prediction of complex trait phenotypes in the presence of unknown gene action is an ongoing challenge in animals, plants, and humans. Development of flexible predictive models that perform well irrespective of genetic and environmental architectures is desirable. Methods that can address non-additive variation in a non-explicit manner are gaining attention for this purpose and, in particular, semi-parametric kernel-based methods have been applied to diverse datasets, mostly providing encouraging results. On the other hand, the gains obtained from these methods have been smaller when smoothed values such as estimated breeding value (EBV) have been used as response variables. However, less emphasis has been placed on the choice of phenotypes to be used in kernel-based whole-genome prediction. This study aimed to evaluate differences between semi-parametric and parametric approaches using two types of response variables and molecular markers as inputs. Pre-corrected phenotypes (PCP) and EBV obtained for dairy cow health traits were used for this comparison. We observed that non-additive genetic variances were major contributors to total genetic variances in PCP, whereas additivity was the largest contributor to variability of EBV, as expected. Within the kernels evaluated, non-parametric methods yielded slightly better predictive performance across traits relative to their additive counterparts regardless of the type of response variable used. This reinforces the view that non-parametric kernels aiming to capture non-linear relationships between a panel of SNPs and phenotypes are appealing for complex trait prediction. However, like past studies, the gain in predictive correlation was not large for either PCP or EBV. We conclude that capturing non-additive genetic variation, especially epistatic variation, in a cross-validation framework remains a significant challenge even when it is important, as seems to be the case for health traits in dairy cows.

Collapse

Morota G, Koyama M, Rosa GJM, Weigel KA, Gianola D. Predicting complex traits using a diffusion kernel on genetic markers with an application to dairy cattle and wheat data. Genet Sel Evol 2013;45:17. [PMID: 23763755 PMCID: PMC3706293 DOI: 10.1186/1297-9686-45-17] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2012] [Accepted: 05/31/2013] [Indexed: 01/09/2023] Open

Abstract

Background

Arguably, genotypes and phenotypes may be linked in functional forms that are not well addressed by the linear additive models that are standard in quantitative genetics. Therefore, developing statistical learning models for predicting phenotypic values from all available molecular information that are capable of capturing complex genetic network architectures is of great importance. Bayesian kernel ridge regression is a non-parametric prediction model proposed for this purpose. Its essence is to create a spatial distance-based relationship matrix called a kernel. Although the set of all single nucleotide polymorphism genotype configurations on which a model is built is finite, past research has mainly used a Gaussian kernel.

Results

We sought to investigate the performance of a diffusion kernel, which was specifically developed to model discrete marker inputs, using Holstein cattle and wheat data. This kernel can be viewed as a discretization of the Gaussian kernel. The predictive ability of the diffusion kernel was similar to that of non-spatial distance-based additive genomic relationship kernels in the Holstein data, but outperformed the latter in the wheat data. However, the difference in performance between the diffusion and Gaussian kernels was negligible.

Conclusions

It is concluded that the ability of a diffusion kernel to capture the total genetic variance is not better than that of a Gaussian kernel, at least for these data. Although the diffusion kernel as a choice of basis function may have potential for use in whole-genome prediction, our results imply that embedding genetic markers into a non-Euclidean metric space has very small impact on prediction. Our results suggest that use of the black box Gaussian kernel is justified, given its connection to the diffusion kernel and its similar predictive performance.

Collapse

de Los Campos G, Hickey JM, Pong-Wong R, Daetwyler HD, Calus MPL. Whole-genome regression and prediction methods applied to plant and animal breeding. Genetics 2013;193:327-45. [PMID: 22745228 PMCID: PMC3567727 DOI: 10.1534/genetics.112.143313] [Citation(s) in RCA: 489] [Impact Index Per Article: 44.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2012] [Accepted: 06/11/2012] [Indexed: 11/18/2022] Open