1
|
Campler MR, Cheng TY, Lee CW, Hofacre CL, Lossie G, Silva GS, El-Gazzar MM, Arruda AG. Investigating the uses of machine learning algorithms to inform risk factor analyses: The example of avian infectious bronchitis virus (IBV) in broiler chickens. Res Vet Sci 2024; 171:105201. [PMID: 38442531 DOI: 10.1016/j.rvsc.2024.105201] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2023] [Revised: 11/16/2023] [Accepted: 02/24/2024] [Indexed: 03/07/2024]
Abstract
Infectious bronchitis virus (IBV) is a contagious coronavirus causing respiratory and urogenital disease in chickens and is responsible for significant economic losses for both the broiler and table egg layer industries. Despite IBV being regularly monitored using standard epidemiologic surveillance practices, knowledge and evidence of risk factors associated with IBV transmission remain limited. The study objective was to compare risk factor modeling outcomes between a traditional stepwise variable selection approach and a machine learning-based random forest Boruta algorithm using routinely collected IBV antibody titer data from broiler flocks. IBV antibody sampling events (n = 1111) from 166 broiler sites between 2016 and 2021 were accessed. Ninety-two geospatial-related and poultry-density variables were obtained using a geographic information system and data sets from publicly available sources. Seventeen and 27 candidate variables were screened to potentially have an association with elevated IBV antibody titers according to the manual selection and machine learning algorithm, respectively. Selected variables from both methods were further investigated by construction of multivariable generalized mixed logistic regression models. Six variables were shortlisted by both screening methods, which included year, distance to urban areas, main roads, landcover, density of layer sites and year, however, final models for both approaches only shared year as an important predictor. Despite limited significance of clinical outcomes, this work showcases the potential of a novel explorative modeling approach in combination with often unutilized resources such as publicly available geospatial data, surveillance health data and machine learning as potential supplementary tools to investigate risk factors related to infectious diseases.
Collapse
Affiliation(s)
- Magnus R Campler
- Department of Veterinary Preventive Medicine, The Ohio State University, OH 43210, USA
| | - Ting-Yu Cheng
- Department of Veterinary Preventive Medicine, The Ohio State University, OH 43210, USA
| | - Chang-Won Lee
- Exotic and Emerging Avian Diseases, Southeast Poultry Research Laboratory, National Poultry Research Center, Agricultural Research Service, U.S. Department of Agriculture, Athens, GA 30605, USA
| | | | - Geoffrey Lossie
- Department of Comparative Pathobiology and Animal Disease Diagnostic Laboratory, College of Veterinary Medicine, Purdue University, IN 47907, USA
| | - Gustavo S Silva
- Department of Comparative Pathobiology and Animal Disease Diagnostic Laboratory, College of Veterinary Medicine, Purdue University, IN 47907, USA
| | - Mohamed M El-Gazzar
- Department of Veterinary Diagnostic and Production Animal Medicine, College of Veterinary Medicine, Iowa State University, IA 50011, USA
| | - Andréia G Arruda
- Department of Veterinary Preventive Medicine, The Ohio State University, OH 43210, USA.
| |
Collapse
|
2
|
Bavykina M, Kostina N, Lee CR, Schafleitner R, Bishop-von Wettberg E, Nuzhdin SV, Samsonova M, Gursky V, Kozlov K. Modeling of Flowering Time in Vigna radiata with Artificial Image Objects, Convolutional Neural Network and Random Forest. PLANTS (BASEL, SWITZERLAND) 2022; 11:3327. [PMID: 36501364 PMCID: PMC9738219 DOI: 10.3390/plants11233327] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/01/2022] [Revised: 11/22/2022] [Accepted: 11/28/2022] [Indexed: 06/17/2023]
Abstract
Flowering time is an important target for breeders in developing new varieties adapted to changing conditions. In this work, a new approach is proposed in which the SNP markers influencing time to flowering in mung bean are selected as important features in a random forest model. The genotypic and weather data are encoded in artificial image objects, and a model for flowering time prediction is constructed as a convolutional neural network. The model uses weather data for only a limited time period of 5 days before and 20 days after planting and is capable of predicting the time to flowering with high accuracy. The most important factors for model solution were identified using saliency maps and a Score-CAM method. Our approach can help breeding programs harness genotypic and phenotypic diversity to more effectively produce varieties with a desired flowering time.
Collapse
Affiliation(s)
- Maria Bavykina
- Mathematical Biology and Bioinformatics Lab, Peter the Great St. Petersburg Polytechnic University, 195251 Saint Petersburg, Russia
| | - Nadezhda Kostina
- Mathematical Biology and Bioinformatics Lab, Peter the Great St. Petersburg Polytechnic University, 195251 Saint Petersburg, Russia
| | - Cheng-Ruei Lee
- Institute of Ecology and Evolutionary Biology, National Taiwan University, Taipei 106319, Taiwan
| | | | - Eric Bishop-von Wettberg
- Department of Plant and Soil Science, Gund Institute for the Environment, University of Vermont, Burlington, VT 05405, USA
| | - Sergey V. Nuzhdin
- Mathematical Biology and Bioinformatics Lab, Peter the Great St. Petersburg Polytechnic University, 195251 Saint Petersburg, Russia
- Program Molecular and Computation Biology, University of California, Los-Angeles, CA 90095, USA
| | - Maria Samsonova
- Mathematical Biology and Bioinformatics Lab, Peter the Great St. Petersburg Polytechnic University, 195251 Saint Petersburg, Russia
| | - Vitaly Gursky
- Theoretical Department, Ioffe Institute, 194021 Saint Petersburg, Russia
| | - Konstantin Kozlov
- Mathematical Biology and Bioinformatics Lab, Peter the Great St. Petersburg Polytechnic University, 195251 Saint Petersburg, Russia
| |
Collapse
|
3
|
Rajavel A, Klees S, Hui Y, Schmitt AO, Gültas M. Deciphering the Molecular Mechanism Underlying African Animal Trypanosomiasis by Means of the 1000 Bull Genomes Project Genomic Dataset. BIOLOGY 2022; 11:biology11050742. [PMID: 35625470 PMCID: PMC9138820 DOI: 10.3390/biology11050742] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/16/2022] [Revised: 05/05/2022] [Accepted: 05/10/2022] [Indexed: 11/16/2022]
Abstract
Simple Summary Climate change is increasing the risk of spreading vector-borne diseases such as African Animal Trypanosomiasis (AAT), which is causing major economic losses, especially in sub-Saharan African countries. Mainly considering this disease, we have investigated transcriptomic and genomic data from two cattle breeds, namely Boran and N‘Dama, where the former is known for its susceptibility and the latter one for its tolerance to the AAT. Despite the rich literature on this disease, there is still a need to investigate underlying genetic mechanisms to decipher the complex interplay of regulatory SNPs (rSNPs), their corresponding gene expression profiles and the downstream effectors associated with the AAT disease. The findings of this study complement our previous results, which mainly involve the upstream events, including transcription factors (TFs) and their co-operations as well as master regulators. Moreover, our investigation of significant rSNPs and effectors found in the liver, spleen and lymph node tissues of both cattle breeds could enhance the understanding of distinct mechanisms leading to either resistance or susceptibility of cattle breeds. Abstract African Animal Trypanosomiasis (AAT) is a neglected tropical disease and spreads by the vector tsetse fly, which carries the infectious Trypanosoma sp. in their saliva. Particularly, this parasitic disease affects the health of livestock, thereby imposing economic constraints on farmers, costing billions of dollars every year, especially in sub-Saharan African countries. Mainly considering the AAT disease as a multistage progression process, we previously performed upstream analysis to identify transcription factors (TFs), their co-operations, over-represented pathways and master regulators. However, downstream analysis, including effectors, corresponding gene expression profiles and their association with the regulatory SNPs (rSNPs), has not yet been established. Therefore, in this study, we aim to investigate the complex interplay of rSNPs, corresponding gene expression and downstream effectors with regard to the AAT disease progression based on two cattle breeds: trypanosusceptible Boran and trypanotolerant N’Dama. Our findings provide mechanistic insights into the effectors involved in the regulation of several signal transduction pathways, thereby differentiating the molecular mechanism with regard to the immune responses of the cattle breeds. The effectors and their associated genes (especially MAPKAPK5, CSK, DOK2, RAC1 and DNMT1) could be promising drug candidates as they orchestrate various downstream regulatory cascades in both cattle breeds.
Collapse
Affiliation(s)
- Abirami Rajavel
- Breeding Informatics Group, Department of Animal Sciences, Georg-August University, Margarethe von Wrangell-Weg 7, 37075 Göttingen, Germany; (S.K.); (Y.H.); (A.O.S.)
- Center for Integrated Breeding Research (CiBreed), Georg-August University, Carl-Sprengel-Weg 1, 37075 Göttingen, Germany
- Correspondence: (A.R.); (M.G.)
| | - Selina Klees
- Breeding Informatics Group, Department of Animal Sciences, Georg-August University, Margarethe von Wrangell-Weg 7, 37075 Göttingen, Germany; (S.K.); (Y.H.); (A.O.S.)
- Center for Integrated Breeding Research (CiBreed), Georg-August University, Carl-Sprengel-Weg 1, 37075 Göttingen, Germany
| | - Yuehan Hui
- Breeding Informatics Group, Department of Animal Sciences, Georg-August University, Margarethe von Wrangell-Weg 7, 37075 Göttingen, Germany; (S.K.); (Y.H.); (A.O.S.)
| | - Armin Otto Schmitt
- Breeding Informatics Group, Department of Animal Sciences, Georg-August University, Margarethe von Wrangell-Weg 7, 37075 Göttingen, Germany; (S.K.); (Y.H.); (A.O.S.)
- Center for Integrated Breeding Research (CiBreed), Georg-August University, Carl-Sprengel-Weg 1, 37075 Göttingen, Germany
| | - Mehmet Gültas
- Center for Integrated Breeding Research (CiBreed), Georg-August University, Carl-Sprengel-Weg 1, 37075 Göttingen, Germany
- Faculty of Agriculture, South Westphalia University of Applied Sciences, Lübecker Ring 2, 59494 Soest, Germany
- Correspondence: (A.R.); (M.G.)
| |
Collapse
|
4
|
Deciphering Pleiotropic Signatures of Regulatory SNPs in Zea mays L. Using Multi-Omics Data and Machine Learning Algorithms. Int J Mol Sci 2022; 23:ijms23095121. [PMID: 35563516 PMCID: PMC9100765 DOI: 10.3390/ijms23095121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2022] [Revised: 04/28/2022] [Accepted: 05/02/2022] [Indexed: 01/25/2023] Open
Abstract
Maize is one of the most widely grown cereals in the world. However, to address the challenges in maize breeding arising from climatic anomalies, there is a need for developing novel strategies to harness the power of multi-omics technologies. In this regard, pleiotropy is an important genetic phenomenon that can be utilized to simultaneously enhance multiple agronomic phenotypes in maize. In addition to pleiotropy, another aspect is the consideration of the regulatory SNPs (rSNPs) that are likely to have causal effects in phenotypic development. By incorporating both aspects in our study, we performed a systematic analysis based on multi-omics data to reveal the novel pleiotropic signatures of rSNPs in a global maize population. For this purpose, we first applied Random Forests and then Markov clustering algorithms to decipher the pleiotropic signatures of rSNPs, based on which hierarchical network models are constructed to elucidate the complex interplay among transcription factors, rSNPs, and phenotypes. The results obtained in our study could help to understand the genetic programs orchestrating multiple phenotypes and thus could provide novel breeding targets for the simultaneous improvement of several agronomic traits.
Collapse
|
5
|
Comparative Investigation of Gene Regulatory Processes Underlying Avian Influenza Viruses in Chicken and Duck. BIOLOGY 2022; 11:biology11020219. [PMID: 35205087 PMCID: PMC8868632 DOI: 10.3390/biology11020219] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/01/2021] [Revised: 01/07/2022] [Accepted: 01/25/2022] [Indexed: 11/30/2022]
Abstract
Simple Summary Avian influenza poses a great risk to gallinaceous poultry, while mallard ducks can withstand most virus strains. To date, the mechanisms underlying the susceptibility of chicken and the effective immune response of duck have not been completely understood. In this study, our aim is to investigate the transcriptional gene regulation governing the expression of important avian-influenza-induced genes and to reveal the master regulators stimulating an effective immune response after virus infection in ducks while dysfunctioning in chicken. Abstract The avian influenza virus (AIV) mainly affects birds and not only causes animals’ deaths, but also poses a great risk of zoonotically infecting humans. While ducks and wild waterfowl are seen as a natural reservoir for AIVs and can withstand most virus strains, chicken mostly succumb to infection with high pathogenic avian influenza (HPAI). To date, the mechanisms underlying the susceptibility of chicken and the effective immune response of duck have not been completely unraveled. In this study, we investigate the transcriptional gene regulation underlying disease progression in chicken and duck after AIV infection. For this purpose, we use a publicly available RNA-sequencing dataset from chicken and ducks infected with low-pathogenic avian influenza (LPAI) H5N2 and HPAI H5N1 (lung and ileum tissues, 1 and 3 days post-infection). Unlike previous studies, we performed a promoter analysis based on orthologous genes to detect important transcription factors (TFs) and their cooperation, based on which we apply a systems biology approach to identify common and species-specific master regulators. We found master regulators such as EGR1, FOS, and SP1, specifically for chicken and ETS1 and SMAD3/4, specifically for duck, which could be responsible for the duck’s effective and the chicken’s ineffective immune response.
Collapse
|
6
|
MIDESP: Mutual Information-Based Detection of Epistatic SNP Pairs for Qualitative and Quantitative Phenotypes. BIOLOGY 2021; 10:biology10090921. [PMID: 34571798 PMCID: PMC8469369 DOI: 10.3390/biology10090921] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/02/2021] [Revised: 09/09/2021] [Accepted: 09/13/2021] [Indexed: 11/17/2022]
Abstract
Simple Summary The interactions between SNPs, which are known as epistasis, can strongly influence the phenotype. Their detection is still a challenge, which is made even more difficult through the existence of background associations that can hide correct epistatic interactions. To address the limitations of existing methods, we present in this study our novel method MIDESP for the detection of epistatic SNP pairs. It is the first mutual information-based method that can be applied to both qualitative and quantitative phenotypes and which explicitly accounts for background associations in the dataset. Abstract The interactions between SNPs result in a complex interplay with the phenotype, known as epistasis. The knowledge of epistasis is a crucial part of understanding genetic causes of complex traits. However, due to the enormous number of SNP pairs and their complex relationship to the phenotype, identification still remains a challenging problem. Many approaches for the detection of epistasis have been developed using mutual information (MI) as an association measure. However, these methods have mainly been restricted to case–control phenotypes and are therefore of limited applicability for quantitative traits. To overcome this limitation of MI-based methods, here, we present an MI-based novel algorithm, MIDESP, to detect epistasis between SNPs for qualitative as well as quantitative phenotypes. Moreover, by incorporating a dataset-dependent correction technique, we deal with the effect of background associations in a genotypic dataset to separate correct epistatic interaction signals from those of false positive interactions resulting from the effect of single SNP×phenotype associations. To demonstrate the effectiveness of MIDESP, we apply it on two real datasets with qualitative and quantitative phenotypes, respectively. Our results suggest that by eliminating the background associations, MIDESP can identify important genes, which play essential roles for bovine tuberculosis or the egg weight of chickens.
Collapse
|
7
|
Identification and Functional Annotation of Genes Related to Bone Stability in Laying Hens Using Random Forests. Genes (Basel) 2021; 12:genes12050702. [PMID: 34066823 PMCID: PMC8151682 DOI: 10.3390/genes12050702] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2021] [Revised: 05/05/2021] [Accepted: 05/06/2021] [Indexed: 12/20/2022] Open
Abstract
Skeletal disorders, including fractures and osteoporosis, in laying hens cause major welfare and economic problems. Although genetics have been shown to play a key role in bone integrity, little is yet known about the underlying genetic architecture of the traits. This study aimed to identify genes associated with bone breaking strength and bone mineral density of the tibiotarsus and the humerus in laying hens. Potentially informative single nucleotide polymorphisms (SNP) were identified using Random Forests classification. We then searched for genes known to be related to bone stability in close proximity to the SNPs and identified 16 potential candidates. Some of them had human orthologues. Based on our findings, we can support the assumption that multiple genes determine bone strength, with each of them having a rather small effect, as illustrated by our SNP effect estimates. Furthermore, the enrichment analysis showed that some of these candidates are involved in metabolic pathways critical for bone integrity. In conclusion, the identified candidates represent genes that may play a role in the bone integrity of chickens. Although further studies are needed to determine causality, the genes reported here are promising in terms of alleviating bone disorders in laying hens.
Collapse
|
8
|
Klees S, Lange TM, Bertram H, Rajavel A, Schlüter JS, Lu K, Schmitt AO, Gültas M. In Silico Identification of the Complex Interplay between Regulatory SNPs, Transcription Factors, and Their Related Genes in Brassica napus L. Using Multi-Omics Data. Int J Mol Sci 2021; 22:E789. [PMID: 33466789 PMCID: PMC7830561 DOI: 10.3390/ijms22020789] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2020] [Revised: 01/08/2021] [Accepted: 01/11/2021] [Indexed: 01/07/2023] Open
Abstract
Regulatory SNPs (rSNPs) are a special class of SNPs which have a high potential to affect the phenotype due to their impact on DNA-binding of transcription factors (TFs). Thus, the knowledge about such rSNPs and TFs could provide essential information regarding different genetic programs, such as tissue development or environmental stress responses. In this study, we use a multi-omics approach by combining genomics, transcriptomics, and proteomics data of two different Brassica napus L. cultivars, namely Zhongshuang11 (ZS11) and Zhongyou821 (ZY821), with high and low oil content, respectively, to monitor the regulatory interplay between rSNPs, TFs and their corresponding genes in the tissues flower, leaf, stem, and root. By predicting the effect of rSNPs on TF-binding and by measuring their association with the cultivars, we identified a total of 41,117 rSNPs, of which 1141 are significantly associated with oil content. We revealed several enriched members of the TF families DOF, MYB, NAC, or TCP, which are important for directing transcriptional programs regulating differential expression of genes within the tissues. In this work, we provide the first genome-wide collection of rSNPs for B. napus and their impact on the regulation of gene expression in vegetative and floral tissues, which will be highly valuable for future studies on rSNPs and gene regulation.
Collapse
Affiliation(s)
- Selina Klees
- Breeding Informatics Group, Department of Animal Sciences, Georg-August University, Margarethe von Wrangell-Weg 7, 37075 Göttingen, Germany; (S.K.); (T.M.L.); (H.B.); (A.R.); (J.-S.S.); (A.O.S.)
| | - Thomas Martin Lange
- Breeding Informatics Group, Department of Animal Sciences, Georg-August University, Margarethe von Wrangell-Weg 7, 37075 Göttingen, Germany; (S.K.); (T.M.L.); (H.B.); (A.R.); (J.-S.S.); (A.O.S.)
| | - Hendrik Bertram
- Breeding Informatics Group, Department of Animal Sciences, Georg-August University, Margarethe von Wrangell-Weg 7, 37075 Göttingen, Germany; (S.K.); (T.M.L.); (H.B.); (A.R.); (J.-S.S.); (A.O.S.)
| | - Abirami Rajavel
- Breeding Informatics Group, Department of Animal Sciences, Georg-August University, Margarethe von Wrangell-Weg 7, 37075 Göttingen, Germany; (S.K.); (T.M.L.); (H.B.); (A.R.); (J.-S.S.); (A.O.S.)
| | - Johanna-Sophie Schlüter
- Breeding Informatics Group, Department of Animal Sciences, Georg-August University, Margarethe von Wrangell-Weg 7, 37075 Göttingen, Germany; (S.K.); (T.M.L.); (H.B.); (A.R.); (J.-S.S.); (A.O.S.)
| | - Kun Lu
- College of Agronomy and Biotechnology, Southwest University, Chongqing 400715, China;
- Academy of Agricultural Sciences, Southwest University, Chongqing 400715, China
- State Cultivation Base of Crop Stress Biology for Southern Mountainous Land of Southwest University, Chongqing 400715, China
| | - Armin Otto Schmitt
- Breeding Informatics Group, Department of Animal Sciences, Georg-August University, Margarethe von Wrangell-Weg 7, 37075 Göttingen, Germany; (S.K.); (T.M.L.); (H.B.); (A.R.); (J.-S.S.); (A.O.S.)
- Center for Integrated Breeding Research (CiBreed), Albrecht-Thaer-Weg 3, Georg-August University, 37075 Göttingen, Germany
| | - Mehmet Gültas
- Breeding Informatics Group, Department of Animal Sciences, Georg-August University, Margarethe von Wrangell-Weg 7, 37075 Göttingen, Germany; (S.K.); (T.M.L.); (H.B.); (A.R.); (J.-S.S.); (A.O.S.)
- Center for Integrated Breeding Research (CiBreed), Albrecht-Thaer-Weg 3, Georg-August University, 37075 Göttingen, Germany
| |
Collapse
|
9
|
Rajavel A, Schmitt AO, Gültas M. Computational Identification of Master Regulators Influencing Trypanotolerance in Cattle. Int J Mol Sci 2021; 22:ijms22020562. [PMID: 33429951 PMCID: PMC7827104 DOI: 10.3390/ijms22020562] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2020] [Revised: 12/31/2020] [Accepted: 01/05/2021] [Indexed: 12/15/2022] Open
Abstract
African Animal Trypanosomiasis (AAT) is transmitted by the tsetse fly which carries pathogenic trypanosomes in its saliva, thus causing debilitating infection to livestock health. As the disease advances, a multistage progression process is observed based on the progressive clinical signs displayed in the host’s body. Investigation of genes expressed with regular monotonic patterns (known as Monotonically Expressed Genes (MEGs)) and of their master regulators can provide important clue for the understanding of the molecular mechanisms underlying the AAT disease. For this purpose, we analysed MEGs for three tissues (liver, spleen and lymph node) of two cattle breeds, namely trypanosusceptible Boran and trypanotolerant N’Dama. Our analysis revealed cattle breed-specific master regulators which are highly related to distinguish the genetic programs in both cattle breeds. Especially the master regulators MYC and DBP found in this study, seem to influence the immune responses strongly, thereby susceptibility and trypanotolerance of Boran and N’Dama respectively. Furthermore, our pathway analysis also bolsters the crucial roles of these master regulators. Taken together, our findings provide novel insights into breed-specific master regulators which orchestrate the regulatory cascades influencing the level of trypanotolerance in cattle breeds and thus could be promising drug targets for future therapeutic interventions.
Collapse
Affiliation(s)
- Abirami Rajavel
- Breeding Informatics Group, Department of Animal Sciences, Georg-August University, Margarethe von Wrangell-Weg 7, 37075 Göttingen, Germany; (A.R.); (A.O.S.)
| | - Armin Otto Schmitt
- Breeding Informatics Group, Department of Animal Sciences, Georg-August University, Margarethe von Wrangell-Weg 7, 37075 Göttingen, Germany; (A.R.); (A.O.S.)
- Center for Integrated Breeding Research (CiBreed), Albrecht-Thaer-Weg 3, Georg-August University, 37075 Göttingen, Germany
| | - Mehmet Gültas
- Breeding Informatics Group, Department of Animal Sciences, Georg-August University, Margarethe von Wrangell-Weg 7, 37075 Göttingen, Germany; (A.R.); (A.O.S.)
- Center for Integrated Breeding Research (CiBreed), Albrecht-Thaer-Weg 3, Georg-August University, 37075 Göttingen, Germany
- Correspondence:
| |
Collapse
|
10
|
Ramzan F, Gültas M, Bertram H, Cavero D, Schmitt AO. Combining Random Forests and a Signal Detection Method Leads to the Robust Detection of Genotype-Phenotype Associations. Genes (Basel) 2020; 11:E892. [PMID: 32764260 PMCID: PMC7465705 DOI: 10.3390/genes11080892] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2020] [Revised: 07/28/2020] [Accepted: 08/03/2020] [Indexed: 12/21/2022] Open
Abstract
Genome wide association studies (GWAS) are a well established methodology to identify genomic variants and genes that are responsible for traits of interest in all branches of the life sciences. Despite the long time this methodology has had to mature the reliable detection of genotype-phenotype associations is still a challenge for many quantitative traits mainly because of the large number of genomic loci with weak individual effects on the trait under investigation. Thus, it can be hypothesized that many genomic variants that have a small, however real, effect remain unnoticed in many GWAS approaches. Here, we propose a two-step procedure to address this problem. In a first step, cubic splines are fitted to the test statistic values and genomic regions with spline-peaks that are higher than expected by chance are considered as quantitative trait loci (QTL). Then the SNPs in these QTLs are prioritized with respect to the strength of their association with the phenotype using a Random Forests approach. As a case study, we apply our procedure to real data sets and find trustworthy numbers of, partially novel, genomic variants and genes involved in various egg quality traits.
Collapse
Affiliation(s)
- Faisal Ramzan
- Breeding Informatics Group, Department of Animal Sciences, Georg-August University, Margarethe von Wrangell-Weg 7, 37075 Göttingen, Germany; (F.R.); (M.G.); (H.B.)
- Department of Animal Breeding and Genetics, University of Agriculture Faisalabad, 38000 Faisalabad, Pakistan
| | - Mehmet Gültas
- Breeding Informatics Group, Department of Animal Sciences, Georg-August University, Margarethe von Wrangell-Weg 7, 37075 Göttingen, Germany; (F.R.); (M.G.); (H.B.)
- Center for Integrated Breeding Research (CiBreed), Albrecht-Thaer-Weg 3, Georg-August University, 37075 Göttingen, Germany
| | - Hendrik Bertram
- Breeding Informatics Group, Department of Animal Sciences, Georg-August University, Margarethe von Wrangell-Weg 7, 37075 Göttingen, Germany; (F.R.); (M.G.); (H.B.)
| | | | - Armin Otto Schmitt
- Breeding Informatics Group, Department of Animal Sciences, Georg-August University, Margarethe von Wrangell-Weg 7, 37075 Göttingen, Germany; (F.R.); (M.G.); (H.B.)
- Center for Integrated Breeding Research (CiBreed), Albrecht-Thaer-Weg 3, Georg-August University, 37075 Göttingen, Germany
| |
Collapse
|