1
|
Kumar H, Panigrahi M, Seo D, Cho S, Bhushan B, Dutt T. Machine Learning-Aided Ultra-Low-Density Single Nucleotide Polymorphism Panel Helps to Identify the Tharparkar Cattle Breed: Lessons for Digital Transformation in Livestock Genomics. OMICS : A JOURNAL OF INTEGRATIVE BIOLOGY 2024; 28:514-525. [PMID: 39302202 DOI: 10.1089/omi.2024.0153] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/22/2024]
Abstract
Cattle breed identification is crucial for livestock research and sustainable food systems, and advances in genomics and artificial intelligence present new opportunities to address these challenges. This study investigates the identification of the Tharparkar cattle breed using genomics tools combined with machine learning (ML) techniques. By leveraging data from the Bovine SNP 50K chip, we developed a breed-specific panel of single nucleotide polymorphisms (SNPs) for Tharparkar cattle and integrated data from seven other Indian cattle populations to enhance panel robustness. Genome-wide association studies (GWAS) and principal component analysis were employed to identify 500 SNPs, which were then refined using ML models-AdaBoost, bagging tree, gradient boosting machines, and random forest-to determine the minimal number of SNPs needed for accurate breed identification. Panels of 23 and 48 SNPs achieved accuracy rates of 95.2-98.4%. Importantly, the identified SNPs were associated with key productive and adaptive traits, thus attesting to the value and potentials of digital transformation in livestock genomics. The ML-aided ultra-low-density SNP panel approach reported here not only facilitates breed identification but also contributes to preserving genetic diversity and guiding future breeding programs.
Collapse
Affiliation(s)
- Harshit Kumar
- Division of Animal Genetics, Indian Veterinary Research Institute, Izatnagar, India
- ICAR-National Research Centre on Mithun, Medziphema, India
| | - Manjit Panigrahi
- Division of Animal Genetics, Indian Veterinary Research Institute, Izatnagar, India
| | - Dongwon Seo
- Research and Development Center, TNT research Co., Jeonju-si, South Korea
| | - Sunghyun Cho
- Research and Development Center, Insilicogen Inc., Yongin-si, South Korea
| | - Bharat Bhushan
- Division of Animal Genetics, Indian Veterinary Research Institute, Izatnagar, India
| | - Triveni Dutt
- Animal Genetics & Breeding Section, Indian Veterinary Research Institute, Izatnagar, India
| |
Collapse
|
2
|
Jasielczuk I, Gurgul A, Szmatoła T, Radko A, Majewska A, Sosin E, Litwińczuk Z, Rubiś D, Ząbek T. The use of SNP markers for cattle breed identification. J Appl Genet 2024; 65:575-589. [PMID: 38568414 DOI: 10.1007/s13353-024-00857-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2023] [Revised: 11/10/2023] [Accepted: 03/12/2024] [Indexed: 08/09/2024]
Abstract
A potential application of single nucleotide polymorphisms (SNPs) in animal husbandry and production is identification of the animal breed. In this study, using chosen marker selection methods and genotypic data obtained with the use of Illumina Bovine SNP50 BeadChip for individuals belonging to ten cattle breeds, the reduced panels containing the most informative SNP markers were developed. The suitability of selected SNP panels for the effective and reliable assignment of the studied individuals to the breed of origin was checked by three allocation algorithms implemented in GeneClass 2. The studied breeds set included both Polish-native breeds under the genetic resources conservation programs and highly productive breeds with a global range. For all of the tested marker selection methods ("delta" and two FST-based variants), two separate methodological approaches of marker assortment were used and three marker panels were created with 96, 192, and 288 SNPs respectively, to determine the minimum number of markers required for effective differentiation of the studied breeds. Moreover, the usefulness of the most effective panels of markers to assess the population structure and genetic diversity of the analyzed breeds was examined. The conducted analyses showed the possibility of using SNP subsets from medium-density genotypic microarrays to distinguish breeds of cattle kept in Poland and to analyze their genetic structure.
Collapse
Affiliation(s)
- Igor Jasielczuk
- Center for Experimental and Innovative Medicine, University of Agriculture in Krakow, Rędzina 1C, 30-248, Kraków, Poland.
| | - Artur Gurgul
- Center for Experimental and Innovative Medicine, University of Agriculture in Krakow, Rędzina 1C, 30-248, Kraków, Poland
| | - Tomasz Szmatoła
- Center for Experimental and Innovative Medicine, University of Agriculture in Krakow, Rędzina 1C, 30-248, Kraków, Poland
| | - Anna Radko
- Department of Animal Molecular Biology, National Research Institute of Animal Production, Krakowska 1, 32-083, Balice, Poland
| | - Anna Majewska
- Department of Cattle Breeding, National Research Institute of Animal Production, Krakowska 1, 32-083, Balice, Poland
| | - Ewa Sosin
- Department of Animal Nutrition and Feed Science, National Research Institute of Animal Production, Krakowska 1, 32-083, Balice, Poland
| | - Zygmunt Litwińczuk
- Sub-Department of Cattle Breeding and Genetic Resources Conservation, University of Life Sciences in Lublin, Akademicka 13, 20-950, Lublin, Poland
| | - Dominika Rubiś
- Department of Animal Molecular Biology, National Research Institute of Animal Production, Krakowska 1, 32-083, Balice, Poland
| | - Tomasz Ząbek
- Department of Animal Molecular Biology, National Research Institute of Animal Production, Krakowska 1, 32-083, Balice, Poland
| |
Collapse
|
3
|
Saini T, Chauhan A, Ahmad SF, Kumar A, Vaishnav S, Singh S, Mehrotra A, Bhushan B, Gaur GK, Dutt T. Elucidation of population stratifying markers and selective sweeps in crossbred Landlly pig population using genome-wide SNP data. Mamm Genome 2024; 35:170-185. [PMID: 38485788 DOI: 10.1007/s00335-024-10029-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2023] [Accepted: 01/23/2024] [Indexed: 05/29/2024]
Abstract
The present study was aimed at the identification of population stratifying markers from the commercial porcine SNP 60K array and elucidate the genome-wide selective sweeps in the crossbred Landlly pig population. Original genotyping data, generated on Landlly pigs, was merged in various combinations with global suid breeds that were grouped as exotic (global pig breeds excluding Indian and Chinese), Chinese (Chinese pig breeds only), and outgroup pig populations. Post quality control, the genome-wide SNPs were ranked for their stratifying power within each dataset in TRES (using three different criteria) and FIFS programs and top-ranked SNPs (0.5K, 1K, 2K, 3K, and 4K densities) were selected. PCA plots were used to assess the stratification power of low-density panels. Selective sweeps were elucidated in the Landlly population using intra- and inter-population haplotype statistics. Additionally, Tajima's D-statistics were calculated to determine the status of balancing selection in the Landlly population. PCA plots showed 0.5K marker density to effectively stratify Landlly from other pig populations. The A-score in DAPC program revealed the Delta statistic of marker selection to outperform other methods (informativeness and FST methods) and that 3000-marker density was suitable for stratification of Landlly animals from exotic pig populations. The results from selective sweep analysis revealed the Landlly population to be under selection for mammary (NAV2), reproductive efficiency (JMY, SERGEF, and MAP3K20), body conformation (FHIT, WNT2, ASRB, DMGDH, and BHMT), feed efficiency (CSRNP1 and ADRA1A), and immunity (U6, MYO3B, RBMS3, and FAM78B) traits. More than two methods suggested sweeps for immunity and feed efficiency traits, thus giving a strong indication for selection in this direction. The study is the first of its kind in Indian pig breeds with a comparison against global breeds. In conclusion, 500 markers were able to effectively stratify the breeds. Different traits under selective sweeps (natural or artificial selection) can be exploited for further improvement.
Collapse
Affiliation(s)
- Tapendra Saini
- Animal Genetics Division, ICAR-Indian Veterinary Research Institute, Izatnagar, 243122, India
| | - Anuj Chauhan
- Swine Production Farm, LPM Section, ICAR-Indian Veterinary Research Institute, Izatnagar, 243122, India.
| | - Sheikh Firdous Ahmad
- Animal Genetics Division, ICAR-Indian Veterinary Research Institute, Izatnagar, 243122, India
| | - Amit Kumar
- Animal Genetics Division, ICAR-Indian Veterinary Research Institute, Izatnagar, 243122, India
| | - Sakshi Vaishnav
- Animal Genetics Division, ICAR-Indian Veterinary Research Institute, Izatnagar, 243122, India
| | - Shivani Singh
- Swine Production Farm, LPM Section, ICAR-Indian Veterinary Research Institute, Izatnagar, 243122, India
| | | | - Bharat Bhushan
- Animal Genetics Division, ICAR-Indian Veterinary Research Institute, Izatnagar, 243122, India
| | - G K Gaur
- Swine Production Farm, LPM Section, ICAR-Indian Veterinary Research Institute, Izatnagar, 243122, India
- ADG Animal Production & Breeding, ICAR, New Delhi, 110001, India
| | - Triveni Dutt
- Indian Veterinary Research Institute, Izatnagar, 243122, India
| |
Collapse
|
4
|
Ulmo‐Diaz G, Engman A, McLarney WO, Lasso Alcalá CA, Hendrickson D, Bezault E, Feunteun E, Prats‐Léon FL, Wiener J, Maxwell R, Mohammed RS, Kwak TJ, Benchetrit J, Bougas B, Babin C, Normandeau E, Djambazian HHV, Chen S, Reiling SJ, Ragoussis J, Bernatchez L. Panmixia in the American eel extends to its tropical range of distribution: Biological implications and policymaking challenges. Evol Appl 2023; 16:1872-1888. [PMID: 38143897 PMCID: PMC10739100 DOI: 10.1111/eva.13599] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2023] [Revised: 08/25/2023] [Accepted: 09/06/2023] [Indexed: 12/26/2023] Open
Abstract
The American eel (Anguilla rostrata) has long been regarded as a panmictic fish and has been confirmed as such in the northern part of its range. In this paper, we tested for the first time whether panmixia extends to the tropical range of the species. To do so, we first assembled a reference genome (975 Mbp, 19 chromosomes) combining long (PacBio and Nanopore and short (Illumina paired-end) reads technologies to support both this study and future research. To test for population structure, we estimated genotype likelihoods from low-coverage whole-genome sequencing of 460 American eels, collected at 21 sampling sites (in seven geographic regions) ranging from Canada to Trinidad and Tobago. We estimated genetic distance between regions, performed ADMIXTURE-like clustering analysis and multivariate analysis, and found no evidence of population structure, thus confirming that panmixia extends to the tropical range of the species. In addition, two genomic regions with putative inversions were observed, both geographically widespread and present at similar frequencies in all regions. We discuss the implications of lack of genetic population structure for the species. Our results are key for the future genomic research in the American eel and the implementation of conservation measures throughout its geographic range. Additionally, our results can be applied to fisheries management and aquaculture of the species.
Collapse
Affiliation(s)
- Gabriela Ulmo‐Diaz
- Département de BiologieInstitut de Biologie Intégrative et des Systèmes (IBIS)Université LavalQuébecCanada
| | - Augustin Engman
- University of Tennessee Institute of Agriculture, School of Natural ResourcesKnoxvilleTennesseeUSA
| | | | | | - Dean Hendrickson
- Department of Integrative Biology and Biodiversity CollectionsUniversity of Texas at AustinAustinTexasUSA
| | - Etienne Bezault
- UMR 8067 BOREA, Biologie Organismes Écosystèmes Aquatiques (MNHN, CNRS, SU, IRD, UCN, UA)Université des AntillesPointe‐à‐PitreGuadeloupe
- Caribaea Initiative, Département de BiologieUniversité Des Antilles‐Campus de FouillolePointe‐à‐PitreGuadeloupeFrance
| | - Eric Feunteun
- UMR 7208 BOREABiologie Organismes Écosystèmes Aquatiques (MNHN, CNRS, SU,IRD, UCN, UA)Station Marine de DinardRennesFrance
- EPHE‐PSLCGEL (Centre de Géoécologie Littorale)DinardFrance
| | | | - Jean Wiener
- Fondation pour la Protection de la Biodiversité Marine (FoProBiM)CaracolHaiti
| | - Robert Maxwell
- Inland Fisheries SectionLouisiana Department of Wildlife and FisheriesLouisianaUSA
| | - Ryan S. Mohammed
- The University of the West Indies (UWI)St. AugustineTrinidad and Tobago
- Present address:
Department of Biological SciencesAuburn UniversityAuburnAlabamaUSA
| | - Thomas J. Kwak
- US Geological SurveyNorth Carolina Cooperative Fish and Wildlife Research UnitDepartment of Applied EcologyNorth Carolina State UniversityRaleighNorth CarolinaUSA
| | | | - Bérénice Bougas
- Département de BiologieInstitut de Biologie Intégrative et des Systèmes (IBIS)Université LavalQuébecCanada
| | - Charles Babin
- Département de BiologieInstitut de Biologie Intégrative et des Systèmes (IBIS)Université LavalQuébecCanada
| | - Eric Normandeau
- Département de BiologieInstitut de Biologie Intégrative et des Systèmes (IBIS)Université LavalQuébecCanada
| | - Haig H. V. Djambazian
- McGIll Genome Centre, Department of Human GeneticsVictor Phillip Dahdaleh Institute of Genomic MedicineMcGill UniversityMontrealQuebecCanada
| | - Shu‐Huang Chen
- McGIll Genome Centre, Department of Human GeneticsVictor Phillip Dahdaleh Institute of Genomic MedicineMcGill UniversityMontrealQuebecCanada
| | - Sarah J. Reiling
- McGIll Genome Centre, Department of Human GeneticsVictor Phillip Dahdaleh Institute of Genomic MedicineMcGill UniversityMontrealQuebecCanada
| | - Jiannis Ragoussis
- McGIll Genome Centre, Department of Human GeneticsVictor Phillip Dahdaleh Institute of Genomic MedicineMcGill UniversityMontrealQuebecCanada
| | - Louis Bernatchez
- Département de BiologieInstitut de Biologie Intégrative et des Systèmes (IBIS)Université LavalQuébecCanada
| |
Collapse
|
5
|
Hayah I, Talbi C, Chafai N, Houaga I, Botti S, Badaoui B. Genetic diversity and breed-informative SNPs identification in domestic pig populations using coding SNPs. Front Genet 2023; 14:1229741. [PMID: 38034497 PMCID: PMC10687199 DOI: 10.3389/fgene.2023.1229741] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2023] [Accepted: 10/31/2023] [Indexed: 12/02/2023] Open
Abstract
Background: The use of breed-informative genetic markers, specifically coding Single Nucleotide Polymorphisms (SNPs), is crucial for breed traceability, authentication of meat and dairy products, and the preservation and improvement of pig breeds. By identifying breed informative markers, we aimed to gain insights into the genetic mechanisms that influence production traits, enabling informed decisions in animal management and promoting sustainable pig production to meet the growing demand for animal products. Methods: Our dataset consists of 300 coding SNPs genotyped from three Italian commercial pig populations: Landrace, Yorkshire, and Duroc. Firstly, we analyzed the genetic diversity among the populations. Then, we applied a discriminant analysis of principal components to identify the most informative SNPs for discriminating between these populations. Lastly, we conducted a functional enrichment analysis to identify the most enriched pathways related to the genetic variation observed in the pig populations. Results: The alpha diversity indexes revealed a high genetic diversity within the three breeds. The higher proportion of observed heterozygosity than expected revealed an excess of heterozygotes in the populations that was supported by negative values of the fixation index (FIS) and deviations from the Hardy-Weinberg equilibrium. The Euclidean distance, the pairwise FST, and the pairwise Nei's GST genetic distances revealed that Yorkshire and Landrace breeds are genetically the closest, with distance values of 2.242, 0.029, and 0.033, respectively. Conversely, Landrace and Duroc breeds showed the highest genetic divergence, with distance values of 2.815, 0.048, and 0.052, respectively. We identified 28 significant SNPs that are related to phenotypic traits and these SNPs were able to differentiate between the pig breeds with high accuracy. The Functional Enrichment Analysis of the informative SNPs highlighted biological functions related to DNA packaging, chromatin integrity, and the preparation of DNA into higher-order structures. Conclusion: Our study sheds light on the genetic underpinnings of phenotypic variation among three Italian pig breeds, offering potential insights into the mechanisms driving breed differentiation. By prioritizing breed-specific coding SNPs, our approach enables a more focused analysis of specific genomic regions relevant to the research question compared to analyzing the entire genome.
Collapse
Affiliation(s)
- Ichrak Hayah
- Laboratory of Biodiversity, Ecology, and Genome, Department of Biology, Faculty of Sciences, Mohammed V University in Rabat, Rabat, Morocco
| | - Chouhra Talbi
- Plant and Microbial Biotechnologies, Biodiversity, and Environment (BioBio), Mohammed V University in Rabat, Rabat, Morocco
| | - Narjice Chafai
- Laboratory of Biodiversity, Ecology, and Genome, Department of Biology, Faculty of Sciences, Mohammed V University in Rabat, Rabat, Morocco
| | - Isidore Houaga
- Centre for Tropical Livestock Genetics and Health, The Roslin Institute, Royal (Dick) School of Veterinary Medicine, The University of Edinburgh, Edinburgh, United Kingdom
- The Roslin Institute, Royal (Dick) School of Veterinary Studies, University of Edinburgh, Edinburgh, United Kingdom
| | | | - Bouabid Badaoui
- Laboratory of Biodiversity, Ecology, and Genome, Department of Biology, Faculty of Sciences, Mohammed V University in Rabat, Rabat, Morocco
- African Sustainable Agriculture Research Institute (ASARI), Mohammed VI Polytechnic University (UM6P), Laâyoune, Morocco
| |
Collapse
|
6
|
Anas M, Farooq M, Asif M, Ali WR, Mansoor S. A Novel Insight into the Identification of Potential SNP Markers for the Genomic Characterization of Buffalo Breeds in Pakistan. Animals (Basel) 2023; 13:2543. [PMID: 37570351 PMCID: PMC10416883 DOI: 10.3390/ani13152543] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2023] [Revised: 07/11/2023] [Accepted: 07/26/2023] [Indexed: 08/13/2023] Open
Abstract
Domestic buffaloes (Bubalus bubalis), known as water buffaloes, play a key role as versatile multipurpose agricultural animals in the Asiatic region. Pakistan, with the second-largest buffalo population in the world, holds a rich domestication history of buffaloes. The overall trends in buffalo production demand the genomic characterization of Pakistani buffalo breeds. To this end, the resequencing data of Pakistani breeds, along with buffalo breeds from 13 other countries, were retrieved from our previous study. This dataset, which contained 34,671,886 single-nucleotide polymorphisms (SNPs), was analyzed through a pipeline that was developed to compare possible allele differences among breeds at each SNP position. In contrast, other available tools only check for positional SNP differences for breed-specific markers. In total, 1918, 1549, 404, and 341 breed-specific markers were identified to characterize the Nili, Nili-Ravi, Azakheli, and Kundi breeds of Pakistani buffalo, respectively. Sufficient evidence in the form of phenotypic data, principal component analysis, admixture analysis, and linkage analysis showed that the Nili breed has maintained its distinct breed status despite sharing a close evolutionary relationship with the Nili-Ravi breed of buffalo. In this era of genome science, the conservation of these breeds and the further validation of the given selection markers in larger populations is a pressing need.
Collapse
Affiliation(s)
- Muhammad Anas
- National Institute for Biotechnology and Genetic Engineering (NIBGE), Faisalabad 38000, Punjab, Pakistan
- Department of Animal Sciences and Center for Nutrition and Pregnancy, North Dakota State University, Fargo, ND 58105, USA
| | - Muhammad Farooq
- National Institute for Biotechnology and Genetic Engineering (NIBGE), Faisalabad 38000, Punjab, Pakistan
| | - Muhammad Asif
- National Institute for Biotechnology and Genetic Engineering (NIBGE), Faisalabad 38000, Punjab, Pakistan
| | - Waqas Rafique Ali
- National Institute for Biotechnology and Genetic Engineering (NIBGE), Faisalabad 38000, Punjab, Pakistan
| | - Shahid Mansoor
- National Institute for Biotechnology and Genetic Engineering (NIBGE), Faisalabad 38000, Punjab, Pakistan
| |
Collapse
|
7
|
Manzoori S, Farahani AHK, Moradi MH, Kazemi-Bonchenari M. Detecting SNP markers discriminating horse breeds by deep learning. Sci Rep 2023; 13:11592. [PMID: 37464049 DOI: 10.1038/s41598-023-38601-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2022] [Accepted: 07/11/2023] [Indexed: 07/20/2023] Open
Abstract
The assignment of an individual to the true population of origin using a low-panel of discriminant SNP markers is one of the most important applications of genomic data for practical use. The aim of this study was to evaluate the potential of different Artificial Neural Networks (ANNs) approaches consisting Deep Neural Networks (DNN), Garson and Olden methods for feature selection of informative SNP markers from high-throughput genotyping data, that would be able to trace the true breed of unknown samples. The total of 795 animals from 37 breeds, genotyped by using the Illumina SNP 50k Bead chip were used in the current study and principal component analysis (PCA), log-likelihood ratios (LLR) and Neighbor-Joining (NJ) were applied to assess the performance of different assignment methods. The results revealed that the DNN, Garson, and Olden methods are able to assign individuals to true populations with 4270, 4937, and 7999 SNP markers, respectively. The PCA was used to determine how the animals allocated to the groups using all genotyped markers available on 50k Bead chip and the subset of SNP markers identified with different methods. The results indicated that all SNP panels are able to assign individuals into their true breeds. The success percentage of genetic assignment for different methods assessed by different levels of LLR showed that the success rate of 70% in the analysis was obtained by three methods with the number of markers of 110, 208, and 178 tags for DNN, Garson, and Olden methods, respectively. Also the results showed that DNN performed better than other two approaches by achieving 93% accuracy at the most stringent threshold. Finally, the identified SNPs were successfully used in independent out-group breeds consisting 120 individuals from eight breeds and the results indicated that these markers are able to correctly allocate all unknown samples to true population of origin. Furthermore, the NJ tree of allele-sharing distances on the validation dataset showed that the DNN has a high potential for feature selection. In general, the results of this study indicated that the DNN technique represents an efficient strategy for selecting a reduced pool of highly discriminant markers for assigning individuals to the true population of origin.
Collapse
Affiliation(s)
- Siavash Manzoori
- Department of Animal Science, Faculty of Agriculture and Natural Resources, Arak University, Arak, Iran
| | | | - Mohammad Hossein Moradi
- Department of Animal Science, Faculty of Agriculture and Natural Resources, Arak University, Arak, Iran
| | - Mehdi Kazemi-Bonchenari
- Department of Animal Science, Faculty of Agriculture and Natural Resources, Arak University, Arak, Iran
| |
Collapse
|
8
|
Zhao C, Wang D, Teng J, Yang C, Zhang X, Wei X, Zhang Q. Breed identification using breed-informative SNPs and machine learning based on whole genome sequence data and SNP chip data. J Anim Sci Biotechnol 2023; 14:85. [PMID: 37259083 DOI: 10.1186/s40104-023-00880-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2022] [Accepted: 04/05/2023] [Indexed: 06/02/2023] Open
Abstract
BACKGROUND Breed identification is useful in a variety of biological contexts. Breed identification usually involves two stages, i.e., detection of breed-informative SNPs and breed assignment. For both stages, there are several methods proposed. However, what is the optimal combination of these methods remain unclear. In this study, using the whole genome sequence data available for 13 cattle breeds from Run 8 of the 1,000 Bull Genomes Project, we compared the combinations of three methods (Delta, FST, and In) for breed-informative SNP detection and five machine learning methods (KNN, SVM, RF, NB, and ANN) for breed assignment with respect to different reference population sizes and difference numbers of most breed-informative SNPs. In addition, we evaluated the accuracy of breed identification using SNP chip data of different densities. RESULTS We found that all combinations performed quite well with identification accuracies over 95% in all scenarios. However, there was no combination which performed the best and robust across all scenarios. We proposed to integrate the three breed-informative detection methods, named DFI, and integrate the three machine learning methods, KNN, SVM, and RF, named KSR. We found that the combination of these two integrated methods outperformed the other combinations with accuracies over 99% in most cases and was very robust in all scenarios. The accuracies from using SNP chip data were only slightly lower than that from using sequence data in most cases. CONCLUSIONS The current study showed that the combination of DFI and KSR was the optimal strategy. Using sequence data resulted in higher accuracies than using chip data in most cases. However, the differences were generally small. In view of the cost of genotyping, using chip data is also a good option for breed identification.
Collapse
Affiliation(s)
- Changheng Zhao
- Shandong Provincial Key Laboratory of Animal Biotechnology and Disease Control and Prevention, College of Animal Science and Veterinary Medicine, Shandong Agricultural University, Tai'an, 271018, China
| | - Dan Wang
- Shandong Provincial Key Laboratory of Animal Biotechnology and Disease Control and Prevention, College of Animal Science and Veterinary Medicine, Shandong Agricultural University, Tai'an, 271018, China
| | - Jun Teng
- Shandong Provincial Key Laboratory of Animal Biotechnology and Disease Control and Prevention, College of Animal Science and Veterinary Medicine, Shandong Agricultural University, Tai'an, 271018, China
| | - Cheng Yang
- Shandong Provincial Key Laboratory of Animal Biotechnology and Disease Control and Prevention, College of Animal Science and Veterinary Medicine, Shandong Agricultural University, Tai'an, 271018, China
| | - Xinyi Zhang
- Shandong Provincial Key Laboratory of Animal Biotechnology and Disease Control and Prevention, College of Animal Science and Veterinary Medicine, Shandong Agricultural University, Tai'an, 271018, China
| | - Xianming Wei
- Shandong Provincial Key Laboratory of Animal Biotechnology and Disease Control and Prevention, College of Animal Science and Veterinary Medicine, Shandong Agricultural University, Tai'an, 271018, China
| | - Qin Zhang
- Shandong Provincial Key Laboratory of Animal Biotechnology and Disease Control and Prevention, College of Animal Science and Veterinary Medicine, Shandong Agricultural University, Tai'an, 271018, China.
| |
Collapse
|
9
|
Ryan CA, Berry DP, O’Brien A, Pabiou T, Purfield DC. Evaluating the use of statistical and machine learning methods for estimating breed composition of purebred and crossbred animals in thirteen cattle breeds using genomic information. Front Genet 2023; 14:1120312. [PMID: 37274789 PMCID: PMC10237237 DOI: 10.3389/fgene.2023.1120312] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2022] [Accepted: 05/03/2023] [Indexed: 06/07/2023] Open
Abstract
Introduction: The ability to accurately predict breed composition using genomic information has many potential uses including increasing the accuracy of genetic evaluations, optimising mating plans and as a parameter for genotype quality control. The objective of the present study was to use a database of genotyped purebred and crossbred cattle to compare breed composition predictions using a freely available software, Admixture, with those from a single nucleotide polymorphism Best Linear Unbiased Prediction (SNP-BLUP) approach; a supplementary objective was to determine the accuracy and general robustness of low-density genotype panels for predicting breed composition. Methods: All animals had genotype information on 49,213 autosomal single nucleotide polymorphism (SNPs). Thirteen breeds were included in the analysis and 500 purebred animals per breed were used to establish the breed training populations. Accuracy of breed composition prediction was determined using a separate validation population of 3,146 verified purebred and 4,330 two and three-way crossbred cattle. Results: When all 49,213 autosomal SNPs were used for breed prediction, a minimal absolute mean difference of 0.04 between Admixture vs. SNP-BLUP breed predictions was evident. For crossbreds, the average absolute difference in breed prediction estimates generated using SNP-BLUP and Admixture was 0.068 with a root mean square error of 0.08. Breed predictions from low-density SNP panels were generated using both SNP-BLUP and Admixture and compared to breed prediction estimates using all 49,213 SNPs (representing the gold standard). Breed composition estimates of crossbreds required more SNPs than predicting the breed composition of purebreds. SNP-BLUP required ≥3,000 SNPs to predict crossbred breed composition, but only 2,000 SNPs were required to predict purebred breed status. The absolute mean (standard deviation) difference across all panels <2,000 SNPs was 0.091 (0.054) and 0.315 (0.316) when predicting the breed composition of all animals using Admixture and SNP-BLUP, respectively compared to the gold standard prediction. Discussion: Nevertheless, a negligible absolute mean (standard deviation) difference of 0.009 (0.123) in breed prediction existed between SNP-BLUP and Admixture once ≥3,000 SNPs were considered, indicating that the prediction of breed composition could be readily integrated into SNP-BLUP pipelines used for genomic evaluations thereby avoiding the necessity for a stand-alone software.
Collapse
Affiliation(s)
- C. A. Ryan
- Teagasc, Co. Cork, Ireland
- Munster Technological University, Cork, Ireland
| | | | | | - T. Pabiou
- Irish Cattle Breeding Federation, Cork, Ireland
| | | |
Collapse
|
10
|
Miao J, Chen Z, Zhang Z, Wang Z, Wang Q, Zhang Z, Pan Y. A web tool for the global identification of pig breeds. Genet Sel Evol 2023; 55:18. [PMID: 36944938 PMCID: PMC10029154 DOI: 10.1186/s12711-023-00788-0] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2022] [Accepted: 02/14/2023] [Indexed: 03/23/2023] Open
Abstract
BACKGROUND Natural and artificial selection for more than 9000 years have led to a variety of domestic pig breeds. Accurate identification of pig breeds is important for breed conservation, sustainable breeding, pork traceability, and local resource registration. RESULTS We evaluated the performance of four selectors and six classifiers for breed identification using a wide range of pig breeds (N = 91). The internal cross-validation and external independent testing showed that partial least squares regression (PLSR) was the most effective selector and partial least squares-discriminant analysis (PLS-DA) was the most powerful classifier for breed identification among many breeds. Five-fold cross-validation indicated that using PLSR as the selector and PLS-DA as the classifier to discriminate 91 pig breeds yielded 98.4% accuracy with only 3K single nucleotide polymorphisms (SNPs). We also constructed a reference dataset with 124 pig breeds and used it to develop the web tool iDIGs ( http://alphaindex.zju.edu.cn/iDIGs_en/ ) as a comprehensive application for global pig breed identification. iDIGs allows users to (1) identify pig breeds without a reference population and (2) design small panels to discriminate several specific pig breeds. CONCLUSIONS In this study, we proved that breed identification among a wide range of pig breeds is feasible and we developed a web tool for such pig breed identification.
Collapse
Affiliation(s)
- Jian Miao
- College of Animal Sciences, Zhejiang University, Hangzhou, 310058, Zhejiang, China
| | - Zitao Chen
- College of Animal Sciences, Zhejiang University, Hangzhou, 310058, Zhejiang, China
| | - Zhenyang Zhang
- College of Animal Sciences, Zhejiang University, Hangzhou, 310058, Zhejiang, China
| | - Zhen Wang
- College of Animal Sciences, Zhejiang University, Hangzhou, 310058, Zhejiang, China
| | - Qishan Wang
- College of Animal Sciences, Zhejiang University, Hangzhou, 310058, Zhejiang, China
- Hainan Institute of Zhejiang University, Building 11, Yongyou Industrial Park, Yazhou Bay Science and Technology City, Yazhou District, Sanya, 572025, Hainan, China
| | - Zhe Zhang
- College of Animal Sciences, Zhejiang University, Hangzhou, 310058, Zhejiang, China.
| | - Yuchun Pan
- College of Animal Sciences, Zhejiang University, Hangzhou, 310058, Zhejiang, China.
- Hainan Institute of Zhejiang University, Building 11, Yongyou Industrial Park, Yazhou Bay Science and Technology City, Yazhou District, Sanya, 572025, Hainan, China.
| |
Collapse
|
11
|
Classification of cattle breeds based on the random forest approach. Livest Sci 2023. [DOI: 10.1016/j.livsci.2022.105143] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]
|
12
|
Gao J, Sun L, Zhang S, Xu J, He M, Zhang D, Wu C, Dai J. Screening Discriminating SNPs for Chinese Indigenous Pig Breeds Identification Using a Random Forests Algorithm. Genes (Basel) 2022; 13:2207. [PMID: 36553474 PMCID: PMC9778029 DOI: 10.3390/genes13122207] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/08/2022] [Revised: 11/19/2022] [Accepted: 11/23/2022] [Indexed: 11/27/2022] Open
Abstract
Chinese indigenous pig breeds have unique genetic characteristics and a rich diversity; however, effective breed identification methods have not yet been well established. In this study, a genotype file of 62,822 single-nucleotide polymorphisms (SNPs), which were obtained from 1059 individuals of 18 Chinese indigenous pig breeds and 5 cosmopolitan breeds, were used to screen the discriminating SNPs for pig breed identification. After linkage disequilibrium (LD) pruning filtering, this study excluded 396 SNPs on non-constant chromosomes and retained 20.92~-27.84% of SNPs for each of the 18 autosomes, leaving a total of 14,823 SNPs. The principal component analysis (PCA) showed the largest differences between cosmopolitan and Chinese pig breeds (PC1 = 10.452%), while relatively small differences were found among the 18 indigenous pig breeds from the Yangtze River Delta region of China. Next, a random forest (RF) algorithm was used to filter these SNPs and obtain the optimal number of decision trees (ntree = 1000) using corresponding out-of-bag (OOB) error rates. By comparing two different SNP ranking methods in the RF analysis, the mean decreasing accuracy (MDA) and mean decreasing Gini index (MDG), the effects of panels with different numbers of SNPs on the assignment accuracy, and the statistics of SNP distribution on each chromosome in the panels, a panel of 1000 of the most breed-discriminative tagged SNPs were finally selected based on the MDA screening method. A high accuracy (>99.3%) was obtained by the breed prediction of 318 samples in the RF test set; thus, a machine learning classification method was established for the multi-breed identification of Chinese indigenous pigs based on a low-density panel of SNPs.
Collapse
Affiliation(s)
- Jun Gao
- Institute of Animal Husbandry and Veterinary Science, Shanghai Academy of Agricultural Sciences, Shanghai 201106, China
- Key Laboratory of Livestock and Poultry Resources (Pig) Evaluation and Utilization, Ministry of Agriculture and Rural Affairs, Shanghai 201106, China
| | - Lingwei Sun
- Institute of Animal Husbandry and Veterinary Science, Shanghai Academy of Agricultural Sciences, Shanghai 201106, China
- Shanghai Municipal Key Laboratory of Agri-Genetics and Breeding, Shanghai 201106, China
| | - Shushan Zhang
- Institute of Animal Husbandry and Veterinary Science, Shanghai Academy of Agricultural Sciences, Shanghai 201106, China
- Key Laboratory of Livestock and Poultry Resources (Pig) Evaluation and Utilization, Ministry of Agriculture and Rural Affairs, Shanghai 201106, China
- Shanghai Municipal Key Laboratory of Agri-Genetics and Breeding, Shanghai 201106, China
- Shanghai Engineering Research Center of Pig Breeding, Shanghai 201106, China
| | - Jiehuan Xu
- Institute of Animal Husbandry and Veterinary Science, Shanghai Academy of Agricultural Sciences, Shanghai 201106, China
- Shanghai Municipal Key Laboratory of Agri-Genetics and Breeding, Shanghai 201106, China
- Shanghai Engineering Research Center of Pig Breeding, Shanghai 201106, China
| | - Mengqian He
- Institute of Animal Husbandry and Veterinary Science, Shanghai Academy of Agricultural Sciences, Shanghai 201106, China
- Shanghai Municipal Key Laboratory of Agri-Genetics and Breeding, Shanghai 201106, China
| | - Defu Zhang
- Institute of Animal Husbandry and Veterinary Science, Shanghai Academy of Agricultural Sciences, Shanghai 201106, China
- Key Laboratory of Livestock and Poultry Resources (Pig) Evaluation and Utilization, Ministry of Agriculture and Rural Affairs, Shanghai 201106, China
- Shanghai Municipal Key Laboratory of Agri-Genetics and Breeding, Shanghai 201106, China
- Shanghai Engineering Research Center of Pig Breeding, Shanghai 201106, China
| | - Caifeng Wu
- Institute of Animal Husbandry and Veterinary Science, Shanghai Academy of Agricultural Sciences, Shanghai 201106, China
- Key Laboratory of Livestock and Poultry Resources (Pig) Evaluation and Utilization, Ministry of Agriculture and Rural Affairs, Shanghai 201106, China
| | - Jianjun Dai
- Institute of Animal Husbandry and Veterinary Science, Shanghai Academy of Agricultural Sciences, Shanghai 201106, China
- Key Laboratory of Livestock and Poultry Resources (Pig) Evaluation and Utilization, Ministry of Agriculture and Rural Affairs, Shanghai 201106, China
- Shanghai Municipal Key Laboratory of Agri-Genetics and Breeding, Shanghai 201106, China
- Shanghai Engineering Research Center of Pig Breeding, Shanghai 201106, China
| |
Collapse
|
13
|
Varga L, Edviné EM, Hudák P, Anton I, Pálinkás-Bodzsár N, Zsolnai A. Balancing at the Borderline of a Breed: A Case Study of the Hungarian Short-Haired Vizsla Dog Breed, Definition of the Breed Profile Using Simple SNP-Based Methods. Genes (Basel) 2022; 13:2022. [PMID: 36360261 PMCID: PMC9690546 DOI: 10.3390/genes13112022] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2022] [Revised: 10/30/2022] [Accepted: 10/31/2022] [Indexed: 09/16/2023] Open
Abstract
The aim of this study was to determine the breed boundary of the Hungarian Short-haired Vizsla (HSV) dog breed. Seventy registered purebred HSV dogs were genotyped on approximately 145,000 SNPs. Principal Component Analysis (PCA) and Admixture analysis certified that they belong to the same population. The outer point of the breed demarcation was a single Hungarian Wire-haired Vizsla (HWV) individual, which was the closest animal genetically to the HSV population in the PCA analysis. Three programs were used for the breed assignment calculations, including the widely used GeneClass2.0 software and two additional approaches developed here: the 'PCA-distance' and 'IBS-central' methods. Both new methods calculate a single number that represents how closely a dog fits into the actual reference population. The former approach calculates this number based on the PCA distances from the median of HSV animals. The latter calculates it from identity by state (IBS) data, measuring the distance from a central animal that is the best representative of the breed. Having no mixed-breed dogs with known HSV genome proportion, admixture animals were simulated by using data of HSV and HWV individuals to calibrate the inclusion/exclusion probabilities for the assignment. The numbers generated from these relatively simple calculations can be used by breeders and clubs to keep their populations under genetic supervision.
Collapse
Affiliation(s)
- László Varga
- Institute of Genetics and Biotechnology, Hungarian University of Agriculture and Life Sciences, Szent István Campus, 2100 Gödöllő, Hungary
- Institute for Farm Animal Gene Conservation, National Centre for Biodiversity and Gene Conservation, 2100 Gödöllő, Hungary
| | - Erika Meleg Edviné
- Institute for Farm Animal Gene Conservation, National Centre for Biodiversity and Gene Conservation, 2100 Gödöllő, Hungary
| | - Péter Hudák
- Institute for Farm Animal Gene Conservation, National Centre for Biodiversity and Gene Conservation, 2100 Gödöllő, Hungary
| | - István Anton
- Department of Animal Breeding, Institute of Animal Science, Hungarian University of Agriculture and Life Sciences, Kaposvár Campus, 2053 Herceghalom, Hungary
| | - Nóra Pálinkás-Bodzsár
- Institute for Farm Animal Gene Conservation, National Centre for Biodiversity and Gene Conservation, 2100 Gödöllő, Hungary
| | - Attila Zsolnai
- Institute for Farm Animal Gene Conservation, National Centre for Biodiversity and Gene Conservation, 2100 Gödöllő, Hungary
- Department of Animal Breeding, Institute of Animal Science, Hungarian University of Agriculture and Life Sciences, Kaposvár Campus, 2053 Herceghalom, Hungary
| |
Collapse
|
14
|
Makombu JG, Cheruiyot EK, Stomeo F, Thuo DN, Oben PM, Oben BO, Zango P, Mialhe E, Ngueguim JR, Mujibi FDN. Species-informative SNP markers for characterising freshwater prawns of genus Macrobrachium in Cameroon. PLoS One 2022; 17:e0263540. [PMID: 36190939 PMCID: PMC9529149 DOI: 10.1371/journal.pone.0263540] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2022] [Accepted: 08/16/2022] [Indexed: 11/07/2022] Open
Abstract
Single Nucleotide Polymorphisms (SNPs) are now popular for a myriad of applications in animal and plant species including, ancestry assignment, conservation genetics, breeding, and traceability of animal products. The objective of this study was to develop a customized cost-effective SNP panel for genetic characterisation of Macrobrachium species in Cameroon. The SNPs identified in a previous characterization study were screened as viable candidates for the reduced panel. Starting from a full set of 1,814 SNPs, a total of 72 core SNPs were chosen using conventional approaches: allele frequency differentials, minor allele frequency profiles, and Wright’s Fst statistics. The discriminatory power of reduced set of informative SNPs were then tested using the admixture analysis, principal component analysis, and discriminant analysis of principal components. The panel of prioritised SNP markers (i.e., N = 72 SNPs) distinguished Macrobrachium species with 100% accuracy. However, large sample size is needed to identify more informative SNPs for discriminating genetically closely related species, including M. macrobrachion versus M. vollenhovenii and M. sollaudii versus M. dux. Overall, the findings in this study show that we can accurately characterise Macrobrachium using a small set of core SNPs which could be useful for this economically important species in Cameroon. Given the results obtained in this study, a larger independent validation sample set will be needed to confirm the discriminative capacity of this SNP panel for wider commercial and research applications.
Collapse
Affiliation(s)
- Judith G. Makombu
- Department of Fisheries and Aquatic Resources Management, Faculty of Agriculture and Veterinary Medicine, University of Buea, Buea, Cameroon
| | | | - Francesca Stomeo
- Biosciences Eastern and Central Africa—International Livestock Research Institute (BecA-ILRI) Hub, Nairobi, Kenya
| | - David N. Thuo
- Australian National Wildlife Collection, National Research Collections Australia, CSIRO, Canberra, Australia
| | - Pius M. Oben
- Department of Fisheries and Aquatic Resources Management, Faculty of Agriculture and Veterinary Medicine, University of Buea, Buea, Cameroon
| | - Benedicta O. Oben
- Department of Fisheries and Aquatic Resources Management, Faculty of Agriculture and Veterinary Medicine, University of Buea, Buea, Cameroon
| | - Paul Zango
- Institute of Fisheries and Aquatic Sciences, University of Douala, Yabassi, Cameroon
| | - Eric Mialhe
- Concepto Azul, Cdlavernaza Norte, Guayaquil, Ecuador
| | - Jules R. Ngueguim
- Institute of Agriculture Research for Development (IRAD), Kribi, Cameroon
| | | |
Collapse
|
15
|
Wilmot H, Glorieux G, Hubin X, Gengler N. A genomic breed assignment test for traceability of meat of Dual-Purpose Blue. Livest Sci 2022. [DOI: 10.1016/j.livsci.2022.104996] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
|
16
|
Cho E, Cho S, Kim M, Ediriweera TK, Seo D, Lee SS, Cha J, Jin D, Kim YK, Lee JH. Single nucleotide polymorphism marker combinations for classifying Yeonsan Ogye chicken using a machine learning approach. JOURNAL OF ANIMAL SCIENCE AND TECHNOLOGY 2022; 64:830-841. [PMID: 36287747 PMCID: PMC9574617 DOI: 10.5187/jast.2022.e64] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/13/2022] [Revised: 07/15/2022] [Accepted: 08/01/2022] [Indexed: 11/27/2022]
Abstract
Genetic analysis has great potential as a tool to differentiate between different species and breeds of livestock. In this study, the optimal combinations of single nucleotide polymorphism (SNP) markers for discriminating the Yeonsan Ogye chicken (Gallus gallus domesticus) breed were identified using high-density 600K SNP array data. In 3,904 individuals from 198 chicken breeds, SNP markers specific to the target population were discovered through a case-control genome-wide association study (GWAS) and filtered out based on the linkage disequilibrium blocks. Significant SNP markers were selected by feature selection applying two machine learning algorithms: Random Forest (RF) and AdaBoost (AB). Using a machine learning approach, the 38 (RF) and 43 (AB) optimal SNP marker combinations for the Yeonsan Ogye chicken population demonstrated 100% accuracy. Hence, the GWAS and machine learning models used in this study can be efficiently utilized to identify the optimal combination of markers for discriminating target populations using multiple SNP markers.
Collapse
Affiliation(s)
- Eunjin Cho
- Department of Bio-AI Convergence, Chungnam
National University, Daejeon 34134, Korea
| | - Sunghyun Cho
- Research and Development Center,
Insilicogen Inc., Yongin 19654, Korea
| | - Minjun Kim
- Division of Animal and Dairy Science,
Chungnam National University, Daejeon 34134, Korea
| | | | - Dongwon Seo
- Department of Bio-AI Convergence, Chungnam
National University, Daejeon 34134, Korea,Research Institute TNT Research
Company, Jeonju 54810, Korea
| | | | - Jihye Cha
- Animal Genome & Bioinformatics,
National Institute of Animal Science, Rural Development
Administration, Wanju 55365, Korea
| | - Daehyeok Jin
- Animal Genetic Resources Research Center,
National Institute of Animal Science, Rural Development
Administration, Hamyang 50000, Korea
| | - Young-Kuk Kim
- Department of Bio-AI Convergence, Chungnam
National University, Daejeon 34134, Korea
| | - Jun Heon Lee
- Department of Bio-AI Convergence, Chungnam
National University, Daejeon 34134, Korea,Division of Animal and Dairy Science,
Chungnam National University, Daejeon 34134, Korea,Corresponding author: Jun Heon Lee,
Department of Bio-AI Convergence, Chungnam National University, Daejeon 34134,
Korea. Tel: +82-42-821-5779, E-mail:
| |
Collapse
|
17
|
Identification of Ancestry Informative Markers in Mediterranean Trout Populations of Molise (Italy): A Multi-Methodological Approach with Machine Learning. Genes (Basel) 2022; 13:genes13081351. [PMID: 36011262 PMCID: PMC9407066 DOI: 10.3390/genes13081351] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2022] [Revised: 07/22/2022] [Accepted: 07/26/2022] [Indexed: 01/27/2023] Open
Abstract
Brown trout (Salmo trutta), like many other freshwater species, is threated by the release in its natural environment of alien species and the restocking with allochthonous conspecific stocks. Many conservation projects are ongoing and several morphological and genetic tools have been proposed to support activities aimed to restore genetic integrity status of native populations. Nevertheless, due to the complexity of degree of introgression reached up after many generations of crossing, the use of dichotomous key and molecular markers, such as mtDNA, LDH-C1* and microsatellites, are often not sufficient to discriminate native and admixed specimens at individual level. Here we propose a reduced panel of ancestry-informative SNP markers (AIMs) to support on field activities for Mediterranean trout management and conservation purpose. Starting from the genotypes data obtained on specimens sampled in the main two Molise’s rivers (Central-Southern Italy), a 47 AIMs panel was identified and validated on simulated and real hybrid population datasets, mainly through a Machine Learning approach based on Random Forest classifier. The AIMs panel proposed may represent an interesting and cost-effective tool for monitoring the level of introgression between native and allochthonous trout population for conservation purpose and this methodology could be also applied in other species.
Collapse
|
18
|
A 20-SNP Panel as a Tool for Genetic Authentication and Traceability of Pig Breeds. Animals (Basel) 2022; 12:ani12111335. [PMID: 35681800 PMCID: PMC9179885 DOI: 10.3390/ani12111335] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2022] [Revised: 05/11/2022] [Accepted: 05/18/2022] [Indexed: 11/16/2022] Open
Abstract
Simple Summary Given the high economic and qualitative values of local-breed meat products, it is not uncommon that substitution or mislabeling (either fraudulent or accidental) occurs at the market level. Therefore, to protect the interests of both producers and consumers, a reliable traceability tool should be developed. Nowadays, traceability usually relies on physical labeling systems (e.g., ear tags, tattoos, or electronic transponders). These systems do not, however, have good performances when dealing with carcasses or processed meat products. Molecular markers (i.e., based on the DNA sequence) can be a solution, since DNA is easily extracted from a wide variety of animal products and parts, and is not degraded during processing, even at the high temperatures involved. The aim of this study was to identify a small number of DNA mutations for breed-traceability purposes, in particular of the Italian Nero Siciliano pig and its derived products. A small panel of 12 DNA mutations was enough to discriminate Nero Siciliano pigs from other pig breeds and from wild boars. Abstract Food authentication in local breeds has important implications from both an economic and a qualitative point of view. Meat products from autochthonous breeds are of premium value, but can easily incur fraudulent or accidental substitution or mislabeling. The aim of this study was to identify a small number of SNPs using the Illumina PorcineSNP60 BeadChip for breed traceability, in particular of the Italian Nero Siciliano pig and its derived products. A panel of 12 SNPs was sufficient to discriminate Nero Siciliano pig from cosmopolitan breeds and wild boars. After adding 8 SNPs, the final panel of 20 SNPs allowed us to discriminate all the breeds involved in the study, to correctly assign each individual to its breed, and, moreover, to discriminate Nero Siciliano from first-generation hybrids. Almost all livestock breeds are being genotyped with medium- or high-density SNP panels, providing a large amount of information for many applications. Here, we proposed a method to select a reduced SNP panel to be used for the traceability of pig breeds.
Collapse
|
19
|
Admixture and breed traceability in European indigenous pig breeds and wild boar using genome-wide SNP data. Sci Rep 2022; 12:7346. [PMID: 35513520 PMCID: PMC9072372 DOI: 10.1038/s41598-022-10698-8] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2021] [Accepted: 04/11/2022] [Indexed: 11/16/2022] Open
Abstract
Preserving diversity of indigenous pig (Sus scrofa) breeds is a key factor to (i) sustain the pork chain (both at local and global scales) including the production of high-quality branded products, (ii) enrich the animal biobanking and (iii) progress conservation policies. Single nucleotide polymorphism (SNP) chips offer the opportunity for whole-genome comparisons among individuals and breeds. Animals from twenty European local pigs breeds, reared in nine countries (Croatia: Black Slavonian, Turopolje; France: Basque, Gascon; Germany: Schwabisch-Hällisches Schwein; Italy: Apulo Calabrese, Casertana, Cinta Senese, Mora Romagnola, Nero Siciliano, Sarda; Lithuania: Indigenous Wattle, White Old Type; Portugal: Alentejana, Bísara; Serbia: Moravka, Swallow-Bellied Mangalitsa; Slovenia: Krškopolje pig; Spain: Iberian, Majorcan Black), and three commercial breeds (Duroc, Landrace and Large White) were sampled and genotyped with the GeneSeek Genomic Profiler (GGP) 70 K HD porcine genotyping chip. A dataset of 51 Wild Boars from nine countries was also added, summing up to 1186 pigs (~ 49 pigs/breed). The aim was to: (i) investigate individual admixture ancestries and (ii) assess breed traceability via discriminant analysis on principal components (DAPC). Albeit the mosaic of shared ancestries found for Nero Siciliano, Sarda and Moravka, admixture analysis indicated independent evolvement for the rest of the breeds. High prediction accuracy of DAPC mark SNP data as a reliable solution for the traceability of breed-specific pig products.
Collapse
|
20
|
Recapitulating whole genome based population genetic structure for Indian wild tigers through an ancestry informative marker panel. Heredity (Edinb) 2022; 128:88-96. [PMID: 34857925 PMCID: PMC8813985 DOI: 10.1038/s41437-021-00477-y] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2020] [Revised: 09/30/2021] [Accepted: 10/01/2021] [Indexed: 02/03/2023] Open
Abstract
Identification of genetic structure within wildlife populations have implications in their conservation and management. Accurately inferring population genetic structure requires whole-genome data across the geographical range of the species, which can be resource-intensive. A cheaper strategy is to employ a subset of markers that can efficiently recapitulate the population genetic structure inferred by the whole genome data. Such ancestry informative markers (AIMs), have rarely been developed for endangered species such as tigers utilizing single nucleotide polymorphisms (SNPs). Here, we first identify the population structure of the Indian tiger using whole-genome sequences and then develop an AIMs panel with a minimum number of SNPs that can recapitulate this structure. We identified four population clusters of Indian tigers with North-East, North-West, and South Indian tigers forming three separate groups, and Terai and Central Indian tigers forming a single cluster. To evaluate the robustness of our AIMs, we applied it to a separate dataset of tigers from across India. Out of 92 SNPs present in our AIMs panel, 49 were present in the new dataset. These 49 SNPs were sufficient to recapitulate the population genetic structure obtained from the whole genome data. To the best of our knowledge, this is the first-ever SNP-based AIMs panel for big cats, which can be used as a cost-effective alternative to whole-genome sequencing for detecting the biogeographical origin of Indian tigers. Our study can be used as a guideline for developing an AIMs panel for the management of other endangered species where obtaining whole genome sequences are difficult.
Collapse
|
21
|
Bedhane M, van der Werf J, de las Heras-Saldana S, Lim D, Park B, Na Park M, Seung Hee R, Clark S. The accuracy of genomic prediction for meat quality traits in Hanwoo cattle when using genotypes from different SNP densities and preselected variants from imputed whole genome sequence. ANIMAL PRODUCTION SCIENCE 2022. [DOI: 10.1071/an20659] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
Abstract
Context
Genomic prediction is the use of genomic data in the estimation of genomic breeding values (GEBV) in animal breeding. In beef cattle breeding programs, genomic prediction increases the rates of genetic gain by increasing the accuracy of selection at earlier ages.
Aims
The objectives of the study were to examine the effect of single-nucleotide polymorphism (SNP) density and to evaluate the effect of using SNPs preselected from imputed whole-genome sequence for genomic prediction.
Methods
Genomic and phenotypic data from 2110 Hanwoo steers were used to predict GEBV for marbling score (MS), meat texture (MT), and meat colour (MC) traits. Three types of SNP densities including 50k, high-density (HD), and whole-genome sequence data and preselected SNPs from genome-wide association study (GWAS) were used for genomic prediction analyses. Two scenarios (independent and dependent discovery populations) were used to select top significant SNPs. The accuracy of GEBV was assessed using random cross-validation. Genomic best linear unbiased prediction (GBLUP) was used to predict the breeding values for each trait.
Key results
Our result showed that very similar prediction accuracies were observed across all SNP densities used in the study. The prediction accuracy among traits ranged from 0.29±0.05 for MC to 0.46±0.04 for MS. Depending on the studied traits, up to 5% of prediction accuracy improvement was obtained when the preselected SNPs from GWAS analysis were included in the prediction analysis.
Conclusions
High SNP density such as HD and the whole-genome sequence data yielded a similar prediction accuracy in Hanwoo beef cattle. Therefore, the 50K SNP chip panel is sufficient to capture the relationships in a breed with a small effective population size such as the Hanwoo cattle population. Preselected variants improved prediction accuracy when they were included in the genomic prediction model.
Implications
The estimated genomic prediction accuracies are moderately accurate in Hanwoo cattle and for searching for SNPs that are more productive could increase the accuracy of estimated breeding values for the studied traits.
Collapse
|
22
|
|
23
|
Wilmot H, Bormann J, Soyeurt H, Hubin X, Glorieux G, Mayeres P, Bertozzi C, Gengler N. Development of a genomic tool for breed assignment by comparison of different classification models: Application to three local cattle breeds. J Anim Breed Genet 2021; 139:40-61. [PMID: 34427366 DOI: 10.1111/jbg.12643] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2021] [Revised: 08/06/2021] [Accepted: 08/08/2021] [Indexed: 12/11/2022]
Abstract
Assignment of individual cattle to a specific breed can often not rely on pedigree information. This is especially the case for local breeds for which the development of genomic assignment tools is required to allow individuals of unknown origin to be included to their herd books. A breed assignment model can be based on two specific stages: (a) the selection of breed-informative markers and (b) the assignment of individuals to a breed with a classification method. However, the performance of combination of methods used in these two stages has been rarely studied until now. In this study, the combination of 16 different SNP panels with four classification methods was developed on 562 reference genotypes from 12 cattle breeds. Based on their performances, best models were validated on three local breeds of interest. In cross-validation, 14 models had a global cross-validation accuracy higher than 90%, with a maximum of 98.22%. In validation, best models used 7,153 or 2,005 SNPs, based on a partial least squares-discriminant analysis (PLS-DA) and assigned individuals to breeds based on nearest shrunken centroids. The average validation sensitivity of the first two best models for the three local breeds of interest were 98.33% and 97.5%. Moreover, results reported in this study suggest that further studies should consider the PLS-DA method when selecting breed-informative SNPs.
Collapse
Affiliation(s)
- Hélène Wilmot
- National Fund for Scientific Research (F.R.S.-FNRS), Brussels, Belgium.,TERRA Teaching and Research Centre, Gembloux Agro-Bio Tech, University of Liège, Gembloux, Belgium
| | - Jeanne Bormann
- Administration of Technical Agricultural Services (ASTA), Luxembourg, Grand Duchy of Luxembourg
| | - Hélène Soyeurt
- TERRA Teaching and Research Centre, Gembloux Agro-Bio Tech, University of Liège, Gembloux, Belgium
| | | | | | | | | | - Nicolas Gengler
- TERRA Teaching and Research Centre, Gembloux Agro-Bio Tech, University of Liège, Gembloux, Belgium
| |
Collapse
|
24
|
Genome-wide selection of discriminant SNP markers for breed assignment in indigenous sheep breeds. ANNALS OF ANIMAL SCIENCE 2021. [DOI: 10.2478/aoas-2020-0097] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
Abstract
Abstract
The assignment of an individual to the true population of origin is one of the most important applications of genomic data for practical use in animal breeding. The aim of this study was to develop a statistical method and then, to identify the minimum number of informative SNP markers from high-throughput genotyping data that would be able to trace the true breed of unknown samples in indigenous sheep breeds. The total numbers of 217 animals were genotyped using Illumina OvineSNP50K BeadChip in Zel, Lori-Bakhtiari, Afshari, Moqani, Qezel and a wild-type Iranian sheep breed. After SNP quality check, the principal component analysis (PCA) was used to determine how the animals allocated to the groups using all genotyped markers. The results revealed that the first principal component (PC1) separated out the two domestic and wild sheep breeds, and all domestic breeds were separated from each other for PC2. The genetic distance between different breeds was calculated using FST and Reynold methods and the results showed that the breeds were well differentiated. A statistical method was developed using the stepwise discriminant analysis (SDA) and the linear discriminant analysis (LDA) to reduce the number of SNPs for discriminating 6 different Iranian sheep populations and K-fold cross-validation technique was employed to evaluate the potential of a selected subset of SNPs in assignment success rate. The procedure selected reduced pools of markers into 201 SNPs that were able to exactly discriminate all sheep populations with 100% accuracy. Moreover, a discriminate analysis of principal components (DAPC) developed using 201 linearly independent SNPs revealed that these markers were able to assign all individuals into true breed. Finally, these 201 identified SNPs were successfully used in an independent out-group breed consisting of 96 samples of Baluchi sheep breed and the results indicated that these markers are able to correctly allocate all unknown samples to true population of origin. In general, the results of this study indicated that the combined use of the SDA and LDA techniques represents an efficient strategy for selecting a reduced pool of highly discriminant markers.
Collapse
|
25
|
Development and Validation of a Multi-Locus PCR-HRM Method for Species Identification in Mytilus Genus with Food Authenticity Purposes. Foods 2021; 10:foods10081684. [PMID: 34441462 PMCID: PMC8391999 DOI: 10.3390/foods10081684] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2021] [Revised: 07/06/2021] [Accepted: 07/06/2021] [Indexed: 11/17/2022] Open
Abstract
DNA-based methods using informative markers such as single nucleotide polymorphism (SNPs) are suitable for reliable species identification (SI) needed to enforce compliance with seafood labelling regulations (EU No.1379/2013). We developed a panel of 10 highly informative SNPs to be genotyped by PCR-High resolution melting (HRM) for SI in the Mytilus genus through in silico and in vitro stages. Its fitness for purpose and concordance were assessed by an internal validation process and by the transference to a second laboratory. The method was applicable to identify M. chilensis, M. edulis, M. galloprovincialis and M. trossulus mussels, fresh, frozen and canned with brine, oil and scallop sauce, but not in preserves containing acetic acid (wine vinegar) and tomato sauce. False-positive and negative rates were zero. Sensitivity, expressed as limit of detection (LOD), ranged between 5 and 8 ng/μL. The method was robust against small variations in DNA quality, annealing time and temperature, primer concentration, reaction volume and HRM kit. Reference materials and 220 samples were tested in an inter-laboratory assay obtaining an “almost perfect agreement” (κ = 0.925, p < 0.001). In conclusion, the method was suitable for the intended use and to be applied in the seafood industry.
Collapse
|
26
|
Hayah I, Ababou M, Botti S, Badaoui B. Comparison of three statistical approaches for feature selection for fine-scale genetic population assignment in four pig breeds. Trop Anim Health Prod 2021; 53:395. [PMID: 34245361 DOI: 10.1007/s11250-021-02824-x] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2020] [Accepted: 06/18/2021] [Indexed: 10/20/2022]
Abstract
BACKGROUND Assigning animals to their corresponding breeds through breed informative single-nucleotide polymorphisms (SNPs) is required in many fields. For instance, it is used in the traceability and the authentication of meat and other livestock products. SNPs' information for several pork breeds are now accessible thanks to the availability of dense SNP chips. These SNP chips cover a large number of molecular markers distributed across the entire genome. To identify the pork breed from a sample of industrial meat, one must analyze a large panel of genetic markers depending on the SNP chip used. The analysis of such large datasets requires intensive work. This leads to the idea of creating less dense chips of breed informative markers based on a reduced number of SNPs. Therefore, the analysis of the data emanating from the genotyping of these reduced chips will require less time and effort. AIM The objective of this study is to find the most informative SNPs for the discrimination between four pig breeds, namely Duroc, Landrace, Large White, and Pietrain. METHOD The Illumina Porcine 60 k SNP chip was used to genotype SNPs distributed all over the individuals' genomes. Firstly, we used three different statistical approaches for feature selection: (i) principal component analysis (PCA), (ii) least absolute shrinkage and selection operator (LASSO), and (iii) random forest (RF). These three approaches identified three sets of SNPs; each set corresponds to one approach. Then, we combined the results of the three methods by setting up a final panel containing the SNPs which appear on the three sets altogether. RESULTS Separately, each method resulted in a panel with the corresponding most discriminating SNPs. The PCA, the LASSO, and the random forest with Boruta algorithm highlighted 28,816, 50, and 286 SNPs, respectively. The number of SNPs selected by PCA is high compared to Boruta and LASSO because PCA chooses the variables while preserving as much information about the data as possible. The only downside of LASSO regression is that among a group of correlated variables, LASSO tends to select only one variable and ignore the others regardless of their importance. Contrarily to LASSO, the Boruta algorithm considers the interdependence between SNPs and selects informative variables even if they are correlated and have the same effect. The three panels shared 23 SNPs; the distribution of the individuals according to these SNPs showed a grouping of individuals of each breed in well-defined clusters without any overlapping. CONCLUSIONS The biological pathways represented by 23 breed informative SNPs resulted by the combination of PCA, LASSO, and Boruta should be explored in further analysis. The results provided by our study are promising for further applications of this method in other livestock animals.
Collapse
Affiliation(s)
- Ichrak Hayah
- Plant and Microbial Biotechnologies, Biodiversity, and Environment (BioBio), Mohammed V University in Rabat, 4 Ibn Battouta Avenue, B.P. 1014 RP, Rabat, Morocco
| | - Mouna Ababou
- Laboratory of Human Pathologies, Genomic Center of Human Pathologies, Mohammed V University in Rabat, 4 Ibn Battouta Avenue, B.P. 1014 RP, Rabat, Morocco
| | - Sara Botti
- PTP Science Park, Via Einstein - Loc. Cascina Codazza, 26900, Lodi, Italy
| | - Bouabid Badaoui
- Plant and Microbial Biotechnologies, Biodiversity, and Environment (BioBio), Mohammed V University in Rabat, 4 Ibn Battouta Avenue, B.P. 1014 RP, Rabat, Morocco.
| |
Collapse
|
27
|
de Medeiros LA, Ribas CC, Lima AP. Genetic Diversification of Adelphobates quinquevittatus (Anura: Dendrobatidae) and the Influence of Upper Madeira River Historical Dynamics. Evol Biol 2021. [DOI: 10.1007/s11692-021-09536-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
|
28
|
Strucken EM, Swaminathan M, Gibson JP. Small SNP panels for breed proportion estimation in Indian crossbred dairy cattle. J Anim Breed Genet 2021; 138:698-707. [PMID: 33687116 PMCID: PMC8519156 DOI: 10.1111/jbg.12544] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2020] [Revised: 01/26/2021] [Accepted: 02/20/2021] [Indexed: 11/29/2022]
Abstract
Reliably identifying breed proportions in crossbred cattle in smallholder farms is a crucial step to improve mating decisions and optimizing management in these systems. High‐density genotype information is able to estimate higher‐order breed proportions accurately, but, are too expensive for mass application in smallholder systems. We used high‐density genotype information (777 k SNPs) of 623 crossbred cattle from India that had Holstein‐Friesian (HFX) and/or Jersey and indigenous breeds in their ancestry to select a smaller number of SNPs for breed proportion estimation. The accuracy of estimates obtained from panels with 100–500 SNP was compared to estimates based on all SNPs. Panels were selected for highest absolute allele frequency difference between exotic dairy versus indigenous Bos indicus, or between HFX versus Jersey breeds. A step‐wise pruning approach was developed showing that and increased physical distances between markers of 8.5 Mb improved breed proportion estimation compared to a standard 1 Mb distance. A panel of 500 SNPs optimized to estimate HFX versus Jersey versus indicine ancestry was able to estimate indicine breed proportions with r2 = .991, HFX proportions with r2 = .979 and Jersey proportions with r2 = .949. The number of markers was a deciding factor in estimation accuracy, together with the distribution of markers across the genome.
Collapse
Affiliation(s)
- Eva M Strucken
- Centre for Genetic Analysis and Applications, School of Environmental and Rural Science, University of New England, Armidale, NSW, Australia
| | | | - John P Gibson
- Centre for Genetic Analysis and Applications, School of Environmental and Rural Science, University of New England, Armidale, NSW, Australia
| |
Collapse
|
29
|
Gebrehiwot NZ, Strucken EM, Marshall K, Aliloo H, Gibson JP. SNP panels for the estimation of dairy breed proportion and parentage assignment in African crossbred dairy cattle. Genet Sel Evol 2021; 53:21. [PMID: 33653262 PMCID: PMC7923343 DOI: 10.1186/s12711-021-00615-4] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2020] [Accepted: 02/17/2021] [Indexed: 01/10/2023] Open
Abstract
BACKGROUND Understanding the relationship between genetic admixture and phenotypic performance is crucial for the optimization of crossbreeding programs. The use of small sets of informative ancestry markers can be a cost-effective option for the estimation of breed composition and for parentage assignment in situations where pedigree recording is difficult. The objectives of this study were to develop small single nucleotide polymorphism (SNP) panels that can accurately estimate the total dairy proportion and assign parentage in both West and East African crossbred dairy cows. METHODS Medium- and high-density SNP genotype data (Illumina BovineSNP50 and BovineHD Beadchip) for 4231 animals sampled from African crossbreds, African Bos taurus, European Bos taurus, Bos indicus, and African indigenous populations were used. For estimating breed composition, the absolute differences in allele frequency were calculated between pure ancestral breeds to identify SNPs with the highest discriminating power, and different combinations of SNPs weighted by ancestral origin were tested against estimates based on all available SNPs. For parentage assignment, informative SNPs were selected based on the highest minor allele frequency (MAF) in African crossbred populations assuming two Scenarios: (1) parents were selected among all the animals with known genotypes, and (2) parents were selected only among the animals known to be a parent of at least one progeny. RESULTS For the medium-density genotype data, SNPs selected for the largest differences in allele frequency between West African indigenous and European Bos taurus breeds performed best for most African crossbred populations and achieved a prediction accuracy (r2) for breed composition of 0.926 to 0.961 with 200 SNPs. For the high-density dataset, a panel with 70% of the SNPs selected on their largest difference in allele frequency between African and European Bos taurus performed best or very near best across all crossbred populations with r2 ranging from 0.978 to 0.984 with 200 SNPs. In all African crossbred populations, unambiguous parentage assignment was possible with ≥ 300 SNPs for the majority of the panels for Scenario 1 and ≥ 200 SNPs for Scenario 2. CONCLUSIONS The identified low-cost SNP assays could overcome incomplete or inaccurate pedigree records in African smallholder systems and allow effective breeding decisions to produce progeny of desired breed composition.
Collapse
Affiliation(s)
- Netsanet Z. Gebrehiwot
- Centre for Genetic Analysis and Applications, School of Environmental and Rural Science, University of New England, Armidale, NSW 2351 Australia
| | - Eva M. Strucken
- Centre for Genetic Analysis and Applications, School of Environmental and Rural Science, University of New England, Armidale, NSW 2351 Australia
| | - Karen Marshall
- International Livestock Research Institute and Centre for Tropical Livestock Genetics and Health, Nairobi, Kenya
| | - Hassan Aliloo
- Centre for Genetic Analysis and Applications, School of Environmental and Rural Science, University of New England, Armidale, NSW 2351 Australia
| | - John P. Gibson
- Centre for Genetic Analysis and Applications, School of Environmental and Rural Science, University of New England, Armidale, NSW 2351 Australia
| |
Collapse
|
30
|
Kumar H, Panigrahi M, Saravanan KA, Parida S, Bhushan B, Gaur GK, Dutt T, Mishra BP, Singh RK. SNPs with intermediate minor allele frequencies facilitate accurate breed assignment of Indian Tharparkar cattle. Gene 2021; 777:145473. [PMID: 33549713 DOI: 10.1016/j.gene.2021.145473] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2020] [Revised: 01/23/2021] [Accepted: 01/28/2021] [Indexed: 10/22/2022]
Abstract
Tharparkar cattle breed is widely known for its superior milch quality and hardiness attributes. This study aimed to develop an ultra-low density breed-specific single nucleotide polymorphism (SNP) genotype panel to accurately quantify Tharparkar populations in biological samples. In this study, we selected and genotyped 72 Tharparkar animals randomly from Cattle & Buffalo Farm of IVRI, India. This Bovine SNP50 BeadChip genotypic datum was merged with the online data from six indigenous cattle breeds and five taurine breeds. Here, we used a combination of pre-selection statistics and the MAF-LD method developed in our laboratory to analyze the genotypic data obtained from 317 individuals of 12 distinct breeds to identify breed-informative SNPs for the selection of Tharparkar cattle. This methodology identified 63 unique Tharparkar-specific SNPs near intermediate gene frequencies. We report several informative SNPs in genes/QTL regions affecting phenotypes or production traits that might differentiate the Tharparkar breed.
Collapse
Affiliation(s)
- Harshit Kumar
- Division of Animal Genetics, Indian Veterinary Research Institute, Izatnagar, Bareilly 243122, UP, India
| | - Manjit Panigrahi
- Division of Animal Genetics, Indian Veterinary Research Institute, Izatnagar, Bareilly 243122, UP, India.
| | - K A Saravanan
- Division of Animal Genetics, Indian Veterinary Research Institute, Izatnagar, Bareilly 243122, UP, India
| | - Subhashree Parida
- Division of Pharmacology & Toxicology, ICAR-Indian Veterinary Research Institute, Izatnagar, Bareilly 243122, UP, India
| | - Bharat Bhushan
- Division of Animal Genetics, Indian Veterinary Research Institute, Izatnagar, Bareilly 243122, UP, India
| | - G K Gaur
- Division of Animal Genetics, Indian Veterinary Research Institute, Izatnagar, Bareilly 243122, UP, India
| | - Triveni Dutt
- Livestock Production and Management Section, ICAR-Indian Veterinary Research Institute, Izatnagar, Bareilly 243122, UP, India
| | - B P Mishra
- Division of Animal Biotechnology, ICAR-Indian Veterinary Research Institute, Izatnagar, Bareilly 243122, UP, India
| | - R K Singh
- Division of Animal Biotechnology, ICAR-Indian Veterinary Research Institute, Izatnagar, Bareilly 243122, UP, India
| |
Collapse
|
31
|
Momeni J, Parejo M, Nielsen RO, Langa J, Montes I, Papoutsis L, Farajzadeh L, Bendixen C, Căuia E, Charrière JD, Coffey MF, Costa C, Dall'Olio R, De la Rúa P, Drazic MM, Filipi J, Galea T, Golubovski M, Gregorc A, Grigoryan K, Hatjina F, Ilyasov R, Ivanova E, Janashia I, Kandemir I, Karatasou A, Kekecoglu M, Kezic N, Matray ES, Mifsud D, Moosbeckhofer R, Nikolenko AG, Papachristoforou A, Petrov P, Pinto MA, Poskryakov AV, Sharipov AY, Siceanu A, Soysal MI, Uzunov A, Zammit-Mangion M, Vingborg R, Bouga M, Kryger P, Meixner MD, Estonba A. Authoritative subspecies diagnosis tool for European honey bees based on ancestry informative SNPs. BMC Genomics 2021; 22:101. [PMID: 33535965 PMCID: PMC7860026 DOI: 10.1186/s12864-021-07379-7] [Citation(s) in RCA: 24] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2020] [Accepted: 01/08/2021] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND With numerous endemic subspecies representing four of its five evolutionary lineages, Europe holds a large fraction of Apis mellifera genetic diversity. This diversity and the natural distribution range have been altered by anthropogenic factors. The conservation of this natural heritage relies on the availability of accurate tools for subspecies diagnosis. Based on pool-sequence data from 2145 worker bees representing 22 populations sampled across Europe, we employed two highly discriminative approaches (PCA and FST) to select the most informative SNPs for ancestry inference. RESULTS Using a supervised machine learning (ML) approach and a set of 3896 genotyped individuals, we could show that the 4094 selected single nucleotide polymorphisms (SNPs) provide an accurate prediction of ancestry inference in European honey bees. The best ML model was Linear Support Vector Classifier (Linear SVC) which correctly assigned most individuals to one of the 14 subspecies or different genetic origins with a mean accuracy of 96.2% ± 0.8 SD. A total of 3.8% of test individuals were misclassified, most probably due to limited differentiation between the subspecies caused by close geographical proximity, or human interference of genetic integrity of reference subspecies, or a combination thereof. CONCLUSIONS The diagnostic tool presented here will contribute to a sustainable conservation and support breeding activities in order to preserve the genetic heritage of European honey bees.
Collapse
Affiliation(s)
- Jamal Momeni
- Eurofins Genomics Europe Genotyping A/S (EFEG), (Former GenoSkan A/S), Aarhus, Denmark.
| | - Melanie Parejo
- Laboratory Genetics, University of the Basque Country (UPV/EHU), Leioa, Bilbao, Spain.,Swiss Bee Research Center, Agroscope, Bern, Switzerland
| | - Rasmus O Nielsen
- Eurofins Genomics Europe Genotyping A/S (EFEG), (Former GenoSkan A/S), Aarhus, Denmark
| | - Jorge Langa
- Laboratory Genetics, University of the Basque Country (UPV/EHU), Leioa, Bilbao, Spain
| | - Iratxe Montes
- Laboratory Genetics, University of the Basque Country (UPV/EHU), Leioa, Bilbao, Spain
| | - Laetitia Papoutsis
- Laboratory of Agricultural Zoology and Entomology, Agricultural University of Athens, Athens, Greece
| | - Leila Farajzadeh
- Department of Molecular Biology and Genetics, Aarhus University, Aarhus, Denmark
| | - Christian Bendixen
- Department of Molecular Biology and Genetics, Aarhus University, Aarhus, Denmark
| | - Eliza Căuia
- Institutul de Cercetare Dezvoltare pentru Apicultura SA, Bucharest, Romania
| | | | | | - Cecilia Costa
- CREA Research Centre for Agriculture and Environment, Bologna, Italy
| | | | | | | | - Janja Filipi
- Department of Ecology, Agronomy and Aquaculture, University of Zadar, Zadar, Croatia
| | | | | | - Ales Gregorc
- Faculty of Agriculture and Life Sciences, University of Maribor, Maribor, Slovenia
| | | | - Fani Hatjina
- Department of Apiculture, Agricultural Organization 'DEMETER', Thessaloniki, Greece
| | - Rustem Ilyasov
- Division of Life Sciences, Major of Biological Sciences, and Convergence Research Center for Insect Vectors, Incheon National University, Incheon, Korea.,Institute of Biochemistry and Genetics, Ufa Federal Research Centre of the Russian Academy of Sciences, Ufa, Russia
| | | | | | | | | | | | | | | | - David Mifsud
- Division of Rural Sciences and Food Systems, Institute of Earth Systems, University of Malta, Msida, Malta
| | - Rudolf Moosbeckhofer
- Österreichische Agentur für Gesundheit und Ernährungssicherheit GmbH, Wien, Austria
| | - Alexei G Nikolenko
- Institute of Biochemistry and Genetics, Ufa Federal Research Centre of the Russian Academy of Sciences, Ufa, Russia
| | | | - Plamen Petrov
- Agricultural University of Plovdiv, Plovdiv, Bulgaria
| | - M Alice Pinto
- Centro de Investigação de Montanha (CIMO), Instituto Politécnico de Bragança, Bragança, Portugal
| | - Aleksandr V Poskryakov
- Institute of Biochemistry and Genetics, Ufa Federal Research Centre of the Russian Academy of Sciences, Ufa, Russia
| | | | - Adrian Siceanu
- Institutul de Cercetare Dezvoltare pentru Apicultura SA, Bucharest, Romania
| | | | - Aleksandar Uzunov
- Landesbetrieb Landwirtschaft Hessen, Bee Institute Kirchhain, Kirchhain, Germany.,Faculty of Agricultural Sciences and Food, University Ss. Cyril and Methodius, Skopje, Republic of Macedonia
| | | | - Rikke Vingborg
- Eurofins Genomics Europe Genotyping A/S (EFEG), (Former GenoSkan A/S), Aarhus, Denmark
| | - Maria Bouga
- Laboratory of Agricultural Zoology and Entomology, Agricultural University of Athens, Athens, Greece
| | - Per Kryger
- Department of Agroecology, Aarhus University, Slagelse, Denmark
| | - Marina D Meixner
- Landesbetrieb Landwirtschaft Hessen, Bee Institute Kirchhain, Kirchhain, Germany
| | - Andone Estonba
- Laboratory Genetics, University of the Basque Country (UPV/EHU), Leioa, Bilbao, Spain.
| |
Collapse
|
32
|
Estimating breed composition for pigs: A case study focused on Mangalitsa pigs and two methods. Livest Sci 2021. [DOI: 10.1016/j.livsci.2021.104398] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
|
33
|
Identification of Ancestry Informative Marker (AIM) Panels to Assess Hybridisation between Feral and Domestic Sheep. Animals (Basel) 2020; 10:ani10040582. [PMID: 32235592 PMCID: PMC7222383 DOI: 10.3390/ani10040582] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2020] [Revised: 03/21/2020] [Accepted: 03/25/2020] [Indexed: 11/30/2022] Open
Abstract
Simple Summary Once present in the entirety of Europe, mouflon (wild sheep) became extinct due to intense hunting, but remnant populations survived and became feral on the Mediterranean islands of Corsica and Sardinia. Although now protected by regional laws, Sardinian mouflon is threatened by crossbreeding with domestic sheep causing genetic hybridisation. The spread of domestic genes can be detrimental for wild populations as it dilutes the genetic features that characterise them. This work aimed to identify diagnostic tools that could be applied to monitor the level of hybridisation between mouflon and domestic sheep. Tens of thousands of genetic markers known as single nucleotide polymorphisms (SNPs) were screened and we identified the smallest number of SNPs necessary to discriminate between pure mouflon and sheep. We produced four SNP panels of different sizes which were able to assess the hybridisation level of a mouflon and we verified that the SNP panels efficacy is independent of the domestic sheep breed involved in the hybrid. The implementation of these results into actual diagnostic tools will help the conservation of this unique and irreplaceable mouflon population, and the methodology applied can easily be transferred to other case studies of interest. Abstract Hybridisation of wild populations with their domestic counterparts can lead to the loss of wildtype genetic integrity, outbreeding depression, and loss of adaptive features. The Mediterranean island of Sardinia hosts one of the last extant autochthonous European mouflon (Ovis aries musimon) populations. Although conservation policies, including reintroduction plans, have been enforced to preserve Sardinian mouflon, crossbreeding with domestic sheep has been documented. We identified panels of single nucleotide polymorphisms (SNPs) that could act as ancestry informative markers able to assess admixture in feral x domestic sheep hybrids. The medium-density SNP array genotyping data of Sardinian mouflon and domestic sheep (O. aries aries) showing pure ancestry were used as references. We applied a two-step selection algorithm to this data consisting of preselection via Principal Component Analysis followed by a supervised machine learning classification method based on random forest to develop SNP panels of various sizes. We generated ancestry informative marker (AIM) panels and tested their ability to assess admixture in mouflon x domestic sheep hybrids both in simulated and real populations of known ancestry proportions. All the AIM panels recorded high correlations with the ancestry proportion computed using the full medium-density SNP array. The AIM panels proposed here may be used by conservation practitioners as diagnostic tools to exclude hybrids from reintroduction plans and improve conservation strategies for mouflon populations.
Collapse
|
34
|
A machine learning approach for the identification of population-informative markers from high-throughput genotyping data: application to several pig breeds. Animal 2019; 14:223-232. [PMID: 31603060 DOI: 10.1017/s1751731119002167] [Citation(s) in RCA: 27] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022] Open
Abstract
Single nucleotide polymorphisms (SNPs) able to describe population differences can be used for important applications in livestock, including breed assignment of individual animals, authentication of mono-breed products and parentage verification among several other applications. To identify the most discriminating SNPs among thousands of markers in the available commercial SNP chip tools, several methods have been used. Random forest (RF) is a machine learning technique that has been proposed for this purpose. In this study, we used RF to analyse PorcineSNP60 BeadChip array genotyping data obtained from a total of 2737 pigs of 7 Italian pig breeds (3 cosmopolitan-derived breeds: Italian Large White, Italian Duroc and Italian Landrace, and 4 autochthonous breeds: Apulo-Calabrese, Casertana, Cinta Senese and Nero Siciliano) to identify breed informative and reduced SNP panels using the mean decrease in the Gini Index and the Mean Decrease in Accuracy parameters with stability evaluation. Other reduced informative SNP panels were obtained using Delta, Fixation index and principal component analysis statistics, and their performances were compared with those obtained using the RF-defined panels using the RF classification method and its derived Out Of Bag rates and correct prediction proportions. Therefore, the performances of a total of six reduced panels were evaluated. The correct assignment of the animals to its breed was close to 100% for all tested approaches. Porcine chromosome 8 harboured the largest number of selected SNPs across all panels. Many SNPs were included in genomic regions in which previous studies identified signatures of selection or genes (e.g. ESR1, KITL and LCORL) that could contribute to explain, at least in part, phenotypically or economically relevant traits that might differentiate cosmopolitan and autochthonous pig breeds. Random forest used as preselection statistics highlighted informative SNPs that were not the same as those identified by other methods. This might be due to specific features of this machine learning methodology. It will be interesting to explore if the adaptation of RF methods for the identification of selection signature regions could be able to describe population-specific features that are not captured by other approaches.
Collapse
|
35
|
Hulsegge I, Schoon M, Windig J, Neuteboom M, Hiemstra SJ, Schurink A. Development of a genetic tool for determining breed purity of cattle. Livest Sci 2019. [DOI: 10.1016/j.livsci.2019.03.002] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]
|
36
|
Ahmad SF, Panigrahi M, Ali A, Dar RR, Narayanan K, Bhushan B. Evaluation of two bovine SNP genotyping arrays for breed clustering and stratification analysis in well-known taurine and indicine breeds. Anim Biotechnol 2019; 31:268-275. [PMID: 30857468 DOI: 10.1080/10495398.2019.1578227] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]
Abstract
The present study aimed to evaluate the efficiency of two Bovine SNP genotyping arrays (i.e., 50 K and HD) for breed clustering and stratification related studies in taurine and indicine breeds. The whole-genome SNP data at two densities were assembled into three datasets (A, B and C). Dataset A (N = 213) included 50 K genotypic data for five taurine (Holstein-Friesian, Guernsey, Brown Swiss, Angus and Jersey) and two indicine (Gir and Nellore) breeds. Dataset B (N = 241) included the same breeds with HD density data. Dataset C (N = 299) included 50 K SNP genotypic data for six taurine (Holstein-Friesian, Jersey, Guernsey, Brown Swiss, Angus and Hereford) and six indicine (Hariana, Kankrej, Brahman, Nellore, Sahiwal and Gir) breeds. The analysis was done using ADMIXTURE program (bioinformatics-based) and cross-validation errors and Principal Component Analysis (statistical analysis). The proportion of polymorphic markers and minor allele frequencies were assessed for each breed. The proportion of markers polymorphic was consistently higher in taurine breeds when compared with breeds from indicine group. Minor allele frequency estimates and ADMIXTURE results showed differential patterns for both the lineages. However, no significant increase in the accuracy of genomic clustering was found on moving from 50 K to HD density data.
Collapse
Affiliation(s)
- Sheikh Firdous Ahmad
- Division of Animal Genetics, ICAR-Indian Veterinary Research Institute, Bareilly, UP, India
| | - Manjit Panigrahi
- Division of Animal Genetics, ICAR-Indian Veterinary Research Institute, Bareilly, UP, India
| | - Ajaz Ali
- Division of Animal Reproduction, ICAR-Indian Veterinary Research Institute, Bareilly, UP, India
| | - Rouf Rashid Dar
- Division of Animal Reproduction, ICAR-Indian Veterinary Research Institute, Bareilly, UP, India
| | - Krishnaswamy Narayanan
- Division of Animal Reproduction, ICAR-Indian Veterinary Research Institute, Bareilly, UP, India
| | - Bharat Bhushan
- Division of Animal Genetics, ICAR-Indian Veterinary Research Institute, Bareilly, UP, India
| |
Collapse
|
37
|
Das R, Roy R, Venkatesh N. Using Ancestry Informative Markers (AIMs) to Detect Fine Structures Within Gorilla Populations. Front Genet 2019; 10:43. [PMID: 30800141 PMCID: PMC6375890 DOI: 10.3389/fgene.2019.00043] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/03/2018] [Accepted: 01/21/2019] [Indexed: 12/04/2022] Open
Abstract
The knowledge of ancestral origin is monumental in conservation of endangered animals since it can aid in preservation of population level genetic integrity and prevent inbreeding among related individuals. Despite maintenance of studbook, the biogeographical affiliation of most captive gorillas is largely unknown, which has constrained management of captive gorillas aiming at maximizing genetic diversity at the population level. In recent years, ancestry informative markers (AIMs) has been successfully employed for the inference of genomic ancestry in a wide range of studies in evolutionary genetics, biomedical research, genetic stock identification, and introgression analysis and forensic analyses. In this study, we sought to derive the AIMs yielding the most cohesive and faithful understanding of biogeographical affiliation of query gorillas. To this end, we compared three commonly used AIMs-determining methods namely, Infocalc, F ST , and Smart Principal Component Analysis (SmartPCA) with ADMIXTURE, using gorilla genome data available through Great Ape Genome Project database. Our findings suggest that the SNPs that were detected by at least three of the four AIMs-determining approaches (N = 1,531), is likely most suitable for delineation of gorilla AIMs. It recapitulated the finer structure within western lowland gorilla genomes with high degree of precision. We further have validated the robustness of our results using a randomized negative control containing the same number of SNPs. To the best of our knowledge, this is the first report of an AIMs panel for gorillas that may aid in developing cost-effective resources for large-scale demographic analyses, and greatly help in conservation of this charismatic mega-fauna.
Collapse
Affiliation(s)
- Ranajit Das
- Manipal Centre for Natural Sciences, Manipal Academy of Higher Education, Manipal, India
| | - Ria Roy
- Department of Biotechnology Engineering, Sahrdaya College of Engineering and Technology, Kodakara, India
| | - Neha Venkatesh
- Department of Genetics, University of Mysore, Mysore, India
| |
Collapse
|
38
|
Kerr Q, Fuentes‐Pardo AP, Kho J, McDermid JL, Ruzzante DE. Temporal stability and assignment power of adaptively divergent genomic regions between herring ( Clupea harengus) seasonal spawning aggregations. Ecol Evol 2019; 9:500-510. [PMID: 30680131 PMCID: PMC6342187 DOI: 10.1002/ece3.4768] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2018] [Revised: 11/07/2018] [Accepted: 11/12/2018] [Indexed: 11/07/2022] Open
Abstract
Atlantic herring (Clupea harengus), a vital ecosystem component and target of the largest Northwest Atlantic pelagic fishery, undergo seasonal spawning migrations that result in elusive sympatric population structure. Herring spawn mostly in fall or spring, and genomic differentiation was recently detected between these groups. Here we used a subset of this differentiation, 66 single nucleotide polymorphisms (SNPs) to analyze the temporal dynamics of this local adaptation and the applicability of SNP subsets in stock assessment. We showed remarkable temporal stability of genomic differentiation corresponding to spawning season, between samples taken a decade apart (2005 N = 90 vs. 2014 N = 71) in the Gulf of St. Lawrence, and new evidence of limited interbreeding between spawning components. We also examined an understudied and overexploited herring population in Bras d'Or lake (N = 97); using highly reduced SNP panels (N SNPs > 6), we verified little-known sympatric spawning populations within this unique inland sea. These results describe consistent local adaptation, arising from asynchronous reproduction in a migratory and dynamic marine species. Our research demonstrates the efficiency and precision of SNP-based assessments of sympatric subpopulations; and indeed, this temporally stable local adaptation underlines the importance of such fine-scale management practices.
Collapse
Affiliation(s)
- Quentin Kerr
- Department of BiologyDalhousie UniversityHalifaxNova ScotiaCanada
| | | | - James Kho
- Department of BiologyDalhousie UniversityHalifaxNova ScotiaCanada
| | - Jenni L. McDermid
- Marine Fish and Mammals Section, Fisheries and Oceans CanadaGulf Fisheries CentreMonctonNew BrunswickCanada
| | | |
Collapse
|
39
|
Jorde PE, Synnes A, Espeland SH, Sodeland M, Knutsen H. Can we rely on selected genetic markers for population identification? Evidence from coastal Atlantic cod. Ecol Evol 2018; 8:12547-12558. [PMID: 30619564 PMCID: PMC6308871 DOI: 10.1002/ece3.4648] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2018] [Revised: 09/30/2018] [Accepted: 10/03/2018] [Indexed: 01/03/2023] Open
Abstract
The use of genetic markers under putative selection in population studies carries the potential for erroneous identification of populations and misassignment of individuals to population of origin. Selected markers are nevertheless attractive, especially in marine organisms that are characterized by weak population structure at neutral loci. Highly fecund species may tolerate the cost of strong selective mortality during early life stages, potentially leading to a shift in offspring genotypes away from the parental proportions. In Atlantic cod, recent genetic studies have uncovered different genotype clusters apparently representing phenotypically cryptic populations that coexist in coastal waters. Here, we tested if a high-graded SNP panel specifically designed to classify individual cod to population of origin may be unreliable because of natural selection acting on the SNPs or their linked background. Temporal samples of cod were collected from two fjords, starting at the earliest life stage (pelagic eggs) and carried on until late autumn (bottom-settled juveniles), covering the period during summer of high natural mortality. Despite the potential for selective mortality during the study period, we found no evidence for selection, as both cod types occurred throughout the season, already in the earliest egg samples, and there was no evidence for a shift during the season in the proportions of one or the other type. We conclude that high-graded marker panels under putative natural selection represent a valid and useful tool for identifying biological population structure in this highly fecund species and presumably in others.
Collapse
Affiliation(s)
- Per Erik Jorde
- Department of Biosciences, Centre for Ecological and Evolutionary SynthesisUniversity of OsloOsloNorway
- Institute of Marine ResearchHisNorway
| | - Ann‐Elin Synnes
- Centre of Coastal ResearchUniversity of AgderKristiansandNorway
| | - Sigurd Heiberg Espeland
- Institute of Marine ResearchHisNorway
- Centre of Coastal ResearchUniversity of AgderKristiansandNorway
| | - Marte Sodeland
- Centre of Coastal ResearchUniversity of AgderKristiansandNorway
| | - Halvor Knutsen
- Institute of Marine ResearchHisNorway
- Centre of Coastal ResearchUniversity of AgderKristiansandNorway
| |
Collapse
|
40
|
Henriques D, Parejo M, Vignal A, Wragg D, Wallberg A, Webster MT, Pinto MA. Developing reduced SNP assays from whole-genome sequence data to estimate introgression in an organism with complex genetic patterns, the Iberian honeybee ( Apis mellifera iberiensis). Evol Appl 2018; 11:1270-1282. [PMID: 30151039 PMCID: PMC6099811 DOI: 10.1111/eva.12623] [Citation(s) in RCA: 23] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2017] [Accepted: 02/11/2018] [Indexed: 01/01/2023] Open
Abstract
The most important managed pollinator, the honeybee (Apis mellifera L.), has been subject to a growing number of threats. In western Europe, one such threat is large-scale introductions of commercial strains (C-lineage ancestry), which is leading to introgressive hybridization and even the local extinction of native honeybee populations (M-lineage ancestry). Here, we developed reduced assays of highly informative SNPs from 176 whole genomes to estimate C-lineage introgression in the most diverse and evolutionarily complex subspecies in Europe, the Iberian honeybee (Apis mellifera iberiensis). We started by evaluating the effects of sample size and sampling a geographically restricted area on the number of highly informative SNPs. We demonstrated that a bias in the number of fixed SNPs (FST = 1) is introduced when the sample size is small (N ≤ 10) and when sampling only captures a small fraction of a population's genetic diversity. These results underscore the importance of having a representative sample when developing reliable reduced SNP assays for organisms with complex genetic patterns. We used a training data set to design four independent SNP assays selected from pairwise FST between the Iberian and C-lineage honeybees. The designed assays, which were validated in holdout and simulated hybrid data sets, proved to be highly accurate and can be readily used for monitoring populations not only in the native range of A. m. iberiensis in Iberia but also in the introduced range in the Balearic islands, Macaronesia and South America, in a time- and cost-effective manner. While our approach used the Iberian honeybee as model system, it has a high value in a wide range of scenarios for the monitoring and conservation of potentially hybridized domestic and wildlife populations.
Collapse
Affiliation(s)
- Dora Henriques
- Mountain Research Centre (CIMO)Polytechnic Institute of BragançaBragançaPortugal
- Centre of Molecular and Environmental Biology (CBMA)University of MinhoBragaPortugal
| | - Melanie Parejo
- AgroscopeSwiss Bee Research CentreBernSwitzerland
- Institute of Bee HealthVetsuisse FacultyUniversity of BernBernSwitzerland
| | - Alain Vignal
- GenPhySEUniversité de ToulouseINRAINPTINP‐ENVTCastanet TolosanFrance
| | - David Wragg
- The Roslin InstituteUniversity of EdinburghEdinburghUK
| | - Andreas Wallberg
- Department of Medical Biochemistry and MicrobiologyScience for Life LaboratoryUppsala UniversityUppsalaSweden
| | - Matthew T. Webster
- Department of Medical Biochemistry and MicrobiologyScience for Life LaboratoryUppsala UniversityUppsalaSweden
| | - M. Alice Pinto
- Mountain Research Centre (CIMO)Polytechnic Institute of BragançaBragançaPortugal
| |
Collapse
|
41
|
Genetic structure of six cattle populations revealed by transcriptome-wide SNPs and gene expression. Genes Genomics 2018; 40:715-724. [PMID: 29934811 PMCID: PMC6015124 DOI: 10.1007/s13258-018-0677-1] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2017] [Accepted: 02/27/2018] [Indexed: 01/29/2023]
Abstract
There are abundant cattle breeds/populations in China, and the systematic discovery of genomic variants is essential for performing the marker assisted selection and conservation of genetic resources. In the present study, we employed whole transcriptome sequencing (RNA-Seq) technology for revealing genetic structure among six Chinese cattle populations according to transcriptome-wide SNPs and gene expression. A total of 68,094 variants consisting of 61,754 SNPs and 6340 InDels were detected and widely distributed among all chromosomes, by which the clear patterns of population structures were revealed. We also found the significantly differential density of variant distribution among genes. Additionally, we totally assembled 15,992 genes and detected obvious differences on the expression profiles among populations. In contrast to genomic variants, the measure of gene expression levels failed to support the expected population structure. Here, we provided a global landscape on the differential expression genes among these cattle populations.
Collapse
|
42
|
Gobena M, Elzo MA, Mateescu RG. Population Structure and Genomic Breed Composition in an Angus-Brahman Crossbred Cattle Population. Front Genet 2018; 9:90. [PMID: 29636769 PMCID: PMC5881247 DOI: 10.3389/fgene.2018.00090] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2017] [Accepted: 03/05/2018] [Indexed: 12/27/2022] Open
Abstract
Crossbreeding is a common strategy used in tropical and subtropical regions to enhance beef production, and having accurate knowledge of breed composition is essential for the success of a crossbreeding program. Although pedigree records have been traditionally used to obtain the breed composition of crossbred cattle, the accuracy of pedigree-based breed composition can be reduced by inaccurate and/or incomplete records and Mendelian sampling. Breed composition estimation from genomic data has multiple advantages including higher accuracy without being affected by missing, incomplete, or inaccurate records and the ability to be used as independent authentication of breed in breed-labeled beef products. The present study was conducted with 676 Angus–Brahman crossbred cattle with genotype and pedigree information to evaluate the feasibility and accuracy of using genomic data to determine breed composition. We used genomic data in parametric and non-parametric methods to detect population structure due to differences in breed composition while accounting for the confounding effect of close familial relationships. By applying principal component analysis (PCA) and the maximum likelihood method of ADMIXTURE to genomic data, it was possible to successfully characterize population structure resulting from heterogeneous breed ancestry, while accounting for close familial relationships. PCA results offered additional insight into the different hierarchies of genetic variation structuring. The first principal component was strongly correlated with Angus–Brahman proportions, and the second represented variation within animals that have a relatively more extended Brangus lineage—indicating the presence of a distinct pattern of genetic variation in these cattle. Although there was strong agreement between breed proportions estimated from pedigree and genetic information, there were significant discrepancies between these two methods for certain animals. This was most likely due to inaccuracies in the pedigree-based estimation of breed composition, which supported the case for using genomic information to complement and/or replace pedigree information when estimating breed composition. Comparison with a supervised analysis where purebreds are used as the training set suggest that accurate predictions can be achieved even in the absence of purebred population information.
Collapse
Affiliation(s)
- Mesfin Gobena
- Department of Animal Sciences, University of Florida, Gainesville, FL, United States
| | - Mauricio A Elzo
- Department of Animal Sciences, University of Florida, Gainesville, FL, United States
| | - Raluca G Mateescu
- Department of Animal Sciences, University of Florida, Gainesville, FL, United States
| |
Collapse
|
43
|
Preselection statistics and Random Forest classification identify population informative single nucleotide polymorphisms in cosmopolitan and autochthonous cattle breeds. Animal 2018. [DOI: 10.1017/s1751731117001355] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022] Open
|
44
|
Kavakiotis I, Samaras P, Triantafyllidis A, Vlahavas I. FIFS: A data mining method for informative marker selection in high dimensional population genomic data. Comput Biol Med 2017; 90:146-154. [DOI: 10.1016/j.compbiomed.2017.09.020] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2017] [Revised: 08/29/2017] [Accepted: 09/26/2017] [Indexed: 12/16/2022]
|
45
|
Penalized classification for optimal statistical selection of markers from high-throughput genotyping: application in sheep breeds. Animal 2017; 12:1118-1125. [PMID: 29061210 DOI: 10.1017/s175173111700266x] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022] Open
Abstract
The identification of individuals' breed of origin has several practical applications in livestock and is useful in different biological contexts such as conservation genetics, breeding and authentication of animal products. In this paper, penalized multinomial regression was applied to identify the minimum number of single nucleotide polymorphisms (SNPs) from high-throughput genotyping data for individual assignment to dairy sheep breeds reared in Sicily. The combined use of penalized multinomial regression and stability selection reduced the number of SNPs required to 48. A final validation step on an independent population was carried out obtaining 100% correctly classified individuals. The results using independent analysis, such as admixture, F st, principal component analysis and random forest, confirmed the ability of these methods in selecting distinctive markers. The identified SNPs may constitute a starting point for the development of a SNP based identification test as a tool for breed assignment and traceability of animal products.
Collapse
|
46
|
Tonussi RL, Silva RMDO, Magalhães AFB, Espigolan R, Peripolli E, Olivieri BF, Feitosa FLB, Lemos MVA, Berton MP, Chiaia HLJ, Pereira ASC, Lôbo RB, Bezerra LAF, Magnabosco CDU, Lourenço DAL, Aguilar I, Baldi F. Application of single step genomic BLUP under different uncertain paternity scenarios using simulated data. PLoS One 2017; 12:e0181752. [PMID: 28957330 PMCID: PMC5619718 DOI: 10.1371/journal.pone.0181752] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2016] [Accepted: 07/06/2017] [Indexed: 11/26/2022] Open
Abstract
The objective of this study was to investigate the application of BLUP and single step genomic BLUP (ssGBLUP) models in different scenarios of paternity uncertainty with different strategies of scaling the G matrix to match the A22 matrix, using simulated data for beef cattle. Genotypes, pedigree, and phenotypes for age at first calving (AFC) and weight at 550 days (W550) were simulated using heritabilities based on real data (0.12 for AFC and 0.34 for W550). Paternity uncertainty scenarios using 0, 25, 50, 75, and 100% of multiple sires (MS) were studied. The simulated genome had a total length of 2,333 cM, containing 735,293 biallelic markers and 7,000 QTLs randomly distributed over the 29 BTA. It was assumed that QTLs explained 100% of the genetic variance. For QTL, the amount of alleles per loci randomly ranged from two to four. The BLUP model that considers phenotypic and pedigree data, and the ssGBLUP model that combines phenotypic, pedigree and genomic information were used for genetic evaluations. Four ways of scaling the mean of the genomic matrix (G) to match to the mean of the pedigree relationship matrix among genotyped animals (A22) were tested. Accuracy, bias, and inflation were investigated for five groups of animals: ALL = all animals; BULL = only bulls; GEN = genotyped animals; FEM = females; and YOUNG = young males. With the BLUP model, the accuracies of genetic evaluations decreased for both traits as the proportion of unknown sires in the population increased. The EBV accuracy reduction was higher for GEN and YOUNG groups. By analyzing the scenarios for YOUNG (from 0 to 100% of MS), the decrease was 87.8 and 86% for AFC and W550, respectively. When applying the ssGBLUP model, the accuracies of genetic evaluation also decreased as the MS in the pedigree for both traits increased. However, the accuracy reduction was less than those observed for BLUP model. Using the same comparison (scenario 0 to 100% of MS), the accuracies reductions were 38 and 44.6% for AFC and W550, respectively. There were no differences between the strategies for scaling the G matrix for ALL, BULL, and FEM groups under the different scenarios with missing pedigree. These results pointed out that the uninformative part of the A22 matrix and genotyped animals with paternity uncertainty did not influence the scaling of G matrix. On the basis of the results, it is important to have a G matrix in the same scale of the A22 matrix, especially for the evaluation of young animals in situations with missing pedigree information. In these situations, the ssGBLUP model is an appropriate alternative to obtain a more reliable and less biased estimate of breeding values, especially for young animals with few or no phenotypic records. For accurate and unbiased genomic predictions with ssGBLUP, it is necessary to assure that the G matrix is compatible with the A22 matrix, even in situations with paternity uncertainty.
Collapse
Affiliation(s)
- Rafael Lara Tonussi
- Department of Animal Science, School of Agricultural and Veterinarian Sciences, Jaboticabal, São Paulo, Brazil
| | | | | | - Rafael Espigolan
- Department of Animal Science, School of Agricultural and Veterinarian Sciences, Jaboticabal, São Paulo, Brazil
| | - Elisa Peripolli
- Department of Animal Science, School of Agricultural and Veterinarian Sciences, Jaboticabal, São Paulo, Brazil
| | - Bianca Ferreira Olivieri
- Department of Animal Science, School of Agricultural and Veterinarian Sciences, Jaboticabal, São Paulo, Brazil
| | - Fabieli Loise Braga Feitosa
- Department of Animal Science, School of Agricultural and Veterinarian Sciences, Jaboticabal, São Paulo, Brazil
| | | | - Mariana Piatto Berton
- Department of Animal Science, School of Agricultural and Veterinarian Sciences, Jaboticabal, São Paulo, Brazil
| | | | | | | | | | | | | | - Ignácio Aguilar
- Department of Animal Breeding, National Institute of Agricultural Research, Las Brujas, Uruguay
| | - Fernando Baldi
- Department of Animal Science, School of Agricultural and Veterinarian Sciences, Jaboticabal, São Paulo, Brazil
- * E-mail:
| |
Collapse
|
47
|
Quinet C, Czaplicki G, Dion E, Dal Pozzo F, Kurz A, Saegerman C. First Results in the Use of Bovine Ear Notch Tag for Bovine Viral Diarrhoea Virus Detection and Genetic Analysis. PLoS One 2016; 11:e0164451. [PMID: 27764130 PMCID: PMC5072587 DOI: 10.1371/journal.pone.0164451] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/11/2016] [Accepted: 09/26/2016] [Indexed: 11/19/2022] Open
Abstract
BACKGROUND Infection due to bovine viral diarrhoea virus (BVDV) is endemic in most cattle-producing countries throughout the world. The key elements of a BVDV control programme are biosecurity, elimination of persistently infected animals and surveillance. Bovine viral diarrhoea (BVD) is a notifiable disease in Belgium and an official eradication programme started from January 2015, based on testing ear notches sampled during the official identification and registration of calves at birth. An antigen-capture ELISA test based on the detection of BVDV Erns protein is used. Ear notch sample may also be used to characterize the genotype of the calf when appropriate elution/dilution buffer is added. Both BVDV antigen-ELISA analysis and animal traceability could be performed. METHODOLOGY With regards to the reference protocol used in the preparation of ear notch samples, alternative procedures were tested in terms of BVDV analytic sensitivity, diagnostic sensitivity and specificity, as well as quality and purity of animal DNA. PRINCIPAL FINDINGS/SIGNIFICANCE The Allflex DNA Buffer D showed promising results in BVDV diagnosis and genome analyses, opening new perspectives for the livestock industry by the exploitation of the animal genome. Due to the high number of cattle involved in the Belgian official BVDV eradication programme based on ear notch tags sample, a large database on both BVDV status of newborn calves and cattle genome could be created for subsequent different uses (e.g. traceability, determination of parentage, genetic signatures throughout the genome associated with particular traits) evolving through a more integrated animal health.
Collapse
Affiliation(s)
| | | | - Elise Dion
- Arsia, Health Department, Ciney, Belgium
| | - Fabiana Dal Pozzo
- Research Unit in Epidemiology and Risk analysis applied to Veterinary Sciences (UREAR-ULg), Fundamental and Applied Research for Animal and Health (FARAH), Faculty of Veterinary Medicine, University of Liege, Liege, Belgium
| | - Anke Kurz
- IFN Schönow GmbH, Bernau bei Berlin, Germany
| | - Claude Saegerman
- Research Unit in Epidemiology and Risk analysis applied to Veterinary Sciences (UREAR-ULg), Fundamental and Applied Research for Animal and Health (FARAH), Faculty of Veterinary Medicine, University of Liege, Liege, Belgium
| |
Collapse
|
48
|
Sorbolini S, Gaspa G, Steri R, Dimauro C, Cellesi M, Stella A, Marras G, Marsan PA, Valentini A, Macciotta NPP. Use of canonical discriminant analysis to study signatures of selection in cattle. Genet Sel Evol 2016; 48:58. [PMID: 27521154 PMCID: PMC4983034 DOI: 10.1186/s12711-016-0236-7] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2016] [Accepted: 08/01/2016] [Indexed: 01/08/2023] Open
Abstract
BACKGROUND Cattle include a large number of breeds that are characterized by marked phenotypic differences and thus constitute a valuable model to study genome evolution in response to processes such as selection and domestication. Detection of "signatures of selection" is a useful approach to study the evolutionary pressures experienced throughout history. In the present study, signatures of selection were investigated in five cattle breeds farmed in Italy using a multivariate approach. METHODS A total of 4094 bulls from five breeds with different production aptitudes (two dairy breeds: Italian Holstein and Italian Brown Swiss; two beef breeds: Piemontese and Marchigiana; and one dual purpose breed: Italian Simmental) were genotyped using the Illumina BovineSNP50 v.1 beadchip. Canonical discriminant analysis was carried out on the matrix of single nucleotide polymorphisms (SNP) genotyping data, separately for each chromosome. Scores for each canonical variable were calculated and then plotted in the canonical space to quantify the distance between breeds. SNPs for which the correlation with the canonical variable was in the 99th percentile for a specific chromosome were considered to be significantly associated with that variable. Results were compared with those obtained using an FST-based approach. RESULTS Based on the results of the canonical discriminant analysis, a large number of signatures of selection were detected, among which several had strong signals in genomic regions that harbour genes known to have an impact on production and morphological bovine traits, including MSTN, LCT, GHR, SCD, NCAPG, KIT, and ASIP. Moreover, new putative candidate genes were identified, such as GCK, B3GALNT1, MGAT1, GALNTL1, PRNP, and PRND. Similar results were obtained with the FST-based approach. CONCLUSIONS The use of canonical discriminant analysis on 50 K SNP genotypes allowed the extraction of new variables that maximize the separation between breeds. This approach is quite straightforward, it can compare more than two groups simultaneously, and relative distances between breeds can be visualized. The genes that were highlighted in the canonical discriminant analysis were in concordance with those obtained using the FST index.
Collapse
Affiliation(s)
- Silvia Sorbolini
- Dipartimento di Agraria, Sezione di Scienze Zootecniche, Università degli Studi di Sassari, V. le Italia, 9, 07100, Sassari, Italy
| | - Giustino Gaspa
- Dipartimento di Agraria, Sezione di Scienze Zootecniche, Università degli Studi di Sassari, V. le Italia, 9, 07100, Sassari, Italy
| | - Roberto Steri
- Consiglio per la Ricerca e la Sperimentazione in Agricoltura, via Salaria 31, 00015, Monterotondo, Italy
| | - Corrado Dimauro
- Dipartimento di Agraria, Sezione di Scienze Zootecniche, Università degli Studi di Sassari, V. le Italia, 9, 07100, Sassari, Italy
| | - Massimo Cellesi
- Dipartimento di Agraria, Sezione di Scienze Zootecniche, Università degli Studi di Sassari, V. le Italia, 9, 07100, Sassari, Italy
| | | | | | - Paolo Ajmone Marsan
- Istituto di Zootecnica, Università Cattolica del Sacro Cuore, Piacenza, Italy
| | - Alessio Valentini
- Dipartimento per l'Innovazione dei Sistemi Biologici Agroalimentari e Forestali DIBAF, Università della Tuscia, Viterbo, Italy
| | - Nicolò Pietro Paolo Macciotta
- Dipartimento di Agraria, Sezione di Scienze Zootecniche, Università degli Studi di Sassari, V. le Italia, 9, 07100, Sassari, Italy.
| |
Collapse
|
49
|
Tsai HY, Hamilton A, Tinch AE, Guy DR, Bron JE, Taggart JB, Gharbi K, Stear M, Matika O, Pong-Wong R, Bishop SC, Houston RD. Genomic prediction of host resistance to sea lice in farmed Atlantic salmon populations. Genet Sel Evol 2016; 48:47. [PMID: 27357694 PMCID: PMC4926294 DOI: 10.1186/s12711-016-0226-9] [Citation(s) in RCA: 144] [Impact Index Per Article: 18.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2016] [Accepted: 06/17/2016] [Indexed: 12/17/2022] Open
Abstract
Background Sea lice have significant negative economic and welfare impacts on marine Atlantic salmon farming. Since host resistance to sea lice has a substantial genetic component, selective breeding can contribute to control of lice. Genomic selection uses genome-wide marker information to predict breeding values, and can achieve markedly higher accuracy than pedigree-based methods. Our aim was to assess the genetic architecture of host resistance to sea lice, and test the utility of genomic prediction of breeding values. Individual lice counts were measured in challenge experiments using two large Atlantic salmon post-smolt populations from a commercial breeding programme, which had genotypes for ~33 K single nucleotide polymorphisms (SNPs). The specific objectives were to: (i) estimate the heritability of host resistance; (ii) assess its genetic architecture by performing a genome-wide association study (GWAS); (iii) assess the accuracy of predicted breeding values using varying SNP densities (0.5 to 33 K) and compare it to that of pedigree-based prediction; and (iv) evaluate the accuracy of prediction in closely and distantly related animals. Results Heritability of host resistance was significant (0.22 to 0.33) in both populations using either pedigree or genomic relationship matrices. The GWAS suggested that lice resistance is a polygenic trait, and no genome-wide significant quantitative trait loci were identified. Based on cross-validation analysis, genomic predictions were more accurate than pedigree-based predictions for both populations. Although prediction accuracies were highest when closely-related animals were used in the training and validation sets, the benefit of having genomic-versus pedigree-based predictions within a population increased as the relationships between training and validation sets decreased. Prediction accuracy reached an asymptote with a SNP density of ~5 K within populations, although higher SNP density was advantageous for cross-population prediction. Conclusions Host resistance to sea lice in farmed Atlantic salmon has a significant genetic component. Phenotypes relating to host resistance can be predicted with moderate to high accuracy within populations, with a major advantage of genomic over pedigree-based methods, even at relatively sparse SNP densities. Prediction accuracies across populations were low, but improved with higher marker densities. Genomic selection can contribute to lice control in salmon farming. Electronic supplementary material The online version of this article (doi:10.1186/s12711-016-0226-9) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Hsin-Yuan Tsai
- The Roslin Institute and Royal (Dick) School of Veterinary Studies, The University of Edinburgh, Midlothian, EH25 9RG, UK.
| | - Alastair Hamilton
- Landcatch Natural Selection Ltd., 15 Beta Centre, Stirling University Innovation Park, Stirling, FK9 4NF, UK
| | - Alan E Tinch
- Landcatch Natural Selection Ltd., 15 Beta Centre, Stirling University Innovation Park, Stirling, FK9 4NF, UK
| | - Derrick R Guy
- Landcatch Natural Selection Ltd., 15 Beta Centre, Stirling University Innovation Park, Stirling, FK9 4NF, UK
| | - James E Bron
- Institute of Aquaculture, University of Stirling, Stirling, FK9 4LA, UK
| | - John B Taggart
- Institute of Aquaculture, University of Stirling, Stirling, FK9 4LA, UK
| | - Karim Gharbi
- Edinburgh Genomics, Ashworth Laboratories, King's Buildings, University of Edinburgh, Edinburgh, EH9 3JT, UK
| | - Michael Stear
- Institute of Biodiversity, Animal Health and Comparative Medicine, University of Glasgow, Bearsden Road, Glasgow, G61 1QH, UK
| | - Oswald Matika
- The Roslin Institute and Royal (Dick) School of Veterinary Studies, The University of Edinburgh, Midlothian, EH25 9RG, UK
| | - Ricardo Pong-Wong
- The Roslin Institute and Royal (Dick) School of Veterinary Studies, The University of Edinburgh, Midlothian, EH25 9RG, UK
| | - Steve C Bishop
- The Roslin Institute and Royal (Dick) School of Veterinary Studies, The University of Edinburgh, Midlothian, EH25 9RG, UK
| | - Ross D Houston
- The Roslin Institute and Royal (Dick) School of Veterinary Studies, The University of Edinburgh, Midlothian, EH25 9RG, UK
| |
Collapse
|
50
|
Yaro M, Munyard KA, Stear MJ, Groth DM. Molecular identification of livestock breeds: a tool for modern conservation biology. Biol Rev Camb Philos Soc 2016; 92:993-1010. [DOI: 10.1111/brv.12265] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2015] [Revised: 02/14/2016] [Accepted: 02/18/2016] [Indexed: 12/22/2022]
Affiliation(s)
- Mohammed Yaro
- School of Biomedical Sciences, CHIRI Biosciences Research Precinct, Faculty of Health Sciences; Curtin University; GPO Box U1987 Perth WA 6845 Australia
| | - Kylie A. Munyard
- School of Biomedical Sciences, CHIRI Biosciences Research Precinct, Faculty of Health Sciences; Curtin University; GPO Box U1987 Perth WA 6845 Australia
| | - Michael J. Stear
- Institute of Biodiversity, Animal Health and Comparative Medicine; University of Glasgow; Bearsden Road Glasgow G61 1QH U.K
| | - David M. Groth
- School of Biomedical Sciences, CHIRI Biosciences Research Precinct, Faculty of Health Sciences; Curtin University; GPO Box U1987 Perth WA 6845 Australia
| |
Collapse
|