1
|
Liu P, Wilson P, Redquest B, Keobouasone S, Manseau M. Seq2Sat and SatAnalyzer toolkit: Towards comprehensive microsatellite genotyping from sequencing data. Mol Ecol Resour 2024; 24:e13929. [PMID: 38289068 DOI: 10.1111/1755-0998.13929] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2023] [Revised: 12/22/2023] [Accepted: 01/09/2024] [Indexed: 03/06/2024]
Abstract
Accurate and efficient microsatellite loci genotyping is an essential process in population genetics that is also used in various demographic analyses. Protocols for next-generation sequencing of microsatellite loci enable high-throughput and cross-compatible allele scoring, common issues that are not addressed by conventional capillary-based approaches. To improve this process, we have developed an all-in-one software, called Seq2Sat (sequence to microsatellite), in C++ to support automated microsatellite genotyping. It directly takes raw reads of microsatellite amplicons and conducts read quality control before inferring genotypes based on depth-of-read, read ratio, sequence composition and length. We have also developed a module for sex identification based on sex chromosome-specific locus amplicons. To allow for greater user access and complement autoscoring, we developed SatAnalyzer (microsatellite analyzer), a user-friendly web-based platform that conducts reads-to-report analyses by calling Seq2Sat for genotype autoscoring and produces interactive genotype graphs for manual editing. SatAnalyzer also allows users to troubleshoot multiplex optimization by analysing read quality and distribution across loci and samples in support of high-quality library preparation. To evaluate its performance, we benchmarked our toolkit Seq2Sat/SatAnalyzer against a conventional capillary gel method and existing microsatellite genotyping software, MEGASAT, using two datasets. Results showed that SatAnalyzer can achieve >99.70% genotyping accuracy and Seq2Sat is ~5 times faster than MEGASAT despite many more informative tables and figures being generated. Seq2Sat and SatAnalyzer are freely available on github (https://github.com/ecogenomicscanada/Seq2Sat) and dockerhub (https://hub.docker.com/r/rocpengliu/satanalyzer).
Collapse
|
2
|
Pina A, Irisarri P, Errea P, Zhebentyayeva T. Mapping Quantitative Trait Loci Associated With Graft (In)Compatibility in Apricot ( Prunus armeniaca L.). FRONTIERS IN PLANT SCIENCE 2021; 12:622906. [PMID: 33679836 PMCID: PMC7933020 DOI: 10.3389/fpls.2021.622906] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/29/2020] [Accepted: 01/08/2021] [Indexed: 05/29/2023]
Abstract
Graft incompatibility (GI) between the most popular Prunus rootstocks and apricot cultivars is one of the major problems for rootstock usage and improvement. Failure in producing long-leaving healthy grafts greatly affects the range of available Prunus rootstocks for apricot cultivation. Despite recent advances related to the molecular mechanisms of a graft-union formation between rootstock and scion, information on genetic control of this trait in woody plants is essentially missing because of a lack of hybrid crosses, segregating for the trait. In this study, we have employed the next-generation sequencing technology to generate the single-nucleotide polymorphism (SNP) markers and construct parental linkage maps for an apricot F1 population "Moniqui (Mo)" × "Paviot (Pa)" segregating for ability to form successful grafts with universal Prunus rootstock "Marianna 2624". To localize genomic regions associated with this trait, we genotyped 138 individuals from the "Mo × Pa" cross and constructed medium-saturated genetic maps. The female "Mo" and male "Pa" maps were composed of 557 and 501 SNPs and organized in eight linkage groups that covered 780.2 and 690.4 cM of genetic distance, respectively. Parental maps were aligned to the Prunus persica v2.0 genome and revealed a high colinearity with the Prunus reference map. Two-year phenotypic data for characters associated with unsuccessful grafting such as necrotic line (NL), bark and wood discontinuities (BD and WD), and an overall estimate of graft (in)compatibility (GI) were collected for mapping quantitative trait loci (QTLs) on both parental maps. On the map of the graft-compatible parent "Pa", two genomic regions on LG5 (44.9-60.8 cM) and LG8 (33.2-39.2 cM) were associated with graft (in)compatibility characters at different significance level, depending on phenotypic dataset. Of these, the LG8 QTL interval was most consistent between the years and supported by two significant and two putative QTLs. To our best knowledge, this is the first report on QTLs for graft (in)compatibility in woody plants. Results of this work will provide a valuable genomic resource for apricot breeding programs and facilitate future efforts focused on candidate genes discovery for graft (in)compatibility in apricot and other Prunus species.
Collapse
|
3
|
Giovambattista G, Takeshima SN, Moe KK, Pereira Rico JA, Polat M, Loza Vega A, Arce Cabrera ON, Aida Y. BoLA-DRB3 genetic diversity in Highland Creole cattle from Bolivia. HLA 2020; 96:688-696. [PMID: 33094557 DOI: 10.1111/tan.14120] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2020] [Revised: 10/14/2020] [Accepted: 10/16/2020] [Indexed: 01/24/2023]
Abstract
The genetic diversity of the BoLA-DRB3 gene has been reported in different cattle breeds owing to its central role in the immune response. However, it is still unknown in hundreds of cattle breeds, especially native populations. Here, we studied BoLA-DRB3 genetic diversity in Highland Creole cattle (CrAl) from Western Bolivia, raised at altitudes between 3800 and 4200 m. DNAs from 48 CrAl cattle were genotyped for BoLA-DRB3 exon 2 alleles using polymerase chain reaction-sequence-based typing (PCR-SBT). The results were compared with 1341 previously reported data from Tropical Creole cattle and other breeds raised in the region. Twenty-three BoLA-DRB3 alleles were identified in CrAl, including the BoLA-DRB3*029:02 variant previously detected in other Creole cattle. Observed and expected heterozygosity were 0.87 and 0.93, respectively. Nucleotide diversity and the number of pairwise difference values were 0.078 and 19.46, respectively. The average number of nonsynonymous and synonymous substitutions were 0.037 and 0.097 for the entire BoLA-DRB3 exon 2, and 0.129 and 0.388 for the antigen-binding site, respectively. Venn analysis and the review of the IPD-MHC database and the literature showed that 2 of 64 alleles were only detected in CrAl, including BoLA-DRB3*029:01 previously reported in African cattle and *048:01 detected in Philippine cattle. Two additional alleles, BoLA-DRB3*007:02 and *029:02, were only present in CrAl and Lowland Creole cattle. Principal Component Analysis (PCA) showed that Bolivian Creole cattle breeds were closely located but they were distant from the Colombian Hartón del Valle Creole. FST analysis showed a low degree of genetic differentiation between Highland and Lowland Bolivian Creole cattle (FST = 0.015). The present results contribute to increasing our knowledge of BoLA-DRB3 genetic diversity in cattle breeds.
Collapse
|
4
|
Jia J, Lin K, Sun H, Dai JJ, Yang ZQ. Identification of two novel KIR3DL1 subtypes, KIR3DL1*0010104 and KIR3DL1*0010105. HLA 2018; 93:138-139. [PMID: 30582293 DOI: 10.1111/tan.13457] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2018] [Revised: 12/12/2018] [Accepted: 12/21/2018] [Indexed: 11/29/2022]
Abstract
KIR3DL1*0010104 and KIR3DL1*0010105 share a common 4 bp deletion in their intron 2.
Collapse
|
5
|
Schmidt DJ, Fallon S, Roberts DT, Espinoza T, McDougall A, Brooks SG, Kind PK, Bond NR, Kennard MJ, Hughes JM. Monitoring age-related trends in genomic diversity of Australian lungfish. Mol Ecol 2018; 27:3231-3241. [PMID: 29989297 DOI: 10.1111/mec.14791] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2018] [Revised: 06/28/2018] [Accepted: 07/01/2018] [Indexed: 11/28/2022]
Abstract
An important challenge for conservation science is to detect declines in intraspecific diversity so that management action can be guided towards populations or species at risk. The lifespan of Australian lungfish (Neoceratodus forsteri) exceeds 80 years, and human impacts on breeding habitat over the last half century may have impeded recruitment, leaving populations dominated by old postreproductive individuals, potentially resulting in a small and declining breeding population. Here, we conduct a "single-sample" evaluation of genetic erosion within contemporary populations of the Australian lungfish. Genetic erosion is a temporal decline in intraspecific diversity due to factors such as reduced population size and inbreeding. We examined whether young individuals showed signs of reduced genetic diversity and/or inbreeding using a novel bomb radiocarbon dating method to age lungfish nonlethally, based on 14 C ratios of scales. A total of 15,201 single nucleotide polymorphic (SNP) loci were genotyped in 92 individuals ranging in age from 2 to 77 years old. Standardized individual heterozygosity and individual inbreeding coefficients varied widely within and between riverine populations, but neither was associated with age, so perceived problems with recruitment have not translated into genetic erosion that could be considered a proximate threat to lungfish populations. Conservation concern has surrounded Australian lungfish for over a century. However, our results suggest that long-lived threatened species can maintain stable levels of intraspecific variability when sufficient reproductive opportunities exist over the course of a long lifespan.
Collapse
|
6
|
Malmberg MM, Shi F, Spangenberg GC, Daetwyler HD, Cogan NOI. Diversity and Genome Analysis of Australian and Global Oilseed Brassica napus L. Germplasm Using Transcriptomics and Whole Genome Re-sequencing. FRONTIERS IN PLANT SCIENCE 2018; 9:508. [PMID: 29725344 PMCID: PMC5917405 DOI: 10.3389/fpls.2018.00508] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/02/2018] [Accepted: 04/03/2018] [Indexed: 05/21/2023]
Abstract
Intensive breeding of Brassica napus has resulted in relatively low diversity, such that B. napus would benefit from germplasm improvement schemes that sustain diversity. As such, samples representative of global germplasm pools need to be assessed for existing population structure, diversity and linkage disequilibrium (LD). Complexity reduction genotyping-by-sequencing (GBS) methods, including GBS-transcriptomics (GBS-t), enable cost-effective screening of a large number of samples, while whole genome re-sequencing (WGR) delivers the ability to generate large numbers of unbiased genomic single nucleotide polymorphisms (SNPs), and identify structural variants (SVs). Furthermore, the development of genomic tools based on whole genomes representative of global oilseed diversity and orientated by the reference genome has substantial industry relevance and will be highly beneficial for canola breeding. As recent studies have focused on European and Chinese varieties, a global diversity panel as well as a substantial number of Australian spring types were included in this study. Focusing on industry relevance, 633 varieties were initially genotyped using GBS-t to examine population structure using 61,037 SNPs. Subsequently, 149 samples representative of global diversity were selected for WGR and both data sets used for a side-by-side evaluation of diversity and LD. The WGR data was further used to develop genomic resources consisting of a list of 4,029,750 high-confidence SNPs annotated using SnpEff, and SVs in the form of 10,976 deletions and 2,556 insertions. These resources form the basis of a reliable and repeatable system allowing greater integration between canola genomics studies, with a strong focus on breeding germplasm and industry applicability.
Collapse
|
7
|
Clevenger JP, Korani W, Ozias-Akins P, Jackson S. Haplotype-Based Genotyping in Polyploids. FRONTIERS IN PLANT SCIENCE 2018; 9:564. [PMID: 29755500 PMCID: PMC5932196 DOI: 10.3389/fpls.2018.00564] [Citation(s) in RCA: 37] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/30/2018] [Accepted: 04/10/2018] [Indexed: 05/05/2023]
Abstract
Accurate identification of polymorphisms from sequence data is crucial to unlocking the potential of high throughput sequencing for genomics. Single nucleotide polymorphisms (SNPs) are difficult to accurately identify in polyploid crops due to the duplicative nature of polyploid genomes leading to low confidence in the true alignment of short reads. Implementing a haplotype-based method in contrasting subgenome-specific sequences leads to higher accuracy of SNP identification in polyploids. To test this method, a large-scale 48K SNP array (Axiom Arachis2) was developed for Arachis hypogaea (peanut), an allotetraploid, in which 1,674 haplotype-based SNPs were included. Results of the array show that 74% of the haplotype-based SNP markers could be validated, which is considerably higher than previous methods used for peanut. The haplotype method has been implemented in a standalone program, HAPLOSWEEP, which takes as input bam files and a vcf file and identifies haplotype-based markers. Haplotype discovery can be made within single reads or span paired reads, and can leverage long read technology by targeting any length of haplotype. Haplotype-based genotyping is applicable in all allopolyploid genomes and provides confidence in marker identification and in silico-based genotyping for polyploid genomics.
Collapse
|
8
|
Gazave E, Tassone EE, Ilut DC, Wingerson M, Datema E, Witsenboer HMA, Davis JB, Grant D, Dyer JM, Jenks MA, Brown J, Gore MA. Population Genomic Analysis Reveals Differential Evolutionary Histories and Patterns of Diversity across Subgenomes and Subpopulations of Brassica napus L. FRONTIERS IN PLANT SCIENCE 2016; 7:525. [PMID: 27148342 PMCID: PMC4838616 DOI: 10.3389/fpls.2016.00525] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/27/2015] [Accepted: 04/04/2016] [Indexed: 05/08/2023]
Abstract
The allotetraploid species Brassica napus L. is a global crop of major economic importance, providing canola oil (seed) and vegetables for human consumption and fodder and meal for livestock feed. Characterizing the genetic diversity present in the extant germplasm pool of B. napus is fundamental to better conserve, manage and utilize the genetic resources of this species. We used sequence-based genotyping to identify and genotype 30,881 SNPs in a diversity panel of 782 B. napus accessions, representing samples of winter and spring growth habits originating from 33 countries across Europe, Asia, and America. We detected strong population structure broadly concordant with growth habit and geography, and identified three major genetic groups: spring (SP), winter Europe (WE), and winter Asia (WA). Subpopulation-specific polymorphism patterns suggest enriched genetic diversity within the WA group and a smaller effective breeding population for the SP group compared to WE. Interestingly, the two subgenomes of B. napus appear to have different geographic origins, with phylogenetic analysis placing WE and WA as basal clades for the other subpopulations in the C and A subgenomes, respectively. Finally, we identified 16 genomic regions where the patterns of diversity differed markedly from the genome-wide average, several of which are suggestive of genomic inversions. The results obtained in this study constitute a valuable resource for worldwide breeding efforts and the genetic dissection and prediction of complex B. napus traits.
Collapse
|
9
|
Abstract
The emergence of new sequencing technologies has provided fast and cost-efficient strategies for high-resolution mapping of complex genomes. Although these approaches hold great promise to accelerate genome analysis, their application in studying genetic variation in wheat has been hindered by the complexity of its polyploid genome. Here, we applied the next-generation sequencing of a wheat doubled-haploid mapping population for high-resolution gene mapping and tested its utility for ordering shotgun sequence contigs of a flow-sorted wheat chromosome. A bioinformatical pipeline was developed for reliable variant analysis of sequence data generated for polyploid wheat mapping populations. The results of variant mapping were consistent with the results obtained using the wheat 9000 SNP iSelect assay. A reference map of the wheat genome integrating 2740 gene-associated single-nucleotide polymorphisms from the wheat iSelect assay, 1351 diversity array technology, 118 simple sequence repeat/sequence-tagged sites, and 416,856 genotyping-by-sequencing markers was developed. By analyzing the sequenced megabase-size regions of the wheat genome we showed that mapped markers are located within 40-100 kb from genes providing a possibility for high-resolution mapping at the level of a single gene. In our population, gene loci controlling a seed color phenotype cosegregated with 2459 markers including one that was located within the red seed color gene. We demonstrate that the high-density reference map presented here is a useful resource for gene mapping and linking physical and genetic maps of the wheat genome.
Collapse
|