1
|
Jackson TK, Rhode C. Comparative genomics of dusky kob (Argyrosomus japonicus, Sciaenidae) conspecifics: Evidence for speciation and the genetic mechanisms underlying traits. JOURNAL OF FISH BIOLOGY 2024. [PMID: 38885946 DOI: 10.1111/jfb.15844] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/13/2023] [Revised: 04/17/2024] [Accepted: 05/28/2024] [Indexed: 06/20/2024]
Abstract
Dusky kob (Argyrosomus japonicus) is a commercially important finfish, indigenous to South Africa, Australia, and China. Previous studies highlighted differences in genetic composition, life history, and morphology of the species across geographic regions. A draft genome sequence of 0.742 Gb (N50 = 5.49 Mb; BUSCO completeness = 97.8%) and 22,438 predicted protein-coding genes was generated for the South African (SA) conspecific. A comparison with the Chinese (CN) conspecific revealed a core set of 32,068 orthologous protein clusters across both genomes. The SA genome exhibited 440 unique clusters compared to 1928 unique clusters in the CN genome. Transportation and immune response processes were overrepresented among the SA accessory genome, whereas the CN accessory genome was enriched for immune response, DNA transposition, and sensory detection (FDR-adjusted p < 0.01). These unique clusters may represent an adaptive component of the species' pangenome that could explain population divergence due to differential environmental specialisation. Furthermore, 700 single-copy orthologues (SCOs) displayed evidence of positive selection between the SA and CN genomes, and globally these genomes shared only 92% similarity, suggesting they might be distinct species. These genes primarily play roles in metabolism and digestion, illustrating the evolutionary pathways that differentiate the species. Understanding these genomic mechanisms underlying adaptation and evolution within and between species provides valuable insights into growth and maturation of kob, traits that are particularly relevant to commercial aquaculture.
Collapse
Affiliation(s)
- Tassin Kim Jackson
- Department of Genetics, Stellenbosch University, Stellenbosch, South Africa
| | - Clint Rhode
- Department of Genetics, Stellenbosch University, Stellenbosch, South Africa
| |
Collapse
|
2
|
Microsatellite Genome-Wide Database Development for the Commercial Blackhead Seabream (Acanthopagrus schlegelii). Genes (Basel) 2023; 14:genes14030620. [PMID: 36980892 PMCID: PMC10048070 DOI: 10.3390/genes14030620] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2023] [Revised: 02/26/2023] [Accepted: 02/27/2023] [Indexed: 03/05/2023] Open
Abstract
Simple sequence repeats (SSRs), the markers with the highest polymorphism and co-dominance degrees, offer a crucial genetic research resource. Limited SSR markers in blackhead seabream have been reported. The availability of the blackhead seabream genome assembly provided the opportunity to carry out genome-wide identification for all microsatellite markers, and bioinformatic analyses open the way for developing a microsatellite genome-wide database in blackhead seabream. In this study, a total of 412,381 SSRs were identified in the 688.08 Mb genome by Krait software. Whole-genome sequences (10×) of 42 samples were aligned against the reference genome and genotyped using the HipSTR tools by comparing and counting repeat number variation across the SSR loci. A total of 156,086 SSRs with a 2–4 bp repeat were genotyped by HipSTR tools, which accounted for 55.78% of the 2–4 bp SSRs in the reference genome. High accuracy of genotyping was observed by comparing HipSTR tools and PCR amplification. A set of 109,131 loci with a number of alleles ≥ 3 and with a number of genotyped individuals ≥ 6 were reserved to constitute the polymorphic SSR database. Fifty-one polymorphic SSR loci were identified through PCR amplification. This strategy to develop polymorphic SSR markers not only obtained a large set of polymorphic SSRs but also eliminated the need for laborious experimental screening. SSR markers developed in this study may facilitate blackhead seabream research, which lays a certain foundation for further gene tagging and genetic linkage analysis, such as marker-assisted selection, genetic mapping, as well as comparative genomic analysis.
Collapse
|
3
|
Tian HF, Hu QM, Li Z. Genome-wide identification of simple sequence repeats and development of polymorphic SSR markers in swamp eel (Monopterus albus). Sci Prog 2021; 104:368504211035597. [PMID: 34375541 PMCID: PMC10358632 DOI: 10.1177/00368504211035597] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
OBJECTIVES Swamp eel is one model species for sexual reversion and an aquaculture fish in China. One local strain with deep yellow and big spots of Monopterus albus has been selected for consecutive selective breeding. The objectives of this study were characterizing the Simple Sequence Repeats (SSRs) of M. albus in the assembled genome obtained recently, and developing polymorphic SSRs for future breeding programs. METHODS The genome wide SSRs were mined by using MISA software, and their types and genomic distribution patterns were investigated. Based on the available flanking sequences, primer pairs were batched developed, and Polymorphic SSRs were identified by using Polymorphic SSR Retrieval tool. The obtained polymorphic SSRs were validated by using e-PCR and capillary electrophoresis, then they were used to investigate genetic diversity of one breeding population. RESULTS A total of 364,802 SSRs were identified in assembled M. albus genome. The total length, density and frequency of SSRs were 8,204,641 bp, 10,259 bp/Mb, and 456.16 loci/Mb, respectively. Mononucleotide repeats were predominant among SSRs (33.33%), and AC and AAT repeats were the most abundant di- and tri-nucleotide repeats motifs. A total of 287,189 primer pairs were designed, and a high-density physical map was constructed (359.11 markers per Mb). A total of 871 polymorphic SSRs were identified, and 38 SSRs of 101 randomly selected ones were validated by using e-PCR and capillary electrophoresis. Using these 38 polymorphic SSRs, 201 alleles were detected and genetic diversity level (Na, PIC, HO, and He) was evaluated. CONCLUSIONS The genome-wide SSRs and newly developed SSR markers will provide a useful tool for genetic mapping, diversity analysis studies in swamp eel in the future. The high level of genetic diversity (Na = 5.29, PIC = 0.5068, HO = 0.4665, He = 0.5525) but excess of homozygotes (FIS = 0.155) in one breeding population provide baseline information for future breeding program.
Collapse
Affiliation(s)
- Hai-feng Tian
- Yangtze River Fisheries Research Institute, Chinese Academy of Fishery Sciences, Wuhan, Hubei, China
| | - Qiao-mu Hu
- Yangtze River Fisheries Research Institute, Chinese Academy of Fishery Sciences, Wuhan, Hubei, China
| | - Zhong Li
- Yangtze River Fisheries Research Institute, Chinese Academy of Fishery Sciences, Wuhan, Hubei, China
| |
Collapse
|
4
|
Zhu L, Wu H, Li H, Tang H, Zhang L, Xu H, Jiao F, Wang N, Yang L. Short Tandem Repeats in plants: Genomic distribution and function prediction. ELECTRON J BIOTECHN 2021. [DOI: 10.1016/j.ejbt.2020.12.003] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022] Open
|
5
|
Tørresen OK, Star B, Mier P, Andrade-Navarro MA, Bateman A, Jarnot P, Gruca A, Grynberg M, Kajava AV, Promponas VJ, Anisimova M, Jakobsen KS, Linke D. Tandem repeats lead to sequence assembly errors and impose multi-level challenges for genome and protein databases. Nucleic Acids Res 2019; 47:10994-11006. [PMID: 31584084 PMCID: PMC6868369 DOI: 10.1093/nar/gkz841] [Citation(s) in RCA: 155] [Impact Index Per Article: 31.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2019] [Revised: 09/03/2019] [Accepted: 10/01/2019] [Indexed: 12/13/2022] Open
Abstract
The widespread occurrence of repetitive stretches of DNA in genomes of organisms across the tree of life imposes fundamental challenges for sequencing, genome assembly, and automated annotation of genes and proteins. This multi-level problem can lead to errors in genome and protein databases that are often not recognized or acknowledged. As a consequence, end users working with sequences with repetitive regions are faced with 'ready-to-use' deposited data whose trustworthiness is difficult to determine, let alone to quantify. Here, we provide a review of the problems associated with tandem repeat sequences that originate from different stages during the sequencing-assembly-annotation-deposition workflow, and that may proliferate in public database repositories affecting all downstream analyses. As a case study, we provide examples of the Atlantic cod genome, whose sequencing and assembly were hindered by a particularly high prevalence of tandem repeats. We complement this case study with examples from other species, where mis-annotations and sequencing errors have propagated into protein databases. With this review, we aim to raise the awareness level within the community of database users, and alert scientists working in the underlying workflow of database creation that the data they omit or improperly assemble may well contain important biological information valuable to others.
Collapse
Affiliation(s)
- Ole K Tørresen
- Centre for Ecological and Evolutionary Synthesis, Department of Biosciences, University of Oslo, NO-0316 Oslo, Norway
| | - Bastiaan Star
- Centre for Ecological and Evolutionary Synthesis, Department of Biosciences, University of Oslo, NO-0316 Oslo, Norway
| | - Pablo Mier
- Faculty of Biology, Johannes Gutenberg University Mainz, Hans-Dieter-Husch-Weg 15, 55128 Mainz, Germany
| | - Miguel A Andrade-Navarro
- Faculty of Biology, Johannes Gutenberg University Mainz, Hans-Dieter-Husch-Weg 15, 55128 Mainz, Germany
| | - Alex Bateman
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton. CB10 1SD, UK
| | - Patryk Jarnot
- Institute of Informatics, Silesian University of Technology, Akademicka 16, 44-100 Gliwice, Poland
| | - Aleksandra Gruca
- Institute of Informatics, Silesian University of Technology, Akademicka 16, 44-100 Gliwice, Poland
| | - Marcin Grynberg
- Institute of Biochemistry and Biophysics PAS, Pawińskiego 5A, 02-106 Warsaw, Poland
| | - Andrey V Kajava
- Centre de Recherche en Biologie cellulaire de Montpellier, UMR 5237 CNRS, Universite Montpellier 1919 Route de Mende, CEDEX 5, 34293 Montpellier, France
- Institut de Biologie Computationnelle, 34095 Montpellier, France
| | - Vasilis J Promponas
- Bioinformatics Research Laboratory, Department of Biological Sciences, University of Cyprus, PO Box 20537, CY 1678 Nicosia, Cyprus
| | - Maria Anisimova
- Institute of Applied Simulations, School of Life Sciences and Facility Management, Zurich University of Applied Sciences (ZHAW), Wädenswil, Switzerland
- Swiss Institute of Bioinformatics (SIB), Lausanne, Switzerland
| | - Kjetill S Jakobsen
- Centre for Ecological and Evolutionary Synthesis, Department of Biosciences, University of Oslo, NO-0316 Oslo, Norway
| | - Dirk Linke
- Section for Genetics and Evolutionary Biology, Department of Biosciences, University of Oslo, NO-0316 Oslo, Norway
| |
Collapse
|
6
|
Wang X, Zhang Y, Qiao L, Chen B. Comparative analyses of simple sequence repeats (SSRs) in 23 mosquito species genomes: Identification, characterization and distribution (Diptera: Culicidae). INSECT SCIENCE 2019; 26:607-619. [PMID: 29484820 PMCID: PMC7379697 DOI: 10.1111/1744-7917.12577] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/02/2017] [Revised: 01/20/2018] [Accepted: 01/24/2018] [Indexed: 05/28/2023]
Abstract
Simple sequence repeats (SSRs) exist in both eukaryotic and prokaryotic genomes and are the most popular genetic markers, but the SSRs of mosquito genomes are still not well understood. In this study, we identified and analyzed the SSRs in 23 mosquito species using Drosophila melanogaster as reference at the whole-genome level. The results show that SSR numbers (33 076-560 175/genome) and genome sizes (574.57-1342.21 Mb) are significantly positively correlated (R2 = 0.8992, P < 0.01), but the correlation in individual species varies in these mosquito species. In six types of SSR, mono- to trinucleotide SSRs are dominant with cumulative percentages of 95.14%-99.00% and densities of 195.65/Mb-787.51/Mb, whereas tetra- to hexanucleotide SSRs are rare with 1.12%-4.22% and 3.76/Mb-40.23/Mb. The (A/T)n, (AC/GT)n and (AGC/GCT)n are the most frequent motifs in mononucleotide, dinucleotide and trinucleotide SSRs, respectively, and the motif frequencies of tetra- to hexanucleotide SSRs appear to be species-specific. The 10-20 bp length of SSRs are dominant with the number of 110 561 ± 93 482 and the frequency of 87.25% ± 5.73% on average, and the number and frequency decline with the increase of length. Most SSRs (83.34% ± 7.72%) are located in intergenic regions, followed by intron regions (11.59% ± 5.59%), exon regions (3.74% ± 1.95%), and untranslated regions (1.32% ± 1.39%). The mono-, di- and trinucleotide SSRs are the main SSRs in both gene regions (98.55% ± 0.85%) and exon regions (99.27% ± 0.52%). An average of 42.52% of total genes contains SSRs, and the preference for SSR occurrence in different gene subcategories are species-specific. The study provides useful insights into the SSR diversity, characteristics and distribution in 23 mosquito species of genomes.
Collapse
Affiliation(s)
- Xiao‐Ting Wang
- Chongqing Key Laboratory of Vector Insects; Chongqing Key Laboratory of Animal Biology; Institute of Entomology and Molecular BiologyChongqing Normal UniversityChongqingChina
| | - Yu‐Juan Zhang
- Chongqing Key Laboratory of Vector Insects; Chongqing Key Laboratory of Animal Biology; Institute of Entomology and Molecular BiologyChongqing Normal UniversityChongqingChina
| | - Liang Qiao
- Chongqing Key Laboratory of Vector Insects; Chongqing Key Laboratory of Animal Biology; Institute of Entomology and Molecular BiologyChongqing Normal UniversityChongqingChina
| | - Bin Chen
- Chongqing Key Laboratory of Vector Insects; Chongqing Key Laboratory of Animal Biology; Institute of Entomology and Molecular BiologyChongqing Normal UniversityChongqingChina
| |
Collapse
|
7
|
Tørresen OK, Brieuc MSO, Solbakken MH, Sørhus E, Nederbragt AJ, Jakobsen KS, Meier S, Edvardsen RB, Jentoft S. Genomic architecture of haddock (Melanogrammus aeglefinus) shows expansions of innate immune genes and short tandem repeats. BMC Genomics 2018; 19:240. [PMID: 29636006 PMCID: PMC5894186 DOI: 10.1186/s12864-018-4616-y] [Citation(s) in RCA: 31] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2017] [Accepted: 03/22/2018] [Indexed: 02/06/2023] Open
Abstract
Background Increased availability of genome assemblies for non-model organisms has resulted in invaluable biological and genomic insight into numerous vertebrates, including teleosts. Sequencing of the Atlantic cod (Gadus morhua) genome and the genomes of many of its relatives (Gadiformes) demonstrated a shared loss of the major histocompatibility complex (MHC) II genes 100 million years ago. An improved version of the Atlantic cod genome assembly shows an extreme density of tandem repeats compared to other vertebrate genome assemblies. Highly contiguous assemblies are therefore needed to further investigate the unusual immune system of the Gadiformes, and whether the high density of tandem repeats found in Atlantic cod is a shared trait in this group. Results Here, we have sequenced and assembled the genome of haddock (Melanogrammus aeglefinus) – a relative of Atlantic cod – using a combination of PacBio and Illumina reads. Comparative analyses reveal that the haddock genome contains an even higher density of tandem repeats outside and within protein coding sequences than Atlantic cod. Further, both species show an elevated number of tandem repeats in genes mainly involved in signal transduction compared to other teleosts. A characterization of the immune gene repertoire demonstrates a substantial expansion of MCHI in Atlantic cod compared to haddock. In contrast, the Toll-like receptors show a similar pattern of gene losses and expansions. For the NOD-like receptors (NLRs), another gene family associated with the innate immune system, we find a large expansion common to all teleosts, with possible lineage-specific expansions in zebrafish, stickleback and the codfishes. Conclusions The generation of a highly contiguous genome assembly of haddock revealed that the high density of short tandem repeats as well as expanded immune gene families is not unique to Atlantic cod – but possibly a feature common to all, or most, codfishes. A shared expansion of NLR genes in teleosts suggests that the NLRs have a more substantial role in the innate immunity of teleosts than other vertebrates. Moreover, we find that high copy number genes combined with variable genome assembly qualities may impede complete characterization of these genes, i.e. the number of NLRs in different teleost species might be underestimates. Electronic supplementary material The online version of this article (10.1186/s12864-018-4616-y) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Ole K Tørresen
- Centre for Ecological and Evolutionary Synthesis, Department of Biosciences, University of Oslo, Oslo, Norway.
| | - Marine S O Brieuc
- Centre for Ecological and Evolutionary Synthesis, Department of Biosciences, University of Oslo, Oslo, Norway
| | - Monica H Solbakken
- Centre for Ecological and Evolutionary Synthesis, Department of Biosciences, University of Oslo, Oslo, Norway
| | - Elin Sørhus
- Institute of Marine Research, Bergen, Norway
| | - Alexander J Nederbragt
- Centre for Ecological and Evolutionary Synthesis, Department of Biosciences, University of Oslo, Oslo, Norway.,Biomedical Informatics Research Group, Department of Informatics, University of Oslo, Oslo, Norway
| | - Kjetill S Jakobsen
- Centre for Ecological and Evolutionary Synthesis, Department of Biosciences, University of Oslo, Oslo, Norway
| | | | | | - Sissel Jentoft
- Centre for Ecological and Evolutionary Synthesis, Department of Biosciences, University of Oslo, Oslo, Norway.
| |
Collapse
|
8
|
Tørresen OK, Star B, Jentoft S, Reinar WB, Grove H, Miller JR, Walenz BP, Knight J, Ekholm JM, Peluso P, Edvardsen RB, Tooming-Klunderud A, Skage M, Lien S, Jakobsen KS, Nederbragt AJ. An improved genome assembly uncovers prolific tandem repeats in Atlantic cod. BMC Genomics 2017; 18:95. [PMID: 28100185 PMCID: PMC5241972 DOI: 10.1186/s12864-016-3448-x] [Citation(s) in RCA: 115] [Impact Index Per Article: 16.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2016] [Accepted: 12/20/2016] [Indexed: 01/06/2023] Open
Abstract
BACKGROUND The first Atlantic cod (Gadus morhua) genome assembly published in 2011 was one of the early genome assemblies exclusively based on high-throughput 454 pyrosequencing. Since then, rapid advances in sequencing technologies have led to a multitude of assemblies generated for complex genomes, although many of these are of a fragmented nature with a significant fraction of bases in gaps. The development of long-read sequencing and improved software now enable the generation of more contiguous genome assemblies. RESULTS By combining data from Illumina, 454 and the longer PacBio sequencing technologies, as well as integrating the results of multiple assembly programs, we have created a substantially improved version of the Atlantic cod genome assembly. The sequence contiguity of this assembly is increased fifty-fold and the proportion of gap-bases has been reduced fifteen-fold. Compared to other vertebrates, the assembly contains an unusual high density of tandem repeats (TRs). Indeed, retrospective analyses reveal that gaps in the first genome assembly were largely associated with these TRs. We show that 21% of the TRs across the assembly, 19% in the promoter regions and 12% in the coding sequences are heterozygous in the sequenced individual. CONCLUSIONS The inclusion of PacBio reads combined with the use of multiple assembly programs drastically improved the Atlantic cod genome assembly by successfully resolving long TRs. The high frequency of heterozygous TRs within or in the vicinity of genes in the genome indicate a considerable standing genomic variation in Atlantic cod populations, which is likely of evolutionary importance.
Collapse
Affiliation(s)
- Ole K. Tørresen
- Centre for Ecological and Evolutionary Synthesis, Department of Biosciences, University of Oslo, Oslo, NO-0316 Norway
| | - Bastiaan Star
- Centre for Ecological and Evolutionary Synthesis, Department of Biosciences, University of Oslo, Oslo, NO-0316 Norway
| | - Sissel Jentoft
- Centre for Ecological and Evolutionary Synthesis, Department of Biosciences, University of Oslo, Oslo, NO-0316 Norway
- Department of Natural Sciences, University of Agder, Kristiansand, NO-4604 Norway
| | - William B. Reinar
- Centre for Ecological and Evolutionary Synthesis, Department of Biosciences, University of Oslo, Oslo, NO-0316 Norway
| | - Harald Grove
- Centre for Integrative Genetics (CIGENE), Department of Animal and Aquacultural Sciences, Norwegian University of Life Sciences, Ås, NO-1432 Norway
| | - Jason R. Miller
- J. Craig Venter Institute, 9704 Medical Center Drive, Rockville, 20850 MD USA
| | - Brian P. Walenz
- Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, 20892 MD USA
| | - James Knight
- Yale School of Medicine, Yale University, New Haven, 06520 CT USA
| | | | | | | | - Ave Tooming-Klunderud
- Centre for Ecological and Evolutionary Synthesis, Department of Biosciences, University of Oslo, Oslo, NO-0316 Norway
| | - Morten Skage
- Centre for Ecological and Evolutionary Synthesis, Department of Biosciences, University of Oslo, Oslo, NO-0316 Norway
| | - Sigbjørn Lien
- Centre for Integrative Genetics (CIGENE), Department of Animal and Aquacultural Sciences, Norwegian University of Life Sciences, Ås, NO-1432 Norway
| | - Kjetill S. Jakobsen
- Centre for Ecological and Evolutionary Synthesis, Department of Biosciences, University of Oslo, Oslo, NO-0316 Norway
| | - Alexander J. Nederbragt
- Centre for Ecological and Evolutionary Synthesis, Department of Biosciences, University of Oslo, Oslo, NO-0316 Norway
- Biomedical Informatics Research Group, Department of Informatics, University of Oslo, Oslo, NO-0316 Norway
| |
Collapse
|
9
|
Cheng J, Zhao Z, Li B, Qin C, Wu Z, Trejo-Saavedra DL, Luo X, Cui J, Rivera-Bustamante RF, Li S, Hu K. A comprehensive characterization of simple sequence repeats in pepper genomes provides valuable resources for marker development in Capsicum. Sci Rep 2016; 6:18919. [PMID: 26739748 PMCID: PMC4703971 DOI: 10.1038/srep18919] [Citation(s) in RCA: 48] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/27/2015] [Accepted: 11/30/2015] [Indexed: 02/05/2023] Open
Abstract
The sequences of the full set of pepper genomes including nuclear, mitochondrial and chloroplast are now available for use. However, the overall of simple sequence repeats (SSR) distribution in these genomes and their practical implications for molecular marker development in Capsicum have not yet been described. Here, an average of 868,047.50, 45.50 and 30.00 SSR loci were identified in the nuclear, mitochondrial and chloroplast genomes of pepper, respectively. Subsequently, systematic comparisons of various species, genome types, motif lengths, repeat numbers and classified types were executed and discussed. In addition, a local database composed of 113,500 in silico unique SSR primer pairs was built using a homemade bioinformatics workflow. As a pilot study, 65 polymorphic markers were validated among a wide collection of 21 Capsicum genotypes with allele number and polymorphic information content value per marker raging from 2 to 6 and 0.05 to 0.64, respectively. Finally, a comparison of the clustering results with those of a previous study indicated the usability of the newly developed SSR markers. In summary, this first report on the comprehensive characterization of SSR motifs in pepper genomes and the very large set of SSR primer pairs will benefit various genetic studies in Capsicum.
Collapse
Affiliation(s)
- Jiaowen Cheng
- College of Horticulture, South China Agricultural University, Guangzhou 510642, China
| | - Zicheng Zhao
- Department of Computer Science, City University of Hong Kong, Hong Kong 999077, China
| | - Bo Li
- College of Horticulture, South China Agricultural University, Guangzhou 510642, China
| | - Cheng Qin
- Pepper Institute, Zunyi Academy of Agricultural Sciences, Zunyi, Guizhou 563102, China
| | - Zhiming Wu
- College of Horticulture and Landscape Architecture, Zhongkai University of Agriculture and Engineering, Guangzhou 510225, China
| | - Diana L. Trejo-Saavedra
- Departamento de Ingeniería Genética, Centro de Investigación y de Estudios Avanzados del IPN (Cinvestav)-Unidad Irapuato, Irapuato 36821, México
| | - Xirong Luo
- Pepper Institute, Zunyi Academy of Agricultural Sciences, Zunyi, Guizhou 563102, China
| | - Junjie Cui
- College of Horticulture, South China Agricultural University, Guangzhou 510642, China
| | - Rafael F. Rivera-Bustamante
- Departamento de Ingeniería Genética, Centro de Investigación y de Estudios Avanzados del IPN (Cinvestav)-Unidad Irapuato, Irapuato 36821, México
| | - Shuaicheng Li
- Department of Computer Science, City University of Hong Kong, Hong Kong 999077, China
| | - Kailin Hu
- College of Horticulture, South China Agricultural University, Guangzhou 510642, China
| |
Collapse
|