1
|
Singh KP, Kumari P, Yadava DK. Development of de-novo transcriptome assembly and SSRs in allohexaploid Brassica with functional annotations and identification of heat-shock proteins for thermotolerance. Front Genet 2022; 13:958217. [PMID: 36186472 PMCID: PMC9524822 DOI: 10.3389/fgene.2022.958217] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2022] [Accepted: 08/23/2022] [Indexed: 11/20/2022] Open
Abstract
Crop Brassicas contain monogenomic and digenomic species, with no evidence of a trigenomic Brassica in nature. Through somatic fusion (Sinapis alba + B. juncea), a novel allohexaploid trigenomic Brassica (H1 = AABBSS; 2n = 60) was produced and used for transcriptome analysis to uncover genes for thermotolerance, annotations, and microsatellite markers for future molecular breeding. Illumina Novaseq 6000 generated a total of 76,055,546 paired-end raw reads, which were used for de-novo assembly, resulting in the development of 486,066 transcripts. A total of 133,167 coding sequences (CDSs) were predicted from transcripts with a mean length of 507.12 bp and 46.15% GC content. The BLASTX search of CDSs against public protein databases showed a maximum of 126,131 (94.72%) and a minimum of 29,810 (22.39%) positive hits. Furthermore, 953,773 gene ontology (GO) terms were found in 77,613 (58.28%) CDSs, which were divided into biological processes (49.06%), cellular components (31.67%), and molecular functions (19.27%). CDSs were assigned to 144 pathways by a pathway study using the KEGG database and 1,551 pathways by a similar analysis using the Reactome database. Further investigation led to the discovery of genes encoding over 2,000 heat shock proteins (HSPs). The discovery of a large number of HSPs in allohexaploid Brassica validated our earlier findings for heat tolerance at seed maturity. A total of 15,736 SSRs have been found in 13,595 CDSs, with an average of one SSR per 4.29 kb length and an SSR frequency of 11.82%. The first transcriptome assembly of a meiotically stable allohexaploid Brassica has been given in this article, along with functional annotations and the presence of SSRs, which could aid future genetic and genomic studies.
Collapse
Affiliation(s)
| | - Preetesh Kumari
- Genetics Division, ICAR—Indian Agricultural Research Institute, New Delhi, India
- *Correspondence: Preetesh Kumari,
| | | |
Collapse
|
2
|
Li X, Qiao L, Chen B, Zheng Y, Zhi C, Zhang S, Pan Y, Cheng Z. SSR markers development and their application in genetic diversity evaluation of garlic ( Allium sativum) germplasm. PLANT DIVERSITY 2022; 44:481-491. [PMID: 36187554 PMCID: PMC9512637 DOI: 10.1016/j.pld.2021.08.001] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/23/2021] [Revised: 07/26/2021] [Accepted: 08/01/2021] [Indexed: 05/25/2023]
Abstract
Garlic (Allium sativum), an asexually propagated vegetable and medicinal crop, has abundant genetic variation. Genetic diversity evaluation based on molecular markers has apparent advantages since their genomic abundance, environment insensitivity, and non-tissue specific features. However, the limited number of available DNA markers, especially SSR markers, are insufficient to conduct related genetic diversity assessment studies in garlic. In this study, 4372 EST-SSR markers were newly developed, and 12 polymorphic markers together with other 17 garlic SSR markers were used to assess the genetic diversity and population structure of 127 garlic accessions. The averaged polymorphism information content (PIC) of these 29 SSR markers was 0.36, ranging from 0.22 to 0.49. Seventy-nine polymorphic loci were detected among these accessions, with an average of 3.48 polymorphic loci per SSR. Both the clustering analyses based on either the genotype data of SSR markers or the phenotypic data of morphological traits obtained genetic distance divided the 127 garlic accessions into three clusters. Moreover, the Mantel test showed that genetic distance had no significant correlations with geographic distance, and weak correlations were found between genetic distance and the phenotypic traits. AMOVA analysis showed that the main genetic variation of this garlic germplasm collection existed in the within-population or cluster. Results of this study will be of great value for the genetic/breeding studies in garlic and enhance the utilization of these garlic germplasms.
Collapse
|
3
|
Lopez L, Wolf EM, Pires JC, Edger PP, Koch MA. Molecular Resources from Transcriptomes in the Brassicaceae Family. FRONTIERS IN PLANT SCIENCE 2017; 8:1488. [PMID: 28900436 PMCID: PMC5581910 DOI: 10.3389/fpls.2017.01488] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 06/16/2017] [Accepted: 08/11/2017] [Indexed: 06/07/2023]
Abstract
The rapidly falling costs and the increasing availability of large DNA sequence data sets facilitate the fast and affordable mining of large molecular markers data sets for comprehensive evolutionary studies. The Brassicaceae (mustards) are an important species-rich family in the plant kingdom with taxa distributed worldwide and a complex evolutionary history. We performed Simple Sequence Repeats (SSRs) mining using de novo assembled transcriptomes from 19 species across the Brassicaceae in order to study SSR evolution and provide comprehensive sets of molecular markers for genetic studies within the family. Moreover, we selected the genus Cochlearia to test the transferability and polymorphism of these markers among species. Additionally, we annotated Cochlearia pyrenaica transcriptome in order to identify the position of each of the mined SSRs. While we introduce a new set of tools that will further enable evolutionary studies across the Brassicaceae, we also discuss some broader aspects of SSR evolution. Overall, we developed 2012 ready-to-use SSR markers with their respective primers in 19 Brassicaceae species and a high quality annotated transcriptome for C. pyrenaica. As indicated by our transferability test with the genus Cochlearia these SSRs are transferable to species within the genus increasing exponentially the number of targeted species. Also, our polymorphism results showed substantial levels of variability for these markers. Finally, despite its complex evolutionary history, SSR evolution across the Brassicaceae family is highly conserved and we found no deviation from patterns reported in other Angiosperms.
Collapse
Affiliation(s)
- Lua Lopez
- Biodiversity and Plant Systematics, Centre of Organismal Studies, University of HeidelbergHeidelberg, Germany
| | - Eva M. Wolf
- Biodiversity and Plant Systematics, Centre of Organismal Studies, University of HeidelbergHeidelberg, Germany
| | - J. Chris Pires
- Division of Biological Sciences, University of MissouriColumbia, MO, United States
| | - Patrick P. Edger
- Department of Horticulture, Michigan State UniversityEast Lansing, MI, United States
- Ecology, Evolutionary Biology and Behavior, Michigan State UniversityEast Lansing, MI, United States
| | - Marcus A. Koch
- Biodiversity and Plant Systematics, Centre of Organismal Studies, University of HeidelbergHeidelberg, Germany
| |
Collapse
|
4
|
Mining microsatellite markers from public expressed sequence tags databases for the study of threatened plants. BMC Genomics 2015; 16:781. [PMID: 26463180 PMCID: PMC4603344 DOI: 10.1186/s12864-015-2031-1] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2015] [Accepted: 10/09/2015] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Simple Sequence Repeats (SSRs) are widely used in population genetic studies but their classical development is costly and time-consuming. The ever-increasing available DNA datasets generated by high-throughput techniques offer an inexpensive alternative for SSRs discovery. Expressed Sequence Tags (ESTs) have been widely used as SSR source for plants of economic relevance but their application to non-model species is still modest. METHODS Here, we explored the use of publicly available ESTs (GenBank at the National Center for Biotechnology Information-NCBI) for SSRs development in non-model plants, focusing on genera listed by the International Union for the Conservation of Nature (IUCN). We also search two model genera with fully annotated genomes for EST-SSRs, Arabidopsis and Oryza, and used them as controls for genome distribution analyses. Overall, we downloaded 16 031 555 sequences for 258 plant genera which were mined for SSRsand their primers with the help of QDD1. Genome distribution analyses in Oryza and Arabidopsis were done by blasting the sequences with SSR against the Oryza sativa and Arabidopsis thaliana reference genomes implemented in the Basal Local Alignment Tool (BLAST) of the NCBI website. Finally, we performed an empirical test to determine the performance of our EST-SSRs in a few individuals from four species of two eudicot genera, Trifolium and Centaurea. RESULTS We explored a total of 14 498 726 EST sequences from the dbEST database (NCBI) in 257 plant genera from the IUCN Red List. We identify a very large number (17 102) of ready-to-test EST-SSRs in most plant genera (193) at no cost. Overall, dinucleotide and trinucleotide repeats were the prevalent types but the abundance of the various types of repeat differed between taxonomic groups. Control genomes revealed that trinucleotide repeats were mostly located in coding regions while dinucleotide repeats were largely associated with untranslated regions. Our results from the empirical test revealed considerable amplification success and transferability between congenerics. CONCLUSIONS The present work represents the first large-scale study developing SSRs by utilizing publicly accessible EST databases in threatened plants. Here we provide a very large number of ready-to-test EST-SSR (17 102) for 193 genera. The cross-species transferability suggests that the number of possible target species would be large. Since trinucleotide repeats are abundant and mainly linked to exons they might be useful in evolutionary and conservation studies. Altogether, our study highly supports the use of EST databases as an extremely affordable and fast alternative for SSR developing in threatened plants.
Collapse
|
5
|
Chen H, Qiao L, Wang L, Wang S, Blair MW, Cheng X. Assessment of genetic diversity and population structure of mung bean (Vigna radiata) germplasm using EST-based and genomic SSR markers. Gene 2015; 566:175-83. [PMID: 25895480 DOI: 10.1016/j.gene.2015.04.043] [Citation(s) in RCA: 27] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/29/2014] [Revised: 04/15/2015] [Accepted: 04/16/2015] [Indexed: 01/30/2023]
Abstract
Mung bean is an important legume crop in tropical and subtropical countries of Asia and has high nutritional and economic value. However the genetic diversity of mung bean is poorly characterized. In this study, our goal was to develop and use microsatellite simple sequence repeat (SSR) markers for germplasm evaluation. In total, 500 novel expression sequence tag EST-based SSRs (eSSRs) and genomic SSRs (gSSRs) were developed from mung bean transcriptome and genome sequences. Of these, only 58 were useful for diversity evaluation in a panel of 157 cultivated and wild mung bean accessions from different collection sites in East Asia. A total of 2.66 alleles were detected on average per locus which shows that polymorphism is generally low for the species. The average polymorphic information content (PIC) of gSSRs was higher than eSSRs and most of the polymorphic gSSRs were composed of di- and tri-nucleotide repeats (52.4% and 38.1% of all loci, respectively). The genotypes were differentiated into nine subgroups by cluster analysis, and the wild mung bean accessions separated well from the cultivated accessions. Analysis of molecular variance indicated that 22% of variance was observed among populations and 78% was due to differences within populations. Clustering, population structure analyses showed that non-Chinese cultivated and wild mung bean accessions were separated from Chinese accessions, but no geographical distinctions existed between genotypes collected in China. Interestingly, the average PIC value of cultivated mung bean (0.36) was higher than that of wild mung bean (0.25) showing that further collecting and wide crosses are necessary for mung bean improvement.
Collapse
Affiliation(s)
- Honglin Chen
- National Key Facility for Crop Gene Resources Genetic Improvement, Institute of Crop Science, Chinese Academy of Agricultural Sciences, Beijing 100081, China
| | - Ling Qiao
- National Key Facility for Crop Gene Resources Genetic Improvement, Institute of Crop Science, Chinese Academy of Agricultural Sciences, Beijing 100081, China; College of Agriculture, Shanxi Agricultural University, Taigu 030801, China
| | - Lixia Wang
- National Key Facility for Crop Gene Resources Genetic Improvement, Institute of Crop Science, Chinese Academy of Agricultural Sciences, Beijing 100081, China
| | - Suhua Wang
- National Key Facility for Crop Gene Resources Genetic Improvement, Institute of Crop Science, Chinese Academy of Agricultural Sciences, Beijing 100081, China
| | - Matthew Wohlgemuth Blair
- Department of Agricultural and Environmental Sciences, Tennessee State University, Nashville, TN 37209, USA
| | - Xuzhen Cheng
- National Key Facility for Crop Gene Resources Genetic Improvement, Institute of Crop Science, Chinese Academy of Agricultural Sciences, Beijing 100081, China.
| |
Collapse
|
6
|
Müller BSDF, Sakamoto T, de Menezes IPP, Prado GS, Martins WS, Brondani C, de Barros EG, Vianello RP. Analysis of BAC-end sequences in common bean (Phaseolus vulgaris L.) towards the development and characterization of long motifs SSRs. PLANT MOLECULAR BIOLOGY 2014; 86:455-470. [PMID: 25164100 DOI: 10.1007/s11103-014-0240-7] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/20/2013] [Accepted: 08/14/2014] [Indexed: 06/03/2023]
Abstract
The increasing volume of genomic data on the Phaseolus vulgaris species have contributed to its importance as a model genetic species and positively affected the investigation of other legumes of scientific and economic value. To expand and gain a more in-depth knowledge of the common bean genome, the ends of a number of bacterial artificial chromosome (BAC) were sequenced, annotated and the presence of repetitive sequences was determined. In total, 52,270 BESs (BAC-end sequences), equivalent to 32 Mbp (~6 %) of the genome, were processed. In total, 3,789 BES-SSRs were identified, with a distribution of one SSR (simple sequence repeat) per 8.36 kbp and 2,000 were suitable for the development of SSRs, of which 194 were evaluated in low-resolution screening. From 40 BES-SSRs based on long motifs SSRs (≥ trinucleotides) analyzed in high-resolution genotyping, 34 showed an equally good amplification for the Andean and for the Mesoamerican genepools, exhibiting an average gene diversity (H E) of 0.490 and 5.59 alleles/locus, of which six classified as Class I showed a H E ≥ 0.7. The PCoA and structure analysis allowed to discriminate the gene pools (K = 2, FST = 0.733). From the 52,270 BESs, 2 % corresponded to transcription factors and 3 % to transposable elements. Putative functions for 24,321 BESs were identified and for 19,363 were assigned functional categories (gene ontology). This study identified highly polymorphic BES-SSRs containing tri- to hexanucleotides motifs and bringing together relevant genetic characteristics useful for breeding programs. Additionally, the BESs were incorporated into the international genome-sequencing project for the common bean.
Collapse
Affiliation(s)
- Bárbara Salomão de Faria Müller
- Laboratório de Genética Molecular de Plantas, Instituto de Biotecnologia Aplicada à Agropecuária (BIOAGRO), Universidade Federal de Viçosa (UFV), Viçosa, MG, Brazil
| | | | | | | | | | | | | | | |
Collapse
|
7
|
Vukosavljev M, Esselink GD, van ’t Westende WPC, Cox P, Visser RGF, Arens P, Smulders MJM. Efficient development of highly polymorphic microsatellite markers based on polymorphic repeats in transcriptome sequences of multiple individuals. Mol Ecol Resour 2014; 15:17-27. [DOI: 10.1111/1755-0998.12289] [Citation(s) in RCA: 39] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/18/2014] [Revised: 05/29/2014] [Accepted: 05/30/2014] [Indexed: 11/26/2022]
Affiliation(s)
- M. Vukosavljev
- Wageningen UR Plant Breeding; Wageningen University & Research Centre; P.O. Box 386 NL-6700AJ Wageningen the Netherlands
- C.T. de Wit Graduate School for Production Ecology and Resource Conservation (PE&RC); Wageningen the Netherlands
| | - G. D. Esselink
- Wageningen UR Plant Breeding; Wageningen University & Research Centre; P.O. Box 386 NL-6700AJ Wageningen the Netherlands
| | - W. P. C. van ’t Westende
- Wageningen UR Plant Breeding; Wageningen University & Research Centre; P.O. Box 386 NL-6700AJ Wageningen the Netherlands
| | - P. Cox
- Roath BV; Eindhoven the Netherlands
| | - R. G. F. Visser
- Wageningen UR Plant Breeding; Wageningen University & Research Centre; P.O. Box 386 NL-6700AJ Wageningen the Netherlands
| | - P. Arens
- Wageningen UR Plant Breeding; Wageningen University & Research Centre; P.O. Box 386 NL-6700AJ Wageningen the Netherlands
| | - M. J. M. Smulders
- Wageningen UR Plant Breeding; Wageningen University & Research Centre; P.O. Box 386 NL-6700AJ Wageningen the Netherlands
| |
Collapse
|