Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

Download

Total Articles

98
(from Reference Citation Analysis)

Article PDFs (49)

Cited by ≥ 1 (56)

Searched Name

simple sequence repeats

Year Published

Show more Refine

Article Statistics

Refine

Ranked By

Results Analysis

Year Published Analysis
Article Type Analysis
Publication Title Analysis
Category Analysis

Results Analysis

Journal Articles

Rank	Citation Analysis	Article Type	Number of Years	Citation(s) in RCA
1	Li C, Zhu Y, Guo X, Sun C, Luo H, Song J, Li Y, Wang L, Qian J, Chen S. Transcriptome analysis reveals ginsenosides biosynthetic genes, microRNAs and simple sequence repeats in Panax ginseng C. A. Meyer. BMC Genomics 2013;14:245. [PMID: 23577925 PMCID: PMC3637502 DOI: 10.1186/1471-2164-14-245] [Citation(s) in RCA: 89] [Impact Index Per Article: 7.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2012] [Accepted: 04/02/2013] [Indexed: 01/06/2023] Open Abstract BACKGROUND Panax ginseng C. A. Meyer is one of the most widely used medicinal plants. Complete genome information for this species remains unavailable due to its large genome size. At present, analysis of expressed sequence tags is still the most powerful tool for large-scale gene discovery. The global expressed sequence tags from P. ginseng tissues, especially those isolated from stems, leaves and flowers, are still limited, hindering in-depth study of P. ginseng. RESULTS Two 454 pyrosequencing runs generated a total of 2,423,076 reads from P. ginseng roots, stems, leaves and flowers. The high-quality reads from each of the tissues were independently assembled into separate and shared contigs. In the separately assembled database, 45,849, 6,172, 4,041 and 3,273 unigenes were only found in the roots, stems, leaves and flowers database, respectively. In the jointly assembled database, 178,145 unigenes were observed, including 86,609 contigs and 91,536 singletons. Among the 178,145 unigenes, 105,522 were identified for the first time, of which 65.6% were identified in the stem, leaf or flower cDNA libraries of P. ginseng. After annotation, we discovered 223 unigenes involved in ginsenoside backbone biosynthesis. Additionally, a total of 326 potential cytochrome P450 and 129 potential UDP-glycosyltransferase sequences were predicted based on the annotation results, some of which may encode enzymes responsible for ginsenoside backbone modification. A BLAST search of the obtained high-quality reads identified 14 potential microRNAs in P. ginseng, which were estimated to target 100 protein-coding genes, including transcription factors, transporters and DNA binding proteins, among others. In addition, a total of 13,044 simple sequence repeats were identified from the 178,145 unigenes. CONCLUSIONS This study provides global expressed sequence tags for P. ginseng, which will contribute significantly to further genome-wide research and analyses in this species. The novel unigenes identified here enlarge the available P. ginseng gene pool and will facilitate gene discovery. In addition, the identification of microRNAs and the prediction of targets from this study will provide information on gene transcriptional regulation in P. ginseng. Finally, the analysis of simple sequence repeats will provide genetic makers for molecular breeding and genetic applications in this species. Collapse Key Words expressed sequence tag microrna simple sequence repeats ginsenoside panax ginseng c. a. meyer Collapse MESH Headings Breeding Expressed Sequence Tags/metabolism Gene Expression Profiling Genes, Plant/genetics Ginsenosides/biosynthesis MicroRNAs/genetics Microsatellite Repeats/genetics Molecular Sequence Annotation Organ Specificity Panax/genetics Panax/metabolism Plant Structures/genetics Collapse Grants Collapse	research-article	12	89
2	Isokpehi RD, Simmons SS, Cohly HHP, Ekunwe SIN, Begonia GB, Ayensu WK. Identification of drought-responsive universal stress proteins in viridiplantae. Bioinform Biol Insights 2011;5:41-58. [PMID: 21423406 PMCID: PMC3045048 DOI: 10.4137/bbi.s6061] [Citation(s) in RCA: 45] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/10/2023] Open Abstract Genes encoding proteins that contain the universal stress protein (USP) domain are known to provide bacteria, archaea, fungi, protozoa, and plants with the ability to respond to a plethora of environmental stresses. Specifically in plants, drought tolerance is a desirable phenotype. However, limited focused and organized functional genomic datasets exist on drought-responsive plant USP genes to facilitate their characterization. The overall objective of the investigation was to identify diverse plant universal stress proteins and Expressed Sequence Tags (ESTs) responsive to water-deficit stress. We hypothesize that cross-database mining of functional annotations in protein and gene transcript bioinformatics resources would help identify candidate drought-responsive universal stress proteins and transcripts from multiple plant species. Our bioinformatics approach retrieved, mined and integrated comprehensive functional annotation data on 511 protein and 1561 ESTs sequences from 161 viridiplantae taxa. A total of 32 drought-responsive ESTs from 7 plant genera Glycine, Hordeum, Manihot, Medicago, Oryza, Pinus and Triticum were identified. Two Arabidopsis USP genes At3g62550 and At3g53990 that encode ATP-binding motif were up-regulated in a drought microarray dataset. Further, a dataset of 80 simple sequence repeats (SSRs) linked to 20 singletons and 47 transcript assembles was constructed. Integrating the datasets on SSRs and drought-responsive ESTs identified three drought-responsive ESTs from bread wheat (BE604157), soybean (BM887317) and maritime pine (BX682209). The SSR sequence types were CAG, ATA and AT respectively. The datasets from cross-database mining provide organized resources for the characterization of USP genes as useful targets for engineering plant varieties tolerant to unfavorable environmental conditions. Collapse Key Words Pfam Uniprot drought expressed sequence tags microsatellite plants salinity simple sequence repeats universal stress protein domain viridiplantae Collapse MESH Headings Collapse Grants Collapse	Journal Article	14	45
3	Ai B, Gao Y, Zhang X, Tao J, Kang M, Huang H. Comparative transcriptome resources of eleven Primulina species, a group of 'stone plants' from a biodiversity hot spot. Mol Ecol Resour 2014;15:619-32. [PMID: 25243665 DOI: 10.1111/1755-0998.12333] [Citation(s) in RCA: 39] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2014] [Revised: 09/15/2014] [Accepted: 09/17/2014] [Indexed: 11/28/2022] Abstract The genus Primulina is an emerging model system in studying the drivers and mechanisms of species diversification, for its high species richness and endemism, together with high degree of habitat specialization. In this study, we sequenced transcriptomes for eleven Primulina species across the phylogeny of the genus using the Illumina HiSeq 2000 platform. A total of 336 million clean reads were processed into 355 573 unigenes with a mean length of 1336 bp and an N50 value of 2191 bp after pooling and reassembling twelve individual pre-assembled unigene sets. Of these unigenes, 249 973 (70%) were successfully annotated and 256 601 (72%) were identified as coding sequences (CDSs). We identified a total of 38 279 simple sequence repeats (SSRs) and 367 123 single nucleotide polymorphisms (SNPs). Marker validation assay revealed that 354 (27.3%) of the 1296 SSR and 795 (39.6%) of the 2008 SNP loci showed successful genotyping performance and exhibited expected polymorphism profiles. We screened 834 putative single-copy nuclear genes and proved their high effectiveness in phylogeny construction and estimation of ancestral population parameters. We identified a total of 85 candidate orthologs under positive selection for 46 of the 66 species pairs. This study provided an efficient application of RNA-seq in development of genomic resources for a group of 'stone plants' from south China Karst regions, a biodiversity hot spot of the World. The assembled unigenes with annotations and the massive gene-associated molecular markers would help guide further molecular systematic, population genetic and ecological genomics studies in Primulina and its relatives. Collapse Key Words Primulina positive selection simple sequence repeats single nucleotide polymorphisms single-copy nuclear genes transcriptome Collapse MESH Headings Collapse Grants Collapse	Research Support, Non-U.S. Gov't	11	39
4	Adams RH, Blackmon H, Reyes-Velasco J, Schield DR, Card DC, Andrew AL, Waynewood N, Castoe TA. Microsatellite landscape evolutionary dynamics across 450 million years of vertebrate genome evolution. Genome 2016;59:295-310. [PMID: 27064176 DOI: 10.1139/gen-2015-0124] [Citation(s) in RCA: 30] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022] Abstract The evolutionary dynamics of simple sequence repeats (SSRs or microsatellites) across the vertebrate tree of life remain largely undocumented and poorly understood. In this study, we analyzed patterns of genomic microsatellite abundance and evolution across 71 vertebrate genomes. The highest abundances of microsatellites exist in the genomes of ray-finned fishes, squamate reptiles, and mammals, while crocodilian, turtle, and avian genomes exhibit reduced microsatellite landscapes. We used comparative methods to infer evolutionary rates of change in microsatellite abundance across vertebrates and to highlight particular lineages that have experienced unusually high or low rates of change in genomic microsatellite abundance. Overall, most variation in microsatellite content, abundance, and evolutionary rate is observed among major lineages of reptiles, yet we found that several deeply divergent clades (i.e., squamate reptiles and mammals) contained relatively similar genomic microsatellite compositions. Archosauromorph reptiles (turtles, crocodilians, and birds) exhibit reduced genomic microsatellite content and the slowest rates of microsatellite evolution, in contrast to squamate reptile genomes that have among the highest rates of microsatellite evolution. Substantial branch-specific shifts in SSR content in primates, monotremes, rodents, snakes, and fish are also evident. Collectively, our results support multiple major shifts in microsatellite genomic landscapes among vertebrates. Collapse Key Words comparative genomics ensemencement de microsatellites génomique comparée microsatellite seeding repeat elements répétitions en tandem simple sequence repeats séquences simples répétées tandem repeats éléments répétés Collapse MESH Headings Collapse Grants Collapse	Journal Article	9	30
5	Cherukupalli N, Divate M, Mittapelli SR, Khareedu VR, Vudem DR. De novo Assembly of Leaf Transcriptome in the Medicinal Plant Andrographis paniculata. FRONTIERS IN PLANT SCIENCE 2016;7:1203. [PMID: 27582746 PMCID: PMC4987368 DOI: 10.3389/fpls.2016.01203] [Citation(s) in RCA: 27] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/15/2016] [Accepted: 07/28/2016] [Indexed: 05/05/2023] Abstract Andrographis paniculata is an important medicinal plant containing various bioactive terpenoids and flavonoids. Despite its importance in herbal medicine, no ready-to-use transcript sequence information of this plant is made available in the public data base, this study mainly deals with the sequencing of RNA from A. paniculata leaf using Illumina HiSeq™ 2000 platform followed by the de novo transcriptome assembly. A total of 189.22 million high quality paired reads were generated and 1,70,724 transcripts were predicted in the primary assembly. Secondary assembly generated a transcriptome size of ~88 Mb with 83,800 clustered transcripts. Based on the similarity searches against plant non-redundant protein database, gene ontology, and eukaryotic orthologous groups, 49,363 transcripts were annotated constituting upto 58.91% of the identified unigenes. Annotation of transcripts-using kyoto encyclopedia of genes and genomes database-revealed 5606 transcripts plausibly involved in 140 pathways including biosynthesis of terpenoids and other secondary metabolites. Transcription factor analysis showed 6767 unique transcripts belonging to 97 different transcription factor families. A total number of 124 CYP450 transcripts belonging to seven divergent clans have been identified. Transcriptome revealed 146 different transcripts coding for enzymes involved in the biosynthesis of terpenoids of which 35 contained terpene synthase motifs. This study also revealed 32,341 simple sequence repeats (SSRs) in 23,168 transcripts. Assembled sequences of transcriptome of A. paniculata generated in this study are made available, for the first time, in the TSA database, which provides useful information for functional and comparative genomic analysis besides identification of key enzymes involved in the various pathways of secondary metabolism. Collapse Key Words Andrographis paniculata cytochrome P450 de novo assembly leaf transcriptome simple sequence repeats terpenoid biosynthesis Collapse MESH Headings Collapse Grants Collapse	research-article	9	27
6	Tanwar UK, Pruthi V, Randhawa GS. RNA-Seq of Guar (Cyamopsis tetragonoloba, L. Taub.) Leaves: De novo Transcriptome Assembly, Functional Annotation and Development of Genomic Resources. FRONTIERS IN PLANT SCIENCE 2017;8:91. [PMID: 28210265 PMCID: PMC5288370 DOI: 10.3389/fpls.2017.00091] [Citation(s) in RCA: 26] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/21/2016] [Accepted: 01/16/2017] [Indexed: 05/24/2023] Abstract Genetic improvement in industrially important guar (Cyamopsis tetragonoloba, L. Taub.) crop has been hindered due to the lack of sufficient genomic or transcriptomic resources. In this study, RNA-Seq technology was employed to characterize the transcriptome of leaf tissues from two guar varieties, namely, M-83 and RGC-1066. Approximately 30 million high-quality pair-end reads of each variety generated by Illumina HiSeq platform were used for de novo assembly by Trinity program. A total of 62,146 non-redundant unigenes with an average length of 679 bp were obtained. The quality assessment of assembled unigenes revealed 87.50% of complete and 97.18% partial core eukaryotic genes (CEGs). Sequence similarity analyses and annotation of the unigenes against non-redundant protein (Nr) and Gene Ontology (GO) databases identified 175,882 GO annotations. A total of 11,308 guar unigenes were annotated with various enzyme codes (EC) and categorized in six categories with 55 subclasses. The annotation of biochemical pathways resulted in a total of 11,971 unigenes assigned with 145 KEGG maps and 1759 enzyme codes. The species distribution analysis of the unigenes showed highest similarity with Glycine max genes. A total of 5773 potential simple sequence repeats (SSRs) and 3594 high-quality single nucleotide polymorphisms (SNPs) were identified. Out of 20 randomly selected SSRs for wet laboratory validation, 13 showed consistent PCR amplification in both guar varieties. In silico studies identified 145 polymorphic SSR markers in two varieties. To the best of our knowledge, this is the first report on transcriptome analysis and SNPs identification in guar till date. Collapse Key Words molecular markers next generation sequencing simple sequence repeats single nucleotide polymorphisms transcriptome analysis Collapse MESH Headings Collapse Grants Collapse	research-article	8	26
7	Zhu C, Tong J, Yu X, Guo W, Wang X, Liu H, Feng X, Sun Y, Liu L, Fu B. A second-generation genetic linkage map for bighead carp (Aristichthys nobilis) based on microsatellite markers. Anim Genet 2014;45:699-708. [PMID: 25040196 DOI: 10.1111/age.12194] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 05/13/2014] [Indexed: 01/03/2023] Abstract Bighead carp (Aristichthys nobilis) is an important aquaculture fish worldwide. Genetic linkage maps for the species were previously reported, but map resolution remained to be improved. In this study, a second-generation genetic linkage map was constructed for bighead carp through a pseudo-testcross strategy using interspecific hybrids between bighead carp and silver carp. Of the 754 microsatellites genotyped in two interspecific mapping families (with 77 progenies for each family), 659 markers were assigned to 24 linkage groups, which were equal to the chromosome numbers of the haploid genome. The consensus map spanned 1917.3 cM covering 92.8% of the estimated bighead carp genome with an average marker interval of 2.9 cM. The length of linkage groups ranged from 52.2 to 133.5 cM with an average of 79.9 cM. The number of markers per linkage group varied from 11 to 55 with an average of 27.5 per linkage group. Normality tests on interval distances of the map showed a non-normal marker distribution; however, significant correlation was found between the length of linkage group and the number of markers below the 0.01 significance level (two-tailed). The length of the female map was 1.12 times that of the male map, and the average recombination ratio of female to male was 1.10:1. Visual inspection showed that distorted markers gathered in some linkage groups and in certain regions of the male and female maps. This well-defined genetic linkage map will provide a basic framework for further genome mapping of quantitative traits, comparative mapping and marker-assisted breeding in bighead carp. Collapse Key Words Aristichthys nobilis genetic map interspecific hybrids pseudo-testcross simple sequence repeats Collapse MESH Headings Collapse Grants Collapse	Research Support, Non-U.S. Gov't	11	23
8	Eisenbart SK, Alzheimer M, Pernitzsch SR, Dietrich S, Stahl S, Sharma CM. A Repeat-Associated Small RNA Controls the Major Virulence Factors of Helicobacter pylori. Mol Cell 2020;80:210-226.e7. [PMID: 33002424 DOI: 10.1016/j.molcel.2020.09.009] [Citation(s) in RCA: 23] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2020] [Revised: 07/29/2020] [Accepted: 09/04/2020] [Indexed: 12/12/2022] Abstract Many bacterial pathogens regulate their virulence genes via phase variation, whereby length-variable simple sequence repeats control the transcription or coding potential of those genes. Here, we have exploited this relationship between DNA structure and physiological function to discover a globally acting small RNA (sRNA) regulator of virulence in the gastric pathogen Helicobacter pylori. Our study reports the first sRNA whose expression is affected by a variable thymine (T) stretch in its promoter. We show the sRNA post-transcriptionally represses multiple major pathogenicity factors of H. pylori, including CagA and VacA, by base pairing to their mRNAs. We further demonstrate transcription of the sRNA is regulated by the nickel-responsive transcriptional regulator NikR (thus named NikS for nickel-regulated sRNA), thereby linking virulence factor regulation to nickel concentrations. Using in-vitro infection experiments, we demonstrate NikS affects host cell internalization and epithelial barrier disruption. Together, our results show NikS is a phase-variable, post-transcriptional global regulator of virulence properties in H. pylori. Collapse Key Words CagA Helicobacter pylori VacA regulatory networks phase-variation post-transcriptional regulation simple sequence repeats small regulatory RNA virulence factors virulence regulator Collapse MESH Headings Collapse Grants Collapse	Research Support, Non-U.S. Gov't	5	23
9	Ding M, Jiang Y, Cao Y, Lin L, He S, Zhou W, Rong J. Gene expression profile analysis of Ligon lintless-1 (Li1) mutant reveals important genes and pathways in cotton leaf and fiber development. Gene 2013;535:273-85. [PMID: 24279997 DOI: 10.1016/j.gene.2013.11.017] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2013] [Revised: 11/02/2013] [Accepted: 11/13/2013] [Indexed: 12/14/2022] Abstract Ligon lintless-1 (Li1) is a monogenic dominant mutant of Gossypium hirsutum (upland cotton) with a phenotype of impaired vegetative growth and short lint fibers. Despite years of research involving genetic mapping and gene expression profile analysis of Li1 mutant ovule tissues, the gene remains uncloned and the underlying pathway of cotton fiber elongation is still unclear. In this study, we report the whole genome-level deep-sequencing analysis of leaf tissues of the Li1 mutant. Differentially expressed genes in leaf tissues of mutant versus wild-type (WT) plants are identified, and the underlying pathways and potential genes that control leaf and fiber development are inferred. The results show that transcription factors AS2, YABBY5, and KANDI-like are significantly differentially expressed in mutant tissues compared with WT ones. Interestingly, several fiber development-related genes are found in the downregulated gene list of the mutant leaf transcriptome. These genes include heat shock protein family, cytoskeleton arrangement, cell wall synthesis, energy, H2O2 metabolism-related genes, and WRKY transcription factors. This finding suggests that the genes are involved in leaf morphology determination and fiber elongation. The expression data are also compared with the previously published microarray data of Li1 ovule tissues. Comparative analysis of the ovule transcriptomes of Li1 and WT reveals that a number of pathways important for fiber elongation are enriched in the downregulated gene list at different fiber development stages (0, 6, 9, 12, 15, 18dpa). Differentially expressed genes identified in both leaf and fiber samples are aligned with cotton whole genome sequences and combined with the genetic fine mapping results to identify a list of candidate genes for Li1. Collapse Key Words 1-aminocyclopropane-1-carboxylic acid oxidase 6-phosphogluconate dehydrogenase 6PGDH ACO ADNI1 ALDH AP2 AP2-EREBP transcription factors AS1 AS2 ASYMMETRIC LEAVES1 ASYMMETRIC LEAVES2 ATBFRUCT1 ATSS3 Arabidopsis beta-fructofuranosidase Arabidopsis starch synthase 3 COX DEG DNA complementary to RNA DPA EST FDR Fiber development Fl G-6-PDH GA GAPC GL1 GLABROUS1 GO GhACT1 GhAPX GhKCH1 GhPOX Gossypium hirsutum actin1 Gossypium hirsutum ascorbate peroxidase Gossypium hirsutum kinesin1 Gossypium hirsutum peroxidase HSP IAA-biosynthetic gene iaaM KNAT1 KNAT2 KNAT6 LHCA LHCB Li1 Li2 Ligon lintless-1 Ligon lintless-2 MDH NAD NAD(P)H dehydrogenase NDF4 NDH-dependent cyclic electron flow 4 PCR PFK PGK PIP2 PIP2 aquaporins PPDK PSAF PSAH PSBQ RFLP RNA-seq RNase ROS RT-PCR SHM4 SSCP SSR TCA cycle Transcriptome UBQ7 UCP5 UDP-glucose 4-epimerase gene UGE WT XTH adenine nucleotide transporter 1 aldehyde dehydrogenase cDNA chlorophyll a/b-binding protein cytochrome c oxidase dNTPs day post anthesis deoxyribonucleoside triphosphates differentially expressed genes expressed sequence tag false discovery rate fuzzless–lintless mutant gene ontology gibberellic acid glucose-6-phosphate dehydrogenase glyceraldehyde-3-phosphate dehydrogenase heat shock protein homeobox protein knotted-1-like 1 homeobox protein knotted-1-like 2 homeobox protein knotted-1-like 6 iaaM light harvesting complex protein malate dehydrogenase phosphofructokinase phosphoglycerate kinase photosystem I reaction center subunit III photosystem I reaction center subunit VI polymerase chain reaction pyruvate orthophosphate dikinase reactive oxygen species real time PCR restriction fragment length polymorphism ribonuclease serine hydroxymethyltransferase 4 simple sequence repeats single-strand conformation polymorphism subunit of oxygen evolving system of photosystem II tricarboxylic acid cycle ubiquitin 7 uncoupling protein 5 wide type xyloglucan endotransglycosylases/hydrolases Collapse MESH Headings Collapse Grants Collapse	Research Support, Non-U.S. Gov't	12	22
10	D'Agostino N, Tamburino R, Cantarella C, De Carluccio V, Sannino L, Cozzolino S, Cardi T, Scotti N. The Complete Plastome Sequences of Eleven Capsicum Genotypes: Insights into DNA Variation and Molecular Evolution. Genes (Basel) 2018;9:E503. [PMID: 30336638 PMCID: PMC6210379 DOI: 10.3390/genes9100503] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2018] [Revised: 10/11/2018] [Accepted: 10/11/2018] [Indexed: 11/16/2022] Open Abstract Members of the genus Capsicum are of great economic importance, including both wild forms and cultivars of peppers and chilies. The high number of potentially informative characteristics that can be identified through next-generation sequencing technologies gave a huge boost to evolutionary and comparative genomic research in higher plants. Here, we determined the complete nucleotide sequences of the plastomes of eight Capsicum species (eleven genotypes), representing the three main taxonomic groups in the genus and estimated molecular diversity. Comparative analyses highlighted a wide spectrum of variation, ranging from point mutations to small/medium size insertions/deletions (InDels), with accD, ndhB, rpl20, ycf1, and ycf2 being the most variable genes. The global pattern of sequence variation is consistent with the phylogenetic signal. Maximum-likelihood tree estimation revealed that Capsicum chacoense is sister to the baccatum complex. Divergence and positive selection analyses unveiled that protein-coding genes were generally well conserved, but we identified 25 positive signatures distributed in six genes involved in different essential plastid functions, suggesting positive selection during evolution of Capsicum plastomes. Finally, the identified sequence variation allowed us to develop simple PCR-based markers useful in future work to discriminate species belonging to different Capsicum complexes. Collapse Key Words chloroplast genome microsatellites molecular markers next-generation sequencing pepper perfect tandem repeats sequence variability simple sequence repeats single-nucleotide polymorphism Collapse MESH Headings Collapse Grants PON02_00395_3215002 "GenHORT". Italian Ministry of Research (MIUR) Collapse	research-article	7	20
11	Molecular Genotyping (SSR) and Agronomic Phenotyping for Utilization of Durum Wheat (Triticum durum Desf.) Ex Situ Collection from Southern Italy: A Combined Approach Including Pedigreed Varieties. Genes (Basel) 2018;9:genes9100465. [PMID: 30241387 PMCID: PMC6211131 DOI: 10.3390/genes9100465] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2018] [Revised: 09/18/2018] [Accepted: 09/18/2018] [Indexed: 11/17/2022] Open Abstract In South Italy durum wheat (Triticum durum Desf.) has a long-time tradition of growing and breeding. Accessions collected and now preserved ex situ are a valuable genetic resource, but their effective use in agriculture and breeding programs remains very low. In this study, a small number (44) of simple sequence repeats (SSR) molecular markers were used to detect pattern of diversity for 136 accessions collected in South Italy over time, to identify the genepool of origin, and establish similarities with 28 Italian varieties with known pedigree grown in Italy over the same time-period. Phenotyping was conducted for 12 morphophysiological characters of agronomic interest. Based on discriminant analysis of principal components (DAPC) and STRUCTURE analysis six groups were identified, the assignment of varieties reflected the genetic basis and breeding strategies involved in their development. Some “old” varieties grown today are the result of evolution through natural hybridization and conservative pure line selection. A small number of molecular markers and little phenotyping coupled with powerful statistical analysis and comparison to pedigreed varieties can provide enough information on the genetic structure of durum wheat germplasm for a quick screening of the germplasm collection able to identify accessions for breeding or introduction in low input agriculture. Collapse Key Words Triticum durum (Desf.) genetic diversity germplasm morphophysiological traits simple sequence repeats Collapse MESH Headings Collapse Grants Collapse	Journal Article	7	19
12	Molnár I, Cifuentes M, Schneider A, Benavente E, Molnár-Láng M. Association between simple sequence repeat-rich chromosome regions and intergenomic translocation breakpoints in natural populations of allopolyploid wild wheats. ANNALS OF BOTANY 2011;107:65-76. [PMID: 21036694 PMCID: PMC3002473 DOI: 10.1093/aob/mcq215] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/08/2010] [Revised: 07/16/2010] [Accepted: 09/21/2010] [Indexed: 05/04/2023] Abstract BACKGROUND AND AIMS Repetitive DNA sequences are thought to be involved in the formation of chromosomal rearrangements. The aim of this study was to analyse the distribution of microsatellite clusters in Aegilops biuncialis and Aegilops geniculata, and its relationship with the intergenomic translocations in these allotetraploid species, wild genetic resources for wheat improvement. METHODS The chromosomal localization of (ACG)(n) and (GAA)(n) microsatellite sequences in Ae. biuncialis and Ae. geniculata and in their diploid progenitors Aegilops comosa and Aegilops umbellulata was investigated by sequential in situ hybridization with simple sequence repeat (SSR) probes and repeated DNA probes (pSc119·2, Afa family and pTa71) and by dual-colour genomic in situ hybridization (GISH). Thirty-two Ae. biuncialis and 19 Ae. geniculata accessions were screened by GISH for intergenomic translocations, which were further characterized by fluorescence in situ hybridization and GISH. KEY RESULTS Single pericentromeric (ACG)(n) signals were localized on most U and on some M genome chromosomes, whereas strong pericentromeric and several intercalary and telomeric (GAA)(n) sites were observed on the Aegilops chromosomes. Three Ae. biuncialis accessions carried 7U(b)-7M(b) reciprocal translocations and one had a 7U(b)-1M(b) rearrangement, while two Ae. geniculata accessions carried 7U(g)-1M(g) or 5U(g)-5M(g) translocations. Conspicuous (ACG)(n) and/or (GAA)(n) clusters were located near the translocation breakpoints in eight of the ten translocated chromosomes analysed, SSR bands and breakpoints being statistically located at the same chromosomal site in six of them. CONCLUSIONS Intergenomic translocation breakpoints are frequently mapped to SSR-rich chromosomal regions in the allopolyploid species examined, suggesting that microsatellite repeated DNA sequences might facilitate the formation of those chromosomal rearrangements. The (ACG)(n) and (GAA)(n) SSR motifs serve as additional chromosome markers for the karyotypic analysis of UM genome Aegilops species. Collapse Key Words aegilops sp. simple sequence repeats karyotype evolution intergenomic translocations two-colour gish fish Collapse MESH Headings Biological Evolution Chromosomes, Plant/genetics Genome, Plant Hybridization, Genetic In Situ Hybridization In Situ Hybridization, Fluorescence Microsatellite Repeats Minisatellite Repeats Phylogeny Ploidies Poaceae/genetics Translocation, Genetic Collapse Grants Collapse	research-article	14	18
13	Avvaru AK, Saxena S, Sowpati DT, Mishra RK. MSDB: A Comprehensive Database of Simple Sequence Repeats. Genome Biol Evol 2018;9:1797-1802. [PMID: 28854643 PMCID: PMC5533116 DOI: 10.1093/gbe/evx132] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 06/12/2017] [Indexed: 11/13/2022] Open Abstract Microsatellites, also known as Simple Sequence Repeats (SSRs), are short tandem repeats of 1-6 nt motifs present in all genomes, particularly eukaryotes. Besides their usefulness as genome markers, SSRs have been shown to perform important regulatory functions, and variations in their length at coding regions are linked to several disorders in humans. Microsatellites show a taxon-specific enrichment in eukaryotic genomes, and some may be functional. MSDB (Microsatellite Database) is a collection of >650 million SSRs from 6,893 species including Bacteria, Archaea, Fungi, Plants, and Animals. This database is by far the most exhaustive resource to access and analyze SSR data of multiple species. In addition to exploring data in a customizable tabular format, users can view and compare the data of multiple species simultaneously using our interactive plotting system. MSDB is developed using the Django framework and MySQL. It is freely available at http://tdb.ccmb.res.in/msdb. Collapse Key Words Django JavaScript database genomics microsatellites simple sequence repeats Collapse MESH Headings Collapse Grants Collapse	Research Support, Non-U.S. Gov't	7	17
14	Sihag P, Sagwal V, Kumar A, Balyan P, Mir RR, Dhankher OP, Kumar U. Discovery of miRNAs and Development of Heat-Responsive miRNA-SSR Markers for Characterization of Wheat Germplasm for Terminal Heat Tolerance Breeding. Front Genet 2021;12:699420. [PMID: 34394189 PMCID: PMC8356722 DOI: 10.3389/fgene.2021.699420] [Citation(s) in RCA: 16] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2021] [Accepted: 06/30/2021] [Indexed: 11/13/2022] Open Abstract A large proportion of the Asian population fulfills their energy requirements from wheat (Triticum aestivum L.). Wheat quality and yield are critically affected by the terminal heat stress across the globe. It affects approximately 40% of the wheat-cultivating regions of the world. Therefore, there is a critical need to develop improved terminal heat-tolerant wheat varieties. Marker-assisted breeding with genic simple sequence repeats (SSR) markers have been used for developing terminal heat-tolerant wheat varieties; however, only few studies involved the use of microRNA (miRNA)-based SSR markers (miRNA-SSRs) in wheat, which were found as key players in various abiotic stresses. In the present study, we identified 104 heat-stress-responsive miRNAs reported in various crops. Out of these, 70 miRNA-SSR markers have been validated on a set of 20 terminal heat-tolerant and heat-susceptible wheat genotypes. Among these, only 19 miRNA-SSR markers were found to be polymorphic, which were further used to study the genetic diversity and population structure. The polymorphic miRNA-SSRs amplified 61 SSR loci with an average of 2.9 alleles per locus. The polymorphic information content (PIC) value of polymorphic miRNA-SSRs ranged from 0.10 to 0.87 with a mean value of 0.48. The dendrogram constructed using unweighted neighbor-joining method and population structure analysis clustered these 20 wheat genotypes into 3 clusters. The target genes of these miRNAs are involved either directly or indirectly in providing tolerance to heat stress. Furthermore, two polymorphic markers miR159c and miR165b were declared as very promising diagnostic markers, since these markers showed specific alleles and discriminated terminal heat-tolerant genotypes from the susceptible genotypes. Thus, these identified miRNA-SSR markers will prove useful in the characterization of wheat germplasm through the study of genetic diversity and population structural analysis and in wheat molecular breeding programs aimed at terminal heat tolerance of wheat varieties. Collapse Key Words Triticum aestivum L. genetic diversity heat responsive genes marker assisted breeding population structure simple sequence repeats Collapse MESH Headings Collapse Grants Collapse	Journal Article	4	16
15	Leino MW, Boström E, Hagenblad J. Twentieth-century changes in the genetic composition of Swedish field pea metapopulations. Heredity (Edinb) 2013;110:338-46. [PMID: 23169556 PMCID: PMC3607183 DOI: 10.1038/hdy.2012.93] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2012] [Revised: 10/04/2012] [Accepted: 10/05/2012] [Indexed: 01/08/2023] Open Abstract Landrace crops are formed by local adaptation, genetic drift and gene flow through seed exchange. In reverse, the study of genetic structure between landrace populations can reveal the effects of these forces over time. We present here the analysis of genetic diversity in 40 Swedish field pea (Pisum sativum L.) populations, either available as historical seed samples from the late nineteenth century or as extant gene bank accessions assembled in the late twentieth century. The historical material shows constant high levels of within-population diversity, whereas the extant accessions show varying, and overall lower, levels of within-population diversity. Structure and principal component analysis cluster most accessions, both extant and historical, in groups after geographical origin. County-wise analyses of the accessions show that the genetic diversity of the historical accessions is largely overlapping. In contrast, most extant accessions show signs of genetic drift. They harbor a subset of the alleles found in the historical accessions and are more differentiated from each other. These results reflect how, historically present metapopulations have been preserved during the twentieth century, although as genetically isolated populations. Collapse Key Words pisum sativum simple sequence repeats population structure landraces aged dna seed exchange Collapse MESH Headings Alleles Breeding Genetic Variation Genetics, Population Geography Pisum sativum/genetics Seeds/genetics Selection, Genetic Sweden Collapse Grants Collapse	research-article	12	16
16	Wu G, Zhang L, Yin Y, Wu J, Yu L, Zhou Y, Li M. Sequencing, de novo assembly and comparative analysis of Raphanus sativus transcriptome. FRONTIERS IN PLANT SCIENCE 2015;6:198. [PMID: 26029219 PMCID: PMC4428447 DOI: 10.3389/fpls.2015.00198] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/23/2014] [Accepted: 03/12/2015] [Indexed: 05/29/2023] Abstract Raphanus sativus is an important Brassicaceae plant and also an edible vegetable with great economic value. However, currently there is not enough transcriptome information of R. sativus tissues, which impedes further functional genomics research on R. sativus. In this study, RNA-seq technology was employed to characterize the transcriptome of leaf tissues. Approximately 70 million clean pair-end reads were obtained and used for de novo assembly by Trinity program, which generated 68,086 unigenes with an average length of 576 bp. All the unigenes were annotated against GO and KEGG databases. In the meanwhile, we merged leaf sequencing data with existing root sequencing data and obtained better de novo assembly of R. sativus using Oases program. Accordingly, potential simple sequence repeats (SSRs), transcription factors (TFs) and enzyme codes were identified in R. sativus. Additionally, we detected a total of 3563 significantly differentially expressed genes (DEGs, P = 0.05) and tissue-specific biological processes between leaf and root tissues. Furthermore, a TFs-based regulation network was constructed using Cytoscape software. Taken together, these results not only provide a comprehensive genomic resource of R. sativus but also shed light on functional genomic and proteomic research on R. sativus in the future. Collapse Key Words RNA sequencing Raphanus sativus simple sequence repeats transcription factor transcriptome Collapse MESH Headings Collapse Grants Collapse	research-article	10	15
17	Herbert A. Simple Repeats as Building Blocks for Genetic Computers. Trends Genet 2020;36:739-750. [PMID: 32690316 DOI: 10.1016/j.tig.2020.06.012] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2020] [Revised: 06/17/2020] [Accepted: 06/22/2020] [Indexed: 11/15/2022] Abstract Processing of RNA involves heterogeneous nuclear ribonucleoproteins. The simple sequence repeats (SSRs) they bind can also adopt alternative DNA structures, like Z DNA, triplexes, G quadruplexes, and I motifs. Those SSRs capable of switching conformation under physiological conditions (called flipons) are genetic elements that can encode alternative RNA processing by their effects on RNA processivity, most likely as DNA:RNA hybrids. Flipons are elements of a binary, instructive genetic code directing how genomic sequences are compiled into transcripts. The combinatorial nature of this code provides a rich set of options for creating genetic computers able to reproduce themselves and use a heritable and evolvable code to optimize survival. The underlying computational logic potentiates a diverse set of genetic programs that modify cis-mediated heritability and disease risk. Collapse Key Words DNA computing G quadruplex;, I motif R loop RNA splicing Z DNA flipons simple sequence repeats triplex Collapse MESH Headings Animals DNA/chemistry DNA/genetics G-Quadruplexes Genetic Code Genome Genomics Humans Microsatellite Repeats RNA/chemistry RNA/genetics Collapse Grants Collapse	Review	5	14
18	De Novo Sequencing and Hybrid Assembly of the Biofuel Crop Jatropha curcas L.: Identification of Quantitative Trait Loci for Geminivirus Resistance. Genes (Basel) 2019;10:genes10010069. [PMID: 30669588 PMCID: PMC6356885 DOI: 10.3390/genes10010069] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2018] [Revised: 12/03/2018] [Accepted: 12/07/2018] [Indexed: 12/26/2022] Open Abstract Jatropha curcas is an important perennial, drought tolerant plant that has been identified as a potential biodiesel crop. We report here the hybrid de novo genome assembly of J. curcas generated using Illumina and PacBio sequencing technologies, and identification of quantitative loci for Jatropha Mosaic Virus (JMV) resistance. In this study, we generated scaffolds of 265.7 Mbp in length, which correspond to 84.8% of the gene space, using Benchmarking Universal Single-Copy Orthologs (BUSCO) analysis. Additionally, 96.4% of predicted protein-coding genes were captured in RNA sequencing data, which reconfirms the accuracy of the assembled genome. The genome was utilized to identify 12,103 dinucleotide simple sequence repeat (SSR) markers, which were exploited in genetic diversity analysis to identify genetically distinct lines. A total of 207 polymorphic SSR markers were employed to construct a genetic linkage map for JMV resistance, using an interspecific F₂ mapping population involving susceptible J. curcas and resistant Jatropha integerrima as parents. Quantitative trait locus (QTL) analysis led to the identification of three minor QTLs for JMV resistance, and the same has been validated in an alternate F₂ mapping population. These validated QTLs were utilized in marker-assisted breeding for JMV resistance. Comparative genomics of oil-producing genes across selected oil producing species revealed 27 conserved genes and 2986 orthologous protein clusters in Jatropha. This reference genome assembly gives an insight into the understanding of the complex genetic structure of Jatropha, and serves as source for the development of agronomically improved virus-resistant and oil-producing lines. Collapse Key Words Euphorbiaceae biofuel geminivirus genome sequencing hybrid sequencing linkage map oil seed simple sequence repeats transcriptomics Collapse MESH Headings Collapse Grants Collapse	Journal Article	6	13
19	Alam CM, Iqbal A, Sharma A, Schulman AH, Ali S. Microsatellite Diversity, Complexity, and Host Range of Mycobacteriophage Genomes of the Siphoviridae Family. Front Genet 2019;10:207. [PMID: 30923537 PMCID: PMC6426759 DOI: 10.3389/fgene.2019.00207] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2018] [Accepted: 02/26/2019] [Indexed: 01/21/2023] Open Abstract The incidence, distribution, and variation of simple sequence repeats (SSRs) in viruses is instrumental in understanding the functional and evolutionary aspects of repeat sequences. Full-length genome sequences retrieved from NCBI were used for extraction and analysis of repeat sequences using IMEx software. We have also developed two MATLAB-based tools for extraction of gene locations from GenBank in tabular format and simulation of this data with SSR incidence data. Present study encompassing 147 Mycobacteriophage genomes revealed 25,284 SSRs and 1,127 compound SSRs (cSSRs) through IMEx. Mono- to hexa-nucleotide motifs were present. The SSR count per genome ranged from 78 (M100) to 342 (M58) while cSSRs incidence ranged from 1 (M138) to 17 (M28, M73). Though cSSRs were present in all the genomes, their frequency and SSR to cSSR conversion percentage varied from 1.08 (M138 with 93 SSRs) to 8.33 (M116 with 96 SSRs). In terms of localization, the SSRs were predominantly localized to coding regions (∼78%). Interestingly, genomes of around 50 kb contained a similar number of SSRs/cSSRs to that in a 110 kb genome, suggesting functional relevance for SSRs which was substantiated by variation in motif constitution between species with different host range. The three species with broad host range (M97, M100, M116) have around 90% of their mono-nucleotide repeat motifs composed of G or C and only M16 has both A and T mononucleotide motifs. Around 20% of the di-nucleotide repeat motifs in the genomes exhibiting a broad host range were CT/TC, which were either absent or represented to a much lesser extent in the other genomes. Collapse Key Words Mycobacteriophage dMAX host range imperfect microsatellite extractor simple sequence repeats Collapse MESH Headings Collapse Grants Collapse	Journal Article	6	12
20	Gao C, Wang Q, Ying Z, Ge Y, Cheng R. Molecular structure and phylogenetic analysis of complete chloroplast genomes of medicinal species Paeonia lactiflora from Zhejiang Province. MITOCHONDRIAL DNA PART B-RESOURCES 2020;5:1077-1078. [PMID: 33366882 PMCID: PMC7748428 DOI: 10.1080/23802359.2020.1721372] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Abstract Paeonia lactiflora is a geo-authentic and superior medicinal plant in Zhejiang province. Here, we report the complete chloroplast genome sequence of P. lactiflora. The total genome size of P. lactiflora is 153,405 bp in length, including a small single-copy (SSC) region of 16,969 bp, a large single-copy (LSC) region of 84,340 bp separated by a pair of inverted repeats (IRs) of 26,048 bp. The overall annotated gene number is 109, containing 76 protein-coding genes, 29 tRNAs and 4 rRNAs. The entire GC content of P. lactiflora is 38.43%, with the highest GC content of 42.99% in IR region. A total of 52 simple sequence repeats are identified in the cp genome of P. lactiflora. Phylogenetic analysis indicated a sister relationship between P. lactiflora and P. veitchii, and supported a unique evolutionary status of Family Paeoniaceae. This work provides a valuable genetic resource to develop robust markers and investigate the population genetics diversities for this famous medicinal species. Collapse Key Words Paeonia lactiflora chloroplast genome phylogenetic analysis simple sequence repeats Collapse MESH Headings Collapse Grants Collapse	Journal Article	5	11
21	Transcriptomic Analysis of the Endangered Neritid Species Clithon retropictus: De Novo Assembly, Functional Annotation, and Marker Discovery. Genes (Basel) 2016;7:genes7070035. [PMID: 27455329 PMCID: PMC4962005 DOI: 10.3390/genes7070035] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2016] [Revised: 07/05/2016] [Accepted: 07/06/2016] [Indexed: 11/25/2022] Open Abstract An aquatic gastropod belonging to the family Neritidae, Clithon retropictus is listed as an endangered class II species in South Korea. The lack of information on its genomic background limits the ability to obtain functional data resources and inhibits informed conservation planning for this species. In the present study, the transcriptomic sequencing and de novo assembly of C. retropictus generated a total of 241,696,750 high-quality reads. These assembled to 282,838 unigenes with mean and N50 lengths of 736.9 and 1201 base pairs, respectively. Of these, 125,616 unigenes were subjected to annotation analysis with known proteins in Protostome DB, COG, GO, and KEGG protein databases (BLASTX; E ≤ 0.00001) and with known nucleotides in the Unigene database (BLASTN; E ≤ 0.00001). The GO analysis indicated that cellular process, cell, and catalytic activity are the predominant GO terms in the biological process, cellular component, and molecular function categories, respectively. In addition, 2093 unigenes were distributed in 107 different KEGG pathways. Furthermore, 49,280 simple sequence repeats were identified in the unigenes (>1 kilobase sequences). This is the first report on the identification of transcriptomic and microsatellite resources for C. retropictus, which opens up the possibility of exploring traits related to the adaptation and acclimatization of this species. Collapse Key Words Clithon retropictus de novo assembly functional annotation simple sequence repeats transcriptome Collapse MESH Headings Collapse Grants Collapse	Journal Article	9	11
22	Cantarella C, D'Agostino N. PSR: polymorphic SSR retrieval. BMC Res Notes 2015;8:525. [PMID: 26428628 PMCID: PMC4591729 DOI: 10.1186/s13104-015-1474-4] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2015] [Accepted: 09/21/2015] [Indexed: 11/10/2022] Open Abstract BACKGROUND With the advent of high-throughput sequencing technologies large-scale identification of microsatellites became affordable and was especially directed to non-model species. By contrast, few efforts have been published toward the automatic identification of polymorphic microsatellites by exploiting sequence redundancy. Few tools for genotyping microsatellite repeats have been implemented so far that are able to manage huge amount of sequence data and handle the SAM/BAM file format. Most of them have been developed for and tested on human or model organisms with high quality reference genomes. RESULTS In this note we describe polymorphic SSR retrieval (PSR), a read counter and simple sequence repeat (SSR) length polymorphism detection tool. It is written in Perl and was developed to identify length polymorphisms in perfect microsatellites exploiting next generation sequencing (NGS) data. PSR has been developed bearing in mind plant non-model species for which de novo transcriptome assembly is generally the first sequence resource available to be used for SSR-mining. PSR is divided into two modules: the read-counting module (PSR_read_retrieval) identifies all the reads that cover the full-length of perfect microsatellites; the comparative module (PSR_poly_finder) detects both heterozygous and homozygous alleles at each microsatellite locus across all genotypes under investigation. Two threshold values to call a length polymorphism and reduce the number of false positives can be defined by the user: the minimum number of reads overlapping the repetitive stretch and the minimum read depth. The first parameter determines if the microsatellite-containing sequence must be processed or not, while the second one is decisive for the identification of minor alleles. PSR was tested on two different case studies. The first study aims at the identification of polymorphic SSRs in a set of de novo assembled transcripts defined by RNA-sequencing of two different plant genotypes. The second research activity aims to investigate sequence variations within a collection of newly sequenced chloroplast genomes. In both the cases PSR results are in agreement with those obtained by capillary gel separation. CONCLUSION PSR has been specifically developed from the need to automate the gene-based and genome-wide identification of polymorphic microsatellites from NGS data. It overcomes the limits related to the existing and time-consuming efforts based on tools developed in the pre-NGS era. Collapse Key Words simple sequence repeats length polymorphism polymorphic microsatellites ngs sam/bam format Collapse MESH Headings Base Sequence DNA, Chloroplast/genetics Electrophoresis, Capillary Genetic Loci Genotype Microsatellite Repeats/genetics Molecular Sequence Data Polymorphism, Genetic RNA, Messenger/genetics RNA, Messenger/metabolism Reference Standards Software Workflow Collapse Grants 27307C2007 NIEHS NIH HHS Collapse	research-article	10	10
23	Ranade SS, Ganea LS, Razzak AM, García Gil MR. Fungal Infection Increases the Rate of Somatic Mutation in Scots Pine (Pinus sylvestris L.). J Hered 2015;106:386-94. [PMID: 25890976 DOI: 10.1093/jhered/esv017] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/25/2014] [Accepted: 03/16/2015] [Indexed: 11/13/2022] Open Abstract Somatic mutations are transmitted during mitosis in developing somatic tissue. Somatic cells bearing the mutations can develop into reproductive (germ) cells and the somatic mutations are then passed on to the next generation of plants. Somatic mutations are a source of variation essential to evolve new defense strategies and adapt to the environment. Stem rust disease in Scots pine has a negative effect on wood quality, and thus adversely affects the economy. It is caused by the 2 most destructive fungal species in Scandinavia: Peridermium pini and Cronartium flaccidum. We studied nuclear genome stability in Scots pine under biotic stress (fungus-infected, 22 trees) compared to a control population (plantation, 20 trees). Stability was assessed as accumulation of new somatic mutations in 10 microsatellite loci selected for genotyping. Microsatellites are widely used as molecular markers in population genetics studies of plants, and are particularly used for detection of somatic mutations as their rate of mutation is of a much higher magnitude when compared with other DNA markers. We report double the rate of somatic mutation per locus in the fungus-infected trees (4.8×10(-3) mutations per locus), as compared to the controls (2.0×10(-3) mutations per locus) when individual samples were analyzed at 10 different microsatellite markers. Pearson's chi-squared test indicated a significant effect of the fungal infection which increased the number of mutations in the fungus-infected trees (χ(2) = 12.9883, df = 1, P = 0.0003134). Collapse Key Words Scots pine Somatic mutation. abiotic stress microsatellite simple sequence repeats Collapse MESH Headings Collapse Grants Collapse	Research Support, Non-U.S. Gov't	10	10
24	De Novo Transcriptome Sequencing Analysis of cDNA Library and Large-Scale Unigene Assembly in Japanese Red Pine (Pinus densiflora). Int J Mol Sci 2015;16:29047-59. [PMID: 26690126 PMCID: PMC4691086 DOI: 10.3390/ijms161226139] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2015] [Revised: 11/13/2015] [Accepted: 11/26/2015] [Indexed: 01/28/2023] Open Abstract Japanese red pine (Pinus densiflora) is extensively cultivated in Japan, Korea, China, and Russia and is harvested for timber, pulpwood, garden, and paper markets. However, genetic information and molecular markers were very scarce for this species. In this study, over 51 million sequencing clean reads from P. densiflora mRNA were produced using Illumina paired-end sequencing technology. It yielded 83,913 unigenes with a mean length of 751 bp, of which 54,530 (64.98%) unigenes showed similarity to sequences in the NCBI database. Among which the best matches in the NCBI Nr database were Picea sitchensis (41.60%), Amborella trichopoda (9.83%), and Pinus taeda (4.15%). A total of 1953 putative microsatellites were identified in 1784 unigenes using MISA (MicroSAtellite) software, of which the tri-nucleotide repeats were most abundant (50.18%) and 629 EST-SSR (expressed sequence tag- simple sequence repeats) primer pairs were successfully designed. Among 20 EST-SSR primer pairs randomly chosen, 17 markers yielded amplification products of the expected size in P. densiflora. Our results will provide a valuable resource for gene-function analysis, germplasm identification, molecular marker-assisted breeding and resistance-related gene(s) mapping for pine for P. densiflora. Collapse Key Words EST-SSR marker discovery Pinus densiflora simple sequence repeats transcriptome sequencing unigene assembly Collapse MESH Headings Collapse Grants Collapse	Journal Article	10	10
25	Zelewska MA, Pulijala M, Spencer-Smith R, Mahmood HTNA, Norman B, Churchward CP, Calder A, Snyder LAS. Phase variable DNA repeats in Neisseria gonorrhoeae influence transcription, translation, and protein sequence variation. Microb Genom 2016;2:e000078. [PMID: 28348872 PMCID: PMC5320596 DOI: 10.1099/mgen.0.000078] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2016] [Accepted: 07/08/2016] [Indexed: 12/23/2022] Open Abstract There are many types of repeated DNA sequences in the genomes of the species of the genus Neisseria, from homopolymeric tracts to tandem repeats of hundreds of bases. Some of these have roles in the phase-variable expression of genes. When a repeat mediates phase variation, reversible switching between tract lengths occurs, which in the species of the genus Neisseria most often causes the gene to switch between on and off states through frame shifting of the open reading frame. Changes in repeat tract lengths may also influence the strength of transcription from a promoter. For phenotypes that can be readily observed, such as expression of the surface-expressed Opa proteins or pili, verification that repeats are mediating phase variation is relatively straightforward. For other genes, particularly those where the function has not been identified, gathering evidence of repeat tract changes can be more difficult. Here we present analysis of the repetitive sequences that could mediate phase variation in the Neisseria gonorrhoeae strain NCCP11945 genome sequence and compare these results with other gonococcal genome sequences. Evidence is presented for an updated phase-variable gene repertoire in this species, including a class of phase variation that causes amino acid changes at the C-terminus of the protein, not previously described in N. gonorrhoeae. Collapse Key Words C-terminal variation gonococcus homopolymeric tract phase variation simple sequence repeats Collapse MESH Headings Collapse Grants Collapse	Research Support, Non-U.S. Gov't	9	10

Please SIGN IN to browse more articles.