1
|
Fandade V, Singh P, Singh D, Sharma H, Thakur G, Saini S, Kumar P, Mantri S, Bishnoi OP, Roy J. Genome-wide identification of microsatellites for mapping, genetic diversity and cross-transferability in wheat (Triticum spp). Gene 2024; 896:148039. [PMID: 38036075 DOI: 10.1016/j.gene.2023.148039] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2023] [Revised: 11/20/2023] [Accepted: 11/27/2023] [Indexed: 12/02/2023]
Abstract
Wheat (Triticum aestivum L.) is a crucial global staple crop, and is consistently being improved to enhance yield, disease resistance, and quality traits. However, the development of molecular markers is a challenging task due to its hexaploid genome. Molecular marker system such as simple sequence repeat (SSR) and single nucleotide polymorphism (SNP) are helpful for breeding, but SNP has limitations due to its development cost and its conversion to breeder markers. The study proposed an in-silico approach, by utilizing the low-cost transcriptome sequencing of two parental lines, 'TAC 75' and 'WH 1105', to identify polymorphic SSRs for mapping in a recombinant inbred line (RIL) population. This study introduces a new approach to bridge wheat genetics intricacies and next-generation sequencing potential. It presents a comprehensive genome-wide SSR distribution using IWGSC CS RefSeq v2.1 genome assembly and to identify 189 polymorphic loci through in-silico strategy. Of these, 54.76% showed polymorphism between parents, surpassing the traditional low polymorphic success rate. A RIL population screening validated these markers, demonstrating the fitness of identified markers through chi-square tests. The designed SSRs were also validated for genetic diversity analysis in a subset of 37 Indian wheat genotypes and cross-transferability in the wild/relative wheat species. In diversity analysis, a subset of 38 markers revealed 95 alleles (2.5 allele/locus), indicating substantial genetic variation. Population structure analysis unveiled three distinct groups, supported by phylogenetic and PCoA analyses. Further the polymorphic SSRs were also analyzed for SSR-gene association using gene ontology analysis. By utilizing the developing seed transcriptome data within parental lines, the study has enhanced the polymorphic SSR identification precision and facilitated in the RIL population. The undertaken study pioneers the use of transcriptome sequencing and genetic mapping to overcome challenges posed by the intricate wheat genome. This approach offers a cost-effective, less labour-intensive alternative to conventional methods, providing a platform for advancing wheat breeding research.
Collapse
Affiliation(s)
- Vikas Fandade
- Agri-Food Biotechnology Division, National Agri-Food Biotechnology Institute, Mohali-140306, Punjab, India; Regional Centre for Biotechnology, NCR Biotech Science Cluster, 3rd Milestone, Faridabad-Gurgaon Expressway, Faridabad, Haryana 121001, India.
| | - Pradeep Singh
- Agri-Food Biotechnology Division, National Agri-Food Biotechnology Institute, Mohali-140306, Punjab, India.
| | - Dalwinder Singh
- Department of Anatomy and cell biology, University of Western Ontario, London, Canada.
| | - Himanshu Sharma
- Agri-Food Biotechnology Division, National Agri-Food Biotechnology Institute, Mohali-140306, Punjab, India.
| | - Garima Thakur
- Protection for Plant Varieties and Farmers Rights Authority, New Delhi, India.
| | - Shivangi Saini
- Agri-Food Biotechnology Division, National Agri-Food Biotechnology Institute, Mohali-140306, Punjab, India.
| | - Prashant Kumar
- Agri-Food Biotechnology Division, National Agri-Food Biotechnology Institute, Mohali-140306, Punjab, India; Regional Centre for Biotechnology, NCR Biotech Science Cluster, 3rd Milestone, Faridabad-Gurgaon Expressway, Faridabad, Haryana 121001, India.
| | - Shrikant Mantri
- Agri-Food Biotechnology Division, National Agri-Food Biotechnology Institute, Mohali-140306, Punjab, India.
| | - O P Bishnoi
- Department of Genetics and Plant Breeding, Chaudhary Charan Singh Haryana Agricultural University, Hisar- 125004, India.
| | - Joy Roy
- Agri-Food Biotechnology Division, National Agri-Food Biotechnology Institute, Mohali-140306, Punjab, India.
| |
Collapse
|
2
|
Duhan N, Kaur S, Kaundal R. ranchSATdb: A Genome-Wide Simple Sequence Repeat (SSR) Markers Database of Livestock Species for Mutant Germplasm Characterization and Improving Farm Animal Health. Genes (Basel) 2023; 14:1481. [PMID: 37510385 PMCID: PMC10378808 DOI: 10.3390/genes14071481] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2023] [Revised: 07/13/2023] [Accepted: 07/14/2023] [Indexed: 07/30/2023] Open
Abstract
Microsatellites, also known as simple sequence repeats (SSRs), are polymorphic loci that play an important role in genome research, animal breeding, and disease control. Ranch animals are important components of agricultural landscape. The ranch animal SSR database, ranchSATdb, is a web resource which contains 15,520,263 putative SSR markers. This database provides a comprehensive tool for performing end-to-end marker selection, from SSRs prediction to generating marker primers and their cross-species feasibility, visualization of the resulting markers, and finding similarities between the genomic repeat sequences all in one place without the need to switch between other resources. The user-friendly online interface allows users to browse SSRs by genomic coordinates, repeat motif sequence, chromosome, motif type, motif frequency, and functional annotation. Users may enter their preferred flanking area around the repeat to retrieve the nucleotide sequence, they can investigate SSRs present in the genic or the genes between SSRs, they can generate custom primers, and they can also execute in silico validation of primers using electronic PCR. For customized sequences, an SSR prediction pipeline called miSATminer is also built. New species will be added to this website's database on a regular basis throughout time. To improve animal health via genomic selection, we hope that ranchSATdb will be a useful tool for mapping quantitative trait loci (QTLs) and marker-assisted selection. The web-resource is freely accessible at https://bioinfo.usu.edu/ranchSATdb/.
Collapse
Affiliation(s)
- Naveen Duhan
- Department of Plants, Soils, and Climate/Center for Integrated BioSystems, College of Agriculture and Applied Sciences, Utah State University, Logan, UT 84322, USA
- Bioinformatics Facility, Center for Integrated BioSystems, Utah State University, Logan, UT 84322, USA
| | - Simardeep Kaur
- Department of Plants, Soils, and Climate/Center for Integrated BioSystems, College of Agriculture and Applied Sciences, Utah State University, Logan, UT 84322, USA
- Division of Biochemistry, ICAR-Indian Agricultural Research Institute, New Delhi 110012, India
- ICAR-Research Complex for North Eastern Hill Region (NEH), Umiam 793103, India
| | - Rakesh Kaundal
- Department of Plants, Soils, and Climate/Center for Integrated BioSystems, College of Agriculture and Applied Sciences, Utah State University, Logan, UT 84322, USA
- Bioinformatics Facility, Center for Integrated BioSystems, Utah State University, Logan, UT 84322, USA
| |
Collapse
|
3
|
Tan C, Zhang H, Chen H, Guan M, Zhu Z, Cao X, Ge X, Zhu B, Chen D. First Report on Development of Genome-Wide Microsatellite Markers for Stock ( Matthiola incana L.). PLANTS (BASEL, SWITZERLAND) 2023; 12:748. [PMID: 36840095 PMCID: PMC9965543 DOI: 10.3390/plants12040748] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 12/16/2022] [Revised: 01/30/2023] [Accepted: 02/02/2023] [Indexed: 06/18/2023]
Abstract
Stock (Matthiola incana (L.) R. Br.) is a famous annual ornamental plant with important ornamental and economic value. The lack of DNA molecular markers has limited genetic analysis, genome evolution, and marker-assisted selective breeding studies of M. incana. Therefore, more DNA markers are needed to support the further elucidation of the biology and genetics of M. incana. In this study, a high-quality genome of M. incana was initially assembled and a set of effective SSR primers was developed at the whole-genome level using genome data. A total of 45,612 loci of SSRs were identified; the di-nucleotide motifs were the most abundant (77.35%). In total, 43,540 primer pairs were designed, of which 300 were randomly selected for PCR validation, and as the success rate for amplification. In addition, 22 polymorphic SSR markers were used to analyze the genetic diversity of 40 stock varieties. Clustering analysis showed that all varieties could be divided into two clusters with a genetic distance of 0.68, which were highly consistent with their flower shape (potted or cut type). Moreover, we have verified that these SSR markers are effective and transferable within the Brassicaceae family. In this study, potential SSR molecular markers were successfully developed for 40 M. incana varieties using whole genome analysis, providing an important genetic tool for theoretical and applied research on M. incana.
Collapse
Affiliation(s)
- Chen Tan
- College of Life Sciences, Gannan Normal University, Ganzhou 341000, China
| | - Haimei Zhang
- College of Life Sciences, Gannan Normal University, Ganzhou 341000, China
| | - Haidong Chen
- College of Life Sciences, Gannan Normal University, Ganzhou 341000, China
| | - Miaotian Guan
- College of Life Sciences, Gannan Normal University, Ganzhou 341000, China
| | - Zhenzhi Zhu
- College of Life Sciences, Gannan Normal University, Ganzhou 341000, China
| | - Xueying Cao
- College of Life Sciences, Gannan Normal University, Ganzhou 341000, China
| | - Xianhong Ge
- College of Plant Science and Technology, Huazhong Agricultural University, Wuhan 431700, China
| | - Bo Zhu
- College of Life Sciences, Gannan Normal University, Ganzhou 341000, China
| | - Daozong Chen
- College of Life Sciences, Gannan Normal University, Ganzhou 341000, China
| |
Collapse
|
4
|
Bharti PK, Husai A. Mining and analysis of microsatellites in human coronavirus genomes using the in-house built Java pipeline. Genomics Inform 2022; 20:e35. [PMID: 36239112 PMCID: PMC9576472 DOI: 10.5808/gi.20033] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2020] [Accepted: 09/14/2022] [Indexed: 11/20/2022] Open
Abstract
Microsatellites or simple sequence repeats are motifs of 1 to 6 nucleotides in length present in both coding and non-coding regions of DNA. These are found widely distributed in the whole genome of prokaryotes, eukaryotes, bacteria, and viruses and are used as molecular markers in studying DNA variations, gene regulation, genetic diversity and evolutionary studies, etc. However, in vitro microsatellite identification proves to be time-consuming and expensive. Therefore, the present research has been focused on using an in-house built java pipeline to identify, analyse, design primers and find related statistics of perfect and compound microsatellites in the seven complete genome sequences of coronavirus, including the genome of coronavirus disease 2019, where the host is Homo sapiens. Based on search criteria among seven genomic sequences, it was revealed that the total number of perfect simple sequence repeats (SSRs) found to be in the range of 76 to 118 and compound SSRs from 01 to10, thus reflecting the low conversion of perfect simple sequence to compound repeats. Furthermore, the incidence of SSRs was insignificant but positively correlated with genome size (R2 = 0.45, p > 0.05), with simple sequence repeats relative abundance (R2 = 0.18, p > 0.05) and relative density (R2 = 0.23, p > 0.05). Dinucleotide repeats were the most abundant in the coding region of the genome, followed by tri, mono, and tetra. This comparative study would help us understand the evolutionary relationship, genetic diversity, and hypervariability in minimal time and cost.
Collapse
Affiliation(s)
- P K Bharti
- School of Computer Science, Shri Venkateshwara University, Gajraula 244236, Uttar Pradesh, India
| | - Akhtar Husai
- Department of Computer Science & IT, MJP Rohilkhand University, Bareilly 243006, Uttar Pradesh, India
| |
Collapse
|
5
|
Sahu BP, Majee P, Singh RR, Sahoo N, Nayak D. Genome-wide identification and characterization of microsatellite markers within the Avipoxviruses. 3 Biotech 2022; 12:113. [PMID: 35497507 PMCID: PMC9008116 DOI: 10.1007/s13205-022-03169-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2022] [Accepted: 03/19/2022] [Indexed: 11/01/2022] Open
Abstract
Microsatellite markers or Simple Sequence Repeats (SSRs) are gaining importance for molecular characterization of the virus as well as estimation of evolution patterns due to its high-polymorphic nature. The Avipoxvirus is the causative agent of pox-like lesions in more than 300 birds and one of the major diseases for the extinction of endangered avian species. Therefore, we conducted a genome-wide analysis to decipher the type, distribution pattern of 14 complete genomes derived from the Avipoxvirus genus. The in-silico screening deciphered the existence of 917-2632 SSRs per strain. In the case of compound SSRs (cSSRs), the value was obtained 44-255 per genome. Our analysis indicates that the di-nucleotide repeats (52.74%) are the most abundant, followed by the mononucleotides (34.79), trinucleotides (11.57%), tetranucleotides (0.64%), pentanucleotides (0.12%) and hexanucleotides (0.15%) repeats. The specific parameters like Relative Abundance (RA) and Relative Density (RD) of microsatellites ranged within 5.5-8.12 and 33.08-53.58 bp/kb. The analysis of RA and RD value of compound microsatellites resulted between 0.25-0.82 and 4.64-15.12 bp/kb. The analysis of motif composition of cSSR revealed that most of the compound microsatellites were made up of two microsatellites, with some unique duplicated pattern of the motif like, (TA)-x-(TA), (TCA)-x-(TCA), etc. and self-complementary motifs, such as (TA)-x-(AT). Finally, we validated forty sets of compound microsatellite markers through an in-vitro approach utilizing clinical specimens and mapping the sequencing products with the database through comparative genomics approaches. Supplementary Information The online version contains supplementary material available at 10.1007/s13205-022-03169-4.
Collapse
|
6
|
Nishimura K, Motoki K, Yamazaki A, Takisawa R, Yasui Y, Kawai T, Ushijima K, Nakano R, Nakazaki T. MIG-seq is an effective method for high-throughput genotyping in wheat ( Triticum spp.). DNA Res 2022; 29:6567359. [PMID: 35412600 PMCID: PMC9035812 DOI: 10.1093/dnares/dsac011] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/27/2021] [Accepted: 04/08/2022] [Indexed: 12/04/2022] Open
Abstract
MIG-seq (Multiplexed inter-simple sequence repeats genotyping by sequencing) has been developed as a low cost genotyping technology, although the number of polymorphisms obtained is assumed to be minimal, resulting in the low application of this technique to analyses of agricultural plants. We applied MIG-seq to 12 plant species that include various crops and investigated the relationship between genome size and the number of bases that can be stably sequenced. The genome size and the number of loci, which can be sequenced by MIG-seq, are positively correlated. This is due to the linkage between genome size and the number of simple sequence repeats (SSRs) through the genome. The applicability of MIG-seq to population structure analysis, linkage mapping, and quantitative trait loci (QTL) analysis in wheat, which has a relatively large genome, was further evaluated. The results of population structure analysis for tetraploid wheat showed the differences among collection sites and subspecies, which agreed with previous findings. Additionally, in wheat biparental mapping populations, over 3,000 SNPs/indels with low deficiency were detected using MIG-seq, and the QTL analysis was able to detect recognized flowering-related genes. These results revealed the effectiveness of MIG-seq for genomic analysis of agricultural plants with large genomes, including wheat.
Collapse
Affiliation(s)
- Kazusa Nishimura
- Graduate School of Agriculture, Kyoto University, Kizugawa City, Kyoto Prefecture 619-0218, Japan
| | - Ko Motoki
- Graduate School of Agriculture, Kyoto University, Kizugawa City, Kyoto Prefecture 619-0218, Japan
| | - Akira Yamazaki
- Graduate School of Agriculture, Kyoto University, Kizugawa City, Kyoto Prefecture 619-0218, Japan
- Faculty of Agriculture, Kindai University, Nara City, Nara Prefecture 631-8505, Japan
| | - Rihito Takisawa
- Faculty of Agriculture, Ryukoku University, Otsu City, Shiga Prefecture 520-2194, Japan
| | - Yasuo Yasui
- Graduate School of Agriculture, Kyoto University, Kizugawa City, Kyoto Prefecture 619-0218, Japan
| | - Takashi Kawai
- Graduate School of Environmental and Life Science, Okayama University, Okayama City, Okayama Prefecture 700-8530, Japan
| | - Koichiro Ushijima
- Graduate School of Environmental and Life Science, Okayama University, Okayama City, Okayama Prefecture 700-8530, Japan
| | - Ryohei Nakano
- Graduate School of Agriculture, Kyoto University, Kizugawa City, Kyoto Prefecture 619-0218, Japan
| | - Tetsuya Nakazaki
- Graduate School of Agriculture, Kyoto University, Kizugawa City, Kyoto Prefecture 619-0218, Japan
| |
Collapse
|
7
|
Duhan N, Kaundal R. LegumeSSRdb: A Comprehensive Microsatellite Marker Database of Legumes for Germplasm Characterization and Crop Improvement. Int J Mol Sci 2021; 22:ijms222111350. [PMID: 34768782 PMCID: PMC8583334 DOI: 10.3390/ijms222111350] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2021] [Revised: 10/04/2021] [Accepted: 10/14/2021] [Indexed: 11/16/2022] Open
Abstract
Microsatellites, or simple sequence repeats (SSRs), are polymorphic loci that play a major role as molecular markers for genome analysis and plant breeding. The legume SSR database is a webserver which contains simple sequence repeats (SSRs) from genomes of 13 legume species. A total of 3,706,276 SSRs are present in the database, 698,509 of which are genic SSRs, and 3,007,772 are non-genic. This webserver is an integrated tool to perform end-to-end marker selection right from generating SSRs to designing and validating primers, visualizing the results and blasting the genomic sequences at one place without juggling between several resources. The user-friendly web interface allows users to browse SSRs based on the genomic region, chromosome, motif type, repeat motif sequence, frequency of motif, and advanced searches allow users to search based on chromosome location range and length of SSR. Users can give their desired flanking region around repeat and obtain the sequence, they can explore the genes in which the SSRs are present or the genes between which the SSRs are bound design custom primers, and perform in silico validation using PCR. An SSR prediction pipeline is implemented where the user can submit their genomic sequence to generate SSRs. This webserver will be frequently updated with more species, in time. We believe that legumeSSRdb would be a useful resource for marker-assisted selection and mapping quantitative trait loci (QTLs) to practice genomic selection and improve crop health. The database can be freely accessed at http://bioinfo.usu.edu/legumeSSRdb/.
Collapse
Affiliation(s)
- Naveen Duhan
- Department of Plants, Soils and Climate, CAAS, Utah State University, Logan, UT 84321, USA;
- Center for Integrated BioSystems (CIB), CAAS, Utah State University, Logan, UT 84321, USA
| | - Rakesh Kaundal
- Department of Plants, Soils and Climate, CAAS, Utah State University, Logan, UT 84321, USA;
- Center for Integrated BioSystems (CIB), CAAS, Utah State University, Logan, UT 84321, USA
- Department of Computer Science, CoS, Utah State University, Logan, UT 84321, USA
- Correspondence: ; Tel.: +1-435-797-4117; Fax: +1-435-797-2766
| |
Collapse
|
8
|
Li D, Shi R, Zhang H, Huang H, Pan S, Liang Y, Peng S, Tan Z. The only conserved microsatellite in coding regions of ebolavirus is the editing site. Biochem Biophys Res Commun 2021; 565:79-84. [PMID: 34098315 DOI: 10.1016/j.bbrc.2021.05.093] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2021] [Revised: 05/26/2021] [Accepted: 05/27/2021] [Indexed: 11/29/2022]
Abstract
Lots of viral genomes were found to contain microsatellites (SSRs) including Ebolavirus, and majority of Ebolavirus microsatellite sites are distributed in protein-coding regions of the genomes. Here, we totally identified 212 reserved microsatellite sites in the protein-coding regions of 213 genomic sequences from five Ebolavirus species. In these reserved microsatellite sites, there is only one significantly conserved microsatellite site among the sample Ebolavirus genomic sequences, and this microsatellite is located at RNA editing site of the GP gene, indicating the selective relevance with RNA editing there. This analysis may help to further explore the biological significance of various microsatellites in Ebolavirus genomes.
Collapse
Affiliation(s)
- Douyue Li
- Bioinformatics Center, College of Biology, Hunan University, Changsha, 410082, China
| | - Ruixue Shi
- Bioinformatics Center, College of Biology, Hunan University, Changsha, 410082, China
| | - Hongxi Zhang
- Bioinformatics Center, College of Biology, Hunan University, Changsha, 410082, China
| | - Hanrou Huang
- Bioinformatics Center, College of Biology, Hunan University, Changsha, 410082, China
| | - Saichao Pan
- Bioinformatics Center, College of Biology, Hunan University, Changsha, 410082, China
| | - Yuling Liang
- Bioinformatics Center, College of Biology, Hunan University, Changsha, 410082, China
| | - Shan Peng
- Bioinformatics Center, College of Biology, Hunan University, Changsha, 410082, China
| | - Zhongyang Tan
- Bioinformatics Center, College of Biology, Hunan University, Changsha, 410082, China.
| |
Collapse
|
9
|
Jain A, Sharma PC. Occurrence and distribution of compound microsatellites in the genomes of three economically important virus families. INFECTION GENETICS AND EVOLUTION 2021; 92:104853. [PMID: 33839312 DOI: 10.1016/j.meegid.2021.104853] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Subscribe] [Scholar Register] [Received: 01/17/2021] [Revised: 04/01/2021] [Accepted: 04/04/2021] [Indexed: 11/15/2022]
Abstract
Microsatellites are nonrandom hypervariable iterations of one to six nucleotides, existing across the coding as well as noncoding regions of virtually all known genomes, arising primarily due to polymerase slippage and unequal crossing over during replication events. Two or more perfect microsatellites located in close proximity form compound microsatellites. We studied the distribution of compound microsatellites in 118 ssDNA virus genomes belonging to three economically important virus families, namely Anelloviridae, Circoviridae, and Parvoviridae, known to predominantly infect livestock and humans. Among these virus families, 0-58.49% of perfect microsatellites were involved in the formation of compound microsatellites, the majority being located in the coding regions. No clear relationship existed between the genomic features (genome size and GC%) and compound microsatellite characteristics (relative abundance and relative density). The majority of the compound microsatellites resulted from di-SSR couples. A strong positive relationship was observed between the maximum distance value and length of compound microsatellite, percentage of microsatellites involved in the compound microsatellite formation, and relative microsatellite density. The degree of variability among microsatellite characteristics studied was largely a species-specific phenomenon. A major proportion of compound microsatellites was represented by similar motif combinations. The findings of the present study will help in better understanding of the structural, functional, and evolutionary role of compound microsatellites prevailing in the smaller genomes.
Collapse
Affiliation(s)
- Ankit Jain
- Merck Life Science Pvt. Ltd, Sector-17, Chandigarh, India
| | - Prakash C Sharma
- University School of Biotechnology, Guru Gobind Singh Indraprastha University, Dwarka Sector-16 C, New Delhi 11078, India.
| |
Collapse
|
10
|
Song X, Yang T, Zhang X, Yuan Y, Yan X, Wei Y, Zhang J, Zhou C. Comparison of the Microsatellite Distribution Patterns in the Genomes of Euarchontoglires at the Taxonomic Level. Front Genet 2021; 12:622724. [PMID: 33719337 PMCID: PMC7953163 DOI: 10.3389/fgene.2021.622724] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2020] [Accepted: 02/05/2021] [Indexed: 02/05/2023] Open
Abstract
Microsatellite or simple sequence repeat (SSR) instability within genes can induce genetic variation. The SSR signatures remain largely unknown in different clades within Euarchontoglires, one of the most successful mammalian radiations. Here, we conducted a genome-wide characterization of microsatellite distribution patterns at different taxonomic levels in 153 Euarchontoglires genomes. Our results showed that the abundance and density of the SSRs were significantly positively correlated with primate genome size, but no significant relationship with the genome size of rodents was found. Furthermore, a higher level of complexity for perfect SSR (P-SSR) attributes was observed in rodents than in primates. The most frequent type of P-SSR was the mononucleotide P-SSR in the genomes of primates, tree shrews, and colugos, while mononucleotide or dinucleotide motif types were dominant in the genomes of rodents and lagomorphs. Furthermore, (A)n was the most abundant motif in primate genomes, but (A)n, (AC)n, or (AG)n was the most abundant motif in rodent genomes which even varied within the same genus. The GC content and the repeat copy numbers of P-SSRs varied in different species when compared at different taxonomic levels, reflecting underlying differences in SSR mutation processes. Notably, the CDSs containing P-SSRs were categorized by functions and pathways using Gene Ontology and Kyoto Encyclopedia of Genes and Genomes annotations, highlighting their roles in transcription regulation. Generally, this work will aid future studies of the functional roles of the taxonomic features of microsatellites during the evolution of mammals in Euarchontoglires.
Collapse
Affiliation(s)
- Xuhao Song
- Key Laboratory of Southwest China Wildlife Resources Conservation (Ministry of Education), China West Normal University, Nanchong, China.,Institute of Ecology, China West Normal University, Nanchong, China
| | - Tingbang Yang
- Key Laboratory of Southwest China Wildlife Resources Conservation (Ministry of Education), China West Normal University, Nanchong, China.,Institute of Ecology, China West Normal University, Nanchong, China
| | - Xinyi Zhang
- Key Laboratory of Southwest China Wildlife Resources Conservation (Ministry of Education), China West Normal University, Nanchong, China
| | - Ying Yuan
- Key Laboratory of Southwest China Wildlife Resources Conservation (Ministry of Education), China West Normal University, Nanchong, China
| | - Xianghui Yan
- Key Laboratory of Southwest China Wildlife Resources Conservation (Ministry of Education), China West Normal University, Nanchong, China
| | - Yi Wei
- Key Laboratory of Southwest China Wildlife Resources Conservation (Ministry of Education), China West Normal University, Nanchong, China.,Institute of Ecology, China West Normal University, Nanchong, China
| | - Jun Zhang
- Key Laboratory of Southwest China Wildlife Resources Conservation (Ministry of Education), China West Normal University, Nanchong, China.,Institute of Ecology, China West Normal University, Nanchong, China
| | - Caiquan Zhou
- Key Laboratory of Southwest China Wildlife Resources Conservation (Ministry of Education), China West Normal University, Nanchong, China.,Institute of Ecology, China West Normal University, Nanchong, China
| |
Collapse
|
11
|
Genome-wide in silico identification and characterization of Simple Sequence Repeats in diverse completed SARS-CoV-2 genomes. GENE REPORTS 2021; 23:101020. [PMID: 33521382 PMCID: PMC7835092 DOI: 10.1016/j.genrep.2021.101020] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2020] [Revised: 12/06/2020] [Accepted: 12/29/2020] [Indexed: 12/19/2022]
Abstract
Simple sequence repeats (SSRs) or, Microsatellites are short repeat sequences that have been extensively studied in eukaryotic (plants) and prokaryotic (bacteria) organisms. Compared to other organisms, the presence and incidence of SSR on viral genomes are less studied. With the emergence of novel infectious viruses over the past few decades, it is imperative to study the genetic diversity in such viruses to predict their evolutionary and functional changes over time. Following the emergence of SARS-CoV-2, we have assembled 121 complete genomes reported from 31 countries across the six continents for the identification and characterization of SSR repeats. Using two independent SSR identification tools, we have found remarkable consistency in the diversity of microsatellites pattern (38–42 per genome) found in the 121 analyzed SARS-CoV-2 genomes indication their important role for genome stability. Among the identified motifs, trinucleotide and hexanucleotide repeats were found to be the most abundant form followed by mono- and di-nucleotide. There were no tetra- or penta-nucleotide repeats in the analyzed SARS-CoV-2 genomes. The discovery of microsatellites in SARS-CoV-2 genomes may become useful for the population genetics, evolutionary analysis, strain identification and genetic variation.
Collapse
|
12
|
Li D, Pan S, Zhang H, Fu Y, Peng Z, Zhang L, Peng S, Xu F, Huang H, Shi R, Zheng H, Peng Y, Tan Z. A comprehensive microsatellite landscape of human Y-DNA at kilobase resolution. BMC Genomics 2021; 22:76. [PMID: 33482734 PMCID: PMC7821415 DOI: 10.1186/s12864-021-07389-5] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2020] [Accepted: 01/13/2021] [Indexed: 12/12/2022] Open
Abstract
Background Though interest in human simple sequence repeats (SSRs) is increasing, little is known about the exact distributional features of numerous SSRs in human Y-DNA at chromosomal level. Herein, totally 540 maps were established, which could clearly display SSR landscape in every bin of 1 k base pairs (Kbp) along the sequenced part of human reference Y-DNA (NC_000024.10), by our developed differential method for improving the existing method to reveal SSR distributional characteristics in large genomic sequences. Results The maps show that SSRs accumulate significantly with forming density peaks in at least 2040 bins of 1 Kbp, which involve different coding, noncoding and intergenic regions of the Y-DNA, and 10 especially high density peaks were reported to associate with biological significances, suggesting that the other hundreds of especially high density peaks might also be biologically significant and worth further analyzing. In contrast, the maps also show that SSRs are extremely sparse in at least 207 bins of 1 Kbp, including many noncoding and intergenic regions of the Y-DNA, which is inconsistent with the widely accepted view that SSRs are mostly rich in these regions, and these sparse distributions are possibly due to powerfully regional selection. Additionally, many regions harbor SSR clusters with same or similar motif in the Y-DNA. Conclusions These 540 maps may provide the important information of clearly position-related SSR distributional features along the human reference Y-DNA for better understanding the genome structures of the Y-DNA. This study may contribute to further exploring the biological significance and distribution law of the huge numbers of SSRs in human Y-DNA. Supplementary Information The online version contains supplementary material available at 10.1186/s12864-021-07389-5.
Collapse
Affiliation(s)
- Douyue Li
- Bioinformatics Center, College of Biology, Hunan University, Changsha, 410082, China
| | - Saichao Pan
- Bioinformatics Center, College of Biology, Hunan University, Changsha, 410082, China
| | - Hongxi Zhang
- Bioinformatics Center, College of Biology, Hunan University, Changsha, 410082, China
| | - Yongzhuo Fu
- Bioinformatics Center, College of Biology, Hunan University, Changsha, 410082, China
| | - Zhuli Peng
- Bioinformatics Center, College of Biology, Hunan University, Changsha, 410082, China
| | - Liang Zhang
- Bioinformatics Center, College of Biology, Hunan University, Changsha, 410082, China
| | - Shan Peng
- Bioinformatics Center, College of Biology, Hunan University, Changsha, 410082, China
| | - Fei Xu
- Department of Mathematics, Wilfrid Laurier University, Waterloo, Ontario, N2L 3C5, Canada
| | - Hanrou Huang
- Bioinformatics Center, College of Biology, Hunan University, Changsha, 410082, China
| | - Ruixue Shi
- Bioinformatics Center, College of Biology, Hunan University, Changsha, 410082, China
| | - Heping Zheng
- Bioinformatics Center, College of Biology, Hunan University, Changsha, 410082, China
| | - Yousong Peng
- Bioinformatics Center, College of Biology, Hunan University, Changsha, 410082, China
| | - Zhongyang Tan
- Bioinformatics Center, College of Biology, Hunan University, Changsha, 410082, China.
| |
Collapse
|
13
|
Wallace MA, Coffman KA, Gilbert C, Ravindran S, Albery GF, Abbott J, Argyridou E, Bellosta P, Betancourt AJ, Colinet H, Eric K, Glaser-Schmitt A, Grath S, Jelic M, Kankare M, Kozeretska I, Loeschcke V, Montchamp-Moreau C, Ometto L, Onder BS, Orengo DJ, Parsch J, Pascual M, Patenkovic A, Puerma E, Ritchie MG, Rota-Stabelli O, Schou MF, Serga SV, Stamenkovic-Radak M, Tanaskovic M, Veselinovic MS, Vieira J, Vieira CP, Kapun M, Flatt T, González J, Staubach F, Obbard DJ. The discovery, distribution, and diversity of DNA viruses associated with Drosophila melanogaster in Europe. Virus Evol 2021; 7:veab031. [PMID: 34408913 PMCID: PMC8363768 DOI: 10.1093/ve/veab031] [Citation(s) in RCA: 15] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022] Open
Abstract
Drosophila melanogaster is an important model for antiviral immunity in arthropods, but very few DNA viruses have been described from the family Drosophilidae. This deficiency limits our opportunity to use natural host-pathogen combinations in experimental studies, and may bias our understanding of the Drosophila virome. Here, we report fourteen DNA viruses detected in a metagenomic analysis of 6668 pool-sequenced Drosophila, sampled from forty-seven European locations between 2014 and 2016. These include three new nudiviruses, a new and divergent entomopoxvirus, a virus related to Leptopilina boulardi filamentous virus, and a virus related to Musca domestica salivary gland hypertrophy virus. We also find an endogenous genomic copy of galbut virus, a double-stranded RNA partitivirus, segregating at very low frequency. Remarkably, we find that Drosophila Vesanto virus, a small DNA virus previously described as a bidnavirus, may be composed of up to twelve segments and thus represent a new lineage of segmented DNA viruses. Two of the DNA viruses, Drosophila Kallithea nudivirus and Drosophila Vesanto virus are relatively common, found in 2 per cent or more of wild flies. The others are rare, with many likely to be represented by a single infected fly. We find that virus prevalence in Europe reflects the prevalence seen in publicly available datasets, with Drosophila Kallithea nudivirus and Drosophila Vesanto virus the only ones commonly detectable in public data from wild-caught flies and large population cages, and the other viruses being rare or absent. These analyses suggest that DNA viruses are at lower prevalence than RNA viruses in D.melanogaster, and may be less likely to persist in laboratory cultures. Our findings go some way to redressing an earlier bias toward RNA virus studies in Drosophila, and lay the foundation needed to harness the power of Drosophila as a model system for the study of DNA viruses.
Collapse
Affiliation(s)
- Megan A Wallace
- The European Drosophila Population Genomics Consortium (DrosEU)
- Ashworth Laboratories, Institute of Evolutionary Biology, University of Edinburgh, Charlotte Auerbach Road, Edinburgh EH9 3FL, UK
| | - Kelsey A Coffman
- Department of Entomology, University of Georgia, Athens, GA, USA
| | - Clément Gilbert
- The European Drosophila Population Genomics Consortium (DrosEU)
- Université Paris-Saclay, CNRS, IRD, UMR Évolution, Génomes, Comportement et Écologie, 91198 Gif-sur-Yvette, France
| | - Sanjana Ravindran
- Ashworth Laboratories, Institute of Evolutionary Biology, University of Edinburgh, Charlotte Auerbach Road, Edinburgh EH9 3FL, UK
| | - Gregory F Albery
- Department of Biology, Georgetown University, Washington, DC, USA
| | - Jessica Abbott
- The European Drosophila Population Genomics Consortium (DrosEU)
- Department of Biology, Section for Evolutionary Ecology, Lund University, Sölvegatan 37, Lund 223 62, Sweden
| | - Eliza Argyridou
- The European Drosophila Population Genomics Consortium (DrosEU)
- Division of Evolutionary Biology, Faculty of Biology, Ludwig-Maximilians-Universität München, Planegg, Germany
| | - Paola Bellosta
- The European Drosophila Population Genomics Consortium (DrosEU)
- Department of Cellular, Computational and Integrative Biology, CIBIO University of Trento, Via Sommarive 9, Trento 38123, Italy
- Department of Medicine & Endocrinology, NYU Langone Medical Center, 550 First Avenue, New York, NY 10016, USA
| | - Andrea J Betancourt
- The European Drosophila Population Genomics Consortium (DrosEU)
- Institute of Integrative Biology, University of Liverpool, Liverpool L69 7ZB, UK
| | - Hervé Colinet
- The European Drosophila Population Genomics Consortium (DrosEU)
- UMR CNRS 6553 ECOBIO, Université de Rennes1, Rennes, France
| | - Katarina Eric
- The European Drosophila Population Genomics Consortium (DrosEU)
- Institute for Biological Research “Sinisa Stankovic”, National Institute of Republic of Serbia, University of Belgrade, Bulevar despota Stefana 142, Belgrade, Serbia
| | - Amanda Glaser-Schmitt
- The European Drosophila Population Genomics Consortium (DrosEU)
- Division of Evolutionary Biology, Faculty of Biology, Ludwig-Maximilians-Universität München, Planegg, Germany
| | - Sonja Grath
- The European Drosophila Population Genomics Consortium (DrosEU)
- Division of Evolutionary Biology, Faculty of Biology, Ludwig-Maximilians-Universität München, Planegg, Germany
| | - Mihailo Jelic
- The European Drosophila Population Genomics Consortium (DrosEU)
- Faculty of Biology, University of Belgrade, Studentski trg 16, Belgrade, Serbia
| | - Maaria Kankare
- The European Drosophila Population Genomics Consortium (DrosEU)
- Department of Biological and Environmental Science, University of Jyväskylä, Finland
| | - Iryna Kozeretska
- The European Drosophila Population Genomics Consortium (DrosEU)
- National Antarctic Scientific Center of Ukraine, 16 Shevchenko Avenue, Kyiv, 01601, Ukraine
| | - Volker Loeschcke
- The European Drosophila Population Genomics Consortium (DrosEU)
- Department of Biology, Genetics, Ecology and Evolution, Aarhus University, Ny Munkegade 116, Aarhus C DK-8000, Denmark
| | - Catherine Montchamp-Moreau
- The European Drosophila Population Genomics Consortium (DrosEU)
- Université Paris-Saclay, CNRS, IRD, UMR Évolution, Génomes, Comportement et Écologie, 91198 Gif-sur-Yvette, France
| | - Lino Ometto
- The European Drosophila Population Genomics Consortium (DrosEU)
- Department of Biology and Biotechnology, University of Pavia, Pavia 27100, Italy
| | - Banu Sebnem Onder
- The European Drosophila Population Genomics Consortium (DrosEU)
- Department of Biology, Faculty of Science, Hacettepe University, Ankara, Turkey
| | - Dorcas J Orengo
- The European Drosophila Population Genomics Consortium (DrosEU)
- Departament de Genètica, Microbiologia i Estadística and Institut de Recerca de la Biodiversitat (IRBio), Universitat de Barcelona, Barcelona, Spain
| | - John Parsch
- The European Drosophila Population Genomics Consortium (DrosEU)
- Division of Evolutionary Biology, Faculty of Biology, Ludwig-Maximilians-Universität München, Planegg, Germany
| | - Marta Pascual
- The European Drosophila Population Genomics Consortium (DrosEU)
- Departament de Genètica, Microbiologia i Estadística and Institut de Recerca de la Biodiversitat (IRBio), Universitat de Barcelona, Barcelona, Spain
| | - Aleksandra Patenkovic
- The European Drosophila Population Genomics Consortium (DrosEU)
- Institute for Biological Research “Sinisa Stankovic”, National Institute of Republic of Serbia, University of Belgrade, Bulevar despota Stefana 142, Belgrade, Serbia
| | - Eva Puerma
- The European Drosophila Population Genomics Consortium (DrosEU)
- Departament de Genètica, Microbiologia i Estadística and Institut de Recerca de la Biodiversitat (IRBio), Universitat de Barcelona, Barcelona, Spain
| | - Michael G Ritchie
- The European Drosophila Population Genomics Consortium (DrosEU)
- Centre for Biological Diversity, St Andrews University, St Andrews HY15 4SS, UK
| | - Omar Rota-Stabelli
- The European Drosophila Population Genomics Consortium (DrosEU)
- Research and Innovation Center, Fondazione E. Mach, San Michele all’Adige (TN) 38010, Italy
- Centre Agriculture Food Environment, University of Trento, San Michele all’Adige (TN) 38010, Italy
| | - Mads Fristrup Schou
- The European Drosophila Population Genomics Consortium (DrosEU)
- Department of Biology, Section for Evolutionary Ecology, Lund University, Sölvegatan 37, Lund 223 62, Sweden
- Department of Bioscience, Aarhus University, Aarhus, Denmark
| | - Svitlana V Serga
- The European Drosophila Population Genomics Consortium (DrosEU)
- National Antarctic Scientific Center of Ukraine, 16 Shevchenko Avenue, Kyiv, 01601, Ukraine
- Taras Shevchenko National University of Kyiv, 64 Volodymyrska str, Kyiv 01601, Ukraine
| | - Marina Stamenkovic-Radak
- The European Drosophila Population Genomics Consortium (DrosEU)
- Faculty of Biology, University of Belgrade, Studentski trg 16, Belgrade, Serbia
| | - Marija Tanaskovic
- The European Drosophila Population Genomics Consortium (DrosEU)
- Institute for Biological Research “Sinisa Stankovic”, National Institute of Republic of Serbia, University of Belgrade, Bulevar despota Stefana 142, Belgrade, Serbia
| | - Marija Savic Veselinovic
- The European Drosophila Population Genomics Consortium (DrosEU)
- Faculty of Biology, University of Belgrade, Studentski trg 16, Belgrade, Serbia
| | - Jorge Vieira
- The European Drosophila Population Genomics Consortium (DrosEU)
- Instituto de Biologia Molecular e Celular (IBMC), University of Porto, Porto, Portugal
- Instituto de Investigação e Inovação em Saúde, University of Porto, i3S, Porto, Portugal
| | - Cristina P Vieira
- The European Drosophila Population Genomics Consortium (DrosEU)
- Instituto de Biologia Molecular e Celular (IBMC), University of Porto, Porto, Portugal
- Instituto de Investigação e Inovação em Saúde, University of Porto, i3S, Porto, Portugal
| | - Martin Kapun
- The European Drosophila Population Genomics Consortium (DrosEU)
- Department of Evolutionary Biology and Environmental Studies, University of Zürich, Zürich, Switzerland
- Division of Cell & Developmental Biology, Medical University of Vienna, Vienna, Austria
| | - Thomas Flatt
- The European Drosophila Population Genomics Consortium (DrosEU)
- Department of Biology, University of Fribourg, Fribourg CH-1700, Switzerland
| | - Josefa González
- The European Drosophila Population Genomics Consortium (DrosEU)
- Institute of Evolutionary Biology (CSIC-UPF), Barcelona, Spain
| | - Fabian Staubach
- The European Drosophila Population Genomics Consortium (DrosEU)
- Department of Evolution and Ecology, University of Freiburg, Freiburg 79104, Germany
| | - Darren J Obbard
- The European Drosophila Population Genomics Consortium (DrosEU)
- Ashworth Laboratories, Institute of Evolutionary Biology, University of Edinburgh, Charlotte Auerbach Road, Edinburgh EH9 3FL, UK
| |
Collapse
|
14
|
citSATdb: Genome-Wide Simple Sequence Repeat (SSR) Marker Database of Citrus Species for Germplasm Characterization and Crop Improvement. Genes (Basel) 2020; 11:genes11121486. [PMID: 33321957 PMCID: PMC7764524 DOI: 10.3390/genes11121486] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2020] [Revised: 11/23/2020] [Accepted: 11/26/2020] [Indexed: 11/17/2022] Open
Abstract
Microsatellites or simple sequence repeats (SSRs) are popular co-dominant markers that play an important role in crop improvement. To enhance genomic resources in general horticulture, we identified SSRs in the genomes of eight citrus species and characterized their frequency and distribution in different genomic regions. Citrus is the world's most widely cultivated fruit crop. We have implemented a microsatellite database, citSATdb, having the highest number (~1,296,500) of putative SSR markers from the genus Citrus, represented by eight species. The database is based on a three-tier approach using MySQL, PHP, and Apache. The markers can be searched using multiple search parameters including chromosome/scaffold number(s), motif types, repeat nucleotides (1-6), SSR length, patterns of repeat motifs and chromosome/scaffold location. The cross-species transferability of selected markers can be checked using e-PCR. Further, the markers can be visualized using the Jbrowse feature. These markers can be used for distinctness, uniformity, and stability (DUS) tests of variety identification, marker-assisted selection (MAS), gene discovery, QTL mapping, and germplasm characterization. citSATdb represents a comprehensive source of markers for developing/implementing new approaches for molecular breeding, required to enhance Citrus productivity. The potential polymorphic SSR markers identified by cross-species transferability could be used for genetic diversity and population distinction in other species.
Collapse
|
15
|
Satyam R, Jha NK, Kar R, Jha SK, Sharma A, Kumar D, Nand P, Ruokolainen J, Kesari KK, Kamal MA. Deciphering the SSR incidences across viral members of Coronaviridae family. Chem Biol Interact 2020; 331:109226. [PMID: 32971122 PMCID: PMC7505113 DOI: 10.1016/j.cbi.2020.109226] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2020] [Revised: 08/05/2020] [Accepted: 08/11/2020] [Indexed: 12/19/2022]
Abstract
Presence of Simple Sequence Repeats (SSRs), both in genic and intergenic regions, have been widely studied in eukaryotes, prokaryotes, and viruses. In the current study, we undertook a survey to analyze the frequency and distribution of microsatellites or SSRs in multiple genomes of Coronaviridae members. We successfully identified 919 SSRs with length ≥12 bp across 55 reference genomes majority of which (838 SSRs) were found abundant in genic regions. The in-silico analysis further identified the preferential abundance of hexameric SSRs than any other size-based motif class. Our analysis shows that the genome size and GC content of the genome had a weak influence on SSR frequency and density. However, we find a positive correlation of SSRs GC content with genomic GC content. We also report relatively low abundances of all theoretically possible 501 repeat motif classes in all the genomes of Coronaviridae. The majority of SSRs were AT-rich. Overall, we see an underrepresentation of SSRs across the genomes of Coronaviridae. Besides, our integrative study highlights the presence of SSRs in ORF1ab (nsp3, nsp4, nsp5A_3CLpro and nsp5B_3CLpro, nsp6, nsp10, nsp12, nsp13, & nsp15 domains), S, ORF3a, ORF7a, N & 3' UTR regions of SARS-CoV-2 and harbours multiple mutations (3'UTR and ORF1ab SSRs serving as major mutational hotspots). This indicates the genic SSRs are under selection pressure against mutations that might alter the reading frame and at the same time responsible for rapid protein evolution. Our preliminary results indicate the significance of the limited repertoire of SSRs in the genomes of Coronaviridae.
Collapse
Affiliation(s)
- Rohit Satyam
- Department of Biotechnology, Noida Institute of Engineering and Technology (NIET), Greater Noida, India
| | - Niraj Kumar Jha
- Department of Biotechnology, School of Engineering & Technology (SET), Sharda University, Greater Noida, 201310, India.
| | - Rohan Kar
- Indian Institute of Management Ahmedabad (IIMA), Gujarat, 380015, India
| | - Saurabh Kumar Jha
- Department of Biotechnology, School of Engineering & Technology (SET), Sharda University, Greater Noida, 201310, India
| | - Ankur Sharma
- Department of Life Science, School of Basic Science & Research, Sharda University, Greater Noida, 201310, India
| | - Dhruv Kumar
- Amity Institute of Molecular Medicine and Stem Cell Research (AIMMSCR), Amity University Uttar Pradesh, Noida, 201313, India
| | - Parma Nand
- Department of Biotechnology, School of Engineering & Technology (SET), Sharda University, Greater Noida, 201310, India
| | | | | | - Mohammad Amjad Kamal
- King Fahd Medical Research Center, King Abdulaziz University, P. O. Box 80216, Jeddah, 21589, Saudi Arabia; Enzymoics, Novel Global Community Educational Foundation, 7 Peterlee Place, Hebersham, NSW, 2770, Australia
| |
Collapse
|
16
|
Pereira F. Evolutionary dynamics of the SARS-CoV-2 ORF8 accessory gene. INFECTION, GENETICS AND EVOLUTION : JOURNAL OF MOLECULAR EPIDEMIOLOGY AND EVOLUTIONARY GENETICS IN INFECTIOUS DISEASES 2020; 85:104525. [PMID: 32890763 PMCID: PMC7467077 DOI: 10.1016/j.meegid.2020.104525] [Citation(s) in RCA: 72] [Impact Index Per Article: 18.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/30/2020] [Revised: 08/28/2020] [Accepted: 08/29/2020] [Indexed: 01/08/2023]
Abstract
The new SARS-CoV-2 poses a significant threat to human health but many aspects of its basic biology remain unknown. Its genome encodes accessory genes that differ significantly within coronaviruses and contribute to the virus pathogenicity. Among accessory genes, open reading frame 8 (ORF8) stands out by being highly variable and showing structural changes suspected to be related with the virus ability to spread. However, the function of ORF8 remains to be elucidated, making it less studied than other SARS-CoV-2 genes. Here I show that ORF8 is poorly conserved among related coronaviruses. The ORF8 phylogeny built using 11,113 SARS-CoV-2 sequences revealed traces of a typical expanding population with a small number of highly frequent lineages. Interestingly, I detected several nonsense mutations and three main deletions in the ORF8 gene that either remove or significantly change the ORF8 protein. These findings suggest that SARS-CoV-2 can persist without a functional ORF8 protein. Deletion breakpoints were found located in predicted hairpins suggesting a possible involvement of these elements in the rearrangement process. Although the function of ORF8 remains to be elucidated, its structural plasticity and high diversity suggest an important role in SARS-CoV-2 pathogenicity.
Collapse
Affiliation(s)
- Filipe Pereira
- Departamento de Ciências da Vida, Universidade de Coimbra. Calçada Martim de Freitas, 3000-456 Coimbra, Portugal; IDENTIFICA, Science and Technology Park of the University of Porto - UPTEC, Rua Alfredo Allen, N.°455/461, 4200-135 Porto, Portugal..
| |
Collapse
|
17
|
Comparative analysis, distribution, and characterization of microsatellites in Orf virus genome. Sci Rep 2020; 10:13852. [PMID: 32807836 PMCID: PMC7431841 DOI: 10.1038/s41598-020-70634-6] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2019] [Accepted: 07/01/2020] [Indexed: 11/09/2022] Open
Abstract
Genome-wide in-silico identification of microsatellites or simple sequence repeats (SSRs) in the Orf virus (ORFV), the causative agent of contagious ecthyma has been carried out to investigate the type, distribution and its potential role in the genome evolution. We have investigated eleven ORFV strains, which resulted in the presence of 1,036-1,181 microsatellites per strain. The further screening revealed the presence of 83-107 compound SSRs (cSSRs) per genome. Our analysis indicates the dinucleotide (76.9%) repeats to be the most abundant, followed by trinucleotide (17.7%), mononucleotide (4.9%), tetranucleotide (0.4%) and hexanucleotide (0.2%) repeats. The Relative Abundance (RA) and Relative Density (RD) of these SSRs varied between 7.6-8.4 and 53.0-59.5 bp/kb, respectively. While in the case of cSSRs, the RA and RD ranged from 0.6-0.8 and 12.1-17.0 bp/kb, respectively. Regression analysis of all parameters like the incident of SSRs, RA, and RD significantly correlated with the GC content. But in a case of genome size, except incident SSRs, all other parameters were non-significantly correlated. Nearly all cSSRs were composed of two microsatellites, which showed no biasedness to a particular motif. Motif duplication pattern, such as, (C)-x-(C), (TG)-x-(TG), (AT)-x-(AT), (TC)- x-(TC) and self-complementary motifs, such as (GC)-x-(CG), (TC)-x-(AG), (GT)-x-(CA) and (TC)-x-(AG) were observed in the cSSRs. Finally, in-silico polymorphism was assessed, followed by in-vitro validation using PCR analysis and sequencing. The thirteen polymorphic SSR markers developed in this study were further characterized by mapping with the sequence present in the database. The results of the present study indicate that these SSRs could be a useful tool for identification, analysis of genetic diversity, and understanding the evolutionary status of the virus.
Collapse
|
18
|
Zhang H, Li D, Zhao X, Pan S, Wu X, Peng S, Huang H, Shi R, Tan Z. Relatively semi-conservative replication and a folded slippage model for short tandem repeats. BMC Genomics 2020; 21:563. [PMID: 32807079 PMCID: PMC7430839 DOI: 10.1186/s12864-020-06949-5] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2020] [Accepted: 07/27/2020] [Indexed: 12/11/2022] Open
Abstract
Background The ubiquitous presence of short tandem repeats (STRs) in virtually all genomes implicates their functional relevance, while a widely-accepted definition of STR is yet to be established. Previous studies majorly focus on relatively longer STRs, while shorter repeats were generally excluded. Herein, we have adopted a more generous criteria to define shorter repeats, which has led to the definition of a much larger number of STRs that lack prior analysis. Using this definition, we analyzed the short repeats in 55 randomly selected segments in 55 randomly selected genomic sequences from a fairly wide range of species covering animals, plants, fungi, protozoa, bacteria, archaea and viruses. Results Our analysis reveals a high percentage of short repeats in all 55 randomly selected segments, indicating that the universal presence of high-content short repeats could be a common characteristic of genomes across all biological kingdoms. Therefore, it is reasonable to assume a mechanism for continuous production of repeats that can make the replicating process relatively semi-conservative. We have proposed a folded replication slippage model that considers the geometric space of nucleotides and hydrogen bond stability to explain the mechanism more explicitly, with improving the existing straight-line slippage model. The folded slippage model can explain the expansion and contraction of mono- to hexa- nucleotide repeats with proper folding angles. Analysis of external forces in the folding template strands also suggests that expansion exists more commonly than contraction in the short tandem repeats. Conclusion The folded replication slippage model provides a reasonable explanation for the continuous occurrences of simple sequence repeats in genomes. This model also contributes to the explanation of STR-to-genome evolution and is an alternative model that complements semi-conservative replication.
Collapse
Affiliation(s)
- Hongxi Zhang
- Bioinformatics Center, College of Biology, Hunan University, Changsha, 410082, China
| | - Douyue Li
- Bioinformatics Center, College of Biology, Hunan University, Changsha, 410082, China
| | - Xiangyan Zhao
- Bioinformatics Center, College of Biology, Hunan University, Changsha, 410082, China
| | - Saichao Pan
- Bioinformatics Center, College of Biology, Hunan University, Changsha, 410082, China
| | - Xiaolong Wu
- Bioinformatics Center, College of Biology, Hunan University, Changsha, 410082, China
| | - Shan Peng
- Bioinformatics Center, College of Biology, Hunan University, Changsha, 410082, China
| | - Hanrou Huang
- Bioinformatics Center, College of Biology, Hunan University, Changsha, 410082, China
| | - Ruixue Shi
- Bioinformatics Center, College of Biology, Hunan University, Changsha, 410082, China
| | - Zhongyang Tan
- Bioinformatics Center, College of Biology, Hunan University, Changsha, 410082, China.
| |
Collapse
|
19
|
Song X, Yang T, Yan X, Zheng F, Xu X, Zhou C. Comparison of microsatellite distribution patterns in twenty-nine beetle genomes. Gene 2020; 757:144919. [PMID: 32603771 DOI: 10.1016/j.gene.2020.144919] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2020] [Revised: 06/15/2020] [Accepted: 06/20/2020] [Indexed: 01/20/2023]
Abstract
Simple sequence repeats (SSRs) represent an important source of genetic variation that provides a basis for adaptation to different environments in organisms. In this study, we examined the distribution patterns of SSRs in twenty-nine beetle genomes and carried out Gene Ontology (GO) analysis of CDSs embedded with perfect SSRs (P-SSRs). The results demonstrated that imperfect SSRs (I-SSRs) represented the most abundant SSR category in beetle genomes and in different genomic regions (CDS, exon, and intron regions). The numbers of P-SSRs, I-SSRs, compound SSRs, and variable number tandem repeats were positively correlated with beetle genome size, whereas neither the frequency nor the density of the SSRs was correlated with genome size. Moreover, our results demonstrated that common genomic features of P-SSRs within the same suborder or family of Coleoptera were rare. Mono-, di-, tri-, or tetranucleotide SSRs were the most abundant P-SSR categories in beetle genomes. The preferred predominant repeat motif among the mononucleotide P-SSRs was (A)n, but the most frequent repeat motifs for other length classes varied differentially among these genomes. Furthermore, the P-SSR type with the highest GC content differed in the beetle genomes and in different genomic regions. CV (coefficient of variability) analysis demonstrated that the repeat copy numbers of P-SSRs presented relatively higher variation in introns than in CDSs and exons. The GO terms of CDSs containing P-SSRs for molecular functions were mainly enriched in "binding" and "transcription". Our findings will be useful for studying the functional roles of microsatellite heterogeneity in beetle adaptation.
Collapse
Affiliation(s)
- Xuhao Song
- Key Laboratory of Southwest China Wildlife Resources Conservation (Ministry of Education), China West Normal University, Nanchong 637009, Sichuan Province, China.
| | - Tingbang Yang
- Key Laboratory of Southwest China Wildlife Resources Conservation (Ministry of Education), China West Normal University, Nanchong 637009, Sichuan Province, China
| | - Xianghui Yan
- Key Laboratory of Southwest China Wildlife Resources Conservation (Ministry of Education), China West Normal University, Nanchong 637009, Sichuan Province, China
| | - Fake Zheng
- Key Laboratory of Southwest China Wildlife Resources Conservation (Ministry of Education), China West Normal University, Nanchong 637009, Sichuan Province, China
| | - Xiaoqin Xu
- Key Laboratory of Southwest China Wildlife Resources Conservation (Ministry of Education), China West Normal University, Nanchong 637009, Sichuan Province, China
| | - Caiquan Zhou
- Key Laboratory of Southwest China Wildlife Resources Conservation (Ministry of Education), China West Normal University, Nanchong 637009, Sichuan Province, China.
| |
Collapse
|
20
|
Wokorach G, Otim G, Njuguna J, Edema H, Njung'e V, Machuka EM, Yao N, Stomeo F, Echodu R. Genomic analysis of Sweet potato feathery mottle virus from East Africa. PHYSIOLOGICAL AND MOLECULAR PLANT PATHOLOGY 2020; 110:101473. [PMID: 32454559 PMCID: PMC7233136 DOI: 10.1016/j.pmpp.2020.101473] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/15/2019] [Revised: 02/07/2020] [Accepted: 02/10/2020] [Indexed: 06/11/2023]
Abstract
Sweet potato feathery mottle virus is a potyvirus that infect sweet potato. The genome of the virus was analysed to understand genetic diversity, evolution and gene flow. Motifs, nucleotide identity and a phylogenetic tree were used to determine phylogroup of the isolates. Gene flow and genetic diversity were tested using DnaSP v.5. Codons evolution were tested using three methods embedded in Datamonkey. The results indicate occurrence of an isolate of phylogroup B within East Africa. Low genetic differentiation was observed between isolates from Kenya and Uganda indicating evidence of gene flow between the two countries. Four genes were found to have positively selected codons bordering or occurring within functional motifs. A motif within P1 gene evolved differently between phylogroup A and B. The evidence of gene flow indicates frequent exchange of the virus between the two countries and P1 gene motif provide a possible marker that can be used for mapping the distribution of the phylogroups.
Collapse
Affiliation(s)
- Godfrey Wokorach
- Biosciences Research Laboratory, Gulu University, P.O. Box 166, Gulu, Uganda
| | - Geoffrey Otim
- Biosciences Research Laboratory, Gulu University, P.O. Box 166, Gulu, Uganda
- Faculty of Agriculture, Gulu University, P.O. Box 166, Gulu, Uganda
| | - Joyce Njuguna
- Biosciences Eastern and Central Africa, International Livestock Research Institute (BecA-ILRI) Hub, P.O. Box 30709, Nairobi, 00100, Kenya
| | - Hilary Edema
- Biosciences Research Laboratory, Gulu University, P.O. Box 166, Gulu, Uganda
| | - Vincent Njung'e
- Biosciences Eastern and Central Africa, International Livestock Research Institute (BecA-ILRI) Hub, P.O. Box 30709, Nairobi, 00100, Kenya
| | - Eunice M. Machuka
- Biosciences Eastern and Central Africa, International Livestock Research Institute (BecA-ILRI) Hub, P.O. Box 30709, Nairobi, 00100, Kenya
| | - Nasser Yao
- Biosciences Eastern and Central Africa, International Livestock Research Institute (BecA-ILRI) Hub, P.O. Box 30709, Nairobi, 00100, Kenya
| | - Francesca Stomeo
- Biosciences Eastern and Central Africa, International Livestock Research Institute (BecA-ILRI) Hub, P.O. Box 30709, Nairobi, 00100, Kenya
| | - Richard Echodu
- Biosciences Research Laboratory, Gulu University, P.O. Box 166, Gulu, Uganda
- Faculty of Agriculture, Gulu University, P.O. Box 166, Gulu, Uganda
- Department of Biology, Faculty of Science, Gulu University, P.O. Box 166, Gulu, Uganda
| |
Collapse
|
21
|
Li D, Zhang H, Peng S, Pan S, Tan Z. Conserved microsatellites may contribute to stem-loop structures in 5', 3' terminals of Ebolavirus genomes. Biochem Biophys Res Commun 2019; 514:726-733. [PMID: 31078274 PMCID: PMC7092875 DOI: 10.1016/j.bbrc.2019.04.192] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2019] [Revised: 04/25/2019] [Accepted: 04/28/2019] [Indexed: 12/12/2022]
Abstract
Microsatellites (SSRs) are ubiquitous in coding and non-coding regions of the Ebolavirus genomes. We synthetically analyzed the microsatellites in whole-genome and terminal regions of 219 Ebolavirus genomes from five species. The Ebolavirus sequences were observed with small intraspecies variations and large interspecific variations, especially in the terminal non-coding regions. Only five conserved microsatellites were detected in the complete genomes, and four of them which well base-paired to help forming conserved stem-loop structures mainly appeared in the terminal non-coding regions. These results suggest that the conserved microsatellites may be evolutionary selected to form conserved secondary structures in 5′, 3′ terminals of Ebolavirus genomes. It may help to understand the biological significance of microsatellites in Ebolavirus and also other virus genomes. Conserved microsatellites mainly occurred in 5′, 3′ terminal non-coding regions. Conserved microsatellites may contribute to conserved stem-loop structures. Conserved microsatellites might be preserved under greater evolutionary pressure.
Collapse
Affiliation(s)
- Douyue Li
- Bioinformatics Center, College of Biology, Hunan University, Changsha, China
| | - Hongxi Zhang
- Bioinformatics Center, College of Biology, Hunan University, Changsha, China
| | - Shan Peng
- Bioinformatics Center, College of Biology, Hunan University, Changsha, China
| | - Saichao Pan
- Bioinformatics Center, College of Biology, Hunan University, Changsha, China
| | - Zhongyang Tan
- Bioinformatics Center, College of Biology, Hunan University, Changsha, China.
| |
Collapse
|
22
|
Alam CM, Iqbal A, Sharma A, Schulman AH, Ali S. Microsatellite Diversity, Complexity, and Host Range of Mycobacteriophage Genomes of the Siphoviridae Family. Front Genet 2019; 10:207. [PMID: 30923537 PMCID: PMC6426759 DOI: 10.3389/fgene.2019.00207] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2018] [Accepted: 02/26/2019] [Indexed: 01/21/2023] Open
Abstract
The incidence, distribution, and variation of simple sequence repeats (SSRs) in viruses is instrumental in understanding the functional and evolutionary aspects of repeat sequences. Full-length genome sequences retrieved from NCBI were used for extraction and analysis of repeat sequences using IMEx software. We have also developed two MATLAB-based tools for extraction of gene locations from GenBank in tabular format and simulation of this data with SSR incidence data. Present study encompassing 147 Mycobacteriophage genomes revealed 25,284 SSRs and 1,127 compound SSRs (cSSRs) through IMEx. Mono- to hexa-nucleotide motifs were present. The SSR count per genome ranged from 78 (M100) to 342 (M58) while cSSRs incidence ranged from 1 (M138) to 17 (M28, M73). Though cSSRs were present in all the genomes, their frequency and SSR to cSSR conversion percentage varied from 1.08 (M138 with 93 SSRs) to 8.33 (M116 with 96 SSRs). In terms of localization, the SSRs were predominantly localized to coding regions (∼78%). Interestingly, genomes of around 50 kb contained a similar number of SSRs/cSSRs to that in a 110 kb genome, suggesting functional relevance for SSRs which was substantiated by variation in motif constitution between species with different host range. The three species with broad host range (M97, M100, M116) have around 90% of their mono-nucleotide repeat motifs composed of G or C and only M16 has both A and T mononucleotide motifs. Around 20% of the di-nucleotide repeat motifs in the genomes exhibiting a broad host range were CT/TC, which were either absent or represented to a much lesser extent in the other genomes.
Collapse
Affiliation(s)
- Chaudhary Mashhood Alam
- Luke/BI Plant Genome Dynamics Lab, Institute of Biotechnology and Viikki Plant Science Centre, University of Helsinki, Helsinki, Finland.,Ingenious e-Brain Solutions, Gurugram, India
| | - Asif Iqbal
- PIRO Technologies Private Limited, New Delhi, India
| | - Anjana Sharma
- Department of Biomedical Sciences, SRCASW, University of Delhi, New Delhi, India
| | - Alan H Schulman
- Luke/BI Plant Genome Dynamics Lab, Institute of Biotechnology and Viikki Plant Science Centre, University of Helsinki, Helsinki, Finland.,Natural Resources Institute Finland (Luke), Helsinki, Finland
| | - Safdar Ali
- Department of Biomedical Sciences, SRCASW, University of Delhi, New Delhi, India.,Department of Biological Sciences, Aliah University, Kolkata, India
| |
Collapse
|
23
|
Comparative analysis on precise distribution-patterns of microsatellites in HIV-1 with differential statistical method. GENE REPORTS 2018. [DOI: 10.1016/j.genrep.2018.06.007] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
|
24
|
Portis E, Lanteri S, Barchi L, Portis F, Valente L, Toppino L, Rotino GL, Acquadro A. Comprehensive Characterization of Simple Sequence Repeats in Eggplant ( Solanum melongena L.) Genome and Construction of a Web Resource. FRONTIERS IN PLANT SCIENCE 2018; 9:401. [PMID: 29643862 PMCID: PMC5883146 DOI: 10.3389/fpls.2018.00401] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/11/2018] [Accepted: 03/13/2018] [Indexed: 05/21/2023]
Abstract
We have characterized the simple sequence repeat (SSR) markers of the eggplant (Solanum melongena) using a recent high quality sequence of its whole genome. We found nearly 133,000 perfect SSRs, a density of 125.5 SSRs/Mbp, and also about 178,400 imperfect SSRs. Of the perfect SSRs, 15.6% were complex, with two stretches of repeats separated by an intervening block of <100 nt. Di- and trinucleotide SSRs accounted, respectively, for 43 and 37% of the total. The SSRs were classified according to their number of repeats and overall length, and were assigned to their linkage group. We found 2,449 of the perfect SSRs in 2,086 genes, with an overall density of 18.5 SSRs/Mbp across the gene space; 3,524 imperfect SSRs were present in 2,924 genes at a density of 26.7 SSRs/Mbp. Putative functions were assigned via ontology to genes containing at least one SSR. Using this data we developed an "Eggplant Microsatellite DataBase" (EgMiDB) which permits identification of SSR markers in terms of their location on the genome, type of repeat (perfect vs. imperfect), motif type, sequence, repeat number and genomic/gene context. It also suggests forward and reverse primers. We employed an in silico PCR analysis to validate these SSR markers, using as templates two CDS sets and three assembled transcriptomes obtained from diverse eggplant accessions.
Collapse
Affiliation(s)
- Ezio Portis
- Dipartimento di Scienze Agrarie, Forestali ed Alimentari – Plant Genetics and Breeding, Università degli Studi di Torino, Turin, Italy
| | - Sergio Lanteri
- Dipartimento di Scienze Agrarie, Forestali ed Alimentari – Plant Genetics and Breeding, Università degli Studi di Torino, Turin, Italy
- *Correspondence: Sergio Lanteri,
| | - Lorenzo Barchi
- Dipartimento di Scienze Agrarie, Forestali ed Alimentari – Plant Genetics and Breeding, Università degli Studi di Torino, Turin, Italy
| | | | | | - Laura Toppino
- CREA-GB, Research Centre for Genomics and Bioinformatics, Lodi, Italy
| | | | - Alberto Acquadro
- Dipartimento di Scienze Agrarie, Forestali ed Alimentari – Plant Genetics and Breeding, Università degli Studi di Torino, Turin, Italy
| |
Collapse
|
25
|
Distinct patterns of simple sequence repeats and GC distribution in intragenic and intergenic regions of primate genomes. Aging (Albany NY) 2017; 8:2635-2654. [PMID: 27644032 PMCID: PMC5191860 DOI: 10.18632/aging.101025] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2016] [Accepted: 08/22/2016] [Indexed: 01/23/2023]
Abstract
As the first systematic examination of simple sequence repeats (SSRs) and guanine-cytosine (GC) distribution in intragenic and intergenic regions of ten primates, our study showed that SSRs and GC displayed nonrandom distribution for both intragenic and intergenic regions, suggesting that they have potential roles in transcriptional or translational regulation. Our results suggest that the majority of SSRs are distributed in non-coding regions, such as the introns, TEs, and intergenic regions. In these primates, trinucleotide perfect (P) SSRs were the most abundant repeats type in the 5'UTRs and CDSs, whereas, mononucleotide P-SSRs were the most in the intron, 3'UTRs, TEs, and intergenic regions. The GC-contents varied greatly among different intragenic and intergenic regions: 5'UTRs > CDSs > 3'UTRs > TEs > introns > intergenic regions, and high GC-content was frequently distributed in exon-rich regions. Our results also showed that in the same intragenic and intergenic regions, the distribution of GC-contents were great similarity in the different primates. Tri- and hexanucleotide P-SSRs had the most GC-contents in the 5'UTRs and CDSs, whereas mononucleotide P-SSRs had the least GC-contents in the six genomic regions of these primates. The most frequent motifs for different length varied obviously with the different genomic regions.
Collapse
|
26
|
Genome-wide In Silico Analysis, Characterization and Identification of Microsatellites in Spodoptera littoralis Multiple nucleopolyhedrovirus (SpliMNPV). Sci Rep 2016; 6:33741. [PMID: 27650818 PMCID: PMC5030640 DOI: 10.1038/srep33741] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2016] [Accepted: 09/01/2016] [Indexed: 01/10/2023] Open
Abstract
In this study, we undertook a survey to analyze the distribution and frequency of microsatellites or Simple Sequence Repeats (SSRs) in Spodoptera littoralis multiple nucleopolyhedrovirus (SpliMNPV) genome (isolate AN-1956). Out of the 55 microsatellite motifs, identified in the SpliMNPV-AN1956 genome using in silico analysis (inclusive of mono-, di-, tri- and hexa-nucleotide repeats), 39 were found to be distributed within coding regions (cSSRs), whereas 16 were observed to lie within intergenic or noncoding regions. Among the 39 motifs located in coding regions, 21 were located in annotated functional genes whilst 18 were identified in unknown functional genes (hypothetical proteins). Among the identified motifs, trinucleotide (80%) repeats were found to be the most abundant followed by dinucleotide (13%), mononucleotide (5%) and hexanucleotide (2%) repeats. The 39 motifs located within coding regions were further validated in vitro by using PCR analysis, while the 21 motifs located within known functional genes (15 genes) were characterized using nucleotide sequencing. A comparison of the sequence analysis data of the 21 sequenced cSSRs with the published sequences is presented. Finally, the developed SSR markers of the 39 motifs were further mapped/localized onto the SpliMNPV-AN1956 genome. In conclusion, the SSR markers specific to SpliMNPV, developed in this study, could be a useful tool for the identification of isolates and analysis of genetic diversity and viral evolutionary status.
Collapse
|
27
|
Comparative analysis of microsatellites and compound microsatellites in T4-like viruses. Gene 2016; 575:695-701. [DOI: 10.1016/j.gene.2015.09.053] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2015] [Revised: 09/16/2015] [Accepted: 09/21/2015] [Indexed: 01/27/2023]
|
28
|
Han B, Wang C, Tang Z, Ren Y, Li Y, Zhang D, Dong Y, Zhao X. Genome-Wide Analysis of Microsatellite Markers Based on Sequenced Database in Chinese Spring Wheat (Triticum aestivum L.). PLoS One 2015; 10:e0141540. [PMID: 26536014 PMCID: PMC4633229 DOI: 10.1371/journal.pone.0141540] [Citation(s) in RCA: 37] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2015] [Accepted: 10/10/2015] [Indexed: 12/12/2022] Open
Abstract
Microsatellites or simple sequence repeats (SSRs) are distributed across both prokaryotic and eukaryotic genomes and have been widely used for genetic studies and molecular marker-assisted breeding in crops. Though an ordered draft sequence of hexaploid bread wheat have been announced, the researches about systemic analysis of SSRs for wheat still have not been reported so far. In the present study, we identified 364,347 SSRs from among 10,603,760 sequences of the Chinese spring wheat (CSW) genome, which were present at a density of 36.68 SSR/Mb. In total, we detected 488 types of motifs ranging from di- to hexanucleotides, among which dinucleotide repeats dominated, accounting for approximately 42.52% of the genome. The density of tri- to hexanucleotide repeats was 24.97%, 4.62%, 3.25% and 24.65%, respectively. AG/CT, AAG/CTT, AGAT/ATCT, AAAAG/CTTTT and AAAATT/AATTTT were the most frequent repeats among di- to hexanucleotide repeats. Among the 21 chromosomes of CSW, the density of repeats was highest on chromosome 2D and lowest on chromosome 3A. The proportions of di-, tri-, tetra-, penta- and hexanucleotide repeats on each chromosome, and even on the whole genome, were almost identical. In addition, 295,267 SSR markers were successfully developed from the 21 chromosomes of CSW, which cover the entire genome at a density of 29.73 per Mb. All of the SSR markers were validated by reverse electronic-Polymerase Chain Reaction (re-PCR); 70,564 (23.9%) were found to be monomorphic and 224,703 (76.1%) were found to be polymorphic. A total of 45 monomorphic markers were selected randomly for validation purposes; 24 (53.3%) amplified one locus, 8 (17.8%) amplified multiple identical loci, and 13 (28.9%) did not amplify any fragments from the genomic DNA of CSW. Then a dendrogram was generated based on the 24 monomorphic SSR markers among 20 wheat cultivars and three species of its diploid ancestors showing that monomorphic SSR markers represented a promising source to increase the number of genetic markers available for the wheat genome. The results of this study will be useful for investigating the genetic diversity and evolution among wheat and related species. At the same time, the results will facilitate comparative genomic studies and marker-assisted breeding (MAS) in plants.
Collapse
Affiliation(s)
- Bin Han
- College of Bio-engineering, Shanxi University, Taiyuan, China
| | - Changbiao Wang
- Biotechnology Research Center, Shanxi Academy of Agricultural Sciences, Taiyuan, China
- * E-mail: (ZHT); (DYZ); (CBW)
| | - Zhaohui Tang
- Biotechnology Research Center, Shanxi Academy of Agricultural Sciences, Taiyuan, China
- * E-mail: (ZHT); (DYZ); (CBW)
| | - Yongkang Ren
- Institute of Crop Science, Shanxi Academy of Agricultural Sciences, Taiyuan, China
| | - Yali Li
- Biotechnology Research Center, Shanxi Academy of Agricultural Sciences, Taiyuan, China
| | - Dayong Zhang
- Provincial Key Laboratory of Agrobiology, Institute of Biotechnology, Jiangsu Academy of Agricultural Sciences, Nanjing, China
- * E-mail: (ZHT); (DYZ); (CBW)
| | - Yanhui Dong
- Biotechnology Research Center, Shanxi Academy of Agricultural Sciences, Taiyuan, China
| | - Xinghua Zhao
- Biotechnology Research Center, Shanxi Academy of Agricultural Sciences, Taiyuan, China
| |
Collapse
|
29
|
Han B, Wang C, Tang Z, Ren Y, Li Y, Zhang D, Dong Y, Zhao X. Genome-Wide Analysis of Microsatellite Markers Based on Sequenced Database in Chinese Spring Wheat (Triticum aestivum L.). PLoS One 2015; 10:e0141540. [PMID: 26536014 DOI: 10.1371/journal.pone.0141540.t006] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2015] [Accepted: 10/10/2015] [Indexed: 05/21/2023] Open
Abstract
Microsatellites or simple sequence repeats (SSRs) are distributed across both prokaryotic and eukaryotic genomes and have been widely used for genetic studies and molecular marker-assisted breeding in crops. Though an ordered draft sequence of hexaploid bread wheat have been announced, the researches about systemic analysis of SSRs for wheat still have not been reported so far. In the present study, we identified 364,347 SSRs from among 10,603,760 sequences of the Chinese spring wheat (CSW) genome, which were present at a density of 36.68 SSR/Mb. In total, we detected 488 types of motifs ranging from di- to hexanucleotides, among which dinucleotide repeats dominated, accounting for approximately 42.52% of the genome. The density of tri- to hexanucleotide repeats was 24.97%, 4.62%, 3.25% and 24.65%, respectively. AG/CT, AAG/CTT, AGAT/ATCT, AAAAG/CTTTT and AAAATT/AATTTT were the most frequent repeats among di- to hexanucleotide repeats. Among the 21 chromosomes of CSW, the density of repeats was highest on chromosome 2D and lowest on chromosome 3A. The proportions of di-, tri-, tetra-, penta- and hexanucleotide repeats on each chromosome, and even on the whole genome, were almost identical. In addition, 295,267 SSR markers were successfully developed from the 21 chromosomes of CSW, which cover the entire genome at a density of 29.73 per Mb. All of the SSR markers were validated by reverse electronic-Polymerase Chain Reaction (re-PCR); 70,564 (23.9%) were found to be monomorphic and 224,703 (76.1%) were found to be polymorphic. A total of 45 monomorphic markers were selected randomly for validation purposes; 24 (53.3%) amplified one locus, 8 (17.8%) amplified multiple identical loci, and 13 (28.9%) did not amplify any fragments from the genomic DNA of CSW. Then a dendrogram was generated based on the 24 monomorphic SSR markers among 20 wheat cultivars and three species of its diploid ancestors showing that monomorphic SSR markers represented a promising source to increase the number of genetic markers available for the wheat genome. The results of this study will be useful for investigating the genetic diversity and evolution among wheat and related species. At the same time, the results will facilitate comparative genomic studies and marker-assisted breeding (MAS) in plants.
Collapse
Affiliation(s)
- Bin Han
- College of Bio-engineering, Shanxi University, Taiyuan, China
| | - Changbiao Wang
- Biotechnology Research Center, Shanxi Academy of Agricultural Sciences, Taiyuan, China
| | - Zhaohui Tang
- Biotechnology Research Center, Shanxi Academy of Agricultural Sciences, Taiyuan, China
| | - Yongkang Ren
- Institute of Crop Science, Shanxi Academy of Agricultural Sciences, Taiyuan, China
| | - Yali Li
- Biotechnology Research Center, Shanxi Academy of Agricultural Sciences, Taiyuan, China
| | - Dayong Zhang
- Provincial Key Laboratory of Agrobiology, Institute of Biotechnology, Jiangsu Academy of Agricultural Sciences, Nanjing, China
| | - Yanhui Dong
- Biotechnology Research Center, Shanxi Academy of Agricultural Sciences, Taiyuan, China
| | - Xinghua Zhao
- Biotechnology Research Center, Shanxi Academy of Agricultural Sciences, Taiyuan, China
| |
Collapse
|
30
|
Genome wide survey of microsatellites in ssDNA viruses infecting vertebrates. Gene 2014; 552:209-18. [DOI: 10.1016/j.gene.2014.09.032] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2014] [Revised: 08/15/2014] [Accepted: 09/15/2014] [Indexed: 01/26/2023]
|
31
|
The analysis of microsatellites and compound microsatellites in 56 complete genomes of Herpesvirales. Gene 2014; 551:103-9. [DOI: 10.1016/j.gene.2014.08.054] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2014] [Revised: 08/09/2014] [Accepted: 08/26/2014] [Indexed: 01/13/2023]
|
32
|
Qin L, Ma Y, Liang P, Tan Z, Li S. Differential distributions of mononucleotide repeat sequences in 256 viral genomes and its potential implications. Gene 2014; 544:159-64. [PMID: 24786215 DOI: 10.1016/j.gene.2014.04.063] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2014] [Revised: 04/14/2014] [Accepted: 04/26/2014] [Indexed: 11/18/2022]
Abstract
Mononucleotide repeats (MNRs) have been systematically investigated in the genomes of eukaryotic and prokaryotic organisms. However, detailed information on the distribution of MNRs in viral genomes is limited. In this study, we examined the distributions of MNRs in 256 fully sequenced virus genomes which showed extensive variations across viral genomes, and is significantly influenced by both genome size and CG content. Furthermore, the ratio of the observed to the expected number of MNRs (O/E ratio) appears to be influenced by both the host range and genome type of a particular virus. Additionally, the densities and frequencies of MNRs in genic regions are lower than in non-coding regions, suggesting that selective pressure acts on viral genomes. We also discuss the potential functional roles that these MNR loci could play in virus genomes. To our knowledge, this is the first analysis focusing on MNRs in viruses, and our study could have potential implications for a deeper understanding of virus genome stability and the co-evolution that occurs between a virus and its host.
Collapse
Affiliation(s)
- Lü Qin
- State Key Laboratory of Biology of Plant Diseases and Insect Pests, Institute of Plant Protection, Chinese Academy of Agricultural Sciences, Beijing 100193, China; College of Biology, State Key Laboratory for Chemo/Biosensing and Chemometrics, Hunan University, Changsha 410082, China
| | - Yuxin Ma
- State Key Laboratory of Biology of Plant Diseases and Insect Pests, Institute of Plant Protection, Chinese Academy of Agricultural Sciences, Beijing 100193, China
| | - Pengbo Liang
- College of Agronomy and Biotechnology, China Agricultural University, Beijing 100094, China
| | - Zhongyang Tan
- State Key Laboratory of Biology of Plant Diseases and Insect Pests, Institute of Plant Protection, Chinese Academy of Agricultural Sciences, Beijing 100193, China; College of Biology, State Key Laboratory for Chemo/Biosensing and Chemometrics, Hunan University, Changsha 410082, China.
| | - Shifang Li
- State Key Laboratory of Biology of Plant Diseases and Insect Pests, Institute of Plant Protection, Chinese Academy of Agricultural Sciences, Beijing 100193, China.
| |
Collapse
|
33
|
Alam CM, Singh AK, Sharfuddin C, Ali S. Incidence, complexity and diversity of simple sequence repeats across potexvirus genomes. Gene 2014; 537:189-96. [DOI: 10.1016/j.gene.2014.01.007] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2013] [Revised: 11/15/2013] [Accepted: 01/04/2014] [Indexed: 01/18/2023]
|
34
|
Qin L, Zhang Z, Zhao X, Wu X, Chen Y, Tan Z, Li S. Survey and analysis of simple sequence repeats (SSRs) present in the genomes of plant viroids. FEBS Open Bio 2014; 4:185-9. [PMID: 24649400 PMCID: PMC3953718 DOI: 10.1016/j.fob.2014.02.001] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2013] [Revised: 01/28/2014] [Accepted: 02/04/2014] [Indexed: 10/25/2022] Open
Abstract
Extensive simple sequence repeat (SSR) surveys have been performed for eukaryotic prokaryotic and viral genomes, but information regarding SSRs in viroids is limited. We undertook a survey to examine the presence of SSRs in viroid genomes. Our results show that the distribution of SSRs in viroids may influence secondary structure, and that SSRs could play a role in generating genetic diversity. We also discuss the potential evolutionary role of repeated sequences in the viroid genome. This is the first report of SSR loci in viroids, and our study could be helpful in understanding the structure and evolution of viroid genomes.
Collapse
Affiliation(s)
- Lü Qin
- State Key Laboratory of Biology of Plant Diseases and Insect Pests, Institute of Plant Protection, Chinese Academy of Agricultural Sciences, Beijing 100193, China ; College of Biology, State Key Laboratory for Chemo/Biosensing and Chemometrics, Hunan University, Changsha 410082, China
| | - Zhixiang Zhang
- State Key Laboratory of Biology of Plant Diseases and Insect Pests, Institute of Plant Protection, Chinese Academy of Agricultural Sciences, Beijing 100193, China
| | - Xiangyan Zhao
- College of Biology, State Key Laboratory for Chemo/Biosensing and Chemometrics, Hunan University, Changsha 410082, China
| | - Xiaolong Wu
- College of Biology, State Key Laboratory for Chemo/Biosensing and Chemometrics, Hunan University, Changsha 410082, China
| | - Yubao Chen
- Department of Computational Biology, Beijing Computing Center, Yongfeng Industry Base, Beijing 100094, China
| | - Zhongyang Tan
- State Key Laboratory of Biology of Plant Diseases and Insect Pests, Institute of Plant Protection, Chinese Academy of Agricultural Sciences, Beijing 100193, China ; College of Biology, State Key Laboratory for Chemo/Biosensing and Chemometrics, Hunan University, Changsha 410082, China
| | - Shifang Li
- State Key Laboratory of Biology of Plant Diseases and Insect Pests, Institute of Plant Protection, Chinese Academy of Agricultural Sciences, Beijing 100193, China
| |
Collapse
|
35
|
Abstract
Tandem repeats (TRs) extensively exist in the genomes of prokaryotes and eukaryotes. Based on the sequenced genomes and gene annotations of 31 plant and algal species in Phytozome version 8.0 (http://www.phytozome.net/), we examined TRs in a genome-wide scale, characterized their distributions and motif features, and explored their putative biological functions. Among the 31 species, no significant correlation was detected between the TR density and genome size. Interestingly, green alga Chlamydomonas reinhardtii (42,059 bp/Mbp) and castor bean Ricinus communis (55,454 bp/Mbp) showed much higher TR densities than all other species (13,209 bp/Mbp on average). In the 29 land plants, including 22 dicots, 5 monocots, and 2 bryophytes, 5′-UTR and upstream intergenic 200-nt (UI200) regions had the first and second highest TR densities, whereas in the two green algae (C. reinhardtii and Volvox carteri) the first and second highest densities were found in intron and coding sequence (CDS) regions, respectively. In CDS regions, trinucleotide and hexanucleotide motifs were those most frequently represented in all species. In intron regions, especially in the two green algae, significantly more TRs were detected near the intron–exon junctions. Within intergenic regions in dicots and monocots, more TRs were found near both the 5′ and 3′ ends of genes. GO annotation in two green algae revealed that the genes with TRs in introns are significantly involved in transcriptional and translational processing. As the first systematic examination of TRs in plant and green algal genomes, our study showed that TRs displayed nonrandom distribution for both intragenic and intergenic regions, suggesting that they have potential roles in transcriptional or translational regulation in plants and green algae.
Collapse
|
36
|
In-silico analysis of simple and imperfect microsatellites in diverse tobamovirus genomes. Gene 2013; 530:193-200. [DOI: 10.1016/j.gene.2013.08.046] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2013] [Revised: 08/10/2013] [Accepted: 08/13/2013] [Indexed: 11/20/2022]
|