1
|
Bharti PK, Husai A. Mining and analysis of microsatellites in human coronavirus genomes using the in-house built Java pipeline. Genomics Inform 2022; 20:e35. [PMID: 36239112 PMCID: PMC9576472 DOI: 10.5808/gi.20033] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2020] [Accepted: 09/14/2022] [Indexed: 11/20/2022] Open
Abstract
Microsatellites or simple sequence repeats are motifs of 1 to 6 nucleotides in length present in both coding and non-coding regions of DNA. These are found widely distributed in the whole genome of prokaryotes, eukaryotes, bacteria, and viruses and are used as molecular markers in studying DNA variations, gene regulation, genetic diversity and evolutionary studies, etc. However, in vitro microsatellite identification proves to be time-consuming and expensive. Therefore, the present research has been focused on using an in-house built java pipeline to identify, analyse, design primers and find related statistics of perfect and compound microsatellites in the seven complete genome sequences of coronavirus, including the genome of coronavirus disease 2019, where the host is Homo sapiens. Based on search criteria among seven genomic sequences, it was revealed that the total number of perfect simple sequence repeats (SSRs) found to be in the range of 76 to 118 and compound SSRs from 01 to10, thus reflecting the low conversion of perfect simple sequence to compound repeats. Furthermore, the incidence of SSRs was insignificant but positively correlated with genome size (R2 = 0.45, p > 0.05), with simple sequence repeats relative abundance (R2 = 0.18, p > 0.05) and relative density (R2 = 0.23, p > 0.05). Dinucleotide repeats were the most abundant in the coding region of the genome, followed by tri, mono, and tetra. This comparative study would help us understand the evolutionary relationship, genetic diversity, and hypervariability in minimal time and cost.
Collapse
Affiliation(s)
- P K Bharti
- School of Computer Science, Shri Venkateshwara University, Gajraula 244236, Uttar Pradesh, India
| | - Akhtar Husai
- Department of Computer Science & IT, MJP Rohilkhand University, Bareilly 243006, Uttar Pradesh, India
| |
Collapse
|
2
|
Li D, Shi R, Zhang H, Huang H, Pan S, Liang Y, Peng S, Tan Z. The only conserved microsatellite in coding regions of ebolavirus is the editing site. Biochem Biophys Res Commun 2021; 565:79-84. [PMID: 34098315 DOI: 10.1016/j.bbrc.2021.05.093] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2021] [Revised: 05/26/2021] [Accepted: 05/27/2021] [Indexed: 11/29/2022]
Abstract
Lots of viral genomes were found to contain microsatellites (SSRs) including Ebolavirus, and majority of Ebolavirus microsatellite sites are distributed in protein-coding regions of the genomes. Here, we totally identified 212 reserved microsatellite sites in the protein-coding regions of 213 genomic sequences from five Ebolavirus species. In these reserved microsatellite sites, there is only one significantly conserved microsatellite site among the sample Ebolavirus genomic sequences, and this microsatellite is located at RNA editing site of the GP gene, indicating the selective relevance with RNA editing there. This analysis may help to further explore the biological significance of various microsatellites in Ebolavirus genomes.
Collapse
Affiliation(s)
- Douyue Li
- Bioinformatics Center, College of Biology, Hunan University, Changsha, 410082, China
| | - Ruixue Shi
- Bioinformatics Center, College of Biology, Hunan University, Changsha, 410082, China
| | - Hongxi Zhang
- Bioinformatics Center, College of Biology, Hunan University, Changsha, 410082, China
| | - Hanrou Huang
- Bioinformatics Center, College of Biology, Hunan University, Changsha, 410082, China
| | - Saichao Pan
- Bioinformatics Center, College of Biology, Hunan University, Changsha, 410082, China
| | - Yuling Liang
- Bioinformatics Center, College of Biology, Hunan University, Changsha, 410082, China
| | - Shan Peng
- Bioinformatics Center, College of Biology, Hunan University, Changsha, 410082, China
| | - Zhongyang Tan
- Bioinformatics Center, College of Biology, Hunan University, Changsha, 410082, China.
| |
Collapse
|
3
|
Genome-wide in silico identification and characterization of Simple Sequence Repeats in diverse completed SARS-CoV-2 genomes. GENE REPORTS 2021; 23:101020. [PMID: 33521382 PMCID: PMC7835092 DOI: 10.1016/j.genrep.2021.101020] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2020] [Revised: 12/06/2020] [Accepted: 12/29/2020] [Indexed: 12/19/2022]
Abstract
Simple sequence repeats (SSRs) or, Microsatellites are short repeat sequences that have been extensively studied in eukaryotic (plants) and prokaryotic (bacteria) organisms. Compared to other organisms, the presence and incidence of SSR on viral genomes are less studied. With the emergence of novel infectious viruses over the past few decades, it is imperative to study the genetic diversity in such viruses to predict their evolutionary and functional changes over time. Following the emergence of SARS-CoV-2, we have assembled 121 complete genomes reported from 31 countries across the six continents for the identification and characterization of SSR repeats. Using two independent SSR identification tools, we have found remarkable consistency in the diversity of microsatellites pattern (38–42 per genome) found in the 121 analyzed SARS-CoV-2 genomes indication their important role for genome stability. Among the identified motifs, trinucleotide and hexanucleotide repeats were found to be the most abundant form followed by mono- and di-nucleotide. There were no tetra- or penta-nucleotide repeats in the analyzed SARS-CoV-2 genomes. The discovery of microsatellites in SARS-CoV-2 genomes may become useful for the population genetics, evolutionary analysis, strain identification and genetic variation.
Collapse
|
4
|
Characterization of a Novel Mitovirus of the Sand Fly Lutzomyia longipalpis Using Genomic and Virus-Host Interaction Signatures. Viruses 2020; 13:v13010009. [PMID: 33374584 PMCID: PMC7822452 DOI: 10.3390/v13010009] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2020] [Revised: 12/17/2020] [Accepted: 12/21/2020] [Indexed: 02/06/2023] Open
Abstract
Hematophagous insects act as the major reservoirs of infectious agents due to their intimate contact with a large variety of vertebrate hosts. Lutzomyia longipalpis is the main vector of Leishmania chagasi in the New World, but its role as a host of viruses is poorly understood. In this work, Lu. longipalpis RNA libraries were subjected to progressive assembly using viral profile HMMs as seeds. A sequence phylogenetically related to fungal viruses of the genus Mitovirus was identified and this novel virus was named Lul-MV-1. The 2697-base genome presents a single gene coding for an RNA-directed RNA polymerase with an organellar genetic code. To determine the possible host of Lul-MV-1, we analyzed the molecular characteristics of the viral genome. Dinucleotide composition and codon usage showed profiles similar to mitochondrial DNA of invertebrate hosts. Also, the virus-derived small RNA profile was consistent with the activation of the siRNA pathway, with size distribution and 5′ base enrichment analogous to those observed in viruses of sand flies, reinforcing Lu. longipalpis as a putative host. Finally, RT-PCR of different insect pools and sequences of public Lu. longipalpis RNA libraries confirmed the high prevalence of Lul-MV-1. This is the first report of a mitovirus infecting an insect host.
Collapse
|
5
|
Zhang H, Li D, Zhao X, Pan S, Wu X, Peng S, Huang H, Shi R, Tan Z. Relatively semi-conservative replication and a folded slippage model for short tandem repeats. BMC Genomics 2020; 21:563. [PMID: 32807079 PMCID: PMC7430839 DOI: 10.1186/s12864-020-06949-5] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2020] [Accepted: 07/27/2020] [Indexed: 12/11/2022] Open
Abstract
Background The ubiquitous presence of short tandem repeats (STRs) in virtually all genomes implicates their functional relevance, while a widely-accepted definition of STR is yet to be established. Previous studies majorly focus on relatively longer STRs, while shorter repeats were generally excluded. Herein, we have adopted a more generous criteria to define shorter repeats, which has led to the definition of a much larger number of STRs that lack prior analysis. Using this definition, we analyzed the short repeats in 55 randomly selected segments in 55 randomly selected genomic sequences from a fairly wide range of species covering animals, plants, fungi, protozoa, bacteria, archaea and viruses. Results Our analysis reveals a high percentage of short repeats in all 55 randomly selected segments, indicating that the universal presence of high-content short repeats could be a common characteristic of genomes across all biological kingdoms. Therefore, it is reasonable to assume a mechanism for continuous production of repeats that can make the replicating process relatively semi-conservative. We have proposed a folded replication slippage model that considers the geometric space of nucleotides and hydrogen bond stability to explain the mechanism more explicitly, with improving the existing straight-line slippage model. The folded slippage model can explain the expansion and contraction of mono- to hexa- nucleotide repeats with proper folding angles. Analysis of external forces in the folding template strands also suggests that expansion exists more commonly than contraction in the short tandem repeats. Conclusion The folded replication slippage model provides a reasonable explanation for the continuous occurrences of simple sequence repeats in genomes. This model also contributes to the explanation of STR-to-genome evolution and is an alternative model that complements semi-conservative replication.
Collapse
Affiliation(s)
- Hongxi Zhang
- Bioinformatics Center, College of Biology, Hunan University, Changsha, 410082, China
| | - Douyue Li
- Bioinformatics Center, College of Biology, Hunan University, Changsha, 410082, China
| | - Xiangyan Zhao
- Bioinformatics Center, College of Biology, Hunan University, Changsha, 410082, China
| | - Saichao Pan
- Bioinformatics Center, College of Biology, Hunan University, Changsha, 410082, China
| | - Xiaolong Wu
- Bioinformatics Center, College of Biology, Hunan University, Changsha, 410082, China
| | - Shan Peng
- Bioinformatics Center, College of Biology, Hunan University, Changsha, 410082, China
| | - Hanrou Huang
- Bioinformatics Center, College of Biology, Hunan University, Changsha, 410082, China
| | - Ruixue Shi
- Bioinformatics Center, College of Biology, Hunan University, Changsha, 410082, China
| | - Zhongyang Tan
- Bioinformatics Center, College of Biology, Hunan University, Changsha, 410082, China.
| |
Collapse
|
6
|
Li D, Zhang H, Peng S, Pan S, Tan Z. Conserved microsatellites may contribute to stem-loop structures in 5', 3' terminals of Ebolavirus genomes. Biochem Biophys Res Commun 2019; 514:726-733. [PMID: 31078274 PMCID: PMC7092875 DOI: 10.1016/j.bbrc.2019.04.192] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2019] [Revised: 04/25/2019] [Accepted: 04/28/2019] [Indexed: 12/12/2022]
Abstract
Microsatellites (SSRs) are ubiquitous in coding and non-coding regions of the Ebolavirus genomes. We synthetically analyzed the microsatellites in whole-genome and terminal regions of 219 Ebolavirus genomes from five species. The Ebolavirus sequences were observed with small intraspecies variations and large interspecific variations, especially in the terminal non-coding regions. Only five conserved microsatellites were detected in the complete genomes, and four of them which well base-paired to help forming conserved stem-loop structures mainly appeared in the terminal non-coding regions. These results suggest that the conserved microsatellites may be evolutionary selected to form conserved secondary structures in 5′, 3′ terminals of Ebolavirus genomes. It may help to understand the biological significance of microsatellites in Ebolavirus and also other virus genomes. Conserved microsatellites mainly occurred in 5′, 3′ terminal non-coding regions. Conserved microsatellites may contribute to conserved stem-loop structures. Conserved microsatellites might be preserved under greater evolutionary pressure.
Collapse
Affiliation(s)
- Douyue Li
- Bioinformatics Center, College of Biology, Hunan University, Changsha, China
| | - Hongxi Zhang
- Bioinformatics Center, College of Biology, Hunan University, Changsha, China
| | - Shan Peng
- Bioinformatics Center, College of Biology, Hunan University, Changsha, China
| | - Saichao Pan
- Bioinformatics Center, College of Biology, Hunan University, Changsha, China
| | - Zhongyang Tan
- Bioinformatics Center, College of Biology, Hunan University, Changsha, China.
| |
Collapse
|
7
|
Genome-wide In Silico Analysis, Characterization and Identification of Microsatellites in Spodoptera littoralis Multiple nucleopolyhedrovirus (SpliMNPV). Sci Rep 2016; 6:33741. [PMID: 27650818 PMCID: PMC5030640 DOI: 10.1038/srep33741] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2016] [Accepted: 09/01/2016] [Indexed: 01/10/2023] Open
Abstract
In this study, we undertook a survey to analyze the distribution and frequency of microsatellites or Simple Sequence Repeats (SSRs) in Spodoptera littoralis multiple nucleopolyhedrovirus (SpliMNPV) genome (isolate AN-1956). Out of the 55 microsatellite motifs, identified in the SpliMNPV-AN1956 genome using in silico analysis (inclusive of mono-, di-, tri- and hexa-nucleotide repeats), 39 were found to be distributed within coding regions (cSSRs), whereas 16 were observed to lie within intergenic or noncoding regions. Among the 39 motifs located in coding regions, 21 were located in annotated functional genes whilst 18 were identified in unknown functional genes (hypothetical proteins). Among the identified motifs, trinucleotide (80%) repeats were found to be the most abundant followed by dinucleotide (13%), mononucleotide (5%) and hexanucleotide (2%) repeats. The 39 motifs located within coding regions were further validated in vitro by using PCR analysis, while the 21 motifs located within known functional genes (15 genes) were characterized using nucleotide sequencing. A comparison of the sequence analysis data of the 21 sequenced cSSRs with the published sequences is presented. Finally, the developed SSR markers of the 39 motifs were further mapped/localized onto the SpliMNPV-AN1956 genome. In conclusion, the SSR markers specific to SpliMNPV, developed in this study, could be a useful tool for the identification of isolates and analysis of genetic diversity and viral evolutionary status.
Collapse
|
8
|
Comparative analysis of microsatellites and compound microsatellites in T4-like viruses. Gene 2016; 575:695-701. [DOI: 10.1016/j.gene.2015.09.053] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2015] [Revised: 09/16/2015] [Accepted: 09/21/2015] [Indexed: 01/27/2023]
|
9
|
|
10
|
Hatcher EL, Wang C, Lefkowitz EJ. Genome variability and gene content in chordopoxviruses: dependence on microsatellites. Viruses 2015; 7:2126-46. [PMID: 25912716 PMCID: PMC4411693 DOI: 10.3390/v7042126] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2015] [Revised: 03/24/2015] [Accepted: 04/17/2015] [Indexed: 11/20/2022] Open
Abstract
To investigate gene loss in poxviruses belonging to the Chordopoxvirinae subfamily, we assessed the gene content of representative members of the subfamily, and determined whether individual genes present in each genome were intact, truncated, or fragmented. When nonintact genes were identified, the early stop mutations (ESMs) leading to gene truncation or fragmentation were analyzed. Of all the ESMs present in these poxvirus genomes, over 65% co-localized with microsatellites—simple sequence nucleotide repeats. On average, microsatellites comprise 24% of the nucleotide sequence of these poxvirus genomes. These simple repeats have been shown to exhibit high rates of variation, and represent a target for poxvirus protein variation, gene truncation, and reductive evolution.
Collapse
Affiliation(s)
- Eneida L Hatcher
- Department of Microbiology, University of Alabama at Birmingham, BBRB 276/11, 845 19th St S, Birmingham, AL 35222, USA.
| | - Chunlin Wang
- Stanford Genome Technology Center, Stanford University, 855 California Ave, Palo Alto, CA 94304, USA.
| | - Elliot J Lefkowitz
- Department of Microbiology, University of Alabama at Birmingham, BBRB 276/11, 845 19th St S, Birmingham, AL 35222, USA.
| |
Collapse
|
11
|
Genome wide survey of microsatellites in ssDNA viruses infecting vertebrates. Gene 2014; 552:209-18. [DOI: 10.1016/j.gene.2014.09.032] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2014] [Revised: 08/15/2014] [Accepted: 09/15/2014] [Indexed: 01/26/2023]
|
12
|
The analysis of microsatellites and compound microsatellites in 56 complete genomes of Herpesvirales. Gene 2014; 551:103-9. [DOI: 10.1016/j.gene.2014.08.054] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2014] [Revised: 08/09/2014] [Accepted: 08/26/2014] [Indexed: 01/13/2023]
|
13
|
Qin L, Ma Y, Liang P, Tan Z, Li S. Differential distributions of mononucleotide repeat sequences in 256 viral genomes and its potential implications. Gene 2014; 544:159-64. [PMID: 24786215 DOI: 10.1016/j.gene.2014.04.063] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2014] [Revised: 04/14/2014] [Accepted: 04/26/2014] [Indexed: 11/18/2022]
Abstract
Mononucleotide repeats (MNRs) have been systematically investigated in the genomes of eukaryotic and prokaryotic organisms. However, detailed information on the distribution of MNRs in viral genomes is limited. In this study, we examined the distributions of MNRs in 256 fully sequenced virus genomes which showed extensive variations across viral genomes, and is significantly influenced by both genome size and CG content. Furthermore, the ratio of the observed to the expected number of MNRs (O/E ratio) appears to be influenced by both the host range and genome type of a particular virus. Additionally, the densities and frequencies of MNRs in genic regions are lower than in non-coding regions, suggesting that selective pressure acts on viral genomes. We also discuss the potential functional roles that these MNR loci could play in virus genomes. To our knowledge, this is the first analysis focusing on MNRs in viruses, and our study could have potential implications for a deeper understanding of virus genome stability and the co-evolution that occurs between a virus and its host.
Collapse
Affiliation(s)
- Lü Qin
- State Key Laboratory of Biology of Plant Diseases and Insect Pests, Institute of Plant Protection, Chinese Academy of Agricultural Sciences, Beijing 100193, China; College of Biology, State Key Laboratory for Chemo/Biosensing and Chemometrics, Hunan University, Changsha 410082, China
| | - Yuxin Ma
- State Key Laboratory of Biology of Plant Diseases and Insect Pests, Institute of Plant Protection, Chinese Academy of Agricultural Sciences, Beijing 100193, China
| | - Pengbo Liang
- College of Agronomy and Biotechnology, China Agricultural University, Beijing 100094, China
| | - Zhongyang Tan
- State Key Laboratory of Biology of Plant Diseases and Insect Pests, Institute of Plant Protection, Chinese Academy of Agricultural Sciences, Beijing 100193, China; College of Biology, State Key Laboratory for Chemo/Biosensing and Chemometrics, Hunan University, Changsha 410082, China.
| | - Shifang Li
- State Key Laboratory of Biology of Plant Diseases and Insect Pests, Institute of Plant Protection, Chinese Academy of Agricultural Sciences, Beijing 100193, China.
| |
Collapse
|
14
|
George B, Gnanasekaran P, Jain SK, Chakraborty S. Genome wide survey and analysis of small repetitive sequences in caulimoviruses. INFECTION GENETICS AND EVOLUTION 2014; 27:15-24. [PMID: 24999243 DOI: 10.1016/j.meegid.2014.06.018] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/06/2014] [Revised: 06/01/2014] [Accepted: 06/22/2014] [Indexed: 12/19/2022]
Abstract
Microsatellites are known to exhibit ubiquitous presence across all kingdoms of life including viruses. Members of the Caulimoviridae family severely affect growth of vegetable and fruit plants and reduce economic yield in diverse cropping systems worldwide. Here, we analyzed the nature and distribution of both simple and complex microsatellites present in complete genome of 44 species of Caulimoviridae. Our results showed, in all analyzed genomes, genome size and GC content had a weak influence on number, relative abundance and relative density of microsatellites, respectively. For each genome, mono- and dinucleotide repeats were found to be highly predominant and are overrepresented in genome of majority of caulimoviruses. AT/TA and GAA/AAG/AGA was the most abundant di- and trinucleotide repeat motif, respectively. Repeats larger than trinucleotide were rarely found in these genomes. Comparative study of occurrence, abundance and density of microsatellite among available RNA and DNA viral genomes indicated that simple repeats were least abundant in genomes of caulimoviruses. Polymorphic repeats even though rare were observed in the large intergenic region of the genome, indicating strand slippage and/or unequal recombination processes do occur in caulimoviruses. To our knowledge, this is the first analysis of microsatellites occurring in any dsDNA viral genome. Characterization of such variations in repeat sequences would be important in deciphering the origin, mutational processes, and role of repeat sequences in viral genomes.
Collapse
Affiliation(s)
- Biju George
- Molecular Virology Laboratory, School of Life Sciences, Jawaharlal Nehru University, New Delhi 110067, India
| | - Prabu Gnanasekaran
- Molecular Virology Laboratory, School of Life Sciences, Jawaharlal Nehru University, New Delhi 110067, India
| | - S K Jain
- Department of Biotechnology, Jamia Hamdard University, New Delhi, Delhi 110062, India
| | - Supriya Chakraborty
- Molecular Virology Laboratory, School of Life Sciences, Jawaharlal Nehru University, New Delhi 110067, India.
| |
Collapse
|
15
|
Chen M, Tan Z, Zeng G, Zeng Z. Differential distribution of compound microsatellites in various Human Immunodeficiency Virus Type 1 complete genomes. INFECTION GENETICS AND EVOLUTION 2012; 12:1452-7. [DOI: 10.1016/j.meegid.2012.05.006] [Citation(s) in RCA: 27] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/23/2011] [Revised: 05/04/2012] [Accepted: 05/12/2012] [Indexed: 12/21/2022]
|
16
|
Zhao X, Tian Y, Yang R, Feng H, Ouyang Q, Tian Y, Tan Z, Li M, Niu Y, Jiang J, Shen G, Yu R. Coevolution between simple sequence repeats (SSRs) and virus genome size. BMC Genomics 2012; 13:435. [PMID: 22931422 PMCID: PMC3585866 DOI: 10.1186/1471-2164-13-435] [Citation(s) in RCA: 36] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2012] [Accepted: 08/18/2012] [Indexed: 12/26/2022] Open
Abstract
Background Relationship between the level of repetitiveness in genomic sequence and genome size has been investigated by making use of complete prokaryotic and eukaryotic genomes, but relevant studies have been rarely made in virus genomes. Results In this study, a total of 257 viruses were examined, which cover 90% of genera. The results showed that simple sequence repeats (SSRs) is strongly, positively and significantly correlated with genome size. Certain repeat class is distributed in a certain range of genome sequence length. Mono-, di- and tri- repeats are widely distributed in all virus genomes, tetra- SSRs as a common component consist in genomes which more than 100 kb in size; in the range of genome < 100 kb, genomes containing penta- and hexa- SSRs are not more than 50%. Principal components analysis (PCA) indicated that dinucleotide repeat affects the differences of SSRs most strongly among virus genomes. Results showed that SSRs tend to accumulate in larger virus genomes; and the longer genome sequence, the longer repeat units. Conclusions We conducted this research standing on the height of the whole virus. We concluded that genome size is an important factor in affecting the occurrence of SSRs; hosts are also responsible for the variances of SSRs content to a certain degree.
Collapse
Affiliation(s)
- Xiangyan Zhao
- Chinese Academy of Inspection and Quarantine, Beijing, 100029, China
| | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
17
|
Differential distribution and occurrence of simple sequence repeats in diverse geminivirus genomes. Virus Genes 2012; 45:556-66. [DOI: 10.1007/s11262-012-0802-1] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2012] [Accepted: 07/31/2012] [Indexed: 01/13/2023]
|