1
|
Panda S, Swain SK, Sahu BP, Sarangi R. Insights into genome plasticity and gene regulation in Orientia tsutsugamushi through genome-wide mining of microsatellite markers. 3 Biotech 2023; 13:366. [PMID: 37840877 PMCID: PMC10575825 DOI: 10.1007/s13205-023-03795-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2023] [Accepted: 09/25/2023] [Indexed: 10/17/2023] Open
Abstract
Microsatellite markers are being used for molecular identification and characterization as well as estimation of evolution patterns due to their highly polymorphic nature. The repeats hold 40% of the entire genome of Orientia tsutsugamushi (OT), but not yet been characterized. Thus, we investigated the genome-wide presence of microsatellites within nine complete genomes of OT and analyzed their distribution pattern, composition, and complexity. The in-silico study revealed that the genome of OT enriched with microsatellites having a total of 126,187 SSRs and 10,374 cSSRs throughout the genome, of which 70% and 30% are represented within the coding and non-coding regions, respectively. The relative density (RD) and relative abundance (RA) of SSRs were 42-44.43/kb and 6.25-6.59/kb, while for cSSRs this value ranged from 7.06 to 8.1/kb and 0.50 to 0.55/kb, respectively. However, RA and RD were weakly correlated with genome size and incidence of microsatellites. The mononucleotide repeats (54.55%) were prevalent over di- (33.22%), tri- (11.88%), tetra- (0.27%), penta- (0.02%), hexanucleotide (0.04%) repeats, with poly (A/T) richness over poly (G/C). The motif composition of cSSRs revealed that maximum cSSRs were made up of two microsatellites having unique duplication patterns such as AT-x-AT and CG-x-CG. To our knowledge, this is the first study of microsatellites in the OT genome, where characterization of such variations in repeat sequences would be important in deciphering the origin, rate of mutation, and role of repeat sequences in the genome. More numbers of microsatellites represented within the coding region provide an insight into the genome plasticity that may interfere with gene regulation to mitigate host-pathogen interaction and evolution of the species.
Collapse
Affiliation(s)
- Subhasmita Panda
- Department of Pediatrics, IMS and SUM Hospital, Siksha ‘O’ Anusandhan (Deemed to be University), K8, Kalinga Nagar, Bhubaneswar, Odisha 751003 India
| | - Subrat Kumar Swain
- Medical Research Laboratory, IMS and SUM Hospital, Siksha ‘O’ Anusandhan (Deemed to be University), K8, Kalinga Nagar, Bhubaneswar, Odisha 751003 India
| | - Basanta Pravas Sahu
- School of Biological Sciences, The University of Hong Kong, Pokfulam, Hong Kong
| | - Rachita Sarangi
- Department of Pediatrics, IMS and SUM Hospital, Siksha “O” Anusandhan (Deemed to be University), K8, Kalinga Nagar, Bhubaneswar, Odisha 751003 India
| |
Collapse
|
2
|
Sahu BP, Majee P, Singh RR, Sahoo N, Nayak D. Recombination drives the emergence of orf virus diversity: evidence from the first complete genome sequence of an Indian orf virus isolate and comparative genomic analysis. Arch Virol 2022; 167:1571-1576. [PMID: 35546377 PMCID: PMC9094603 DOI: 10.1007/s00705-022-05443-5] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2021] [Accepted: 03/07/2022] [Indexed: 11/02/2022]
Abstract
Contagious pustular dermatitis is a disease that primarily infects small ruminants and possesses zoonotic potential. It is caused by orf virus (ORFV), a member of the genus Parapoxvirus. In this study, we evaluated an ORFV outbreak in goats in Madhya Pradesh, a state in central India, during 2017. The transboundary potential of this virus was evaluated by constructing phylogenetic trees. The complete genome sequence of an ORFV isolate named Ind/MP/17 was found to be 139,807 bp in length with 63.7% GC content and 132 open reading frames (ORFs) flanked by 3,910-bp inverted terminal repeats (ITRs). An investigation into evolutionary parameters such as selection pressure (θ = dN/dS) and nucleotide diversity (π) demonstrated that ORFV has undergone purifying selection. A total of 40 recombination events were identified, 21 of which were evident in the Ind/MP/17 genome, indicating its ability to generate new variants.
Collapse
Affiliation(s)
- Basanta Pravas Sahu
- Department of Biological Sciences and Biomedical Engineering, Indian Institute of Technology Indore, Indore, MP, 453331, India
| | - Prativa Majee
- Department of Biological Sciences and Biomedical Engineering, Indian Institute of Technology Indore, Indore, MP, 453331, India
| | - Ravi Raj Singh
- Department of Biological Sciences and Biomedical Engineering, Indian Institute of Technology Indore, Indore, MP, 453331, India
| | - Niranjana Sahoo
- Centre for Wildlife Health, College of Veterinary Sciences and Animal Husbandry, Bhubaneswar, 751030, India
| | - Debasis Nayak
- Department of Biological Science, Indian Institute of Science Education and Research (IISER) Bhopal, Bhopal Bypass Road, Bhauri, Madhya Pradesh, 462066, India.
| |
Collapse
|
3
|
Sahu BP, Majee P, Singh RR, Sahoo N, Nayak D. Genome-wide identification and characterization of microsatellite markers within the Avipoxviruses. 3 Biotech 2022; 12:113. [PMID: 35497507 PMCID: PMC9008116 DOI: 10.1007/s13205-022-03169-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2022] [Accepted: 03/19/2022] [Indexed: 11/01/2022] Open
Abstract
Microsatellite markers or Simple Sequence Repeats (SSRs) are gaining importance for molecular characterization of the virus as well as estimation of evolution patterns due to its high-polymorphic nature. The Avipoxvirus is the causative agent of pox-like lesions in more than 300 birds and one of the major diseases for the extinction of endangered avian species. Therefore, we conducted a genome-wide analysis to decipher the type, distribution pattern of 14 complete genomes derived from the Avipoxvirus genus. The in-silico screening deciphered the existence of 917-2632 SSRs per strain. In the case of compound SSRs (cSSRs), the value was obtained 44-255 per genome. Our analysis indicates that the di-nucleotide repeats (52.74%) are the most abundant, followed by the mononucleotides (34.79), trinucleotides (11.57%), tetranucleotides (0.64%), pentanucleotides (0.12%) and hexanucleotides (0.15%) repeats. The specific parameters like Relative Abundance (RA) and Relative Density (RD) of microsatellites ranged within 5.5-8.12 and 33.08-53.58 bp/kb. The analysis of RA and RD value of compound microsatellites resulted between 0.25-0.82 and 4.64-15.12 bp/kb. The analysis of motif composition of cSSR revealed that most of the compound microsatellites were made up of two microsatellites, with some unique duplicated pattern of the motif like, (TA)-x-(TA), (TCA)-x-(TCA), etc. and self-complementary motifs, such as (TA)-x-(AT). Finally, we validated forty sets of compound microsatellite markers through an in-vitro approach utilizing clinical specimens and mapping the sequencing products with the database through comparative genomics approaches. Supplementary Information The online version contains supplementary material available at 10.1007/s13205-022-03169-4.
Collapse
|
4
|
Jain A, Sharma PC. Occurrence and distribution of compound microsatellites in the genomes of three economically important virus families. INFECTION GENETICS AND EVOLUTION 2021; 92:104853. [PMID: 33839312 DOI: 10.1016/j.meegid.2021.104853] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Subscribe] [Scholar Register] [Received: 01/17/2021] [Revised: 04/01/2021] [Accepted: 04/04/2021] [Indexed: 11/15/2022]
Abstract
Microsatellites are nonrandom hypervariable iterations of one to six nucleotides, existing across the coding as well as noncoding regions of virtually all known genomes, arising primarily due to polymerase slippage and unequal crossing over during replication events. Two or more perfect microsatellites located in close proximity form compound microsatellites. We studied the distribution of compound microsatellites in 118 ssDNA virus genomes belonging to three economically important virus families, namely Anelloviridae, Circoviridae, and Parvoviridae, known to predominantly infect livestock and humans. Among these virus families, 0-58.49% of perfect microsatellites were involved in the formation of compound microsatellites, the majority being located in the coding regions. No clear relationship existed between the genomic features (genome size and GC%) and compound microsatellite characteristics (relative abundance and relative density). The majority of the compound microsatellites resulted from di-SSR couples. A strong positive relationship was observed between the maximum distance value and length of compound microsatellite, percentage of microsatellites involved in the compound microsatellite formation, and relative microsatellite density. The degree of variability among microsatellite characteristics studied was largely a species-specific phenomenon. A major proportion of compound microsatellites was represented by similar motif combinations. The findings of the present study will help in better understanding of the structural, functional, and evolutionary role of compound microsatellites prevailing in the smaller genomes.
Collapse
Affiliation(s)
- Ankit Jain
- Merck Life Science Pvt. Ltd, Sector-17, Chandigarh, India
| | - Prakash C Sharma
- University School of Biotechnology, Guru Gobind Singh Indraprastha University, Dwarka Sector-16 C, New Delhi 11078, India.
| |
Collapse
|
5
|
Comparative analysis, distribution, and characterization of microsatellites in Orf virus genome. Sci Rep 2020; 10:13852. [PMID: 32807836 PMCID: PMC7431841 DOI: 10.1038/s41598-020-70634-6] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2019] [Accepted: 07/01/2020] [Indexed: 11/09/2022] Open
Abstract
Genome-wide in-silico identification of microsatellites or simple sequence repeats (SSRs) in the Orf virus (ORFV), the causative agent of contagious ecthyma has been carried out to investigate the type, distribution and its potential role in the genome evolution. We have investigated eleven ORFV strains, which resulted in the presence of 1,036-1,181 microsatellites per strain. The further screening revealed the presence of 83-107 compound SSRs (cSSRs) per genome. Our analysis indicates the dinucleotide (76.9%) repeats to be the most abundant, followed by trinucleotide (17.7%), mononucleotide (4.9%), tetranucleotide (0.4%) and hexanucleotide (0.2%) repeats. The Relative Abundance (RA) and Relative Density (RD) of these SSRs varied between 7.6-8.4 and 53.0-59.5 bp/kb, respectively. While in the case of cSSRs, the RA and RD ranged from 0.6-0.8 and 12.1-17.0 bp/kb, respectively. Regression analysis of all parameters like the incident of SSRs, RA, and RD significantly correlated with the GC content. But in a case of genome size, except incident SSRs, all other parameters were non-significantly correlated. Nearly all cSSRs were composed of two microsatellites, which showed no biasedness to a particular motif. Motif duplication pattern, such as, (C)-x-(C), (TG)-x-(TG), (AT)-x-(AT), (TC)- x-(TC) and self-complementary motifs, such as (GC)-x-(CG), (TC)-x-(AG), (GT)-x-(CA) and (TC)-x-(AG) were observed in the cSSRs. Finally, in-silico polymorphism was assessed, followed by in-vitro validation using PCR analysis and sequencing. The thirteen polymorphic SSR markers developed in this study were further characterized by mapping with the sequence present in the database. The results of the present study indicate that these SSRs could be a useful tool for identification, analysis of genetic diversity, and understanding the evolutionary status of the virus.
Collapse
|
6
|
Alam CM, Iqbal A, Sharma A, Schulman AH, Ali S. Microsatellite Diversity, Complexity, and Host Range of Mycobacteriophage Genomes of the Siphoviridae Family. Front Genet 2019; 10:207. [PMID: 30923537 PMCID: PMC6426759 DOI: 10.3389/fgene.2019.00207] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2018] [Accepted: 02/26/2019] [Indexed: 01/21/2023] Open
Abstract
The incidence, distribution, and variation of simple sequence repeats (SSRs) in viruses is instrumental in understanding the functional and evolutionary aspects of repeat sequences. Full-length genome sequences retrieved from NCBI were used for extraction and analysis of repeat sequences using IMEx software. We have also developed two MATLAB-based tools for extraction of gene locations from GenBank in tabular format and simulation of this data with SSR incidence data. Present study encompassing 147 Mycobacteriophage genomes revealed 25,284 SSRs and 1,127 compound SSRs (cSSRs) through IMEx. Mono- to hexa-nucleotide motifs were present. The SSR count per genome ranged from 78 (M100) to 342 (M58) while cSSRs incidence ranged from 1 (M138) to 17 (M28, M73). Though cSSRs were present in all the genomes, their frequency and SSR to cSSR conversion percentage varied from 1.08 (M138 with 93 SSRs) to 8.33 (M116 with 96 SSRs). In terms of localization, the SSRs were predominantly localized to coding regions (∼78%). Interestingly, genomes of around 50 kb contained a similar number of SSRs/cSSRs to that in a 110 kb genome, suggesting functional relevance for SSRs which was substantiated by variation in motif constitution between species with different host range. The three species with broad host range (M97, M100, M116) have around 90% of their mono-nucleotide repeat motifs composed of G or C and only M16 has both A and T mononucleotide motifs. Around 20% of the di-nucleotide repeat motifs in the genomes exhibiting a broad host range were CT/TC, which were either absent or represented to a much lesser extent in the other genomes.
Collapse
Affiliation(s)
- Chaudhary Mashhood Alam
- Luke/BI Plant Genome Dynamics Lab, Institute of Biotechnology and Viikki Plant Science Centre, University of Helsinki, Helsinki, Finland.,Ingenious e-Brain Solutions, Gurugram, India
| | - Asif Iqbal
- PIRO Technologies Private Limited, New Delhi, India
| | - Anjana Sharma
- Department of Biomedical Sciences, SRCASW, University of Delhi, New Delhi, India
| | - Alan H Schulman
- Luke/BI Plant Genome Dynamics Lab, Institute of Biotechnology and Viikki Plant Science Centre, University of Helsinki, Helsinki, Finland.,Natural Resources Institute Finland (Luke), Helsinki, Finland
| | - Safdar Ali
- Department of Biomedical Sciences, SRCASW, University of Delhi, New Delhi, India.,Department of Biological Sciences, Aliah University, Kolkata, India
| |
Collapse
|
7
|
Comparative analysis on precise distribution-patterns of microsatellites in HIV-1 with differential statistical method. GENE REPORTS 2018. [DOI: 10.1016/j.genrep.2018.06.007] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
|
8
|
Survey and analysis of simple sequence repeats (SSRs) in three genomes of Candida species. Gene 2016; 584:129-35. [PMID: 26883055 DOI: 10.1016/j.gene.2016.02.018] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2015] [Revised: 01/15/2016] [Accepted: 02/12/2016] [Indexed: 11/23/2022]
Abstract
Simple sequence repeats (SSRs) or microsatellites, which composed of tandem repeated short units of 1-6 bp, have been paying attention continuously. Here, the distribution, composition and polymorphism of microsatellites and compound microsatellites were analyzed in three available genomes of Candida species (Candida dubliniensis, Candida glabrata and Candida orthopsilosis). The results show that there were 118,047, 66,259 and 61,119 microsatellites in genomes of C. dubliniensis, C. glabrata and C. orthopsilosis, respectively. The SSRs covered more than 1/3 length of genomes in the three species. The microsatellites, which just consist of bases A and (or) T, such as (A)n, (T)n, (AT)n, (TA)n, (AAT)n, (TAA)n, (TTA)n, (ATA)n, (ATT)n and (TAT)n, were predominant in the three genomes. The length of microsatellites was focused on 6 bp and 9 bp either in the three genomes or in its coding sequences. What's more, the relative abundance (19.89/kbp) and relative density (167.87 bp/kbp) of SSRs in sequence of mitochondrion of C. glabrata were significantly great than that in any one of genomes or chromosomes of the three species. In addition, the distance between any two adjacent microsatellites was an important factor to influence the formation of compound microsatellites. The analysis may be helpful for further studying the roles of microsatellites in genomes' origination, organization and evolution of Candida species.
Collapse
|
9
|
Comparative analysis of microsatellites and compound microsatellites in T4-like viruses. Gene 2016; 575:695-701. [DOI: 10.1016/j.gene.2015.09.053] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2015] [Revised: 09/16/2015] [Accepted: 09/21/2015] [Indexed: 01/27/2023]
|
10
|
Genome wide survey of microsatellites in ssDNA viruses infecting vertebrates. Gene 2014; 552:209-18. [DOI: 10.1016/j.gene.2014.09.032] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2014] [Revised: 08/15/2014] [Accepted: 09/15/2014] [Indexed: 01/26/2023]
|
11
|
The analysis of microsatellites and compound microsatellites in 56 complete genomes of Herpesvirales. Gene 2014; 551:103-9. [DOI: 10.1016/j.gene.2014.08.054] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2014] [Revised: 08/09/2014] [Accepted: 08/26/2014] [Indexed: 01/13/2023]
|
12
|
Alam CM, Singh AK, Sharfuddin C, Ali S. In- silico exploration of thirty alphavirus genomes for analysis of the simple sequence repeats. Meta Gene 2014; 2:694-705. [PMID: 25606453 PMCID: PMC4287844 DOI: 10.1016/j.mgene.2014.09.005] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2014] [Revised: 09/08/2014] [Accepted: 09/10/2014] [Indexed: 11/29/2022] Open
Abstract
The compilation of simple sequence repeats (SSRs) in viruses and its analysis with reference to incidence, distribution and variation would be instrumental in understanding the functional and evolutionary aspects of repeat sequences. Present study encompasses the analysis of SSRs across 30 species of alphaviruses. The full length genome sequences, assessed from NCBI were used for extraction and analysis of repeat sequences using IMEx software. The repeats of different motif sizes (mono- to penta-nucleotide) observed therein exhibited variable incidence across the species. Expectedly, mononucleotide A/T was the most prevalent followed by dinucleotide AG/GA and trinucleotide AAG/GAA in these genomes. The conversion of SSRs to imperfect microsatellite or compound microsatellite (cSSR) is low. cSSR, primarily constituted by variant motifs accounted for up to 12.5% of the SSRs. Interestingly, seven species lacked cSSR in their genomes. However, the SSR and cSSR are predominantly localized to the coding region ORFs for non structural protein and structural proteins. The relative frequencies of different classes of simple and compound microsatellites within and across genomes have been highlighted. This is the first analysis of SSR and cSSR in alphaviruses. We analysed differential frequency and distribution patterns of SSRs and cSSRs. We studied localization of SSR and cSSR in alphaviruses proteomics This study would help in better understanding of evolutionary biology of alphaviruses.
Collapse
Affiliation(s)
| | - Avadhesh Kumar Singh
- Department of Biomedical Sciences, SRCASW, University of Delhi, Vasundhara Enclave, New Delhi 110096, India
| | | | - Safdar Ali
- Department of Biomedical Sciences, SRCASW, University of Delhi, Vasundhara Enclave, New Delhi 110096, India
| |
Collapse
|