1
|
Tseng MC, Lee YH, Yen TB, Li SM. Genome-wide characterization of microsatellites in cobia Rachycentron canadum (Linnaeus, 1766): Survey and analysis of their abundance and diversity. JOURNAL OF FISH BIOLOGY 2024; 104:44-55. [PMID: 37658731 DOI: 10.1111/jfb.15552] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/05/2023] [Revised: 08/23/2023] [Accepted: 08/31/2023] [Indexed: 09/05/2023]
Abstract
The cobia Rachycentron canadum, mainly distributed in the warm waters of tropical and subtropical regions around the world, remains a fish of considerable economic importance. Detailed diversity and the number of microsatellite sequences in the cobia genome are still unintelligible. The primary aim of this work was to identify and quantify the miscellaneous SSR sequences in the cobia genome. More than 280,000 sequences were sequenced and screened using next-generation sequencing technology and microsatellite identification. Perfect mononucleotide repeats, dinucleotide microsatellites, and trinucleotide microsatellites contain (A)10 /(T)10 , (AC)6 /(TG)6 , and (AAT)5-32 as the largest number of motifs in each type of microsatellite, respectively. The tetranucleotide and pentanucleotide microsatellites (TTM and PTM) consist of the largest number of motifs of both (ATCT)5-32 and (TCAT)5-31 in TTMs, and (CTCTC)5-9 in PTMs, whereas the hexanucleotide microsatellites are rarely observed in the cobia genome. All c. 38000 sequences of composite microsatellites are extremely diverse, including compound (11.71%), interrupted compound (71.77%), complex (0.45%), and interrupted complex (16.07%). In this study, we developed a convenient and useful recording system for writing down and categorizing diverse composite microsatellite types. This system will provide great support for exploring repeat origins, evolutionary mechanisms, and the application of polymorphic microsatellites.
Collapse
Affiliation(s)
- Mei-Chen Tseng
- Department of Aquaculture, National Pingtung University of Science and Technology, Pingtung 912, Taiwan, R.O.C
| | - Yen-Hung Lee
- Tungkang Aquaculture Research Center, Fisheries Research Institute, MOA, Pingtung 928, Taiwan, R.O.C
| | - Tsair-Bor Yen
- Department of Tropical Agriculture and International Cooperation, National Pingtung University of Science and Technology, Pingtung 912, Taiwan, R.O.C
| | - Shu-Ming Li
- Department of Aquaculture, National Pingtung University of Science and Technology, Pingtung 912, Taiwan, R.O.C
| |
Collapse
|
2
|
Panda S, Swain SK, Sahu BP, Sarangi R. Insights into genome plasticity and gene regulation in Orientia tsutsugamushi through genome-wide mining of microsatellite markers. 3 Biotech 2023; 13:366. [PMID: 37840877 PMCID: PMC10575825 DOI: 10.1007/s13205-023-03795-6] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2023] [Accepted: 09/25/2023] [Indexed: 10/17/2023] Open
Abstract
Microsatellite markers are being used for molecular identification and characterization as well as estimation of evolution patterns due to their highly polymorphic nature. The repeats hold 40% of the entire genome of Orientia tsutsugamushi (OT), but not yet been characterized. Thus, we investigated the genome-wide presence of microsatellites within nine complete genomes of OT and analyzed their distribution pattern, composition, and complexity. The in-silico study revealed that the genome of OT enriched with microsatellites having a total of 126,187 SSRs and 10,374 cSSRs throughout the genome, of which 70% and 30% are represented within the coding and non-coding regions, respectively. The relative density (RD) and relative abundance (RA) of SSRs were 42-44.43/kb and 6.25-6.59/kb, while for cSSRs this value ranged from 7.06 to 8.1/kb and 0.50 to 0.55/kb, respectively. However, RA and RD were weakly correlated with genome size and incidence of microsatellites. The mononucleotide repeats (54.55%) were prevalent over di- (33.22%), tri- (11.88%), tetra- (0.27%), penta- (0.02%), hexanucleotide (0.04%) repeats, with poly (A/T) richness over poly (G/C). The motif composition of cSSRs revealed that maximum cSSRs were made up of two microsatellites having unique duplication patterns such as AT-x-AT and CG-x-CG. To our knowledge, this is the first study of microsatellites in the OT genome, where characterization of such variations in repeat sequences would be important in deciphering the origin, rate of mutation, and role of repeat sequences in the genome. More numbers of microsatellites represented within the coding region provide an insight into the genome plasticity that may interfere with gene regulation to mitigate host-pathogen interaction and evolution of the species.
Collapse
Affiliation(s)
- Subhasmita Panda
- Department of Pediatrics, IMS and SUM Hospital, Siksha ‘O’ Anusandhan (Deemed to be University), K8, Kalinga Nagar, Bhubaneswar, Odisha 751003 India
| | - Subrat Kumar Swain
- Medical Research Laboratory, IMS and SUM Hospital, Siksha ‘O’ Anusandhan (Deemed to be University), K8, Kalinga Nagar, Bhubaneswar, Odisha 751003 India
| | - Basanta Pravas Sahu
- School of Biological Sciences, The University of Hong Kong, Pokfulam, Hong Kong
| | - Rachita Sarangi
- Department of Pediatrics, IMS and SUM Hospital, Siksha “O” Anusandhan (Deemed to be University), K8, Kalinga Nagar, Bhubaneswar, Odisha 751003 India
| |
Collapse
|
3
|
Bharti PK, Husai A. Mining and analysis of microsatellites in human coronavirus genomes using the in-house built Java pipeline. Genomics Inform 2022; 20:e35. [PMID: 36239112 PMCID: PMC9576472 DOI: 10.5808/gi.20033] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2020] [Accepted: 09/14/2022] [Indexed: 11/20/2022] Open
Abstract
Microsatellites or simple sequence repeats are motifs of 1 to 6 nucleotides in length present in both coding and non-coding regions of DNA. These are found widely distributed in the whole genome of prokaryotes, eukaryotes, bacteria, and viruses and are used as molecular markers in studying DNA variations, gene regulation, genetic diversity and evolutionary studies, etc. However, in vitro microsatellite identification proves to be time-consuming and expensive. Therefore, the present research has been focused on using an in-house built java pipeline to identify, analyse, design primers and find related statistics of perfect and compound microsatellites in the seven complete genome sequences of coronavirus, including the genome of coronavirus disease 2019, where the host is Homo sapiens. Based on search criteria among seven genomic sequences, it was revealed that the total number of perfect simple sequence repeats (SSRs) found to be in the range of 76 to 118 and compound SSRs from 01 to10, thus reflecting the low conversion of perfect simple sequence to compound repeats. Furthermore, the incidence of SSRs was insignificant but positively correlated with genome size (R2 = 0.45, p > 0.05), with simple sequence repeats relative abundance (R2 = 0.18, p > 0.05) and relative density (R2 = 0.23, p > 0.05). Dinucleotide repeats were the most abundant in the coding region of the genome, followed by tri, mono, and tetra. This comparative study would help us understand the evolutionary relationship, genetic diversity, and hypervariability in minimal time and cost.
Collapse
Affiliation(s)
- P K Bharti
- School of Computer Science, Shri Venkateshwara University, Gajraula 244236, Uttar Pradesh, India
| | - Akhtar Husai
- Department of Computer Science & IT, MJP Rohilkhand University, Bareilly 243006, Uttar Pradesh, India
| |
Collapse
|
4
|
Sahu BP, Majee P, Singh RR, Sahoo N, Nayak D. Genome-wide identification and characterization of microsatellite markers within the Avipoxviruses. 3 Biotech 2022; 12:113. [PMID: 35497507 PMCID: PMC9008116 DOI: 10.1007/s13205-022-03169-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2022] [Accepted: 03/19/2022] [Indexed: 11/01/2022] Open
Abstract
Microsatellite markers or Simple Sequence Repeats (SSRs) are gaining importance for molecular characterization of the virus as well as estimation of evolution patterns due to its high-polymorphic nature. The Avipoxvirus is the causative agent of pox-like lesions in more than 300 birds and one of the major diseases for the extinction of endangered avian species. Therefore, we conducted a genome-wide analysis to decipher the type, distribution pattern of 14 complete genomes derived from the Avipoxvirus genus. The in-silico screening deciphered the existence of 917-2632 SSRs per strain. In the case of compound SSRs (cSSRs), the value was obtained 44-255 per genome. Our analysis indicates that the di-nucleotide repeats (52.74%) are the most abundant, followed by the mononucleotides (34.79), trinucleotides (11.57%), tetranucleotides (0.64%), pentanucleotides (0.12%) and hexanucleotides (0.15%) repeats. The specific parameters like Relative Abundance (RA) and Relative Density (RD) of microsatellites ranged within 5.5-8.12 and 33.08-53.58 bp/kb. The analysis of RA and RD value of compound microsatellites resulted between 0.25-0.82 and 4.64-15.12 bp/kb. The analysis of motif composition of cSSR revealed that most of the compound microsatellites were made up of two microsatellites, with some unique duplicated pattern of the motif like, (TA)-x-(TA), (TCA)-x-(TCA), etc. and self-complementary motifs, such as (TA)-x-(AT). Finally, we validated forty sets of compound microsatellite markers through an in-vitro approach utilizing clinical specimens and mapping the sequencing products with the database through comparative genomics approaches. Supplementary Information The online version contains supplementary material available at 10.1007/s13205-022-03169-4.
Collapse
|
5
|
Jain A, Sharma PC. Occurrence and distribution of compound microsatellites in the genomes of three economically important virus families. INFECTION GENETICS AND EVOLUTION 2021; 92:104853. [PMID: 33839312 DOI: 10.1016/j.meegid.2021.104853] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Subscribe] [Scholar Register] [Received: 01/17/2021] [Revised: 04/01/2021] [Accepted: 04/04/2021] [Indexed: 11/15/2022]
Abstract
Microsatellites are nonrandom hypervariable iterations of one to six nucleotides, existing across the coding as well as noncoding regions of virtually all known genomes, arising primarily due to polymerase slippage and unequal crossing over during replication events. Two or more perfect microsatellites located in close proximity form compound microsatellites. We studied the distribution of compound microsatellites in 118 ssDNA virus genomes belonging to three economically important virus families, namely Anelloviridae, Circoviridae, and Parvoviridae, known to predominantly infect livestock and humans. Among these virus families, 0-58.49% of perfect microsatellites were involved in the formation of compound microsatellites, the majority being located in the coding regions. No clear relationship existed between the genomic features (genome size and GC%) and compound microsatellite characteristics (relative abundance and relative density). The majority of the compound microsatellites resulted from di-SSR couples. A strong positive relationship was observed between the maximum distance value and length of compound microsatellite, percentage of microsatellites involved in the compound microsatellite formation, and relative microsatellite density. The degree of variability among microsatellite characteristics studied was largely a species-specific phenomenon. A major proportion of compound microsatellites was represented by similar motif combinations. The findings of the present study will help in better understanding of the structural, functional, and evolutionary role of compound microsatellites prevailing in the smaller genomes.
Collapse
Affiliation(s)
- Ankit Jain
- Merck Life Science Pvt. Ltd, Sector-17, Chandigarh, India
| | - Prakash C Sharma
- University School of Biotechnology, Guru Gobind Singh Indraprastha University, Dwarka Sector-16 C, New Delhi 11078, India.
| |
Collapse
|
6
|
Li D, Pan S, Zhang H, Fu Y, Peng Z, Zhang L, Peng S, Xu F, Huang H, Shi R, Zheng H, Peng Y, Tan Z. A comprehensive microsatellite landscape of human Y-DNA at kilobase resolution. BMC Genomics 2021; 22:76. [PMID: 33482734 PMCID: PMC7821415 DOI: 10.1186/s12864-021-07389-5] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2020] [Accepted: 01/13/2021] [Indexed: 12/12/2022] Open
Abstract
Background Though interest in human simple sequence repeats (SSRs) is increasing, little is known about the exact distributional features of numerous SSRs in human Y-DNA at chromosomal level. Herein, totally 540 maps were established, which could clearly display SSR landscape in every bin of 1 k base pairs (Kbp) along the sequenced part of human reference Y-DNA (NC_000024.10), by our developed differential method for improving the existing method to reveal SSR distributional characteristics in large genomic sequences. Results The maps show that SSRs accumulate significantly with forming density peaks in at least 2040 bins of 1 Kbp, which involve different coding, noncoding and intergenic regions of the Y-DNA, and 10 especially high density peaks were reported to associate with biological significances, suggesting that the other hundreds of especially high density peaks might also be biologically significant and worth further analyzing. In contrast, the maps also show that SSRs are extremely sparse in at least 207 bins of 1 Kbp, including many noncoding and intergenic regions of the Y-DNA, which is inconsistent with the widely accepted view that SSRs are mostly rich in these regions, and these sparse distributions are possibly due to powerfully regional selection. Additionally, many regions harbor SSR clusters with same or similar motif in the Y-DNA. Conclusions These 540 maps may provide the important information of clearly position-related SSR distributional features along the human reference Y-DNA for better understanding the genome structures of the Y-DNA. This study may contribute to further exploring the biological significance and distribution law of the huge numbers of SSRs in human Y-DNA. Supplementary Information The online version contains supplementary material available at 10.1186/s12864-021-07389-5.
Collapse
Affiliation(s)
- Douyue Li
- Bioinformatics Center, College of Biology, Hunan University, Changsha, 410082, China
| | - Saichao Pan
- Bioinformatics Center, College of Biology, Hunan University, Changsha, 410082, China
| | - Hongxi Zhang
- Bioinformatics Center, College of Biology, Hunan University, Changsha, 410082, China
| | - Yongzhuo Fu
- Bioinformatics Center, College of Biology, Hunan University, Changsha, 410082, China
| | - Zhuli Peng
- Bioinformatics Center, College of Biology, Hunan University, Changsha, 410082, China
| | - Liang Zhang
- Bioinformatics Center, College of Biology, Hunan University, Changsha, 410082, China
| | - Shan Peng
- Bioinformatics Center, College of Biology, Hunan University, Changsha, 410082, China
| | - Fei Xu
- Department of Mathematics, Wilfrid Laurier University, Waterloo, Ontario, N2L 3C5, Canada
| | - Hanrou Huang
- Bioinformatics Center, College of Biology, Hunan University, Changsha, 410082, China
| | - Ruixue Shi
- Bioinformatics Center, College of Biology, Hunan University, Changsha, 410082, China
| | - Heping Zheng
- Bioinformatics Center, College of Biology, Hunan University, Changsha, 410082, China
| | - Yousong Peng
- Bioinformatics Center, College of Biology, Hunan University, Changsha, 410082, China
| | - Zhongyang Tan
- Bioinformatics Center, College of Biology, Hunan University, Changsha, 410082, China.
| |
Collapse
|
7
|
Laskar R, Jilani MG, Ali S. Implications of genome simple sequence repeats signature in 98 Polyomaviridae species. 3 Biotech 2021; 11:35. [PMID: 33432281 PMCID: PMC7787124 DOI: 10.1007/s13205-020-02583-w] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2020] [Accepted: 11/02/2020] [Indexed: 01/21/2023] Open
Abstract
The analysis of simple sequence repeats (SSRs) in 98 genomes across four genera of the family Polyomaviridae was performed. The genome size ranged from 3962 (BM87) to 7369 bp (BM85) but maximum genomes were in the range of 5-5.5 kb. The GC% had an average of 42% and ranged between 34.69 (BM95) and 52.35 (BM81). A total of 3036 SSRs and 223 cSSRs were extracted using IMEx with incident frequency from 18 to 56 and 0 to 7, respectively. The most prevalent mono-nucleotide repeat motif was "T" (48.95%) followed by "A" (33.48%). "AT/TA" was the most prevalent dinucleotide motif closely followed by "CT/TC". The distribution was expectedly more in the coding region with 77.6% SSRs of which nearly half were in Large T Antigen (LTA) gene. Notably, most viruses with humans, apes and related species as host exhibited exclusivity of mono-nucleotide repeats in AT region, a proposed predictive marker for determination of humans as host in the virus in course of its evolution. Each genome has a unique SSR signature which is pivotal for viral evolution particularly in terms of host divergence. SUPPLEMENTARY INFORMATION The online version contains supplementary material available at 10.1007/s13205-020-02583-w.
Collapse
Affiliation(s)
- Rezwanuzzaman Laskar
- Clinical and Applied Genomics (CAG) Laboratory, Department of Biological Sciences, Aliah University, IIA/27, Newtown, Kolkata, 700160 India
| | - Md Gulam Jilani
- Clinical and Applied Genomics (CAG) Laboratory, Department of Biological Sciences, Aliah University, IIA/27, Newtown, Kolkata, 700160 India
| | - Safdar Ali
- Clinical and Applied Genomics (CAG) Laboratory, Department of Biological Sciences, Aliah University, IIA/27, Newtown, Kolkata, 700160 India
| |
Collapse
|
8
|
Comparative analysis, distribution, and characterization of microsatellites in Orf virus genome. Sci Rep 2020; 10:13852. [PMID: 32807836 PMCID: PMC7431841 DOI: 10.1038/s41598-020-70634-6] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2019] [Accepted: 07/01/2020] [Indexed: 11/09/2022] Open
Abstract
Genome-wide in-silico identification of microsatellites or simple sequence repeats (SSRs) in the Orf virus (ORFV), the causative agent of contagious ecthyma has been carried out to investigate the type, distribution and its potential role in the genome evolution. We have investigated eleven ORFV strains, which resulted in the presence of 1,036-1,181 microsatellites per strain. The further screening revealed the presence of 83-107 compound SSRs (cSSRs) per genome. Our analysis indicates the dinucleotide (76.9%) repeats to be the most abundant, followed by trinucleotide (17.7%), mononucleotide (4.9%), tetranucleotide (0.4%) and hexanucleotide (0.2%) repeats. The Relative Abundance (RA) and Relative Density (RD) of these SSRs varied between 7.6-8.4 and 53.0-59.5 bp/kb, respectively. While in the case of cSSRs, the RA and RD ranged from 0.6-0.8 and 12.1-17.0 bp/kb, respectively. Regression analysis of all parameters like the incident of SSRs, RA, and RD significantly correlated with the GC content. But in a case of genome size, except incident SSRs, all other parameters were non-significantly correlated. Nearly all cSSRs were composed of two microsatellites, which showed no biasedness to a particular motif. Motif duplication pattern, such as, (C)-x-(C), (TG)-x-(TG), (AT)-x-(AT), (TC)- x-(TC) and self-complementary motifs, such as (GC)-x-(CG), (TC)-x-(AG), (GT)-x-(CA) and (TC)-x-(AG) were observed in the cSSRs. Finally, in-silico polymorphism was assessed, followed by in-vitro validation using PCR analysis and sequencing. The thirteen polymorphic SSR markers developed in this study were further characterized by mapping with the sequence present in the database. The results of the present study indicate that these SSRs could be a useful tool for identification, analysis of genetic diversity, and understanding the evolutionary status of the virus.
Collapse
|
9
|
Du L, Liu Q, Zhao K, Tang J, Zhang X, Yue B, Fan Z. PSMD: An extensive database for pan-species microsatellite investigation and marker development. Mol Ecol Resour 2019; 20:283-291. [PMID: 31599098 DOI: 10.1111/1755-0998.13098] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2019] [Revised: 09/18/2019] [Accepted: 09/24/2019] [Indexed: 12/21/2022]
Abstract
Microsatellites are widely distributed throughout nearly all genomes which have been extensively exploited as powerful genetic markers for diverse applications due to their high polymorphisms. Their length variations are involved in gene regulation and implicated in numerous genetic diseases even in cancers. Although much effort has been devoted in microsatellite database construction, the existing microsatellite databases still had some drawbacks, such as limited number of species, unfriendly export format, missing marker development, lack of compound microsatellites and absence of gene annotation, which seriously restricted researchers to perform downstream analysis. In order to overcome the above limitations, we developed PSMD (Pan-Species Microsatellite Database, http://big.cdu.edu.cn/psmd/) as a web-based database to facilitate researchers to easily identify microsatellites, exploit reliable molecular markers and compare microsatellite distribution pattern on genome-wide scale. In current release, PSMD comprises 678,106,741 perfect microsatellites and 43,848,943 compound microsatellites from 18,408 organisms, which covered almost all species with available genomic data. In addition to interactive browse interface, PSMD also offers a flexible filter function for users to quickly gain desired microsatellites from large data sets. PSMD allows users to export GFF3 formatted file and CSV formatted statistical file for downstream analysis. We also implemented an online tool for analysing occurrence of microsatellites with user-defined parameters. Furthermore, Primer3 was embedded to help users to design high-quality primers with customizable settings. To our knowledge, PSMD is the most extensive resource which is likely to be adopted by scientists engaged in biological, medical, environmental and agricultural research.
Collapse
Affiliation(s)
- Lianming Du
- Institute for Advanced Study, Chengdu University, Chengdu, China
| | - Qin Liu
- Key Laboratory of Bio-resources and Eco-environment, Ministry of Education, College of Life Science, Sichuan University, Chengdu, China.,College of Life Sciences and Food Engineering, Yibin University, Yibin, China
| | - Kelei Zhao
- Institute for Advanced Study, Chengdu University, Chengdu, China
| | - Jie Tang
- School of Pharmacy and Bioengineering, Chengdu University, Chengdu, China
| | - Xiuyue Zhang
- Key Laboratory of Bio-resources and Eco-environment, Ministry of Education, College of Life Science, Sichuan University, Chengdu, China
| | - Bisong Yue
- Key Laboratory of Bio-resources and Eco-environment, Ministry of Education, College of Life Science, Sichuan University, Chengdu, China
| | - Zhenxin Fan
- Key Laboratory of Bio-resources and Eco-environment, Ministry of Education, College of Life Science, Sichuan University, Chengdu, China
| |
Collapse
|
10
|
Ledenyova ML, Tkachenko GA, Shpak IM. Imperfect and Compound Microsatellites in the Genomes of Burkholderia pseudomallei Strains. Mol Biol 2019. [DOI: 10.1134/s0026893319010084] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
|
11
|
Alam CM, Iqbal A, Sharma A, Schulman AH, Ali S. Microsatellite Diversity, Complexity, and Host Range of Mycobacteriophage Genomes of the Siphoviridae Family. Front Genet 2019; 10:207. [PMID: 30923537 PMCID: PMC6426759 DOI: 10.3389/fgene.2019.00207] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2018] [Accepted: 02/26/2019] [Indexed: 01/21/2023] Open
Abstract
The incidence, distribution, and variation of simple sequence repeats (SSRs) in viruses is instrumental in understanding the functional and evolutionary aspects of repeat sequences. Full-length genome sequences retrieved from NCBI were used for extraction and analysis of repeat sequences using IMEx software. We have also developed two MATLAB-based tools for extraction of gene locations from GenBank in tabular format and simulation of this data with SSR incidence data. Present study encompassing 147 Mycobacteriophage genomes revealed 25,284 SSRs and 1,127 compound SSRs (cSSRs) through IMEx. Mono- to hexa-nucleotide motifs were present. The SSR count per genome ranged from 78 (M100) to 342 (M58) while cSSRs incidence ranged from 1 (M138) to 17 (M28, M73). Though cSSRs were present in all the genomes, their frequency and SSR to cSSR conversion percentage varied from 1.08 (M138 with 93 SSRs) to 8.33 (M116 with 96 SSRs). In terms of localization, the SSRs were predominantly localized to coding regions (∼78%). Interestingly, genomes of around 50 kb contained a similar number of SSRs/cSSRs to that in a 110 kb genome, suggesting functional relevance for SSRs which was substantiated by variation in motif constitution between species with different host range. The three species with broad host range (M97, M100, M116) have around 90% of their mono-nucleotide repeat motifs composed of G or C and only M16 has both A and T mononucleotide motifs. Around 20% of the di-nucleotide repeat motifs in the genomes exhibiting a broad host range were CT/TC, which were either absent or represented to a much lesser extent in the other genomes.
Collapse
Affiliation(s)
- Chaudhary Mashhood Alam
- Luke/BI Plant Genome Dynamics Lab, Institute of Biotechnology and Viikki Plant Science Centre, University of Helsinki, Helsinki, Finland.,Ingenious e-Brain Solutions, Gurugram, India
| | - Asif Iqbal
- PIRO Technologies Private Limited, New Delhi, India
| | - Anjana Sharma
- Department of Biomedical Sciences, SRCASW, University of Delhi, New Delhi, India
| | - Alan H Schulman
- Luke/BI Plant Genome Dynamics Lab, Institute of Biotechnology and Viikki Plant Science Centre, University of Helsinki, Helsinki, Finland.,Natural Resources Institute Finland (Luke), Helsinki, Finland
| | - Safdar Ali
- Department of Biomedical Sciences, SRCASW, University of Delhi, New Delhi, India.,Department of Biological Sciences, Aliah University, Kolkata, India
| |
Collapse
|
12
|
Yan C, Du J, Gao L, Li Y, Hou X. The complete chloroplast genome sequence of watercress (Nasturtium officinale R. Br.): Genome organization, adaptive evolution and phylogenetic relationships in Cardamineae. Gene 2019; 699:24-36. [PMID: 30849538 DOI: 10.1016/j.gene.2019.02.075] [Citation(s) in RCA: 43] [Impact Index Per Article: 8.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2018] [Revised: 02/16/2019] [Accepted: 02/18/2019] [Indexed: 12/12/2022]
Abstract
Watercress (Nasturtium officinale R. Br.), an aquatic leafy vegetable of the Brassicaceae family, is known as a nutritional powerhouse. Here, we de novo sequenced and assembled the complete chloroplast (cp) genome of watercress based on combined PacBio and Illumina data. The cp genome is 155,106 bp in length, exhibiting a typical quadripartite structure including a pair of inverted repeats (IRA and IRB) of 26,505 bp separated by a large single copy (LSC) region of 84,265 bp and a small single copy (SSC) region of 17,831 bp. The genome contained 113 unique genes, including 79 protein-coding genes, 30 tRNAs and 4 rRNAs, with 20 duplicate in the IRs. Compared with the prior cp genome of watercress deposited in GenBank, 21 single nucleotide polymorphisms (SNPs) and 27 indels were identified, mainly located in noncoding sequences. A total of 49 repeat structures and 71 simple sequence repeats (SSRs) were detected. Codon usage showed a bias for A/T-ending codons in the cp genome of watercress. Moreover, 45 RNA editing sites were predicted in 16 genes, all for C-to-U transitions. A comparative plastome study with Cardamineae species revealed a conserved gene order and high similarity of protein-coding sequences. Analysis of the Ka/Ks ratios of Cardamineae suggested positive selection exerted on the ycf2 gene in watercress, which might reflect specific adaptations of watercress to its particular living environment. Phylogenetic analyses based on complete cp genomes and common protein-coding genes from 56 species showed that the genus Nasturtium was a sister to Cardamine in the Cardamineae tribe. Our study provides valuable resources for future evolution, population genetics and molecular biology studies of watercress.
Collapse
Affiliation(s)
- Chao Yan
- State Key Laboratory of Crop Genetics and Germplasm Enhancement, Key Laboratory of Biology and Genetic Improvement of Horticultural Crops (East China), Ministry of Agriculture and Rural Affairs of the P.R. China, Nanjing Agricultural University, Nanjing 210095, China
| | - Jianchang Du
- Provincial Key Laboratory of Agrobiology, Institute of Crop Germplasm and Biotechnology, Jiangsu Academy of Agricultural Sciences, Nanjing 210014, China
| | - Lu Gao
- State Key Laboratory of Crop Genetics and Germplasm Enhancement, Key Laboratory of Biology and Genetic Improvement of Horticultural Crops (East China), Ministry of Agriculture and Rural Affairs of the P.R. China, Nanjing Agricultural University, Nanjing 210095, China
| | - Ying Li
- State Key Laboratory of Crop Genetics and Germplasm Enhancement, Key Laboratory of Biology and Genetic Improvement of Horticultural Crops (East China), Ministry of Agriculture and Rural Affairs of the P.R. China, Nanjing Agricultural University, Nanjing 210095, China
| | - Xilin Hou
- State Key Laboratory of Crop Genetics and Germplasm Enhancement, Key Laboratory of Biology and Genetic Improvement of Horticultural Crops (East China), Ministry of Agriculture and Rural Affairs of the P.R. China, Nanjing Agricultural University, Nanjing 210095, China.
| |
Collapse
|
13
|
Bhat NN, Mahiya-Farooq, Padder BA, Shah M, Dar M, Nabi A, Bano A, Rasool RS, Sana-Surma. Microsatellite mining in the genus Colletotrichum. GENE REPORTS 2018. [DOI: 10.1016/j.genrep.2018.09.001] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/28/2022]
|
14
|
Sen S, Dehury B, Sahu J, Rathi S, Yadav RNS. Mining and comparative survey of EST-SSR markers among members of Euphorbiaceae family. Mol Biol Rep 2018; 45:453-468. [PMID: 29626317 DOI: 10.1007/s11033-018-4181-0] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2017] [Accepted: 04/02/2018] [Indexed: 11/30/2022]
Abstract
Euphorbiaceae represents flowering plants family of tropical and sub-tropical region rich in secondary metabolites of economic importance. To understand and assess the genetic makeup among the members, this study was undertaken to characterize and compare SSR markers from publicly available ESTs and GSSs of nine selected species of the family. Mining of SSRs was performed by MISA, primer designing by Primer3, while functional annotation, gene ontology (GO) and enrichment analysis were performed by Blast2GO. A total 12,878 number of SSRs were detected from 101,701 number of EST sequences. SSR density ranged from 1 SSR/3.22 kb to 1 SSR/15.65 kb. A total of 1873 primer pairs were designed for the annotated SSR-Contigs. About 77.07% SSR-ESTs could be assigned a significant match to the protein database. 3037 unique SSR-FDM were assigned and IPR003657 (WRKY Domain) was found to be the most dominant FDM among the members. 1810 unique GO terms obtained were further subjected to enrichment analysis to obtain 513 statistically significant GO terms mapped to the SSR containing ESTs. Most frequent enriched GO terms were, GO:0003824 for molecular function, GO:0006350 for biological process and GO:0005886 for cellular component, justifying the richness of defensive secondary metabolites and phytomedicine within the family. The results from this study provides tangible insight to genetic make-up and distribution of SSRs. Functional annotation corresponded many genes of unknown functions which may be considered as novel genes or genes responsible for stress specific secondary metabolites. Further studies are required to understand stress specific genes accountable for leveraging the synthesis of secondary metabolites.
Collapse
Affiliation(s)
- Surojit Sen
- Centre for Biotechnology and Bioinformatics, Dibrugarh University, Dibrugarh, Assam, India.
| | - Budheswar Dehury
- Biomedical Informatics Centre, ICMR-Regional Medical Research Centre, Nalco Square, Chandrasekharpur, Bhubaneswar, Odisha, 751023, India
| | - Jagajjit Sahu
- Distributed Information Center, Department of Agricultural Biotechnology, Assam Agricultural University, Jorhat, Assam, 785013, India
| | - Sunayana Rathi
- Department of Biochemistry and Agricultural Chemistry, Assam Agricultural University, Jorhat, Assam, 785013, India
| | | |
Collapse
|
15
|
Genome sequencing and analysis of Alcaligenes faecalis subsp. phenolicus MB207. Sci Rep 2018; 8:3616. [PMID: 29483539 PMCID: PMC5827749 DOI: 10.1038/s41598-018-21919-4] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2017] [Accepted: 02/08/2018] [Indexed: 11/12/2022] Open
Abstract
Bacteria within the genus Alcaligenes, exhibit diverse properties but remain largely unexplored at genome scale. To shed light on the genome structure, heterogeneity and traits of Alcaligenes species, the genome of a tannery effluent isolated Alcaligenes faecalis subsp. phenolicus MB207 was sequenced and assembled. The genome was compared to the whole genome sequences of genus Alcaligenes present in the National Centre for Biotechnology Information database. Core, pan and species specific gene sequences i.e. singletons were identified. Members of this genus did not portray exceptional genetic heterogeneity or conservation and out of 5,166 protein coding genes from pooled genome dataset, 2429 (47.01%) contributed to the core, 1193 (23.09%) to singletons and 1544 (29.88%) to accessory genome. Secondary metabolite forming apparatus, antibiotic production and resistance was also profiled. Alcaligenes faecalis subsp. phenolicus MB207 genome consisted of a copious amount of bioremediation genes i.e. metal tolerance and xenobiotic degrading genes. This study marks this strain as a prospective eco-friendly bacterium with numerous benefits for the environment related research. Availability of the whole genome sequence heralds an opportunity for researchers to explore enzymes and apparatus for sustainable environmental clean-up as well as important compounds/substance production.
Collapse
|
16
|
Xu Y, Li W, Hu Z, Zeng T, Shen Y, Liu S, Zhang X, Li J, Yue B. Genome-wide mining of perfect microsatellites and tetranucleotide orthologous microsatellites estimates in six primate species. Gene 2017; 643:124-132. [PMID: 29223358 DOI: 10.1016/j.gene.2017.12.008] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2017] [Revised: 12/04/2017] [Accepted: 12/06/2017] [Indexed: 12/16/2022]
Abstract
Advancement in genome sequencing and in silico mining tools have provided new opportunities for comparative primate genomics of microsatellites. The SSRs (simple sequence repeats) numbers were not correlated with the genome size (Pearson, r=0.310, p=0.550), and were positively correlated with the total length of SSRs (Pearson, r=0.992, p=0.00). A total of 224,289 tetranucleotide orthologous microsatellites families and 367 single-copy orthologous SSRs loci were found in six primate species by homologous alignment. The inner mutation types of single-copy orthologous SSRs loci included the copy number variance, point mutation, and chromosomal translocation. The accumulated repeat times and average length of tetranucleotide orthologous microsatellites in Rhinopithecus roxellana, Papio anubis and Macaca mulatta were longer than Homo sapiens and Pan troglodytes, which showed the tetranucleotide orthologous SSRs loci had more repeat times and longer average length on the branches with earlier divergence time, one exception may be Microcebus murinus as a primitive monkey with a smallest morphology in Malagasy. Our conclusion indicated that single-copy tetranucleotide orthologous SSRs sequences accumulated individual mutation more slowly through time in H. sapiens and P. troglodytes than in R. roxellanae, P. anubis and M. mulatta. However, such divergence wouldn't arise uniformly in all branches of the primate tree. A comparison of genomic sequence assemblages would offer remarkable insights about comparisons and contrasts, and the evolutionary processes of the microsatellites involved in human and nonhuman primate species.
Collapse
Affiliation(s)
- Yongtao Xu
- Key Laboratory of Bio-resources and Eco-environment (Ministry of Education), College of Life Sciences, Sichuan University, Chengdu 610064, PR China
| | - Wujiao Li
- Key Laboratory of Bio-resources and Eco-environment (Ministry of Education), College of Life Sciences, Sichuan University, Chengdu 610064, PR China
| | - Zongxiu Hu
- Yibin Hengshu Animal Models Resource Industry Technology Academy, Yibin 644609, PR China
| | - Tao Zeng
- Yibin Hengshu Animal Models Resource Industry Technology Academy, Yibin 644609, PR China
| | - Yongmei Shen
- Sichuan Engineering Research Center for Medical Animal, Chengdu 610064, PR China
| | - Sanxu Liu
- Key Laboratory of Bio-resources and Eco-environment (Ministry of Education), College of Life Sciences, Sichuan University, Chengdu 610064, PR China
| | - Xiuyue Zhang
- Sichuan Key Laboratory of Conservation Biology on Endangered Wildlife, College of Life Sciences, Sichuan University, Chengdu 610064, PR China
| | - Jing Li
- Sichuan Key Laboratory of Conservation Biology on Endangered Wildlife, College of Life Sciences, Sichuan University, Chengdu 610064, PR China
| | - Bisong Yue
- Sichuan Key Laboratory of Conservation Biology on Endangered Wildlife, College of Life Sciences, Sichuan University, Chengdu 610064, PR China.
| |
Collapse
|
17
|
Zhang H, Hall N, McElroy JS, Lowe EK, Goertzen LR. Complete plastid genome sequence of goosegrass (Eleusine indica) and comparison with other Poaceae. Gene 2016; 600:36-43. [PMID: 27899326 DOI: 10.1016/j.gene.2016.11.038] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2016] [Revised: 11/12/2016] [Accepted: 11/24/2016] [Indexed: 11/18/2022]
Abstract
Eleusine indica, also known as goosegrass, is a serious weed in at least 42 countries. In this paper we report the complete plastid genome sequence of goosegrass obtained by de novo assembly of paired-end and mate-paired reads generated by Illumina sequencing of total genomic DNA. The goosegrass plastome is a circular molecule of 135,151bp in length, consisting of two single-copy regions separated by a pair of inverted repeats (IRs) of 20,919 bases. The large (LSC) and the small (SSC) single-copy regions span 80,667 bases and 12,646 bases, respectively. The plastome of goosegrass has 38.19% GC content and includes 108 unique genes, of which 76 are protein-coding, 28 are transfer RNA, and 4 are ribosomal RNA. The goosegrass plastome sequence was compared to eight other species of Poaceae. Although generally conserved with respect to Poaceae, this genomic resource will be useful for evolutionary studies within this weed species and the genus Eleusine.
Collapse
Affiliation(s)
- Hui Zhang
- Department of Crop, Soil and Environmental Science, Auburn University, AL 36849, USA
| | - Nathan Hall
- Department of Biological Sciences, Auburn University, AL 36849, USA
| | - J Scott McElroy
- Department of Crop, Soil and Environmental Science, Auburn University, AL 36849, USA.
| | - Elijah K Lowe
- Department of Biology and Evolution of Marine Organisms, Stazione Zoologica Anton Dohrn, Villa Comunale, 80121 Napoli, Italy; BEACON Center for the Study of Evolution in Action, Michigan State University, East Lansing, MI, USA
| | | |
Collapse
|
18
|
Survey and analysis of simple sequence repeats (SSRs) in three genomes of Candida species. Gene 2016; 584:129-35. [PMID: 26883055 DOI: 10.1016/j.gene.2016.02.018] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2015] [Revised: 01/15/2016] [Accepted: 02/12/2016] [Indexed: 11/23/2022]
Abstract
Simple sequence repeats (SSRs) or microsatellites, which composed of tandem repeated short units of 1-6 bp, have been paying attention continuously. Here, the distribution, composition and polymorphism of microsatellites and compound microsatellites were analyzed in three available genomes of Candida species (Candida dubliniensis, Candida glabrata and Candida orthopsilosis). The results show that there were 118,047, 66,259 and 61,119 microsatellites in genomes of C. dubliniensis, C. glabrata and C. orthopsilosis, respectively. The SSRs covered more than 1/3 length of genomes in the three species. The microsatellites, which just consist of bases A and (or) T, such as (A)n, (T)n, (AT)n, (TA)n, (AAT)n, (TAA)n, (TTA)n, (ATA)n, (ATT)n and (TAT)n, were predominant in the three genomes. The length of microsatellites was focused on 6 bp and 9 bp either in the three genomes or in its coding sequences. What's more, the relative abundance (19.89/kbp) and relative density (167.87 bp/kbp) of SSRs in sequence of mitochondrion of C. glabrata were significantly great than that in any one of genomes or chromosomes of the three species. In addition, the distance between any two adjacent microsatellites was an important factor to influence the formation of compound microsatellites. The analysis may be helpful for further studying the roles of microsatellites in genomes' origination, organization and evolution of Candida species.
Collapse
|
19
|
Comparative analysis of microsatellites and compound microsatellites in T4-like viruses. Gene 2016; 575:695-701. [DOI: 10.1016/j.gene.2015.09.053] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2015] [Revised: 09/16/2015] [Accepted: 09/21/2015] [Indexed: 01/27/2023]
|
20
|
GEORGE B, GEORGE B, AWASTHI M, SINGH RN. In silico genome-wide identification and analysis of microsatelliterepeats in the largest RNA virus family (Closteroviridae). Turk J Biol 2016. [DOI: 10.3906/biy-1503-11] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/09/2023] Open
|
21
|
NGAI MINGYIN, SAITOU NARUYA. The effect of perfection status on mutation rates of microsatellites in primates. ANTHROPOL SCI 2016. [DOI: 10.1537/ase.160124] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/01/2022]
Affiliation(s)
- MING YIN NGAI
- Department of Biological Sciences, Graduate School of Science, The University of Tokyo, Tokyo
- Division of Population Genetics, National Institute of Genetics, Mishima
| | - NARUYA SAITOU
- Department of Biological Sciences, Graduate School of Science, The University of Tokyo, Tokyo
- Division of Population Genetics, National Institute of Genetics, Mishima
| |
Collapse
|
22
|
Demers JE, Jiménez-Gasco MDM. Evolution of Nine Microsatellite Loci in the Fungus Fusarium oxysporum. J Mol Evol 2015; 82:27-37. [PMID: 26661928 DOI: 10.1007/s00239-015-9725-5] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2014] [Accepted: 11/19/2015] [Indexed: 12/11/2022]
Abstract
The evolution of nine microsatellites and one minisatellite was investigated in the fungus Fusarium oxysporum and sister taxa Fusarium redolens and Fusarium verticillioides. Compared to other organisms, fungi have been reported to contain fewer and less polymorphic microsatellites. Mutational patterns over evolutionary time were studied for these ten loci by mapping changes in core repeat numbers onto a phylogeny based on the sequence of the conserved translation elongation factor 1-α gene. The patterns of microsatellite formation, expansion, and interruption by base substitutions were followed across the phylogeny, showing that these loci are evolving in a manner similar to that of microsatellites in other eukaryotes. Most mutations could be fit to a stepwise mutation model, but a few appear to have involved multiple repeat units. No evidence of gene conversion was seen at the minisatellite locus, which may also be mutating by replication slippage. Some homoplastic numbers of repeat units were observed for these loci, and polymorphisms in the regions flanking the microsatellites may provide better genetic markers for population genetics studies of these species.
Collapse
Affiliation(s)
- Jill E Demers
- Department of Plant Pathology & Environmental Microbiology, The Pennsylvania State University, University Park, PA, USA. .,USDA-ARS Systematic Mycology and Microbiology Laboratory, 10300 Baltimore Ave., Beltsville, MD, 20705, USA.
| | - María del Mar Jiménez-Gasco
- Department of Plant Pathology & Environmental Microbiology, The Pennsylvania State University, University Park, PA, USA.
| |
Collapse
|
23
|
Schemerhorn BJ, Crane YM, Crane CF. The evolution of Hessian fly from the Old World to the New World: Evidence from molecular markers. INSECT SCIENCE 2015; 22:768-784. [PMID: 25263747 DOI: 10.1111/1744-7917.12175] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Accepted: 08/19/2014] [Indexed: 06/03/2023]
Abstract
Eighteen polymorphic microsatellite loci and 11 single-nucleotide polymorphisms were genotyped in 1,095 individual Hessian fly specimens representing 23 populations from North America, southern Europe, and southwest Asia. The genotypes were used to assess genetic diversity and interrelationship of Hessian fly populations. While phylogenetic analysis indicates that the American populations most similar to Eurasian populations come from the east coast of the United States, genetic distance is least between (Alabama and California) and (Kazakhstan and Spain). Allelic diversity and frequency vary across North America, but they are not correlated with distance from the historically documented point of introduction in New York City or with temperature or precipitation. Instead, the greatest allelic diversity mostly occurs in areas with Mediterranean climates. The microsatellite data indicate a general deficiency for heterozygotes in Hessian fly. The North American population structure is consistent with multiple introductions, isolation by distance, and human-abetted dispersal by bulk transport of puparia in infested straw or on harvesting equipment.
Collapse
Affiliation(s)
| | | | - Charles F Crane
- USDA-ARS
- Department of Botany and Plant Pathology, Purdue University, West Lafayette, IN, 47907, USA
| |
Collapse
|
24
|
Basharat Z, Yasmin A. Survey of compound microsatellites in multiple Lactobacillus genomes. Can J Microbiol 2015; 61:898-902. [PMID: 26445296 DOI: 10.1139/cjm-2015-0136] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/22/2023]
Abstract
Distinct simple sequence repeats with 2 or more individual microsatellites joined together or lying adjacent to each other are identified as compound microsatellites. Investigation of such composite microsatellites in the genomes of genus Lactobacillus was the aim of this study. In silico inspection of microsatellite clustering in genomes of 14 Lactobacillus species revealed a wealth of compound microsatellites. All of the mined compound microsatellites were imperfect, were composed of variant motifs, and increased in all genomes, with maximum distance (dMAX) increments of 10 to 50. The majority of these repeats were present in the coding regions. A correlation of microsatellite to compound microsatellite density was detected. The difference established in compound microsatellite division among eukaryotes, Escherichia coli, and lactobacilli is suggestive of diverse genomic features and elementary distinction between creation and fixation methods of compound microsatellites among these organisms.
Collapse
Affiliation(s)
- Zarrin Basharat
- Microbiology and Biotechnology Research Laboratory, Department of Environmental Sciences, Fatima Jinnah Women University 46000, Pakistan.,Microbiology and Biotechnology Research Laboratory, Department of Environmental Sciences, Fatima Jinnah Women University 46000, Pakistan
| | - Azra Yasmin
- Microbiology and Biotechnology Research Laboratory, Department of Environmental Sciences, Fatima Jinnah Women University 46000, Pakistan.,Microbiology and Biotechnology Research Laboratory, Department of Environmental Sciences, Fatima Jinnah Women University 46000, Pakistan
| |
Collapse
|
25
|
George B, Alam CM, Kumar RV, Gnanasekaran P, Chakraborty S. Potential linkage between compound microsatellites and recombination in geminiviruses: Evidence from comparative analysis. Virology 2015; 482:41-50. [DOI: 10.1016/j.virol.2015.03.003] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2014] [Revised: 02/16/2015] [Accepted: 03/05/2015] [Indexed: 01/10/2023]
|
26
|
George B, George B, awasthi M, Singh RN. Genome wide survey and analysis of microsatellites in Tombusviridae family. Genes Genomics 2015. [DOI: 10.1007/s13258-015-0295-0] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
|
27
|
George B, Bhatt BS, Awasthi M, George B, Singh AK. Comparative analysis of microsatellites in chloroplast genomes of lower and higher plants. Curr Genet 2015; 61:665-77. [PMID: 25999216 DOI: 10.1007/s00294-015-0495-9] [Citation(s) in RCA: 64] [Impact Index Per Article: 7.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2015] [Revised: 05/05/2015] [Accepted: 05/08/2015] [Indexed: 12/29/2022]
Abstract
Microsatellites, or simple sequence repeats (SSRs), contain repetitive DNA sequence where tandem repeats of one to six base pairs are present number of times. Chloroplast genome sequences have been shown to possess extensive variations in the length, number and distribution of SSRs. However, a comparative analysis of chloroplast microsatellites is not available. Considering their potential importance in generating genomic diversity, we have systematically analysed the abundance and distribution of simple and compound microsatellites in 164 sequenced chloroplast genomes from wide range of plants. The key findings of these studies are (1) a large number of mononucleotide repeats as compared to SSR(2-6)(di-, tri-, tetra-, penta-, hexanucleotide repeats) are present in all chloroplast genomes investigated, (2) lower plants such as algae show wide variation in relative abundance, density and distribution of microsatellite repeats as compared to flowering plants, (3) longer SSRs are excluded from coding regions of most chloroplast genomes, (4) GC content has a weak influence on number, relative abundance and relative density of mononucleotide as well as SSR(2-6). However, GC content strongly showed negative correlation with relative density (R (2) = 0.5, P < 0.05) and relative abundance (R (2) = 0.6, P < 0.05) of cSSRs. In summary, our comparative studies of chloroplast genomes illustrate the variable distribution of microsatellites and revealed that chloroplast genome of smaller plants possesses relatively more genomic diversity compared to higher plants.
Collapse
Affiliation(s)
- Biju George
- Blessy Software Solution, Sector 4/441, Malviya Nagar, Jaipur, 302017, Rajasthan, India.
| | - Bhavin S Bhatt
- School of Life Sciences, Central University of Gujarat, Gandhinagar, 382030, Gujarat, India
| | - Mayur Awasthi
- Mahatma Gandhi Chitrakoot Gramodaya Vishwavidhyalaya, Satna, 485334, Madhya Pradesh, India
| | - Binu George
- Blessy Software Solution, Sector 4/441, Malviya Nagar, Jaipur, 302017, Rajasthan, India
| | - Achuit K Singh
- School of Life Sciences, Central University of Gujarat, Gandhinagar, 382030, Gujarat, India.
| |
Collapse
|
28
|
Mashhood Alam C, Sharfuddin C, Ali S. Analysis of Simple and Imperfect Microsatellites in Ebolavirus Species and Other Genomes of Filoviridae Family. ACTA ACUST UNITED AC 2015. [DOI: 10.17795/gct-26404] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
|
29
|
Sahu J, Das Talukdar A, Devi K, Choudhury MD, Barooah M, Modi MK, Sen P. E-Microsatellite Markers for Centella asiatica (Gotu Kola) Genome: Validation and Cross-Transferability in Apiaceae Family for Plant Omics Research and Development. OMICS-A JOURNAL OF INTEGRATIVE BIOLOGY 2015; 19:52-65. [DOI: 10.1089/omi.2014.0113] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
Affiliation(s)
- Jagajjit Sahu
- Distributed Information Centre, Department of Agricultural Biotechnology, Assam Agricultural University, Assam, India
- Department of Life Science and Bioinformatics, Assam University, Assam, India
| | - Anupam Das Talukdar
- Department of Life Science and Bioinformatics, Assam University, Assam, India
| | - Kamalakshi Devi
- Distributed Information Centre, Department of Agricultural Biotechnology, Assam Agricultural University, Assam, India
| | | | - Madhumita Barooah
- Distributed Information Centre, Department of Agricultural Biotechnology, Assam Agricultural University, Assam, India
| | - Mahendra Kumar Modi
- Distributed Information Centre, Department of Agricultural Biotechnology, Assam Agricultural University, Assam, India
| | - Priyabrata Sen
- Distributed Information Centre, Department of Agricultural Biotechnology, Assam Agricultural University, Assam, India
| |
Collapse
|
30
|
Sahu J, Sen P, Choudhury MD, Dehury B, Barooah M, Modi MK, Talukdar AD. Rediscovering medicinal plants' potential with OMICS: microsatellite survey in expressed sequence tags of eleven traditional plants with potent antidiabetic properties. OMICS-A JOURNAL OF INTEGRATIVE BIOLOGY 2014; 18:298-309. [PMID: 24802971 DOI: 10.1089/omi.2013.0147] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
Abstract
Herbal medicines and traditionally used medicinal plants present an untapped potential for novel molecular target discovery using systems science and OMICS biotechnology driven strategies. Since up to 40% of the world's poor people have no access to government health services, traditional and folk medicines are often the only therapeutics available to them. In this vein, North East (NE) India is recognized for its rich bioresources. As part of the Indo-Burma hotspot, it is regarded as an epicenter of biodiversity for several plants having myriad traditional uses, including medicinal use. However, the improvement of these valuable bioresources through molecular breeding strategies, for example, using genic microsatellites or Simple Sequence Repeats (SSRs) or Expressed Sequence Tags (ESTs)-derived SSRs has not been fully utilized in large scale to date. In this study, we identified a total of 47,700 microsatellites from 109,609 ESTs of 11 medicinal plants (pineapple, papaya, noyontara, bitter orange, bermuda brass, ratalu, barbados nut, mango, mulberry, lotus, and guduchi) having proven antidiabetic properties. A total of 58,159 primer pairs were designed for the non-redundant 8060 SSR-positive ESTs and putative functions were assigned to 4483 unique contigs. Among the identified microsatellites, excluding mononucleotide repeats, di-/trinucleotides are predominant, among which repeat motifs of AG/CT and AAG/CTT were most abundant. Similarity search of SSR containing ESTs and antidiabetic gene sequences revealed 11 microsatellites linked to antidiabetic genes in five plants. GO term enrichment analysis revealed a total of 80 enriched GO terms widely distributed in 53 biological processes, 17 molecular functions, and 10 cellular components associated with the 11 markers. The present study therefore provides concrete insights into the frequency and distribution of SSRs in important medicinal resources. The microsatellite markers reported here markedly add to the genetic stock for cross transferability in these plants and the literature on biomarkers and novel drug discovery for common chronic diseases such as diabetes.
Collapse
Affiliation(s)
- Jagajjit Sahu
- 1 Agri-Bioinformatics Promotion Programme, Department of Agricultural Biotechnology, Assam Agricultural University , Assam, India
| | | | | | | | | | | | | |
Collapse
|
31
|
The analysis of microsatellites and compound microsatellites in 56 complete genomes of Herpesvirales. Gene 2014; 551:103-9. [DOI: 10.1016/j.gene.2014.08.054] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2014] [Revised: 08/09/2014] [Accepted: 08/26/2014] [Indexed: 01/13/2023]
|
32
|
Alam CM, Singh AK, Sharfuddin C, Ali S. In- silico exploration of thirty alphavirus genomes for analysis of the simple sequence repeats. Meta Gene 2014; 2:694-705. [PMID: 25606453 PMCID: PMC4287844 DOI: 10.1016/j.mgene.2014.09.005] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2014] [Revised: 09/08/2014] [Accepted: 09/10/2014] [Indexed: 11/29/2022] Open
Abstract
The compilation of simple sequence repeats (SSRs) in viruses and its analysis with reference to incidence, distribution and variation would be instrumental in understanding the functional and evolutionary aspects of repeat sequences. Present study encompasses the analysis of SSRs across 30 species of alphaviruses. The full length genome sequences, assessed from NCBI were used for extraction and analysis of repeat sequences using IMEx software. The repeats of different motif sizes (mono- to penta-nucleotide) observed therein exhibited variable incidence across the species. Expectedly, mononucleotide A/T was the most prevalent followed by dinucleotide AG/GA and trinucleotide AAG/GAA in these genomes. The conversion of SSRs to imperfect microsatellite or compound microsatellite (cSSR) is low. cSSR, primarily constituted by variant motifs accounted for up to 12.5% of the SSRs. Interestingly, seven species lacked cSSR in their genomes. However, the SSR and cSSR are predominantly localized to the coding region ORFs for non structural protein and structural proteins. The relative frequencies of different classes of simple and compound microsatellites within and across genomes have been highlighted. This is the first analysis of SSR and cSSR in alphaviruses. We analysed differential frequency and distribution patterns of SSRs and cSSRs. We studied localization of SSR and cSSR in alphaviruses proteomics This study would help in better understanding of evolutionary biology of alphaviruses.
Collapse
Affiliation(s)
| | - Avadhesh Kumar Singh
- Department of Biomedical Sciences, SRCASW, University of Delhi, Vasundhara Enclave, New Delhi 110096, India
| | | | - Safdar Ali
- Department of Biomedical Sciences, SRCASW, University of Delhi, Vasundhara Enclave, New Delhi 110096, India
| |
Collapse
|
33
|
George B, Gnanasekaran P, Jain SK, Chakraborty S. Genome wide survey and analysis of small repetitive sequences in caulimoviruses. INFECTION GENETICS AND EVOLUTION 2014; 27:15-24. [PMID: 24999243 DOI: 10.1016/j.meegid.2014.06.018] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/06/2014] [Revised: 06/01/2014] [Accepted: 06/22/2014] [Indexed: 12/19/2022]
Abstract
Microsatellites are known to exhibit ubiquitous presence across all kingdoms of life including viruses. Members of the Caulimoviridae family severely affect growth of vegetable and fruit plants and reduce economic yield in diverse cropping systems worldwide. Here, we analyzed the nature and distribution of both simple and complex microsatellites present in complete genome of 44 species of Caulimoviridae. Our results showed, in all analyzed genomes, genome size and GC content had a weak influence on number, relative abundance and relative density of microsatellites, respectively. For each genome, mono- and dinucleotide repeats were found to be highly predominant and are overrepresented in genome of majority of caulimoviruses. AT/TA and GAA/AAG/AGA was the most abundant di- and trinucleotide repeat motif, respectively. Repeats larger than trinucleotide were rarely found in these genomes. Comparative study of occurrence, abundance and density of microsatellite among available RNA and DNA viral genomes indicated that simple repeats were least abundant in genomes of caulimoviruses. Polymorphic repeats even though rare were observed in the large intergenic region of the genome, indicating strand slippage and/or unequal recombination processes do occur in caulimoviruses. To our knowledge, this is the first analysis of microsatellites occurring in any dsDNA viral genome. Characterization of such variations in repeat sequences would be important in deciphering the origin, mutational processes, and role of repeat sequences in viral genomes.
Collapse
Affiliation(s)
- Biju George
- Molecular Virology Laboratory, School of Life Sciences, Jawaharlal Nehru University, New Delhi 110067, India
| | - Prabu Gnanasekaran
- Molecular Virology Laboratory, School of Life Sciences, Jawaharlal Nehru University, New Delhi 110067, India
| | - S K Jain
- Department of Biotechnology, Jamia Hamdard University, New Delhi, Delhi 110062, India
| | - Supriya Chakraborty
- Molecular Virology Laboratory, School of Life Sciences, Jawaharlal Nehru University, New Delhi 110067, India.
| |
Collapse
|
34
|
Brittain A, Stroebele E, Erives A. Microsatellite repeat instability fuels evolution of embryonic enhancers in Hawaiian Drosophila. PLoS One 2014; 9:e101177. [PMID: 24978198 PMCID: PMC4076327 DOI: 10.1371/journal.pone.0101177] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2014] [Accepted: 06/03/2014] [Indexed: 12/16/2022] Open
Abstract
For ∼30 million years, the eggs of Hawaiian Drosophila were laid in ever-changing environments caused by high rates of island formation. The associated diversification of the size and developmental rate of the syncytial fly embryo would have altered morphogenic gradients, thus necessitating frequent evolutionary compensation of transcriptional responses. We investigate the consequences these radiations had on transcriptional enhancers patterning the embryo to see whether their pattern of molecular evolution is different from non-Hawaiian species. We identify and functionally assay in transgenic D. melanogaster the Neurogenic Ectoderm Enhancers from two different Hawaiian Drosophila groups: (i) the picture wing group, and (ii) the modified mouthparts group. We find that the binding sites in this set of well-characterized enhancers are footprinted by diverse microsatellite repeat (MSR) sequences. We further show that Hawaiian embryonic enhancers in general are enriched in MSR relative to both Hawaiian non-embryonic enhancers and non-Hawaiian embryonic enhancers. We propose embryonic enhancers are sensitive to Activator spacing because they often serve as assembly scaffolds for the aggregation of transcription factor activator complexes. Furthermore, as most indels are produced by microsatellite repeat slippage, enhancers from Hawaiian Drosophila lineages, which experience dynamic evolutionary pressures, would become grossly enriched in MSR content.
Collapse
Affiliation(s)
- Andrew Brittain
- Department of Biology, University of Iowa, Iowa City, Iowa, United States of America
| | - Elizabeth Stroebele
- Department of Biology, University of Iowa, Iowa City, Iowa, United States of America
| | - Albert Erives
- Department of Biology, University of Iowa, Iowa City, Iowa, United States of America
- * E-mail:
| |
Collapse
|
35
|
Singh AK, Alam CM, Sharfuddin C, Ali S. Frequency and distribution of simple and compound microsatellites in forty-eight Human papillomavirus (HPV) genomes. INFECTION GENETICS AND EVOLUTION 2014; 24:92-8. [PMID: 24662441 DOI: 10.1016/j.meegid.2014.03.010] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Subscribe] [Scholar Register] [Received: 01/18/2014] [Revised: 03/02/2014] [Accepted: 03/12/2014] [Indexed: 12/14/2022]
Abstract
Simple sequence repeats (SSRs) are tandem-repeated sequences ubiquitously present but differentially distributed across genomes. Present study is a systematic analysis for incidence, composition and complexity of different microsatellites in 48 representative Human papillomavirus (HPV) genomes. The analysis revealed a total of 1868 SSRs and 120 cSSRs. However, four genomes (HPV-60, HPV-92, HPV-112 and HPV-136) lacked any cSSR content; while HPV-31 accounted for a maximum of 10 cSSRs. An overall increase in cSSR% with higher dMAX was observed. The SSRs and cSSRs were prevalent in coding regions. Poly(A/T) repeats were significantly more abundant than poly(G/C) repeats possibly due to high (A/T) content of the HPV genomes. Further, higher prevalence of di-nucleotide repeats over tri-nucleotide repeats may be attributed to instability of former because of higher slippage rate. An in-depth study of the satellite sequences would provide an insight into the imperfections and evolution of microsatellites.
Collapse
Affiliation(s)
- Avadhesh Kumar Singh
- Department of Biomedical Sciences, SRCASW, University of Delhi, Vasundhara Enclave, New Delhi 110096, India
| | | | | | - Safdar Ali
- Department of Biomedical Sciences, SRCASW, University of Delhi, Vasundhara Enclave, New Delhi 110096, India.
| |
Collapse
|
36
|
Alam CM, Singh AK, Sharfuddin C, Ali S. Incidence, complexity and diversity of simple sequence repeats across potexvirus genomes. Gene 2014; 537:189-96. [DOI: 10.1016/j.gene.2014.01.007] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2013] [Revised: 11/15/2013] [Accepted: 01/04/2014] [Indexed: 01/18/2023]
|
37
|
Genome-wide scan for analysis of simple and imperfect microsatellites in diverse carlaviruses. INFECTION GENETICS AND EVOLUTION 2014; 21:287-94. [DOI: 10.1016/j.meegid.2013.11.018] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/19/2013] [Revised: 11/15/2013] [Accepted: 11/21/2013] [Indexed: 01/08/2023]
|
38
|
Behura SK, Severson DW. Association of microsatellite pairs with segmental duplications in insect genomes. BMC Genomics 2013; 14:907. [PMID: 24359442 PMCID: PMC3878106 DOI: 10.1186/1471-2164-14-907] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2013] [Accepted: 12/16/2013] [Indexed: 11/30/2022] Open
Abstract
Background Segmental duplications (SDs), also known as low-copy repeats, are DNA sequences of length greater than 1 kb which are duplicated with a high degree of sequence identity (greater than 90%) causing instability in genomes. SDs are generally found in the genome as mosaic forms of duplicated sequences which are generated by a two-step process: first, multiple duplicated sequences are aggregated at specific genomic regions, and then, these primary duplications undergo multiple secondary duplications. However, the mechanism of how duplicated sequences are aggregated in the first place is not well understood. Results By analyzing the distribution of microsatellite sequences among twenty insect species in a genome-wide manner it was found that pairs of microsatellites along with the intervening sequences were duplicated multiple times in each genome. They were found as low copy repeats or segmental duplications when the duplicated loci were greater than 1 kb in length and had greater than 90% sequence similarity. By performing a sliding-window genomic analysis for number of paired microsatellites and number of segmental duplications, it was observed that regions rich in repetitive paired microsatellites tend to get richer in segmental duplication suggesting a “rich-gets-richer” mode of aggregation of the duplicated loci in specific regions of the genome. Results further show that the relationship between number of paired microsatellites and segmental duplications among the species is independent of the known phylogeny suggesting that association of microsatellites with segmental duplications may be a species-specific evolutionary process. It was also observed that the repetitive microsatellite pairs are associated with gene duplications but those sequences are rarely retained in the orthologous genes between species. Although some of the duplicated sequences with microsatellites as termini were found within transposable elements (TEs) of Drosophila, most of the duplications are found in the TE-free and gene-free regions of the genome. Conclusion The study clearly suggests that microsatellites are instrumental in extensive sequence duplications that may contribute to species-specific evolution of genome plasticity in insects.
Collapse
Affiliation(s)
- Susanta K Behura
- Eck Institute for Global Health, Department of Biological Sciences, University of Notre Dame, Notre Dame, IN 46556, USA.
| | | |
Collapse
|
39
|
In-silico analysis of simple and imperfect microsatellites in diverse tobamovirus genomes. Gene 2013; 530:193-200. [DOI: 10.1016/j.gene.2013.08.046] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2013] [Revised: 08/10/2013] [Accepted: 08/13/2013] [Indexed: 11/20/2022]
|
40
|
Dickey AM, Hall PM, Shatters RG, Mckenzie CL. Evolution and homoplasy at the Bem6 microsatellite locus in three sweetpotato whitefly (Bemisia tabaci) cryptic species. BMC Res Notes 2013; 6:249. [PMID: 23819589 PMCID: PMC3716913 DOI: 10.1186/1756-0500-6-249] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2013] [Accepted: 06/26/2013] [Indexed: 11/19/2022] Open
Abstract
BACKGROUND The evolution of individual microsatellite loci is often complex and homoplasy is common but often goes undetected. Sequencing alleles at a microsatellite locus can provide a more complete picture of the common evolutionary mechanisms occurring at that locus and can reveal cases of homoplasy. Within species homoplasy can lead to an underestimate of differentiation among populations and among species homoplasy can produce a misleading interpretation regarding shared alleles and hybridization. This is especially problematic with cryptic species. RESULTS By sequencing alleles from three cryptic species of the sweetpotato whitefly (Bemisia tabaci), designated MEAM1, MED, and NW, the evolution of the putatively dinucleotide Bem6 (CA₈)imp microsatellite locus is inferred as one of primarily stepwise mutation occurring at four distinct heptaucleotide tandem repeats. In two of the species this pattern yields a compound tandem repeat. Homoplasy was detected both among species and within species. CONCLUSIONS In the absence of sequencing, size homoplasious alleles at the Bem6 locus lead to an overestimate of alleles shared and hybridization among cryptic species of Bemisia tabaci. Furthermore, the compound heptanucleotide motif structure of a putative dinucleotide microsatellite has implications for the nomenclature of heptanucleotide tandem repeats with step-wise evolution.
Collapse
Affiliation(s)
- Aaron M Dickey
- USDA-ARS, U.S. Horticultural Research Laboratory, 2001 South Rock Rd, Fort Pierce, FL 34945, USA
- Current address: Mid-Florida Research & Education Center, University of Florida, 2725 Binion Rd, Apopka, FL 32703, USA
| | - Paula M Hall
- Mid-Florida Research & Education Center, University of Florida, 2725 Binion Rd, Apopka, FL 32703, USA
| | - Robert G Shatters
- USDA-ARS, U.S. Horticultural Research Laboratory, 2001 South Rock Rd, Fort Pierce, FL 34945, USA
| | - Cindy L Mckenzie
- USDA-ARS, U.S. Horticultural Research Laboratory, 2001 South Rock Rd, Fort Pierce, FL 34945, USA
| |
Collapse
|
41
|
Alam CM, George B, Sharfuddin C, Jain S, Chakraborty S. Occurrence and analysis of imperfect microsatellites in diverse potyvirus genomes. Gene 2013; 521:238-44. [DOI: 10.1016/j.gene.2013.02.045] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2012] [Revised: 02/14/2013] [Accepted: 02/19/2013] [Indexed: 12/30/2022]
|
42
|
Chen M, Tan Z, Zeng G, Zeng Z. Differential distribution of compound microsatellites in various Human Immunodeficiency Virus Type 1 complete genomes. INFECTION GENETICS AND EVOLUTION 2012; 12:1452-7. [DOI: 10.1016/j.meegid.2012.05.006] [Citation(s) in RCA: 27] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/23/2011] [Revised: 05/04/2012] [Accepted: 05/12/2012] [Indexed: 12/21/2022]
|
43
|
Meglécz E, Nève G, Biffin E, Gardner MG. Breakdown of phylogenetic signal: a survey of microsatellite densities in 454 shotgun sequences from 154 non model eukaryote species. PLoS One 2012; 7:e40861. [PMID: 22815847 PMCID: PMC3397955 DOI: 10.1371/journal.pone.0040861] [Citation(s) in RCA: 39] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/19/2012] [Accepted: 06/14/2012] [Indexed: 11/19/2022] Open
Abstract
Microsatellites are ubiquitous in Eukaryotic genomes. A more complete understanding of their origin and spread can be gained from a comparison of their distribution within a phylogenetic context. Although information for model species is accumulating rapidly, it is insufficient due to a lack of species depth, thus intragroup variation is necessarily ignored. As such, apparent differences between groups may be overinflated and generalizations cannot be inferred until an analysis of the variation that exists within groups has been conducted. In this study, we examined microsatellite coverage and motif patterns from 454 shotgun sequences of 154 Eukaryote species from eight distantly related phyla (Cnidaria, Arthropoda, Onychophora, Bryozoa, Mollusca, Echinodermata, Chordata and Streptophyta) to test if a consistent phylogenetic pattern emerges from the microsatellite composition of these species. It is clear from our results that data from model species provide incomplete information regarding the existing microsatellite variability within the Eukaryotes. A very strong heterogeneity of microsatellite composition was found within most phyla, classes and even orders. Autocorrelation analyses indicated that while microsatellite contents of species within clades more recent than 200 Mya tend to be similar, the autocorrelation breaks down and becomes negative or non-significant with increasing divergence time. Therefore, the age of the taxon seems to be a primary factor in degrading the phylogenetic pattern present among related groups. The most recent classes or orders of Chordates still retain the pattern of their common ancestor. However, within older groups, such as classes of Arthropods, the phylogenetic pattern has been scrambled by the long independent evolution of the lineages.
Collapse
Affiliation(s)
- Emese Meglécz
- IMBE UMR 7263 CNRS IRD, Aix-Marseille University, Marseille, France.
| | | | | | | |
Collapse
|
44
|
Meglécz E, Nève G, Biffin E, Gardner MG. Breakdown of phylogenetic signal: a survey of microsatellite densities in 454 shotgun sequences from 154 non model eukaryote species. PLoS One 2012. [PMID: 22815847 DOI: 10.1371/journal.pone.004086] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/11/2023] Open
Abstract
Microsatellites are ubiquitous in Eukaryotic genomes. A more complete understanding of their origin and spread can be gained from a comparison of their distribution within a phylogenetic context. Although information for model species is accumulating rapidly, it is insufficient due to a lack of species depth, thus intragroup variation is necessarily ignored. As such, apparent differences between groups may be overinflated and generalizations cannot be inferred until an analysis of the variation that exists within groups has been conducted. In this study, we examined microsatellite coverage and motif patterns from 454 shotgun sequences of 154 Eukaryote species from eight distantly related phyla (Cnidaria, Arthropoda, Onychophora, Bryozoa, Mollusca, Echinodermata, Chordata and Streptophyta) to test if a consistent phylogenetic pattern emerges from the microsatellite composition of these species. It is clear from our results that data from model species provide incomplete information regarding the existing microsatellite variability within the Eukaryotes. A very strong heterogeneity of microsatellite composition was found within most phyla, classes and even orders. Autocorrelation analyses indicated that while microsatellite contents of species within clades more recent than 200 Mya tend to be similar, the autocorrelation breaks down and becomes negative or non-significant with increasing divergence time. Therefore, the age of the taxon seems to be a primary factor in degrading the phylogenetic pattern present among related groups. The most recent classes or orders of Chordates still retain the pattern of their common ancestor. However, within older groups, such as classes of Arthropods, the phylogenetic pattern has been scrambled by the long independent evolution of the lineages.
Collapse
Affiliation(s)
- Emese Meglécz
- IMBE UMR 7263 CNRS IRD, Aix-Marseille University, Marseille, France.
| | | | | | | |
Collapse
|
45
|
Chen M, Zeng G, Tan Z, Jiang M, Zhang J, Zhang C, Lu L, Lin Y, Peng J. Compound microsatellites in complete Escherichia coli
genomes. FEBS Lett 2011; 585:1072-6. [DOI: 10.1016/j.febslet.2011.03.005] [Citation(s) in RCA: 30] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/04/2011] [Revised: 02/28/2011] [Accepted: 03/02/2011] [Indexed: 01/09/2023]
|
46
|
Mudunuri SB, Kumar P, Rao AA, Pallamsetty S, Nagarajaram HA. G-IMEx: A comprehensive software tool for detection of microsatellites from genome sequences. Bioinformation 2010; 5:221-3. [PMID: 21364802 PMCID: PMC3040503 DOI: 10.6026/97320630005221] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2010] [Accepted: 08/25/2010] [Indexed: 01/03/2023] Open
Abstract
Microsatellites are ubiquitous short tandem repeats found in all known genomes and are known to play a very important role in various studies and fields including DNA fingerprinting, paternity studies, evolutionary studies, virulence and adaptation of certain bacteria and viruses etc. Due to the sequencing of several genomes and the availability of enormous amounts of sequence data during the past few years, computational studies of microsatellites are of interest for many researchers. In this context, we developed a software tool called Imperfect Microsatellite Extractor (IMEx), to extract perfect, imperfect and compound microsatellites from genome sequences along with their complete statistics. Recently we developed a user-friendly graphical-interface using JAVA for IMEx to be used as a stand-alone software named G-IMEx. G-IMEx takes a nucleotide sequence as an input and the results are produced in both html and text formats. The Linux version of G-IMEx can be downloaded for free from http://www.cdfd.org.in/imex.
Collapse
Affiliation(s)
- Suresh B Mudunuri
- Department of Computer Science and Engineering, Aditya Engineering College (AEC), Surampalem 533 437, India
| | - Pankaj Kumar
- Laboratory of Computational Biology, Centre for DNA Fingerprinting and Diagnostics (CDFD), Hyderabad 500 001, India
| | - Allam Appa Rao
- Jawaharlal Nehru Technological University (JNTU), Kakinada, 533 003, India
| | - S Pallamsetty
- Department of Computer Science and Systems Engineering, Andhra University College of Engineering (AUCE), Visakhapatnam 530 003, India
| | - H A Nagarajaram
- Laboratory of Computational Biology, Centre for DNA Fingerprinting and Diagnostics (CDFD), Hyderabad 500 001, India
| |
Collapse
|
47
|
Macdonald AJ, Sarre SD, Fitzsimmons NN, Aitken N. Determining microsatellite genotyping reliability and mutation detection ability: an approach using small-pool PCR from sperm DNA. Mol Genet Genomics 2010; 285:1-18. [PMID: 20957392 DOI: 10.1007/s00438-010-0577-9] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2010] [Accepted: 09/10/2010] [Indexed: 11/26/2022]
Abstract
Microsatellite genotyping from trace DNA is now common in fields as diverse as medicine, forensics and wildlife genetics. Conversely, small-pool PCR (SP-PCR) has been used to investigate microsatellite mutation mechanisms in human DNA, but has had only limited application to non-human species. Trace DNA and SP-PCR studies share many challenges, including problems associated with allelic drop-out, false alleles and other PCR artefacts, and the need to reliably identify genuine alleles and/or mutations. We provide a framework for the validation of such studies without a multiple tube approach and demonstrate the utility of that approach with an analysis of microsatellite mutations in the tammar wallaby (Macropus eugenii). Specifically, we amplified three autosomal microsatellites from somatic DNA to characterise efficiency and reliability of PCR from low-template DNA. Reconstruction experiments determined our ability to discriminate mutations from parental alleles. We then developed rules to guide data interpretation. We estimated mutation rates in sperm DNA to range from 1.5 × 10(-2) to 2.2 × 10(-3) mutations per locus per generation. Large multi-step mutations were observed, providing evidence for complex mutation processes at microsatellites and potentially violating key assumptions in the stepwise mutation model. Our data demonstrate the necessity of actively searching for large mutation events when investigating microsatellite evolution and highlight the need for a thorough understanding of microsatellite amplification characteristics before embarking on SP-PCR or trace DNA studies.
Collapse
Affiliation(s)
- Anna J Macdonald
- Institute for Applied Ecology, University of Canberra, Canberra, ACT 2601, Australia.
| | | | | | | |
Collapse
|
48
|
Buschiazzo E, Gemmell NJ. Conservation of human microsatellites across 450 million years of evolution. Genome Biol Evol 2010; 2:153-65. [PMID: 20333231 PMCID: PMC2839350 DOI: 10.1093/gbe/evq007] [Citation(s) in RCA: 35] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 02/02/2010] [Indexed: 11/21/2022] Open
Abstract
The sequencing and comparison of vertebrate genomes have enabled the
identification of widely conserved genomic elements. Chief among these are genes
and cis-regulatory regions, which are often under selective
constraints that promote their retention in related organisms. The conservation
of elements that either lack function or whose functions are yet to be ascribed
has been relatively little investigated. In particular, microsatellites, a class
of highly polymorphic repetitive sequences considered by most to be neutrally
evolving junk DNA that is too labile to be maintained in distant species, have
not been comprehensively studied in a comparative genomic framework. Here, we
used the UCSC alignment of the human genome against those of 11 mammalian and
five nonmammalian vertebrates to identify and examine the extent of conservation
of human microsatellites in vertebrate genomes. Out of 696,016 microsatellites
found in human sequences, 85.39% were conserved in at least one other species,
whereas 28.65% and 5.98% were found in at least one and three nonprimate
species, respectively. An exponential decline of microsatellite conservation
with increasing evolutionary time, a comparable distribution of conserved versus
nonconserved microsatellites in the human genome, and a positive correlation
between microsatellite conservation and overall sequence conservation, all
suggest that most microsatellites are only maintained in genomes by chance,
although exceptionally conserved human microsatellites were also found in
distant mammals and other vertebrates. Our findings provide the first
comprehensive survey of microsatellite conservation across deep evolutionary
timescales, in this case 450 Myr of vertebrate evolution, and provide new tools
for the identification of functional conserved microsatellites, the development
of cross-species microsatellite markers and the study of microsatellite
evolution above the species level.
Collapse
Affiliation(s)
- Emmanuel Buschiazzo
- School of Biological Sciences, University of Canterbury, Christchurch, New Zealand.
| | | |
Collapse
|