101
|
Distinct patterns of simple sequence repeats and GC distribution in intragenic and intergenic regions of primate genomes. Aging (Albany NY) 2017; 8:2635-2654. [PMID: 27644032 PMCID: PMC5191860 DOI: 10.18632/aging.101025] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2016] [Accepted: 08/22/2016] [Indexed: 01/23/2023]
Abstract
As the first systematic examination of simple sequence repeats (SSRs) and guanine-cytosine (GC) distribution in intragenic and intergenic regions of ten primates, our study showed that SSRs and GC displayed nonrandom distribution for both intragenic and intergenic regions, suggesting that they have potential roles in transcriptional or translational regulation. Our results suggest that the majority of SSRs are distributed in non-coding regions, such as the introns, TEs, and intergenic regions. In these primates, trinucleotide perfect (P) SSRs were the most abundant repeats type in the 5'UTRs and CDSs, whereas, mononucleotide P-SSRs were the most in the intron, 3'UTRs, TEs, and intergenic regions. The GC-contents varied greatly among different intragenic and intergenic regions: 5'UTRs > CDSs > 3'UTRs > TEs > introns > intergenic regions, and high GC-content was frequently distributed in exon-rich regions. Our results also showed that in the same intragenic and intergenic regions, the distribution of GC-contents were great similarity in the different primates. Tri- and hexanucleotide P-SSRs had the most GC-contents in the 5'UTRs and CDSs, whereas mononucleotide P-SSRs had the least GC-contents in the six genomic regions of these primates. The most frequent motifs for different length varied obviously with the different genomic regions.
Collapse
|
102
|
Mahfooz S, Singh SP, Mishra N, Mishra A. A Comparison of Microsatellites in Phytopathogenic Aspergillus Species in Order to Develop Markers for the Assessment of Genetic Diversity among Its Isolates. Front Microbiol 2017; 8:1774. [PMID: 28979242 PMCID: PMC5611378 DOI: 10.3389/fmicb.2017.01774] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2017] [Accepted: 08/31/2017] [Indexed: 11/17/2022] Open
Abstract
The occurrence of Microsatellites (SSRs) has been witnessed in most of the fungal genomes however its abundance varies across species. In the present study, we analyzed the frequency of SSRs in the whole genome and transcripts of two phyto-pathogenic (Aspergillus niger and Aspergillus terreus) and compared them with two non-pathogenic (Aspergillus nidulans and Aspergillus oryzae) Aspergillus. Higher relative abundance and relative density of SSRs were observed in the whole genome and transcript sequences of the pathogenic Aspergillus when compared to the non-pathogenic. The relative abundance and density of SSRs were positively correlated with the G+C content of transcripts. Among the different classes of SSR, the percentage of tetra-nucleotide SSRs were maximum in A. niger (36.7%) and A. oryzae (35.9%) whereas A. nidulans and A. terreus preferred tri-nucleotide SSRs (38.2 and 42.1%) in whole genome sequences. In transcripts, tri-nucleotide SSRs were the most abundant whereas di-nucleotide SSRs were the least favored. Motif conservation study among the transcripts revealed conservation of only 27% motif within Aspergillus species. Furthermore, a similar relationship among the Ascomycetes was obtained on the basis of motif conservation and conserved genes (rDNA). To analyze the diversity present within the Indian isolates of Aspergillus, primers were successfully designed for 692 motifs in A. niger and A. terreus of which 20 were selected for diversity analysis. Among all the markers amplified, 10 markers (83.3%) were polymorphic, whereas remaining two markers (16.6%) were monomorphic. Ten polymorphic markers acquired in this investigation showed the utility of recently created SSR markers in the assessment of genetic diversity among various isolates of Aspergillus.
Collapse
Affiliation(s)
| | | | | | - Aradhana Mishra
- Division of Plant Microbe Interaction, CSIR-National Botanical Research InstituteLucknow, India
| |
Collapse
|
103
|
Bagshaw AT. Functional Mechanisms of Microsatellite DNA in Eukaryotic Genomes. Genome Biol Evol 2017; 9:2428-2443. [PMID: 28957459 PMCID: PMC5622345 DOI: 10.1093/gbe/evx164] [Citation(s) in RCA: 77] [Impact Index Per Article: 9.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 08/23/2017] [Indexed: 02/06/2023] Open
Abstract
Microsatellite repeat DNA is best known for its length mutability, which is implicated in several neurological diseases and cancers, and often exploited as a genetic marker. Less well-known is the body of work exploring the widespread and surprisingly diverse functional roles of microsatellites. Recently, emerging evidence includes the finding that normal microsatellite polymorphism contributes substantially to the heritability of human gene expression on a genome-wide scale, calling attention to the task of elucidating the mechanisms involved. At present, these are underexplored, but several themes have emerged. I review evidence demonstrating roles for microsatellites in modulation of transcription factor binding, spacing between promoter elements, enhancers, cytosine methylation, alternative splicing, mRNA stability, selection of transcription start and termination sites, unusual structural conformations, nucleosome positioning and modification, higher order chromatin structure, noncoding RNA, and meiotic recombination hot spots.
Collapse
|
104
|
Cao J, Wu L, Jin M, Li T, Hui K, Ren Q. Transcriptome profiling of the Macrobrachium rosenbergii lymphoid organ under the white spot syndrome virus challenge. FISH & SHELLFISH IMMUNOLOGY 2017; 67:27-39. [PMID: 28554835 DOI: 10.1016/j.fsi.2017.05.059] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/25/2016] [Revised: 05/23/2017] [Accepted: 05/25/2017] [Indexed: 06/07/2023]
Abstract
Macrobrachium rosenbergii is a crustacean with economic importance, and adult prawns are generally thought to be tolerant to white spot syndrome virus (WSSV) infection. Although certain genes are known to respond to WSSV infection and lymphoid tissue is an important immune organ, the response of lymphoid organ to WSSV infection is unclear. Next-generation sequencing was employed in this study to determine the transcriptome differences between WSSV infection and mock lymphoid organs. A total of 44,606,694 and 40,384,856 clean reads were generated and assembled into 73,658 and 72,374 unigenes from the control sample and the WSSV infection sample, respectively. Based on homology searches, KEGG, GO, and COG analysis, 21,323 unigenes were annotated. Among them, 4951 differential expression genes were identified and categorized into 244 metabolic pathways. Coagulation cascades, and pattern recognition receptor signaling pathways were used as examples to discuss the response of host to WSSV infection. We also identified 12,308 simple sequence repeats, which can be further used as functional markers. Results contribute to a better understanding of the immune response of prawn lymphoid organ to WSSV and provide information for identifying novel genes in the absence of the prawn genome.
Collapse
Affiliation(s)
- Jun Cao
- Institute of Life Sciences, Jiangsu University, Zhenjiang, Jiangsu, People's Republic of China
| | - Lei Wu
- Jiangsu Key Laboratory for Biodiversity and Biotechnology, Jiangsu Key Laboratory for Aquatic Crustacean Diseases, College of Life Sciences, Nanjing Normal University, Nanjing 210046, People's Republic of China
| | - Min Jin
- State Key Laboratory Breeding Base of Marine Genetic Resource, Third Institute of Oceanography, SOA, Xiamen 361005, People's Republic of China
| | - Tingting Li
- Jiangsu Key Laboratory for Biodiversity and Biotechnology, Jiangsu Key Laboratory for Aquatic Crustacean Diseases, College of Life Sciences, Nanjing Normal University, Nanjing 210046, People's Republic of China
| | - Kaimin Hui
- Jiangsu Key Laboratory for Biodiversity and Biotechnology, Jiangsu Key Laboratory for Aquatic Crustacean Diseases, College of Life Sciences, Nanjing Normal University, Nanjing 210046, People's Republic of China.
| | - Qian Ren
- Jiangsu Key Laboratory for Biodiversity and Biotechnology, Jiangsu Key Laboratory for Aquatic Crustacean Diseases, College of Life Sciences, Nanjing Normal University, Nanjing 210046, People's Republic of China; Co-Innovation Center for Marine Bio-Industry Technology of Jiangsu Province, Lianyungang, People's Republic of China.
| |
Collapse
|
105
|
Abstract
Microsatellites or simple sequence repeats (SSRs) are found in most organisms and play an important role in genomic organization and function. To characterize the abundance of SSRs (1-6 base-pairs [bp]) on the cattle Y chromsome, the relative frequency and density of perfect or uninterrupted SSRs based on the published Y chromosome sequence were examined. A total of 17,273 perfect SSRs were found, with total length of 324.78 kb, indicating that approximately 0.75% of the cattle Y chromosome sequence (43.30 Mb) comprises perfect SSRs, with an average length of 18.80 bp. The relative frequency and density were 398.92 loci/Mb and 7500.62 bp/Mb, respectively. The proportions of the six classes of perfect SSRs were highly variable on the cattle Y chromosome. Mononucleotide repeats had a total number of 8073 (46.74%) and an average length of 15.45 bp, and were the most abundant SSRs class, while the percentages of di-, tetra-, tri-, penta-, and hexa-nucleotide repeats were 22.86%, 11.98%, 11.58%, 6.65%, and 0.19%, respectively. Different classes of SSRs varied in their repeat number, with the highest being 42 for dinucleotides. Results reveal that repeat categories A, AC, AT, AAC, AGC, GTTT, CTTT, ATTT, and AACTG predominate on the Y chromosome. This study provides insight into the organization of cattle Y chromosome repetitive DNA, as well as information useful for developing more polymorphic cattle Y-chromosome-specific SSRs.
Collapse
Affiliation(s)
- Zhi-Jie Ma
- a Academy of Animal Science and Veterinary Medicine , Qinghai University , Xining , Qinghai , China
| |
Collapse
|
106
|
Abstract
The instability of microsatellite DNA repeats is responsible for at least 40 neurodegenerative diseases. Recently, Mirkin and co-workers presented a novel mechanism for microsatellite expansions based on break-induced replication (BIR) at sites of microsatellite-induced replication stalling and fork collapse. The BIR model aims to explain single-step, large expansions of CAG/CTG trinucleotide repeats in dividing cells. BIR has been characterized extensively in Saccharomyces cerevisiae as a mechanism to repair broken DNA replication forks (single-ended DSBs) and degraded telomeric DNA. However, the structural footprints of BIR-like DSB repair have been recognized in human genomic instability and tied to the etiology of diverse developmental diseases; thus, the implications of the paper by Kim et al. (Kim JC, Harris ST, Dinter T, Shah KA, et al., Nat Struct Mol Biol 24: 55-60) extend beyond trinucleotide repeat expansion in yeast and microsatellite instability in human neurological disorders. Significantly, insight into BIR-like repair can explain certain pathways of complex genome rearrangements (CGRs) initiated at non-B form microsatellite DNA in human cancers.
Collapse
Affiliation(s)
- Michael Leffak
- Department of Biochemistry and Molecular Biology, Boonshoft School of Medicine, Wright State University, Dayton, OH, USA
| |
Collapse
|
107
|
Li Z, Chen F, Huang C, Zheng W, Yu C, Cheng H, Zhou R. Genome-wide mapping and characterization of microsatellites in the swamp eel genome. Sci Rep 2017; 7:3157. [PMID: 28600492 PMCID: PMC5466649 DOI: 10.1038/s41598-017-03330-7] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2017] [Accepted: 04/26/2017] [Indexed: 11/09/2022] Open
Abstract
We described genome-wide screening and characterization of microsatellites in the swamp eel genome. A total of 99,293 microsatellite loci were identified in the genome with an overall density of 179 microsatellites per megabase of genomic sequences. The dinucleotide microsatellites were the most abundant type representing 71% of the total microsatellite loci and the AC-rich motifs were the most recurrent in all repeat types. Microsatellite frequency decreased as numbers of repeat units increased, which was more obvious in long than short microsatellite motifs. Most of microsatellites were located in non-coding regions, whereas only approximately 1% of the microsatellites were detected in coding regions. Trinucleotide repeats were most abundant microsatellites in the coding regions, which represented amino acid repeats in proteins. There was a chromosome-biased distribution of microsatellites in non-coding regions, with the highest density of 203.95/Mb on chromosome 8 and the least on chromosome 7 (164.06/Mb). The most abundant dinucleotides (AC)n was mainly located on chromosome 8. Notably, genomic mapping showed that there was a chromosome-biased association of genomic distributions between microsatellites and transposon elements. Thus, the novel dataset of microsatellites in swamp eel provides a valuable resource for further studies on QTL-based selection breeding, genetic resource conservation and evolutionary genetics.
Collapse
Affiliation(s)
- Zhigang Li
- Hubei Key Laboratory of Cell Homeostasis, Laboratory of Molecular and Developmental Genetics, College of Life Sciences, Wuhan University, Wuhan, 430072, P. R. China
| | - Feng Chen
- Hubei Key Laboratory of Cell Homeostasis, Laboratory of Molecular and Developmental Genetics, College of Life Sciences, Wuhan University, Wuhan, 430072, P. R. China
| | - Chunhua Huang
- Hubei Key Laboratory of Cell Homeostasis, Laboratory of Molecular and Developmental Genetics, College of Life Sciences, Wuhan University, Wuhan, 430072, P. R. China
| | - Weixin Zheng
- Hubei Key Laboratory of Cell Homeostasis, Laboratory of Molecular and Developmental Genetics, College of Life Sciences, Wuhan University, Wuhan, 430072, P. R. China
| | - Chunlai Yu
- Hubei Key Laboratory of Cell Homeostasis, Laboratory of Molecular and Developmental Genetics, College of Life Sciences, Wuhan University, Wuhan, 430072, P. R. China
| | - Hanhua Cheng
- Hubei Key Laboratory of Cell Homeostasis, Laboratory of Molecular and Developmental Genetics, College of Life Sciences, Wuhan University, Wuhan, 430072, P. R. China.
| | - Rongjia Zhou
- Hubei Key Laboratory of Cell Homeostasis, Laboratory of Molecular and Developmental Genetics, College of Life Sciences, Wuhan University, Wuhan, 430072, P. R. China.
| |
Collapse
|
108
|
Characterization of porcine simple sequence repeat variation on a population scale with genome resequencing data. Sci Rep 2017; 7:2376. [PMID: 28539617 PMCID: PMC5443785 DOI: 10.1038/s41598-017-02600-8] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2017] [Accepted: 04/13/2017] [Indexed: 12/23/2022] Open
Abstract
Simple sequence repeats (SSRs) are used as polymorphic molecular markers in many species. They contribute very important functional variations in a range of complex traits; however, little is known about the variation of most SSRs in pig populations. Here, using genome resequencing data, we identified ~0.63 million polymorphic SSR loci from more than 100 individuals. Through intensive analysis of this dataset, we found that the SSR motif composition, motif length, total length of alleles and distribution of alleles all contribute to SSR variability. Furthermore, we found that CG-containing SSRs displayed significantly lower polymorphism and higher cross-species conservation. With a rigorous filter procedure, we provided a catalogue of 16,527 high-quality polymorphic SSRs, which displayed reliable results for the analysis of phylogenetic relationships and provided valuable summary statistics for 30 individuals equally selected from eight local Chinese pig breeds, six commercial lean pig breeds and Chinese wild boars. In addition, from the high-quality polymorphic SSR catalogue, we identified four loci with potential loss-of-function alleles. Overall, these analyses provide a valuable catalogue of polymorphic SSRs to the existing pig genetic variation database, and we believe this catalogue could be used for future genome-wide genetic analysis.
Collapse
|
109
|
Liu S, Hou W, Sun T, Xu Y, Li P, Yue B, Fan Z, Li J. Genome-wide mining and comparative analysis of microsatellites in three macaque species. Mol Genet Genomics 2017; 292:537-550. [PMID: 28160080 DOI: 10.1007/s00438-017-1289-1] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2016] [Accepted: 01/09/2017] [Indexed: 12/13/2022]
Abstract
Microsatellites are found in taxonomically different organisms, and such repeats are related with genomic structure, function and certain diseases. To characterize microsatellites for macaques, we searched and compared SSRs with 1-6 bp nucleotide motifs in rhesus, cynomolgus and pigtailed macaque. A total of 1395671, 1284929 and 1266348 perfect SSRs were mined, respectively. The most frequent perfect SSRs were mononucleotide SSRs. The most GC-content was in dinucleotide SSRs and the least was in the mononucleotide SSRs. Chromosome size was positively correlated with SSR number and negatively correlated with the relative frequency and density of SSRs. The GC content of chromosome SSRs were negatively correlated with relative frequency of SSRs and GC content of chromosome sequences. The features of microsatellite distribution in assembled genomes of the three species were greatly similar, which revealed that the distributional pattern of microsatellites is probably conservative in genus Macaca. The degenerated number of repeat motifs was found to be different in pentanucleotide and hexanucleotide repeats. Species-specific motifs for each macaque were significantly underrepresented. Overall, SSR frequencies of each chromosome in rhesus macaque were higher than in cynomolgus macaque. The maximum repeat times of mono- to pentanucleotide repeats in cynomolgus macaque was more than other two macaques. These results emphasize the genetic diversity and phylogenetic relationship of genus Macaca species. Our data will be beneficial for comparative genome mapping, understanding the distribution of SSRs and genome structure between these animal models, and provide a foundation for further development and identification of more macaque-specific SSRs.
Collapse
Affiliation(s)
- Sanxu Liu
- Key Laboratory of Bio-resources and Eco-environment (Ministry of Education), College of Life Sciences, Sichuan University, Chengdu, 610064, People's Republic of China
| | - Wei Hou
- Key Laboratory of Bio-resources and Eco-environment (Ministry of Education), College of Life Sciences, Sichuan University, Chengdu, 610064, People's Republic of China
| | - Tianlin Sun
- Key Laboratory of Bio-resources and Eco-environment (Ministry of Education), College of Life Sciences, Sichuan University, Chengdu, 610064, People's Republic of China
| | - Yongtao Xu
- Key Laboratory of Bio-resources and Eco-environment (Ministry of Education), College of Life Sciences, Sichuan University, Chengdu, 610064, People's Republic of China
| | - Peng Li
- Key Laboratory of Bio-resources and Eco-environment (Ministry of Education), College of Life Sciences, Sichuan University, Chengdu, 610064, People's Republic of China
| | - Bisong Yue
- Key Laboratory of Bio-resources and Eco-environment (Ministry of Education), College of Life Sciences, Sichuan University, Chengdu, 610064, People's Republic of China.,Sichuan Key Laboratory of Conservation Biology on Endangered Wildlife, College of Life Sciences, Sichuan University, Chengdu, 610064, People's Republic of China
| | - Zhenxin Fan
- Key Laboratory of Bio-resources and Eco-environment (Ministry of Education), College of Life Sciences, Sichuan University, Chengdu, 610064, People's Republic of China
| | - Jing Li
- Key Laboratory of Bio-resources and Eco-environment (Ministry of Education), College of Life Sciences, Sichuan University, Chengdu, 610064, People's Republic of China. .,Sichuan Key Laboratory of Conservation Biology on Endangered Wildlife, College of Life Sciences, Sichuan University, Chengdu, 610064, People's Republic of China.
| |
Collapse
|
110
|
|
111
|
Single Amino Acid Repeats in the Proteome World: Structural, Functional, and Evolutionary Insights. PLoS One 2016; 11:e0166854. [PMID: 27893794 PMCID: PMC5125637 DOI: 10.1371/journal.pone.0166854] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2016] [Accepted: 11/05/2016] [Indexed: 12/15/2022] Open
Abstract
Microsatellites or simple sequence repeats (SSR) are abundant, highly diverse stretches of short DNA repeats present in all genomes. Tandem mono/tri/hexanucleotide repeats in the coding regions contribute to single amino acids repeats (SAARs) in the proteome. While SSRs in the coding region always result in amino acid repeats, a majority of SAARs arise due to a combination of various codons representing the same amino acid and not as a consequence of SSR events. Certain amino acids are abundant in repeat regions indicating a positive selection pressure behind the accumulation of SAARs. By analysing 22 proteomes including the human proteome, we explored the functional and structural relationship of amino acid repeats in an evolutionary context. Only ~15% of repeats are present in any known functional domain, while ~74% of repeats are present in the disordered regions, suggesting that SAARs add to the functionality of proteins by providing flexibility, stability and act as linker elements between domains. Comparison of SAAR containing proteins across species reveals that while shorter repeats are conserved among orthologs, proteins with longer repeats, >15 amino acids, are unique to the respective organism. Lysine repeats are well conserved among orthologs with respect to their length and number of occurrences in a protein. Other amino acids such as glutamic acid, proline, serine and alanine repeats are generally conserved among the orthologs with varying repeat lengths. These findings suggest that SAARs have accumulated in the proteome under positive selection pressure and that they provide flexibility for optimal folding of functional/structural domains of proteins. The insights gained from our observations can help in effective designing and engineering of proteins with novel features.
Collapse
|
112
|
Gadgil R, Barthelemy J, Lewis T, Leffak M. Replication stalling and DNA microsatellite instability. Biophys Chem 2016; 225:38-48. [PMID: 27914716 DOI: 10.1016/j.bpc.2016.11.007] [Citation(s) in RCA: 36] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2016] [Revised: 11/05/2016] [Accepted: 11/05/2016] [Indexed: 01/08/2023]
Abstract
Microsatellites are short, tandemly repeated DNA motifs of 1-6 nucleotides, also termed simple sequence repeats (SRSs) or short tandem repeats (STRs). Collectively, these repeats comprise approximately 3% of the human genome Subramanian et al. (2003), Lander and Lander (2001) [1,2], and represent a large reservoir of loci highly prone to mutations Sun et al. (2012), Ellegren (2004) [3,4] that contribute to human evolution and disease. Microsatellites are known to stall and reverse replication forks in model systems Pelletier et al. (2003), Samadashwily et al. (1997), Kerrest et al. (2009) [5-7], and are hotspots of chromosomal double strand breaks (DSBs). We briefly review the relationship of these repeated sequences to replication stalling and genome instability, and present recent data on the impact of replication stress on DNA fragility at microsatellites in vivo.
Collapse
Affiliation(s)
- R Gadgil
- Department of Biochemistry and Molecular Biology, Boonshoft School of Medicine, Wright State University, Dayton, OH 45435, USA
| | - J Barthelemy
- Department of Biochemistry and Molecular Biology, Boonshoft School of Medicine, Wright State University, Dayton, OH 45435, USA
| | - T Lewis
- Department of Biochemistry and Molecular Biology, Boonshoft School of Medicine, Wright State University, Dayton, OH 45435, USA
| | - M Leffak
- Department of Biochemistry and Molecular Biology, Boonshoft School of Medicine, Wright State University, Dayton, OH 45435, USA.
| |
Collapse
|
113
|
Liu L, Qin M, Yang L, Song Z, Luo L, Bao H, Ma Z, Zhou Z, Xu J. A genome-wide analysis of simple sequence repeats in Apis cerana and its development as polymorphism markers. Gene 2016; 599:53-59. [PMID: 27836668 DOI: 10.1016/j.gene.2016.11.016] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2016] [Revised: 10/16/2016] [Accepted: 11/07/2016] [Indexed: 11/24/2022]
Abstract
The Asian honeybee (Apis cerana) is an important indigenous species that play an indispensable role in the ecological balance and biological diversity. Few studies have been conducted to characterize the simple sequence repeats (SSRs) derived from A. cerana, so, in this study, a genome-wide screening for SSRs were firstly performed in the genome of A. cerana by comparison with that in west honeybee (Apis mellifera). There were 20,9991 SSRs distributed throughout the genome of A. cerana (Korea strain) and di-nucleotides were the most frequent SSR type. Both total number and density of SSRs in A. cerana genome were smaller than that in A. mellifera genome. Through comparing length discrepancy of SSRs loci among several isolates based on sequence alignment, 218 potential polymorphic SSRs primers derived from A. cerana were presented. Five among these SSR markers were evaluated for amplification in twenty-eight colonies of Apis cerana cerana (Chinese honeybee), which showed highly polymorphic, with the value of Polymorphism information content (PIC) ranging from 0.47 to 0.61. All these results will contribute to further develop more effective SSRs markers derived from A. cerana, which can be used to study genetic structure and population polymorphism of Asian honeybee.
Collapse
Affiliation(s)
- Lu Liu
- College of Life Sciences, Chongqing Normal University, Chongqing, China
| | - Mingzhu Qin
- College of Life Sciences, Chongqing Normal University, Chongqing, China
| | - Lin Yang
- College of Life Sciences, Chongqing Normal University, Chongqing, China
| | - Zhenzhen Song
- College of Life Sciences, Chongqing Normal University, Chongqing, China
| | - Li Luo
- College of Life Sciences, Chongqing Normal University, Chongqing, China
| | - Hongyin Bao
- College of Life Sciences, Chongqing Normal University, Chongqing, China
| | - Zhenggang Ma
- College of Life Sciences, Chongqing Normal University, Chongqing, China
| | - Zeyang Zhou
- College of Life Sciences, Chongqing Normal University, Chongqing, China; State Key Laboratory of Silkworm Genome Biology, Southwest University, Chongqing, China
| | - Jinshan Xu
- College of Life Sciences, Chongqing Normal University, Chongqing, China.
| |
Collapse
|
114
|
Kiran K, Rawal HC, Dubey H, Jaswal R, Devanna BN, Gupta DK, Bhardwaj SC, Prasad P, Pal D, Chhuneja P, Balasubramanian P, Kumar J, Swami M, Solanke AU, Gaikwad K, Singh NK, Sharma TR. Draft Genome of the Wheat Rust Pathogen (Puccinia triticina) Unravels Genome-Wide Structural Variations during Evolution. Genome Biol Evol 2016; 8:2702-21. [PMID: 27521814 PMCID: PMC5630921 DOI: 10.1093/gbe/evw197] [Citation(s) in RCA: 44] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 08/06/2016] [Indexed: 01/02/2023] Open
Abstract
Leaf rust is one of the most important diseases of wheat and is caused by Puccinia triticina, a highly variable rust pathogen prevalent worldwide. Decoding the genome of this pathogen will help in unraveling the molecular basis of its evolution and in the identification of genes responsible for its various biological functions. We generated high quality draft genome sequences (approximately 100- 106 Mb) of two races of P. triticina; the variable and virulent Race77 and the old, avirulent Race106. The genomes of races 77 and 106 had 33X and 27X coverage, respectively. We predicted 27678 and 26384 genes, with average lengths of 1,129 and 1,086 bases in races 77 and 106, respectively and found that the genomes consisted of 37.49% and 39.99% repetitive sequences. Genome wide comparative analysis revealed that Race77 differs substantially from Race106 with regard to segmental duplication (SD), repeat element, and SNP/InDel characteristics. Comparative analyses showed that Race 77 is a recent, highly variable and adapted Race compared with Race106. Further sequence analyses of 13 additional pathotypes of Race77 clearly differentiated the recent, active and virulent, from the older pathotypes. Average densities of 2.4 SNPs and 0.32 InDels per kb were obtained for all P. triticina pathotypes. Secretome analysis demonstrated that Race77 has more virulence factors than Race 106, which may be responsible for the greater degree of adaptation of this pathogen. We also found that genes under greater selection pressure were conserved in the genomes of both races, and may affect functions crucial for the higher levels of virulence factors in Race77. This study provides insights into the genome structure, genome organization, molecular basis of variation, and pathogenicity of P. triticina The genome sequence data generated in this study have been submitted to public domain databases and will be an important resource for comparative genomics studies of the more than 4000 existing Puccinia species.
Collapse
Affiliation(s)
- Kanti Kiran
- ICAR-National Research Centre on Plant Biotechnology, New Delhi, India
| | - Hukam C Rawal
- ICAR-National Research Centre on Plant Biotechnology, New Delhi, India
| | - Himanshu Dubey
- ICAR-National Research Centre on Plant Biotechnology, New Delhi, India
| | - Rajdeep Jaswal
- ICAR-National Research Centre on Plant Biotechnology, New Delhi, India
| | - B N Devanna
- ICAR-National Research Centre on Plant Biotechnology, New Delhi, India
| | | | - Subhash C Bhardwaj
- ICAR - Indian Institute of Wheat and Barley Research, Regional Station, Flowerdale, Shimla, India
| | - P Prasad
- ICAR - Indian Institute of Wheat and Barley Research, Regional Station, Flowerdale, Shimla, India
| | - Dharam Pal
- ICAR - Indian Agricultural Research Institute, Regional Station Tutikandi Centre, Shimla, India
| | | | | | - J Kumar
- ICAR - National Institute of Biotic Stress Management, Raipur, Chhattisgarh, India
| | - M Swami
- ICAR-Indian Agricultural Research Institute, Regional Station, Wellington, India
| | | | - Kishor Gaikwad
- ICAR-National Research Centre on Plant Biotechnology, New Delhi, India
| | - Nagendra K Singh
- ICAR-National Research Centre on Plant Biotechnology, New Delhi, India
| | - Tilak Raj Sharma
- ICAR-National Research Centre on Plant Biotechnology, New Delhi, India
| |
Collapse
|
115
|
|
116
|
Shao C, Lin M, Zhou Z, Zhou Y, Shen Y, Xue A, Zhou H, Tang Q, Xie J. Mutation analysis of 19 autosomal short tandem repeats in Chinese Han population from Shanghai. Int J Legal Med 2016; 130:1439-1444. [DOI: 10.1007/s00414-016-1427-z] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2016] [Accepted: 07/19/2016] [Indexed: 10/21/2022]
|
117
|
Lin JC, Wang CC, Jiang RS, Wang WY, Liu SA. Impact of microsatellite alteration in surgical margins on local recurrence in oral cavity cancer patients. Eur Arch Otorhinolaryngol 2016; 274:431-439. [PMID: 27430224 DOI: 10.1007/s00405-016-4215-y] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2016] [Accepted: 07/13/2016] [Indexed: 10/21/2022]
Abstract
The aim of this study was to investigate the association between microsatellite alteration in the surgical margins and local recurrence of oral cavity squamous cell carcinoma patients. Surgical specimens confirmed by pathological examination and corresponding surgical margins were collected from 120 oral cavity squamous cell carcinoma patients. Ten microsatellite markers were examined in the tumor specimens and paired surgical margins, which proved to be negative on pathological assessment. The specimens and surgical margins were amplified by polymerase chain reaction followed by computerized analysis. Forty-two specimens (35.0 %) with microsatellite instability (MSI) in at least one marker were found, and more than half of the specimens (n = 73, 60.8 %) had loss of heterozygosity (LOH) in at least one marker. Although MSI and LOH were not associated with the prognosis of oral cavity squamous cell carcinoma patients, presence of MSI in the tumor-free surgical margins increased the risk of local recurrence (hazard ratio: 9.549; 95 % confidence interval: 4.143-22.01). Genetic analysis of tumor-free surgical margins is a useful tool for identifying oral cavity squamous cell carcinoma patients who are vulnerable to local recurrence.
Collapse
Affiliation(s)
- Jin-Ching Lin
- Department of Radiation Oncology, Taichung Veterans General Hospital, Taichung, Taiwan.,Faculty of Medicine, School of Medicine, National Yang-Ming University, Taipei, Taiwan
| | - Chen-Chi Wang
- Department of Otolaryngology, Taichung Veterans General Hospital, No. 1650, Sec 4, Taiwan Boulevard, Taichung, Taiwan.,Faculty of Medicine, School of Medicine, National Yang-Ming University, Taipei, Taiwan
| | - Rong-San Jiang
- Department of Otolaryngology, Taichung Veterans General Hospital, No. 1650, Sec 4, Taiwan Boulevard, Taichung, Taiwan
| | - Wen-Yi Wang
- Department of Nursing, HungKuang University, Taichung, Taiwan
| | - Shih-An Liu
- Department of Otolaryngology, Taichung Veterans General Hospital, No. 1650, Sec 4, Taiwan Boulevard, Taichung, Taiwan. .,Faculty of Medicine, School of Medicine, National Yang-Ming University, Taipei, Taiwan. .,Department of Medical Research, China Medical University Hospital, China Medical University, Taichung, Taiwan.
| |
Collapse
|
118
|
Fungtammasan A, Tomaszkiewicz M, Campos-Sánchez R, Eckert KA, DeGiorgio M, Makova KD. Reverse Transcription Errors and RNA-DNA Differences at Short Tandem Repeats. Mol Biol Evol 2016; 33:2744-58. [PMID: 27413049 PMCID: PMC5026258 DOI: 10.1093/molbev/msw139] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022] Open
Abstract
Transcript variation has important implications for organismal function in health and disease. Most transcriptome studies focus on assessing variation in gene expression levels and isoform representation. Variation at the level of transcript sequence is caused by RNA editing and transcription errors, and leads to nongenetically encoded transcript variants, or RNA–DNA differences (RDDs). Such variation has been understudied, in part because its detection is obscured by reverse transcription (RT) and sequencing errors. It has only been evaluated for intertranscript base substitution differences. Here, we investigated transcript sequence variation for short tandem repeats (STRs). We developed the first maximum-likelihood estimator (MLE) to infer RT error and RDD rates, taking next generation sequencing error rates into account. Using the MLE, we empirically evaluated RT error and RDD rates for STRs in a large-scale DNA and RNA replicated sequencing experiment conducted in a primate species. The RT error rates increased exponentially with STR length and were biased toward expansions. The RDD rates were approximately 1 order of magnitude lower than the RT error rates. The RT error rates estimated with the MLE from a primate data set were concordant with those estimated with an independent method, barcoded RNA sequencing, from a Caenorhabditis elegans data set. Our results have important implications for medical genomics, as STR allelic variation is associated with >40 diseases. STR nonallelic transcript variation can also contribute to disease phenotype. The MLE and empirical rates presented here can be used to evaluate the probability of disease-associated transcripts arising due to RDD.
Collapse
Affiliation(s)
- Arkarachai Fungtammasan
- Integrative Biosciences, Bioinformatics and Genomics Option, Pennsylvania State University Department of Biology, Pennsylvania State University Center for Medical Genomics, Pennsylvania State University Huck Institute of Genome Sciences, Pennsylvania State University
| | - Marta Tomaszkiewicz
- Department of Biology, Pennsylvania State University Center for Medical Genomics, Pennsylvania State University
| | - Rebeca Campos-Sánchez
- Department of Biology, Pennsylvania State University Center for Medical Genomics, Pennsylvania State University
| | - Kristin A Eckert
- Center for Medical Genomics, Pennsylvania State University Department of Pathology, The Jake Gittlen Laboratories for Cancer Research, The Pennsylvania State University College of Medicine
| | - Michael DeGiorgio
- Department of Biology, Pennsylvania State University Center for Medical Genomics, Pennsylvania State University Institute for CyberScience, Pennsylvania State University
| | - Kateryna D Makova
- Department of Biology, Pennsylvania State University Center for Medical Genomics, Pennsylvania State University Huck Institute of Genome Sciences, Pennsylvania State University
| |
Collapse
|
119
|
Xu Y, Hu Z, Wang C, Zhang X, Li J, Yue B. Characterization of perfect microsatellite based on genome-wide and chromosome level in Rhesus monkey (Macaca mulatta). Gene 2016; 592:269-75. [PMID: 27395431 DOI: 10.1016/j.gene.2016.07.016] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2016] [Revised: 06/21/2016] [Accepted: 07/05/2016] [Indexed: 10/21/2022]
Abstract
Microsatellite studies based on chromosomes level would contribute to the biometric correlation analysis of chromosome and microsatellite applications on the specific chromosome. In this study, the total microsatellite length of 1,141,024 loci was 21.8Mb, which covered about 0.74% of the male Rhesus monkey genome. Perfect mononucleotide SSRs were the most abundant, followed by the pattern: perfect di->tetra->tri->penta->hexanucleotide SSRs. The main range of repeat times focused on 12-32 times (mono-), 7-23 times (di-), 5-10 times (tri-), 4-14 times (tetra-), 4-9 times (penta-), 4-8 times (hexa-), respectively. The largest SSRs number was found in chromosome 1 with 94,347 loci, followed by chromosome 3, 2, 7 and 5, and the smallest number was in chromosome 18. The predominant repeat types in male Rhesus monkey genome and chromosome Y were basically A, AC, AG, AAT, AAC, AAAT, AAAC, AAAG, AAACA and AAACAA. SSRs number of all chromosomes was closely positively correlated with chromosome sequence size (r=0.969, p<0.01), and significantly negatively correlated with abundance (r=-0.24, 0.01<p<0.05). The lengths of all chromosomes were significantly negatively correlated with microsatellite density (r=-0.456, 0.01<p<0.05), and relative abundance and density of SSRs in all chromosomes were significantly negatively correlated with SSR GC content (r=-0.939/-0.928, p<0.01). The SSRs GC content on chromosome X (accounting for 16.71%) was found to be the highest in female Rhesus monkey, which might contributed to the DNA methylation of CpG islands for sex chromosome X inactivation and expression regulation. These results and exported tetranucleotide repeat sequences in each chromosome for primer design would facilitate the exploration of microsatellites structural function, composition mode and molecular markers development in Rhesus monkey genome.
Collapse
Affiliation(s)
- Yongtao Xu
- Key Laboratory of Bio-resources and Eco-environment (Ministry of Education), College of Life Sciences, Sichuan University, Chengdu 610064, PR China; Sichuan Key Laboratory of Conservation Biology on Endangered Wildlife, College of Life Sciences, Sichuan University, Chengdu 610064, PR China
| | - Zongxiu Hu
- Yibin HengShu Animal Models Resourse Industry Technology Academy, Yibin 644609, PR China
| | - Chen Wang
- Key Laboratory of Bio-resources and Eco-environment (Ministry of Education), College of Life Sciences, Sichuan University, Chengdu 610064, PR China
| | - Xiuyue Zhang
- Key Laboratory of Bio-resources and Eco-environment (Ministry of Education), College of Life Sciences, Sichuan University, Chengdu 610064, PR China; Sichuan Key Laboratory of Conservation Biology on Endangered Wildlife, College of Life Sciences, Sichuan University, Chengdu 610064, PR China
| | - Jing Li
- Key Laboratory of Bio-resources and Eco-environment (Ministry of Education), College of Life Sciences, Sichuan University, Chengdu 610064, PR China; Sichuan Key Laboratory of Conservation Biology on Endangered Wildlife, College of Life Sciences, Sichuan University, Chengdu 610064, PR China
| | - Bisong Yue
- Key Laboratory of Bio-resources and Eco-environment (Ministry of Education), College of Life Sciences, Sichuan University, Chengdu 610064, PR China; Sichuan Key Laboratory of Conservation Biology on Endangered Wildlife, College of Life Sciences, Sichuan University, Chengdu 610064, PR China.
| |
Collapse
|
120
|
Genomic leftovers: identifying novel microsatellites, over-represented motifs and functional elements in the human genome. Sci Rep 2016; 6:27722. [PMID: 27278669 PMCID: PMC4899811 DOI: 10.1038/srep27722] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2015] [Accepted: 05/23/2016] [Indexed: 01/29/2023] Open
Abstract
The human genome is 99% complete. This study contributes to filling the 1% gap by enriching previously unknown repeat regions called microsatellites (MST). We devised a Global MST Enrichment (GME) kit to enrich and nextgen sequence 2 colorectal cell lines and 16 normal human samples to illustrate its utility in identifying contigs from reads that do not map to the genome reference. The analysis of these samples yielded 790 novel extra-referential concordant contigs that are observed in more than one sample. We searched for evidence of functional elements in the concordant contigs in two ways: (1) BLAST-ing each contig against normal RNA-Seq samples, (2) Checking for predicted functional elements using GlimmerHMM. Of the 790 concordant contigs, 37 had an exact match to at least one RNA-Seq read; 15 aligned to more than 100 RNA-Seq reads. Of the 249 concordant contigs predicted by GlimmerHMM to have functional elements, 6 had at least one exact RNA-Seq match. BLAST-ing these novel contigs against all publically available sequences confirmed that they were found in human and chimpanzee BAC and FOSMID clones sequenced as part of the original human genome project. These extra-referential contigs predominantly contained pentameric repeats, especially two motifs: AATGG and GTGGA.
Collapse
|
121
|
Mahfooz S, Singh SP, Rakh R, Bhattacharya A, Mishra N, Singh PC, Chauhan PS, Nautiyal CS, Mishra A. A Comprehensive Characterization of Simple Sequence Repeats in the Sequenced Trichoderma Genomes Provides Valuable Resources for Marker Development. Front Microbiol 2016; 7:575. [PMID: 27199911 PMCID: PMC4846858 DOI: 10.3389/fmicb.2016.00575] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2016] [Accepted: 04/07/2016] [Indexed: 11/13/2022] Open
Abstract
Members of genus Trichoderma are known worldwide for mycoparasitism. To gain a better insight into the organization and evolution of their genomes, we used an in silico approach to compare the occurrence, relative abundance and density of SSRs in Trichoderma atroviride, T. harzianum, T. reesei, and T. virens. Our analysis revealed that in all the four genome sequences studied, the occurrence, relative abundance, and density of microsatellites varied and was not influenced by genome sizes. The relative abundance and density of SSRs positively correlated with the G + C content of their genomes. The maximum frequency of SSRs was observed in the smallest genome of T. reesei whereas it was least in second smallest genome of T. atroviride. Among different classes of repeats, the tri-nucleotide repeats were abundant in all the genomes and accounts for ∼38%, whereas hexa-nuceotide repeats were the least (∼10.2%). Further evaluation of the conservation of motifs in the transcript sequences shows a 49.5% conservation among all the motifs. In order to study polymorphism in Trichoderma isolates, 12 polymorphic SSR markers were developed. Of the 12 markers, 6 markers are from T. atroviride and remaining 6 belong to T. harzianum. SSR markers were found to be more polymorphic from T. atroviride with an average polymorphism information content value of 0.745 in comparison with T. harzianum (0.615). Twelve polymorphic markers obtained in this study clearly demonstrate the utility of newly developed SSR markers in establishing genetic relationships among different isolates of Trichoderma.
Collapse
Affiliation(s)
- Sahil Mahfooz
- Division of Plant Microbe Interaction, Council of Scientific and Industrial Research-National Botanical Research Institute Lucknow, India
| | - Satyendra P Singh
- Division of Plant Microbe Interaction, Council of Scientific and Industrial Research-National Botanical Research Institute Lucknow, India
| | - Ramraje Rakh
- Maharashtra Institute of Medical Sciences and Research Medical College Latur, India
| | - Arpita Bhattacharya
- Division of Plant Microbe Interaction, Council of Scientific and Industrial Research-National Botanical Research Institute Lucknow, India
| | - Nishtha Mishra
- Division of Plant Microbe Interaction, Council of Scientific and Industrial Research-National Botanical Research Institute Lucknow, India
| | - Poonam C Singh
- Division of Plant Microbe Interaction, Council of Scientific and Industrial Research-National Botanical Research Institute Lucknow, India
| | - Puneet S Chauhan
- Division of Plant Microbe Interaction, Council of Scientific and Industrial Research-National Botanical Research Institute Lucknow, India
| | - Chandra S Nautiyal
- Division of Plant Microbe Interaction, Council of Scientific and Industrial Research-National Botanical Research Institute Lucknow, India
| | - Aradhana Mishra
- Division of Plant Microbe Interaction, Council of Scientific and Industrial Research-National Botanical Research Institute Lucknow, India
| |
Collapse
|
122
|
Someswara Rao C, Raju SV. Next generation sequencing (NGS) database for tandem repeats with multiple pattern 2°-shaft multicore string matching. GENOMICS DATA 2016; 7:307-17. [PMID: 26981434 PMCID: PMC4778683 DOI: 10.1016/j.gdata.2016.01.015] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/14/2015] [Revised: 01/15/2016] [Accepted: 01/27/2016] [Indexed: 11/25/2022]
Abstract
Next generation sequencing (NGS) technologies have been rapidly applied in biomedical and biological research in recent years. To provide the comprehensive NGS resource for the research, in this paper , we have considered 10 loci/codi/repeats TAGA, TCAT, GAAT, AGAT, AGAA, GATA, TATC, CTTT, TCTG and TCTA. Then we developed the NGS Tandem Repeat Database (TandemRepeatDB) for all the chromosomes of Homo sapiens, Callithrix jacchus, Chlorocebus sabaeus, Gorilla gorilla, Macaca fascicularis, Macaca mulatta, Nomascus leucogenys, Pan troglodytes, Papio anubis and Pongo abelii genome data sets for all those locis. We find the successive occurence frequency for all the above 10 SSR (simple sequence repeats) in the above genome data sets on a chromosome-by-chromosome basis with multiple pattern 2° shaft multicore string matching.
Collapse
Affiliation(s)
| | - S Viswanadha Raju
- Department of CSE, JNTUCEJ, JNTUniversity Hyderabad, Telangana, India
| |
Collapse
|
123
|
Shao C, Zhang Y, Zhou Y, Zhu W, Xu H, Liu Z, Tang Q, Shen Y, Xie J. Identification and characterization of the highly polymorphic locus D14S739 in the Han Chinese population. Croat Med J 2016; 56:482-9. [PMID: 26526885 PMCID: PMC4655933 DOI: 10.3325/cmj.2015.56.482] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022] Open
Abstract
AIM To systemically select and evaluate short tandem repeats (STRs) on the chromosome 14 and obtain new STR loci as expanded genotyping markers for forensic application. METHODS STRs on the chromosome 14 were filtered from Tandem Repeats Database and further selected based on their positions on the chromosome, repeat patterns of the core sequences, sequence homology of the flanking regions, and suitability of flanking regions in primer design. The STR locus with the highest heterozygosity and polymorphism information content (PIC) was selected for further analysis of genetic polymorphism, forensic parameters, and the core sequence. RESULTS Among 26 STR loci selected as candidates, D14S739 had the highest heterozygosity (0.8691) and PIC (0.8432), and showed no deviation from the Hardy-Weinberg equilibrium. 14 alleles were observed, ranging in size from 21 to 34 tetranucleotide units in the core region of (GATA)9-18 (GACA)7-12 GACG (GACA)2 GATA. Paternity testing showed no mutations. CONCLUSION D14S739 is a highly informative STR locus and could be a suitable genetic marker for forensic applications in the Han Chinese population.
Collapse
Affiliation(s)
| | | | | | | | | | | | | | | | - Jianhui Xie
- Jianhui Xie, Department of Forensic Medicine, Shanghai Medical College of Fudan University, Shanghai, China,
| |
Collapse
|
124
|
Cheng J, Zhao Z, Li B, Qin C, Wu Z, Trejo-Saavedra DL, Luo X, Cui J, Rivera-Bustamante RF, Li S, Hu K. A comprehensive characterization of simple sequence repeats in pepper genomes provides valuable resources for marker development in Capsicum. Sci Rep 2016; 6:18919. [PMID: 26739748 PMCID: PMC4703971 DOI: 10.1038/srep18919] [Citation(s) in RCA: 39] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/27/2015] [Accepted: 11/30/2015] [Indexed: 02/05/2023] Open
Abstract
The sequences of the full set of pepper genomes including nuclear, mitochondrial and chloroplast are now available for use. However, the overall of simple sequence repeats (SSR) distribution in these genomes and their practical implications for molecular marker development in Capsicum have not yet been described. Here, an average of 868,047.50, 45.50 and 30.00 SSR loci were identified in the nuclear, mitochondrial and chloroplast genomes of pepper, respectively. Subsequently, systematic comparisons of various species, genome types, motif lengths, repeat numbers and classified types were executed and discussed. In addition, a local database composed of 113,500 in silico unique SSR primer pairs was built using a homemade bioinformatics workflow. As a pilot study, 65 polymorphic markers were validated among a wide collection of 21 Capsicum genotypes with allele number and polymorphic information content value per marker raging from 2 to 6 and 0.05 to 0.64, respectively. Finally, a comparison of the clustering results with those of a previous study indicated the usability of the newly developed SSR markers. In summary, this first report on the comprehensive characterization of SSR motifs in pepper genomes and the very large set of SSR primer pairs will benefit various genetic studies in Capsicum.
Collapse
Affiliation(s)
- Jiaowen Cheng
- College of Horticulture, South China Agricultural University, Guangzhou 510642, China
| | - Zicheng Zhao
- Department of Computer Science, City University of Hong Kong, Hong Kong 999077, China
| | - Bo Li
- College of Horticulture, South China Agricultural University, Guangzhou 510642, China
| | - Cheng Qin
- Pepper Institute, Zunyi Academy of Agricultural Sciences, Zunyi, Guizhou 563102, China
| | - Zhiming Wu
- College of Horticulture and Landscape Architecture, Zhongkai University of Agriculture and Engineering, Guangzhou 510225, China
| | - Diana L. Trejo-Saavedra
- Departamento de Ingeniería Genética, Centro de Investigación y de Estudios Avanzados del IPN (Cinvestav)-Unidad Irapuato, Irapuato 36821, México
| | - Xirong Luo
- Pepper Institute, Zunyi Academy of Agricultural Sciences, Zunyi, Guizhou 563102, China
| | - Junjie Cui
- College of Horticulture, South China Agricultural University, Guangzhou 510642, China
| | - Rafael F. Rivera-Bustamante
- Departamento de Ingeniería Genética, Centro de Investigación y de Estudios Avanzados del IPN (Cinvestav)-Unidad Irapuato, Irapuato 36821, México
| | - Shuaicheng Li
- Department of Computer Science, City University of Hong Kong, Hong Kong 999077, China
| | - Kailin Hu
- College of Horticulture, South China Agricultural University, Guangzhou 510642, China
| |
Collapse
|
125
|
Engineered Nucleases and Trinucleotide Repeat Diseases. ADVANCES IN EXPERIMENTAL MEDICINE AND BIOLOGY 2016. [DOI: 10.1007/978-1-4939-3509-3_9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]
|
126
|
Ranade SS, Lin YC, Van de Peer Y, García-Gil MR. Comparative in silico analysis of SSRs in coding regions of high confidence predicted genes in Norway spruce (Picea abies) and Loblolly pine (Pinus taeda). BMC Genet 2015; 16:149. [PMID: 26706685 PMCID: PMC4691297 DOI: 10.1186/s12863-015-0304-y] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2015] [Accepted: 12/10/2015] [Indexed: 11/24/2022] Open
Abstract
Background Microsatellites or simple sequence repeats (SSRs) are DNA sequences consisting of 1–6 bp tandem repeat motifs present in the genome. SSRs are considered to be one of the most powerful tools in genetic studies. We carried out a comparative study of perfect SSR loci belonging to class I (≥20) and class II (≥12 and <20 bp) types located in coding regions of high confidence genes in Picea abies and Pinus taeda. SSRLocator was used to retrieve SSRs from the full length CDS of predicted genes in both species. Results Trimers were the most abundant motifs in class I followed by hexamers in Picea abies, while trimers and hexamers were equally abundant in Pinus taeda class I SSRs. Hexamers were most frequent within class II SSRs followed by trimers, in both species. Although the frequency of genes containing SSRs was slightly higher in Pinus taeda, SSR counts per Mbp for class I was similar in both species (P-value = 0.22); while for class II SSRs, it was significantly higher in Picea abies (P-value = 0.00009). AT-rich motifs were higher in abundance than the GC-rich motifs, within class II SSRs in both the species (P-values = 10−9 and 0). With reference to class I SSRs, AT-rich and GC-rich motifs were detected with equal frequency in Pinus taeda (P-value = 0.24); while in Picea abies, GC-rich motifs were detected with higher frequency than the AT-rich motifs (P-value = 0.0005). Conclusions Our study gives a comparative overview of the genome SSRs composition based on high confidence genes in the two recently sequenced and economically important conifers and, also provides information on functional molecular markers that can be applied in genetic studies in Pinus and Picea species. Electronic supplementary material The online version of this article (doi:10.1186/s12863-015-0304-y) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Sonali Sachin Ranade
- Department of Forest Genetics and Plant Physiology, Umeå Plant Science Centre, Swedish University of Agricultural Sciences, SE-901 83, Umeå, Sweden.
| | - Yao-Cheng Lin
- Department of Plant Systems Biology (VIB) and Department of Plant Biotechnology and Bioinformatics, Ghent University, Technologiepark 927, 9052, Ghent, Belgium.
| | - Yves Van de Peer
- Department of Plant Systems Biology (VIB) and Department of Plant Biotechnology and Bioinformatics, Ghent University, Technologiepark 927, 9052, Ghent, Belgium. .,Genomics Research Institute, University of Pretoria, Hatfield Campus, Pretoria, 0028, South Africa. .,Bioinformatics Institute Ghent, Ghent University, 9052, Ghent, Belgium.
| | - María Rosario García-Gil
- Department of Forest Genetics and Plant Physiology, Umeå Plant Science Centre, Swedish University of Agricultural Sciences, SE-901 83, Umeå, Sweden.
| |
Collapse
|
127
|
Fertin G, Jean G, Radulescu A, Rusu I. Hybrid de novo tandem repeat detection using short and long reads. BMC Med Genomics 2015; 8 Suppl 3:S5. [PMID: 26399998 PMCID: PMC4582210 DOI: 10.1186/1755-8794-8-s3-s5] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022] Open
Abstract
Background As one of the most studied genome rearrangements, tandem repeats have a considerable impact on genetic backgrounds of inherited diseases. Many methods designed for tandem repeat detection on reference sequences obtain high quality results. However, in the case of a de novo context, where no reference sequence is available, tandem repeat detection remains a difficult problem. The short reads obtained with the second-generation sequencing methods are not long enough to span regions that contain long repeats. This length limitation was tackled by the long reads obtained with the third-generation sequencing platforms such as Pacific Biosciences technologies. Nevertheless, the gain on the read length came with a significant increase of the error rate. The main objective of nowadays studies on long reads is to handle the high error rate up to 16%. Methods In this paper we present MixTaR, the first de novo method for tandem repeat detection that combines the high-quality of short reads and the large length of long reads. Our hybrid algorithm uses the set of short reads for tandem repeat pattern detection based on a de Bruijn graph. These patterns are then validated using the long reads, and the tandem repeat sequences are constructed using local greedy assemblies. Results MixTaR is tested with both simulated and real reads from complex organisms. For a complete analysis of its robustness to errors, we use short and long reads with different error rates. The results are then analysed in terms of number of tandem repeats detected and the length of their patterns. Conclusions Our method shows high precision and sensitivity. With low false positive rates even for highly erroneous reads, MixTaR is able to detect accurate tandem repeats with pattern lengths varying within a significant interval.
Collapse
|
128
|
Qi WH, Jiang XM, Du LM, Xiao GS, Hu TZ, Yue BS, Quan QM. Genome-Wide Survey and Analysis of Microsatellite Sequences in Bovid Species. PLoS One 2015. [PMID: 26196922 PMCID: PMC4510479 DOI: 10.1371/journal.pone.0133667] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022] Open
Abstract
Microsatellites or simple sequence repeats (SSRs) have become the most popular source of genetic markers, which are ubiquitously distributed in many eukaryotic and prokaryotic genomes. This is the first study examining and comparing SSRs in completely sequenced genomes of the Bovidae. We analyzed and compared the number of SSRs, relative abundance, relative density, guanine-cytosine (GC) content and proportion of SSRs in six taxonomically different bovid species: Bos taurus, Bubalus bubalis, Bos mutus, Ovis aries, Capra hircus, and Pantholops hodgsonii. Our analysis revealed that, based on our search criteria, the total number of perfect SSRs found ranged from 663,079 to 806,907 and covered from 0.44% to 0.48% of the bovid genomes. Relative abundance and density of SSRs in these Bovinae genomes were non-significantly correlated with genome size (Pearson, r < 0.420, p > 0.05). Perfect mononucleotide SSRs were the most abundant, followed by the pattern: perfect di- > tri- > penta- > tetra- > hexanucleotide SSRs. Generally, the number of SSRs, relative abundance, and relative density of SSRs decreased as the motif repeat length increased in each species of Bovidae. The most GC-content was in trinucleotide SSRs and the least was in the mononucleotide SSRs in the six bovid genomes. The GC-contents of tri- and pentanucleotide SSRs showed a great deal of similarity among different chromosomes of B. taurus, O. aries, and C. hircus. SSR number of all chromosomes in the B. taurus, O.aries, and C. hircus is closely positively correlated with chromosome sequence size (Pearson, r > 0.980, p < 0.01) and significantly negatively correlated with GC-content (Pearson, r < -0.638, p < 0.01). Relative abundance and density of SSRs in all chromosomes of the three species were significantly negatively correlated with GC-content (Pearson, r < -0.333, P < 0.05) but not significantly correlated with chromosome sequence size (Pearson, r < -0.185, P > 0.05). Relative abundances of the same nucleotide SSR type showed great similarity among different chromosomes of B. taurus, O. aries, and C. hircus.
Collapse
Affiliation(s)
- Wen-Hua Qi
- College of Life Science and Engineering, Chongqing Three Gorges University, Chongqing, 404100, China
- * E-mail:
| | - Xue-Mei Jiang
- College of Environmental and Chemistry Engineering, Chongqing Three Gorges University, Chongqing, 404100, China
| | - Lian-Ming Du
- Key Laboratory of Bio-resources and Eco-environment (Ministry of Education), College of Life Sciences, Sichuan University, Chengdu, 610064, China
| | - Guo-Sheng Xiao
- College of Life Science and Engineering, Chongqing Three Gorges University, Chongqing, 404100, China
| | - Ting-Zhang Hu
- College of Life Science and Engineering, Chongqing Three Gorges University, Chongqing, 404100, China
| | - Bi-Song Yue
- Key Laboratory of Bio-resources and Eco-environment (Ministry of Education), College of Life Sciences, Sichuan University, Chengdu, 610064, China
| | - Qiu-Mei Quan
- School of Life Sciences, China West Normal University, Nanchong, 637009, China
| |
Collapse
|
129
|
Ma Z. Genome-wide characterization of perfect microsatellites in yak (Bos grunniens). Genetica 2015; 143:515-20. [PMID: 26071092 DOI: 10.1007/s10709-015-9849-y] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/10/2014] [Accepted: 06/05/2015] [Indexed: 11/25/2022]
Abstract
Microsatellites or simple sequence repeats (SSRs) constitute a significant portion of genomes and play an important role in gene function and genome organization. The availability of a complete genome sequence for yak (Bos grunniens) has made it possible to carry out genome-wide analysis of microsatellites in this species. We analyzed the abundance and density of perfect SSRs in the yak genome. We found a total of 723,172 SSRs with 1-6 bp nucleotide motifs, indicating that about 0.47 % of the yak whole genome sequence (2.66 Gb) comprises perfect SSRs, the average length of which was 17.34 bp/Mb. The average frequency and density of perfect SSRs was 272.18 loci/Mb and 4719.25 bp/Mb, respectively. The proportion of the six classes of perfect SSRs was not evenly distributed in the yak genome. Mononucleotide repeats (44.04 %) with a total number of 318,435 and a average length of 14.71 bp appeared to be the most abundant SSRs class, while the percentages of dinucleotide, trinucleotide, pentanucleotide, tetranucleotide and hexanucleotide repeats was 24.11 %, 15.80 %, 9.50 %, 6.40 % and 0.15 %, respectively. Different repeat classes of SSRs varied in their repeat number with the highest being 1206. Our results suggest that 15 motifs comprised the predominant categories with a frequency above 1 loci/Mb: A, AC, AT, AG, AGC, AAC, AAT, ACC, ATTT, GTTT, AATG, CTTT, ATGG, AACTG and ATCTG.
Collapse
Affiliation(s)
- Zhijie Ma
- Qinghai Academy of Animal Science and Veterinary Medicine, Qinghai University, No. 1 Weier Road, Bio-Science Industrial District, Xining, 810016, Qinghai, People's Republic of China,
| |
Collapse
|
130
|
Pramod S, Perkins AD, Welch ME. Patterns of microsatellite evolution inferred from the Helianthus annuus (Asteraceae) transcriptome. J Genet 2015; 93:431-42. [PMID: 25189238 DOI: 10.1007/s12041-014-0402-z] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/24/2022]
Abstract
The distribution of microsatellites in exons, and their association with gene ontology (GO) terms is explored to elucidate patterns of microsatellite evolution in the common sunflower, Helianthus annuus. The relative position, motif, size and level of impurity were estimated for each microsatellite in the unigene database available from the Compositae Genome Project (CGP), and statistical analyses were performed to determine if differences in microsatellite distributions and enrichment within certain GO terms were significant. There are more translated than untranslated microsatellites, implying that many bring about structural changes in proteins. However, the greatest density is observed within the UTRs, particularly 5'UTRs. Further, UTR microsatellites are purer and longer than coding region microsatellites. This suggests that UTR microsatellites are either younger and under more relaxed constraints, or that purifying selection limits impurities, and directional selection favours their expansion. GOs associated with response to various environmental stimuli including water deprivation and salt stress were significantly enriched with microsatellites. This may suggest that these GOs are more labile in plant genomes, or that selection has favoured the maintenance of microsatellites in these genes over others. This study shows that the distribution of transcribed microsatellites in H. annuus is nonrandom, the coding region microsatellites are under greater constraint compared to the UTR microsatellites, and that these sequences are enriched within genes that regulate plant responses to environmental stress and stimuli.
Collapse
Affiliation(s)
- Sreepriya Pramod
- Department of Biological Sciences, Mississippi State University, 219 Harned Hall, 295 Lee Boulevard, MS 39762, USA.
| | | | | |
Collapse
|
131
|
Baptiste BA, Jacob KD, Eckert KA. Genetic evidence that both dNTP-stabilized and strand slippage mechanisms may dictate DNA polymerase errors within mononucleotide microsatellites. DNA Repair (Amst) 2015; 29:91-100. [PMID: 25758780 DOI: 10.1016/j.dnarep.2015.02.016] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2014] [Revised: 02/15/2015] [Accepted: 02/16/2015] [Indexed: 12/19/2022]
Abstract
Mononucleotide microsatellites are tandem repeats of a single base pair, abundant within coding exons and frequent sites of mutation in the human genome. Because the repeated unit is one base pair, multiple mechanisms of insertion/deletion (indel) mutagenesis are possible, including strand-slippage, dNTP-stabilized, and misincorportion-misalignment. Here, we examine the effects of polymerase identity (mammalian Pols α, β, κ, and η), template sequence, dNTP pool size, and reaction temperature on indel errors during in vitro synthesis of mononucleotide microsatellites. We utilized the ratio of insertion to deletion errors as a genetic indicator of mechanism. Strikingly, we observed a statistically significant bias toward deletion errors within mononucleotide repeats for the majority of the 28 DNA template and polymerase combinations examined, with notable exceptions based on sequence and polymerase identity. Using mutator forms of Pol β did not substantially alter the error specificity, suggesting that mispairing-misalignment mechanism is not a primary mechanism. Based on our results for mammalian DNA polymerases representing three structurally distinct families, we suggest that dNTP-stabilized mutagenesis may be an alternative mechanism for mononucleotide microsatellite indel mutation. The change from a predominantly dNTP-stabilized mechanism to a strand-slippage mechanism with increasing microsatellite length may account for the differential rates of tandem repeat mutation that are observed genome-wide.
Collapse
Affiliation(s)
- Beverly A Baptiste
- The Jake Gittlen Laboratories for Cancer Research and the Department of Pathology, Pennsylvania State University College of Medicine, 500 University Drive, Hershey, PA 17033, USA
| | - Kimberly D Jacob
- The Jake Gittlen Laboratories for Cancer Research and the Department of Pathology, Pennsylvania State University College of Medicine, 500 University Drive, Hershey, PA 17033, USA
| | - Kristin A Eckert
- The Jake Gittlen Laboratories for Cancer Research and the Department of Pathology, Pennsylvania State University College of Medicine, 500 University Drive, Hershey, PA 17033, USA.
| |
Collapse
|
132
|
Pathak RU, Srinivasan A, Mishra RK. Genome-wide mapping of matrix attachment regions in Drosophila melanogaster. BMC Genomics 2014; 15:1022. [PMID: 25424749 PMCID: PMC4301625 DOI: 10.1186/1471-2164-15-1022] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2014] [Accepted: 11/12/2014] [Indexed: 12/12/2022] Open
Abstract
Background Eukaryotic genome acquires functionality upon proper packaging within the nucleus. This process is facilitated by the structural framework of Nuclear Matrix, a nucleo-proteinaceous meshwork. Matrix Attachment Regions (MARs) in the genome serve as anchoring sites to this framework. Results Here we report direct sequencing of the MAR preparation from Drosophila melanogaster embryos and identify >7350 MARs. This amounts to ~2.5% of the fly genome and often coincide with AT rich non-coding regions. We find significant association of MARs with the origins of replication, transcription start sites, paused RNA Polymerase II sites and exons, but not introns, of highly expressed genes. We also identified sequence motifs and repeats that constitute MARs. Conclusion Our data reveal the contact points of genome to the nuclear architecture and provide a link between nuclear functions and genomic packaging. Electronic supplementary material The online version of this article (doi:10.1186/1471-2164-15-1022) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
| | | | - Rakesh K Mishra
- Centre for Cellular and Molecular Biology, Council of Scientific and Industrial Research, Uppal Road, Hyderabad 500 007, India.
| |
Collapse
|
133
|
Shen C, Wang X, Tian L, Che G. Microsatellite alteration in multiple primary lung cancer. J Thorac Dis 2014; 6:1499-505. [PMID: 25364529 DOI: 10.3978/j.issn.2072-1439.2014.09.14] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2014] [Accepted: 08/28/2014] [Indexed: 02/05/2023]
Abstract
Patients with pulmonary neoplasms have an increased risk for developing a second tumor of the lung, either at the same time or different times. It is important to determine if the second tumor represents an independent primary tumor or recurrence/metastasis, because it will significantly change the management and prognosis. Microsatellite instability (MSI) and loss of heterozygosity (LOH) represents molecular disorders acquired by the cell during neoplastic transformation. Both are associated with genetic instability. Functional silencing of tumour suppressor genes may be the consequence of genomic instability, particularly of the globally occurring LOH phenomenon. Numerous studies have confirmed the role of MSI/LOH at both the early and the late stages of multiple primary lung cancer. This paper reviews the published literatures focused on the role of MSI/LOH significance in multiple primary lung cancer. Additionally, a new method based on the allelic variations at polymorphic microsatellite markers was offered that it does not rely on collection of normal tissue, performed with minimal tumor sample, and will complement clinical criteria for diagnostic discrimination between multiple primary cancers versus solitary metastatic diseases.
Collapse
Affiliation(s)
- Cheng Shen
- Department of Thoracic Surgery, West-China Hospital, Sichuan University, Chengdu 610041, China
| | - Xin Wang
- Department of Thoracic Surgery, West-China Hospital, Sichuan University, Chengdu 610041, China
| | - Long Tian
- Department of Thoracic Surgery, West-China Hospital, Sichuan University, Chengdu 610041, China
| | - Guowei Che
- Department of Thoracic Surgery, West-China Hospital, Sichuan University, Chengdu 610041, China
| |
Collapse
|
134
|
Ramamoorthy S, Garapati HS, Mishra RK. Length and sequence dependent accumulation of simple sequence repeats in vertebrates: Potential role in genome organization and regulation. Gene 2014; 551:167-75. [DOI: 10.1016/j.gene.2014.08.052] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2014] [Revised: 08/03/2014] [Accepted: 08/25/2014] [Indexed: 10/24/2022]
|
135
|
Jiang Q, Li Q, Yu H, Kong L. Genome-wide analysis of simple sequence repeats in marine animals-a comparative approach. MARINE BIOTECHNOLOGY (NEW YORK, N.Y.) 2014; 16:604-619. [PMID: 24939717 DOI: 10.1007/s10126-014-9580-1] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/01/2014] [Accepted: 05/22/2014] [Indexed: 06/03/2023]
Abstract
Tandem simple sequence repeats (SSRs) are one of the most popular molecular markers in genetic analysis owing to their ubiquitous occurrence,high reproducibility, multiallelic nature, and codominant mode. High mutability makes SSRs play a role in genome evolution and correspondingly show different patterns. Comparative analysis of genomic SSRs in different taxonomic groups usually focuses on land species, while marine animals have been neglected. This study examined the abundance of genomic SSRs with repeated unit lengths of 1-6 bp in 30 marine animals including nine taxonomic groups and further compared with the land species. More than thousands of SSRs were discovered in every organism which provided a huge resource for the development of molecular markers. Thirty marine animals showed profound differences in SSR characteristics, but some group-specific trends were also found. Both similarities and differences of repeat patterns were discovered between the land and marine species. Two taxon-specific SSR types were discovered: the pentanucleotides motif AGAGG in Euteleostei and the hexanucleotide repeats of ATGTAC in Porifera and Echinodermata. Gene ontology (GO) enrichment analysis of two representative species (Amphimedon queenslandica for Porifera and Strongylocentrotus purpuratus for Echinodermata) revealed functional preference of the ATGTAC motif associated genes, and this might hint at evolutionary significance.
Collapse
Affiliation(s)
- Qun Jiang
- The Key Laboratory of Mariculture, Ministry of Education, Ocean University of China, 266003, Qingdao, China
| | | | | | | |
Collapse
|
136
|
Press MO, Carlson KD, Queitsch C. The overdue promise of short tandem repeat variation for heritability. Trends Genet 2014; 30:504-12. [PMID: 25182195 DOI: 10.1016/j.tig.2014.07.008] [Citation(s) in RCA: 71] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2014] [Revised: 07/23/2014] [Accepted: 07/24/2014] [Indexed: 12/11/2022]
Abstract
Short tandem repeat (STR) variation has been proposed as a major explanatory factor in the heritability of complex traits in humans and model organisms. However, we still struggle to incorporate STR variation into genotype-phenotype maps. We review here the promise of STRs in contributing to complex trait heritability and highlight the challenges that STRs pose due to their repetitive nature. We argue that STR variants are more likely than single-nucleotide variants to have epistatic interactions, reiterate the need for targeted assays to genotype STRs accurately, and call for more appropriate statistical methods in detecting STR-phenotype associations. Lastly, we suggest that somatic STR variation within individuals may serve as a read-out of disease susceptibility, and is thus potentially a valuable covariate for future association studies.
Collapse
Affiliation(s)
- Maximilian O Press
- Department of Genome Sciences, University of Washington, Foege Building S-250, Box 355065, 3720 15th Avenue NE, Seattle, WA 98195-5065, USA
| | - Keisha D Carlson
- Department of Genome Sciences, University of Washington, Foege Building S-250, Box 355065, 3720 15th Avenue NE, Seattle, WA 98195-5065, USA
| | - Christine Queitsch
- Department of Genome Sciences, University of Washington, Foege Building S-250, Box 355065, 3720 15th Avenue NE, Seattle, WA 98195-5065, USA.
| |
Collapse
|
137
|
Biswas MK, Xu Q, Mayer C, Deng X. Genome wide characterization of short tandem repeat markers in sweet orange (Citrus sinensis). PLoS One 2014; 9:e104182. [PMID: 25148383 PMCID: PMC4141690 DOI: 10.1371/journal.pone.0104182] [Citation(s) in RCA: 37] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2014] [Accepted: 07/09/2014] [Indexed: 11/18/2022] Open
Abstract
Sweet orange (Citrus sinensis) is one of the major cultivated and most-consumed citrus species. With the goal of enhancing the genomic resources in citrus, we surveyed, developed and characterized microsatellite markers in the ≈347 Mb sequence assembly of the sweet orange genome. A total of 50,846 SSRs were identified with a frequency of 146.4 SSRs/Mbp. Dinucleotide repeats are the most frequent repeat class and the highest density of SSRs was found in chromosome 4. SSRs are non-randomly distributed in the genome and most of the SSRs (62.02%) are located in the intergenic regions. We found that AT-rich SSRs are more frequent than GC-rich SSRs. A total number of 21,248 SSR primers were successfully developed, which represents 89 SSR markers per Mb of the genome. A subset of 950 developed SSR primer pairs were synthesized and tested by wet lab experiments on a set of 16 citrus accessions. In total we identified 534 (56.21%) polymorphic SSR markers that will be useful in citrus improvement. The number of amplified alleles ranges from 2 to 12 with an average of 4 alleles per marker and an average PIC value of 0.75. The newly developed sweet orange primer sequences, their in silico PCR products, exact position in the genome assembly and putative function are made publicly available. We present the largest number of SSR markers ever developed for a citrus species. Almost two thirds of the markers are transferable to 16 citrus relatives and may be used for constructing a high density linkage map. In addition, they are valuable for marker-assisted selection studies, population structure analyses and comparative genomic studies of C. sinensis with other citrus related species. Altogether, these markers provide a significant contribution to the citrus research community.
Collapse
Affiliation(s)
- Manosh Kumar Biswas
- Key Laboratory of Horticultural Plant Biology of Ministry of Education (MOE), Huazhong Agricultural University, Wuhan, Hubei, P.R. China
| | - Qiang Xu
- Key Laboratory of Horticultural Plant Biology of Ministry of Education (MOE), Huazhong Agricultural University, Wuhan, Hubei, P.R. China
| | | | - Xiuxin Deng
- Key Laboratory of Horticultural Plant Biology of Ministry of Education (MOE), Huazhong Agricultural University, Wuhan, Hubei, P.R. China
- * E-mail:
| |
Collapse
|
138
|
Abstract
It is widely appreciated that short tandem repeat (STR) variation underlies substantial phenotypic variation in organisms. Some propose that the high mutation rates of STRs in functional genomic regions facilitate evolutionary adaptation. Despite their high mutation rate, some STRs show little to no variation in populations. One such STR occurs in the Arabidopsis thaliana gene PFT1 (MED25), where it encodes an interrupted polyglutamine tract. Although the PFT1 STR is large (∼270 bp), and thus expected to be extremely variable, it shows only minuscule variation across A. thaliana strains. We hypothesized that the PFT1 STR is under selective constraint, due to previously undescribed roles in PFT1 function. We investigated this hypothesis using plants expressing transgenic PFT1 constructs with either an endogenous STR or synthetic STRs of varying length. Transgenic plants carrying the endogenous PFT1 STR generally performed best in complementing a pft1 null mutant across adult PFT1-dependent traits. In stark contrast, transgenic plants carrying a PFT1 transgene lacking the STR phenocopied a pft1 loss-of-function mutant for flowering time phenotypes and were generally hypomorphic for other traits, establishing the functional importance of this domain. Transgenic plants carrying various synthetic constructs occupied the phenotypic space between wild-type and pft1 loss-of-function mutants. By varying PFT1 STR length, we discovered that PFT1 can act as either an activator or repressor of flowering in a photoperiod-dependent manner. We conclude that the PFT1 STR is constrained to its approximate wild-type length by its various functional requirements. Our study implies that there is strong selection on STRs not only to generate allelic diversity, but also to maintain certain lengths pursuant to optimal molecular function.
Collapse
|
139
|
Milani D, Cabral-de-Mello DC. Microsatellite organization in the grasshopper Abracris flavolineata (Orthoptera: Acrididae) revealed by FISH mapping: remarkable spreading in the A and B chromosomes. PLoS One 2014; 9:e97956. [PMID: 24871300 PMCID: PMC4037182 DOI: 10.1371/journal.pone.0097956] [Citation(s) in RCA: 30] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2014] [Accepted: 04/27/2014] [Indexed: 12/12/2022] Open
Abstract
With the aim of acquiring deeper knowledge about repetitive DNAs chromosomal organization in grasshoppers, we used fluorescent in situ hybridization (FISH) to map the distribution of 16 microsatellite repeats, including mono-, di-, tri- and tetra-nucleotides, in the chromosomes of the species Abracris flavolineata (Acrididae), which harbors B chromosome. FISH revealed two main patterns: (i) exclusively scattered signals, and (ii) scattered and specific signals, forming evident blocks. The enrichment was observed in both euchromatic and heterochromatic areas and only the motif (C)30 was absent in heterochromatin. The A and B chromosomes were enriched with all the elements that were mapped, being observed in the B chromosome more distinctive blocks for (GA)15 and (GAG)10. For A complement distinctive blocks were noticed for (A)30, (CA)15, (CG)15, (GA)15, (CAC)10, (CAA)10, (CGG)10, (GAA)10, (GAC)10 and (GATA)8. These results revealed an intense spreading of microsatellites in the A. flavolineata genome that was independent of the A+T or G+C enrichment in the repeats. The data indicate that the microsatellites compose the B chromosome and could be involved in the evolution of this element in this species, although no specific relationship with any A chromosome was observed to discuss about its origin. The systematic analysis presented here contributes to the knowledge of repetitive DNA chromosomal organization among grasshoppers including the B chromosomes.
Collapse
Affiliation(s)
- Diogo Milani
- UNESP - Univ Estadual Paulista, Instituto de Biociências/IB, Departamento de Biologia, Rio Claro, São Paulo, Brazil
| | | |
Collapse
|
140
|
Prevalence and implications of elevated microsatellite alterations at selected tetranucleotides in cancer. Br J Cancer 2014; 111:823-7. [PMID: 24691426 PMCID: PMC4150258 DOI: 10.1038/bjc.2014.167] [Citation(s) in RCA: 49] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Revised: 03/01/2014] [Accepted: 03/05/2014] [Indexed: 12/22/2022] Open
Abstract
Elevated microsatellite alterations at selected tetranucleotides (EMAST), a variation of microsatellite instability (MSI), has been reported in a variety of malignancies (e.g., neoplasias of the lung, head and neck, colorectal region, skin, urinary tract and reproductive organs). EMAST is more prominent at organ sites with potential external exposure to carcinogens (e.g., head, neck, lung, urinary bladder and colon), although the specific molecular mechanisms leading to EMAST remain elusive. Because it is often associated with advanced stages of malignancy, EMAST may be a consequence of rapid cell proliferation and increased mutagenesis. Moreover, defects in DNA mismatch repair enzyme complexes, TP53 mutation status and peritumoural inflammation involving T cells have been described in EMAST tumours. At various tumour sites, EMAST and high-frequency MSI share no clinicopathological features or molecular mechanisms, suggesting their existence as separate entities. Thus EMAST should be explored, because its presence in human cells may reflect both increased risk and the potential for early detection. In particular, the potential use of EMAST in prognosis and prediction may yield novel types of therapeutic intervention, particularly those involving the immune system. This review will summarise the current information concerning EMAST in cancer to highlight the knowledge gaps that require further research.
Collapse
|
141
|
Kim TM, Laird PW, Park PJ. The landscape of microsatellite instability in colorectal and endometrial cancer genomes. Cell 2014; 155:858-68. [PMID: 24209623 DOI: 10.1016/j.cell.2013.10.015] [Citation(s) in RCA: 272] [Impact Index Per Article: 24.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2013] [Revised: 07/11/2013] [Accepted: 10/02/2013] [Indexed: 12/30/2022]
Abstract
Microsatellites-simple tandem repeats present at millions of sites in the human genome-can shorten or lengthen due to a defect in DNA mismatch repair. We present here a comprehensive genome-wide analysis of the prevalence, mutational spectrum, and functional consequences of microsatellite instability (MSI) in cancer genomes. We analyzed MSI in 277 colorectal and endometrial cancer genomes (including 57 microsatellite-unstable ones) using exome and whole-genome sequencing data. Recurrent MSI events in coding sequences showed tumor type specificity, elevated frameshift-to-inframe ratios, and lower transcript levels than wild-type alleles. Moreover, genome-wide analysis revealed differences in the distribution of MSI versus point mutations, including overrepresentation of MSI in euchromatic and intronic regions compared to heterochromatic and intergenic regions, respectively, and depletion of MSI at nucleosome-occupied sequences. Our results provide a panoramic view of MSI in cancer genomes, highlighting their tumor type specificity, impact on gene expression, and the role of chromatin organization.
Collapse
Affiliation(s)
- Tae-Min Kim
- Center for Biomedical Informatics, Harvard Medical School, Boston, MA 02115, USA; Cancer Evolution Research Center, College of Medicine, The Catholic University of Korea, Seoul 137-701, Korea
| | | | | |
Collapse
|
142
|
Wang Y, Chen M, Wang H, Wang JF, Bao D. Microsatellites in the genome of the edible mushroom, Volvariella volvacea. BIOMED RESEARCH INTERNATIONAL 2014; 2014:281912. [PMID: 24575404 PMCID: PMC3915763 DOI: 10.1155/2014/281912] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/02/2013] [Revised: 10/23/2013] [Accepted: 10/23/2013] [Indexed: 01/13/2023]
Abstract
Using bioinformatics software and database, we have characterized the microsatellite pattern in the V. volvacea genome and compared it with microsatellite patterns found in the genomes of four other edible fungi: Coprinopsis cinerea, Schizophyllum commune, Agaricus bisporus, and Pleurotus ostreatus. A total of 1346 microsatellites have been identified, with mono-nucleotides being the most frequent motif. The relative abundance of microsatellites was lower in coding regions with 21 No./Mb. However, the microsatellites in the V. volvacea gene models showed a greater tendency to be located in the CDS regions. There was also a higher preponderance of trinucleotide repeats, especially in the kinase genes, which implied a possible role in phenotypic variation. Among the five fungal genomes, microsatellite abundance appeared to be unrelated to genome size. Furthermore, the short motifs (mono- to tri-nucleotides) outnumbered other categories although these differed in proportion. Data analysis indicated a possible relationship between the most frequent microsatellite types and the genetic distance between the five fungal genomes.
Collapse
Affiliation(s)
- Ying Wang
- National Engineering Research Center of Edible Fungi and Key Laboratory of Applied Mycological Resources and Utilization, Ministry of Agriculture and Shanghai Key Laboratory of Agricultural Genetics and Breeding and Institute of Edible Fungi, Shanghai Academy of Agriculture Science, Shanghai 201403, China
| | - Mingjie Chen
- National Engineering Research Center of Edible Fungi and Key Laboratory of Applied Mycological Resources and Utilization, Ministry of Agriculture and Shanghai Key Laboratory of Agricultural Genetics and Breeding and Institute of Edible Fungi, Shanghai Academy of Agriculture Science, Shanghai 201403, China
| | - Hong Wang
- National Engineering Research Center of Edible Fungi and Key Laboratory of Applied Mycological Resources and Utilization, Ministry of Agriculture and Shanghai Key Laboratory of Agricultural Genetics and Breeding and Institute of Edible Fungi, Shanghai Academy of Agriculture Science, Shanghai 201403, China
| | - Jing-Fang Wang
- Key Laboratory of Systems Biomedicine, Shanghai Center for Systems Biomedicine, Shanghai Jiao Tong University, Shanghai 200240, China
| | - Dapeng Bao
- National Engineering Research Center of Edible Fungi and Key Laboratory of Applied Mycological Resources and Utilization, Ministry of Agriculture and Shanghai Key Laboratory of Agricultural Genetics and Breeding and Institute of Edible Fungi, Shanghai Academy of Agriculture Science, Shanghai 201403, China
| |
Collapse
|
143
|
Abstract
Tandem repeats (TRs) extensively exist in the genomes of prokaryotes and eukaryotes. Based on the sequenced genomes and gene annotations of 31 plant and algal species in Phytozome version 8.0 (http://www.phytozome.net/), we examined TRs in a genome-wide scale, characterized their distributions and motif features, and explored their putative biological functions. Among the 31 species, no significant correlation was detected between the TR density and genome size. Interestingly, green alga Chlamydomonas reinhardtii (42,059 bp/Mbp) and castor bean Ricinus communis (55,454 bp/Mbp) showed much higher TR densities than all other species (13,209 bp/Mbp on average). In the 29 land plants, including 22 dicots, 5 monocots, and 2 bryophytes, 5′-UTR and upstream intergenic 200-nt (UI200) regions had the first and second highest TR densities, whereas in the two green algae (C. reinhardtii and Volvox carteri) the first and second highest densities were found in intron and coding sequence (CDS) regions, respectively. In CDS regions, trinucleotide and hexanucleotide motifs were those most frequently represented in all species. In intron regions, especially in the two green algae, significantly more TRs were detected near the intron–exon junctions. Within intergenic regions in dicots and monocots, more TRs were found near both the 5′ and 3′ ends of genes. GO annotation in two green algae revealed that the genes with TRs in introns are significantly involved in transcriptional and translational processing. As the first systematic examination of TRs in plant and green algal genomes, our study showed that TRs displayed nonrandom distribution for both intragenic and intergenic regions, suggesting that they have potential roles in transcriptional or translational regulation in plants and green algae.
Collapse
|
144
|
GATA simple sequence repeats function as enhancer blocker boundaries. Nat Commun 2013; 4:1844. [PMID: 23673629 DOI: 10.1038/ncomms2872] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2012] [Accepted: 04/11/2013] [Indexed: 11/09/2022] Open
Abstract
Simple sequence repeats (SSRs) account for ~3% of the human genome, but their functional significance still remains unclear. One of the prominent SSRs the GATA tetranucleotide repeat has preferentially accumulated in complex organisms. GATA repeats are particularly enriched on the human Y chromosome, and their non-random distribution and exclusive association with genes expressed during early development indicate their role in coordinated gene regulation. Here we show that GATA repeats have enhancer blocker activity in Drosophila and human cells. This enhancer blocker activity is seen in transgenic as well as native context of the enhancers at various developmental stages. These findings ascribe functional significance to SSRs and offer an explanation as to why SSRs, especially GATA, may have accumulated in complex organisms.
Collapse
|
145
|
Ream DC, Murakami ST, Schmidt EE, Huang GH, Liang C, Friedberg I, Cheng XW. Comparative analysis of error-prone replication mononucleotide repeats across baculovirus genomes. Virus Res 2013; 178:217-25. [PMID: 24140718 DOI: 10.1016/j.virusres.2013.10.005] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2013] [Revised: 10/04/2013] [Accepted: 10/07/2013] [Indexed: 11/25/2022]
Abstract
Genome replication by the baculovirus DNA polymerase often generates errors in mononucleotide repeat (MNR) sequences due to replication slippage. This results in the inactivation of genes that affects different stages of the cell infection cycle. Here we mapped these MNRs in the 59 baculovirus genomes. We found that the MNR frequencies of baculovirus genomes are different and not correlated with the genome sizes. Although the average A/T content of baculoviruses is 58.67%, the A/T MNR frequency is significantly higher than that of the G/C MNRs. Furthermore, the A7/T7 MNRs are the most frequent of those we studied. Finally, MNR frequencies in different classes of baculovirus genes, such as immediate early genes, show differences between baculovirus genomes, suggesting that the distribution and frequency of different MNRs are unique to each baculovirus species or strain. Therefore, the results of this study can help select appropriate baculoviruses for the development of biological insecticides.
Collapse
Affiliation(s)
- David C Ream
- Department of Microbiology, Miami University, Oxford, OH, USA
| | | | | | | | | | | | | |
Collapse
|
146
|
Chapal-Ilani N, Maruvka YE, Spiro A, Reizel Y, Adar R, Shlush LI, Shapiro E. Comparing algorithms that reconstruct cell lineage trees utilizing information on microsatellite mutations. PLoS Comput Biol 2013; 9:e1003297. [PMID: 24244121 PMCID: PMC3828138 DOI: 10.1371/journal.pcbi.1003297] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2012] [Accepted: 09/09/2013] [Indexed: 11/18/2022] Open
Abstract
Organism cells proliferate and die to build, maintain, renew and repair it. The cellular history of an organism up to any point in time can be captured by a cell lineage tree in which vertices represent all organism cells, past and present, and directed edges represent progeny relations among them. The root represents the fertilized egg, and the leaves represent extant and dead cells. Somatic mutations accumulated during cell division endow each organism cell with a genomic signature that is unique with a very high probability. Distances between such genomic signatures can be used to reconstruct an organism's cell lineage tree. Cell populations possess unique features that are absent or rare in organism populations (e.g., the presence of stem cells and a small number of generations since the zygote) and do not undergo sexual reproduction, hence the reconstruction of cell lineage trees calls for careful examination and adaptation of the standard tools of population genetics. Our lab developed a method for reconstructing cell lineage trees by examining only mutations in highly variable microsatellite loci (MS, also called short tandem repeats, STR). In this study we use experimental data on somatic mutations in MS of individual cells in human and mice in order to validate and quantify the utility of known lineage tree reconstruction algorithms in this context. We employed extensive measurements of somatic mutations in individual cells which were isolated from healthy and diseased tissues of mice and humans. The validation was done by analyzing the ability to infer known and clear biological scenarios. In general, we found that if the biological scenario is simple, almost all algorithms tested can infer it. Another somewhat surprising conclusion is that the best algorithm among those tested is Neighbor Joining where the distance measure used is normalized absolute distance. We include our full dataset in Tables S1, S2, S3, S4, S5 to enable further analysis of this data by others.
Collapse
Affiliation(s)
- Noa Chapal-Ilani
- Department of Computer Science and Applied Mathematics, Weizmann Institute of Science, Rehovot, Israel
| | | | | | | | | | | | | |
Collapse
|
147
|
Mutation rates, spectra, and genome-wide distribution of spontaneous mutations in mismatch repair deficient yeast. G3-GENES GENOMES GENETICS 2013; 3:1453-65. [PMID: 23821616 PMCID: PMC3755907 DOI: 10.1534/g3.113.006429] [Citation(s) in RCA: 86] [Impact Index Per Article: 7.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 01/18/2023]
Abstract
DNA mismatch repair is a highly conserved DNA repair pathway. In humans, germline mutations in hMSH2 or hMLH1, key components of mismatch repair, have been associated with Lynch syndrome, a leading cause of inherited cancer mortality. Current estimates of the mutation rate and the mutational spectra in mismatch repair defective cells are primarily limited to a small number of individual reporter loci. Here we use the yeast Saccharomyces cerevisiae to generate a genome-wide view of the rates, spectra, and distribution of mutation in the absence of mismatch repair. We performed mutation accumulation assays and next generation sequencing on 19 strains, including 16 msh2 missense variants implicated in Lynch cancer syndrome. The mutation rate for DNA mismatch repair null strains was approximately 1 mutation per genome per generation, 225-fold greater than the wild-type rate. The mutations were distributed randomly throughout the genome, independent of replication timing. The mutation spectra included insertions/deletions at homopolymeric runs (87.7%) and at larger microsatellites (5.9%), as well as transitions (4.5%) and transversions (1.9%). Additionally, repeat regions with proximal repeats are more likely to be mutated. A bias toward deletions at homopolymers and insertions at (AT)n microsatellites suggests a different mechanism for mismatch generation at these sites. Interestingly, 5% of the single base pair substitutions might represent double-slippage events that occurred at the junction of immediately adjacent repeats, resulting in a shift in the repeat boundary. These data suggest a closer scrutiny of tumor suppressors with homopolymeric runs with proximal repeats as the potential drivers of oncogenesis in mismatch repair defective cells.
Collapse
|
148
|
Grandi FC, An W. Non-LTR retrotransposons and microsatellites: Partners in genomic variation. Mob Genet Elements 2013; 3:e25674. [PMID: 24195012 PMCID: PMC3812793 DOI: 10.4161/mge.25674] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2013] [Revised: 07/07/2013] [Accepted: 07/09/2013] [Indexed: 01/10/2023] Open
Abstract
The human genome is laden with both non-LTR (long-terminal repeat) retrotransposons and microsatellite repeats. Both types of sequences are able to, either actively or passively, mutagenize the genomes of human individuals and are therefore poised to dynamically alter the human genomic landscape across generations. Non-LTR retrotransposons, such as L1 and Alu, are a major source of new microsatellites, which are born both concurrently and subsequently to L1 and Alu integration into the genome. Likewise, the mutation dynamics of microsatellite repeats have a direct impact on the fitness of their non-LTR retrotransposon parent owing to microsatellite expansion and contraction. This review explores the interactions and dynamics between non-LTR retrotransposons and microsatellites in the context of genomic variation and evolution.
Collapse
Affiliation(s)
- Fiorella C Grandi
- School of Molecular Biosciences and Center for Reproductive Biology; Washington State University; Pullman, WA USA
| | | |
Collapse
|
149
|
Gao C, Ren X, Mason AS, Li J, Wang W, Xiao M, Fu D. Revisiting an important component of plant genomes: microsatellites. FUNCTIONAL PLANT BIOLOGY : FPB 2013; 40:645-661. [PMID: 32481138 DOI: 10.1071/fp12325] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/31/2012] [Accepted: 01/16/2013] [Indexed: 06/11/2023]
Abstract
Microsatellites are some of the most highly variable repetitive DNA tracts in genomes. Few studies focus on whether the characteristic instability of microsatellites is linked to phenotypic effects in plants. We summarise recent data to investigate how microsatellite variations affect gene expression and hence phenotype. We discuss how the basic characteristics of microsatellites may contribute to phenotypic effects. In summary, microsatellites in plants are universal and highly mutable, they coexist and coevolve with transposable elements, and are under selective pressure. The number of motif nucleotides, the type of motif and transposon activity all contribute to the nonrandom generation and decay of microsatellites, and to conservation and distribution biases. Although microsatellites are generated by accident, they mature through responses to environmental change before final decay. This process is mediated by organism adjustment mechanisms, which maintain a balance between birth versus death and growth versus decay in microsatellites. Close relationships also exist between the physical structure, variation and functionality of microsatellites: in most plant species, sequences containing microsatellites are associated with catalytic activity and binding functions, are expressed in the membrane and organelles, and participate in the developmental and metabolic processes. Microsatellites contribute to genome structure and functional plasticity, and may be considered to promote species evolution in plants in response to environmental changes. In conclusion, the generation, loss, functionality and evolution of microsatellites can be related to plant gene expression and functional alterations. The effect of microsatellites on phenotypic variation may be as significant in plants as it is in animals.
Collapse
Affiliation(s)
- Caihua Gao
- Engineering Research Center of South Upland Agriculture, Ministry of Education, College of Agronomy and Biotechnology, Southwest University, Chongqing 400715, China
| | - Xiaodong Ren
- Engineering Research Center of South Upland Agriculture, Ministry of Education, College of Agronomy and Biotechnology, Southwest University, Chongqing 400715, China
| | - Annaliese S Mason
- Centre for Integrative Legume Research and School of Agriculture and Food Sciences, The University of Queensland, Brisbane 4072, Qld, Australia
| | - Jiana Li
- Engineering Research Center of South Upland Agriculture, Ministry of Education, College of Agronomy and Biotechnology, Southwest University, Chongqing 400715, China
| | - Wei Wang
- Engineering Research Center of South Upland Agriculture, Ministry of Education, College of Agronomy and Biotechnology, Southwest University, Chongqing 400715, China
| | - Meili Xiao
- Engineering Research Center of South Upland Agriculture, Ministry of Education, College of Agronomy and Biotechnology, Southwest University, Chongqing 400715, China
| | - Donghui Fu
- Key Laboratory of Crop Physiology, Ecology and Genetic Breeding, Ministry of Education, Jiangxi Agricultural University, Nanchang, Jiangxi 330045, China
| |
Collapse
|
150
|
Hahn Y. Evidence for the dissemination of cryptic non-coding RNAs transcribed from intronic and intergenic segments by retroposition. Bioinformatics 2013; 29:1593-9. [PMID: 23652427 DOI: 10.1093/bioinformatics/btt258] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
MOTIVATION Insertion of DNA segments is one mechanism by which genomes evolve. The bulk of genomic segments are now known to be transcribed into long and short non-coding RNAs (ncRNAs), promoter-associated transcripts and enhancer-templated transcripts. These various cryptic ncRNAs are thought to be dispersed in the human and other genomes by retroposition. RESULTS In this study, I report clear evidence for dissemination of cryptic ncRNAs transcribed from intronic and intergenic segments by retroposition. I used highly stringent conditions to find recently retroposed ncRNAs that had a poly(A) tract and were flanked by target site duplication. I identified 73 instances of retroposition in the human, mouse, and rat genomes (12, 36 and 25 instances, respectively). The inserted segments, in some cases, served as a novel exon or promoter for the associated gene, resulting in novel transcript variants. Some disseminated sequences showed sequence conservation across animals, implying a possible regulatory role. My results indicate that retroposition is one of the mechanisms for dispersion of ncRNAs. I propose that these newly inserted segments may play a role in genome evolution by potentially functioning as novel exons, promoters or enhancers. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Yoonsoo Hahn
- Department of Life Science, Research Center for Biomolecules and Biosystems, Chung-Ang University, Seoul 156-756, Korea.
| |
Collapse
|