1
|
Maekawa K, Yamada S, Sharma R, Chaudhuri J, Keeney S. Triple-helix potential of the mouse genome. Proc Natl Acad Sci U S A 2022; 119:e2203967119. [PMID: 35503911 PMCID: PMC9171763 DOI: 10.1073/pnas.2203967119] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2022] [Accepted: 03/30/2022] [Indexed: 01/14/2023] Open
Abstract
Certain DNA sequences, including mirror-symmetric polypyrimidine•polypurine runs, are capable of folding into a triple-helix–containing non–B-form DNA structure called H-DNA. Such H-DNA–forming sequences occur frequently in many eukaryotic genomes, including in mammals, and multiple lines of evidence indicate that these motifs are mutagenic and can impinge on DNA replication, transcription, and other aspects of genome function. In this study, we show that the triplex-forming potential of H-DNA motifs in the mouse genome can be evaluated using S1-sequencing (S1-seq), which uses the single-stranded DNA (ssDNA)–specific nuclease S1 to generate deep-sequencing libraries that report on the position of ssDNA throughout the genome. When S1-seq was applied to genomic DNA isolated from mouse testis cells and splenic B cells, we observed prominent clusters of S1-seq reads that appeared to be independent of endogenous double-strand breaks, that coincided with H-DNA motifs, and that correlated strongly with the triplex-forming potential of the motifs. Fine-scale patterns of S1-seq reads, including a pronounced strand asymmetry in favor of centrally positioned reads on the pyrimidine-containing strand, suggested that this S1-seq signal is specific for one of the four possible isomers of H-DNA (H-y5). By leveraging the abundance and complexity of naturally occurring H-DNA motifs across the mouse genome, we further defined how polypyrimidine repeat length and the presence of repeat-interrupting substitutions modify the structure of H-DNA. This study provides an approach for studying DNA secondary structure genome-wide at high spatial resolution.
Collapse
Affiliation(s)
- Kaku Maekawa
- Molecular Biology Program, Memorial Sloan Kettering Cancer Center, New York, NY 10065
- Department of Radiation Genetics, Graduate School of Medicine, Kyoto University, Kyoto 606-8501, Japan
| | - Shintaro Yamada
- Molecular Biology Program, Memorial Sloan Kettering Cancer Center, New York, NY 10065
- Department of Radiation Genetics, Graduate School of Medicine, Kyoto University, Kyoto 606-8501, Japan
| | - Rahul Sharma
- Immunology Program, Memorial Sloan Kettering Cancer Center, New York, NY 10065
| | - Jayanta Chaudhuri
- Immunology Program, Memorial Sloan Kettering Cancer Center, New York, NY 10065
| | - Scott Keeney
- Molecular Biology Program, Memorial Sloan Kettering Cancer Center, New York, NY 10065
- HHMI, Memorial Sloan Kettering Cancer Center, New York, NY 10065
| |
Collapse
|
2
|
Ranathunge C, Wheeler GL, Chimahusky ME, Kennedy MM, Morrison JI, Baldwin BS, Perkins AD, Welch ME. Transcriptome profiles of sunflower reveal the potential role of microsatellites in gene expression divergence. Mol Ecol 2018; 27:1188-1199. [PMID: 29419922 DOI: 10.1111/mec.14522] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2017] [Revised: 01/18/2018] [Accepted: 01/29/2018] [Indexed: 12/17/2022]
Abstract
The mechanisms by which natural populations generate adaptive genetic variation are not well understood. Some studies propose that microsatellites can function as drivers of adaptive variation. Here, we tested a potentially adaptive role for transcribed microsatellites with natural populations of the common sunflower (Helianthus annuus L.) by assessing the enrichment of microsatellites in genes that show expression divergence across latitudes. Seeds collected from six populations at two distinct latitudes in Kansas and Oklahoma were planted and grown in a common garden. Morphological measurements from the common garden demonstrated that phenotypic variation among populations is largely explained by underlying genetic variation. An RNA-Seq experiment was conducted with 96 of the individuals grown in the common garden and differentially expressed (DE) transcripts between the two latitudes were identified. A total number of 825 DE transcripts were identified. DE transcripts and nondifferentially expressed (NDE) transcripts were then scanned for microsatellites. The abundance of different motif lengths and types in both groups were estimated. Our results indicate that DE transcripts are significantly enriched with mononucleotide repeats and significantly depauperate in trinucleotide repeats. Further, the standardized mononucleotide repeat motif A and dinucleotide repeat motif AG were significantly enriched within DE transcripts while motif types, C, AT, ACC and AAC in DE transcripts, are significantly differentiated in microsatellite tract length between the two latitudes. The tract length differentiation at specific microsatellite motif types across latitudes and their enrichment within DE transcripts indicate a potential functional role for transcribed microsatellites in gene expression divergence in sunflower.
Collapse
Affiliation(s)
- Chathurani Ranathunge
- Department of Biological Sciences, Mississippi State University, Starkville, MS, USA
| | - Gregory L Wheeler
- Department of Biological Sciences, Mississippi State University, Starkville, MS, USA.,Department of Evolution, Ecology, and Organismal Biology, The Ohio State University, Columbus, OH, USA
| | - Melody E Chimahusky
- Department of Biological Sciences, Mississippi State University, Starkville, MS, USA
| | - Meaghan M Kennedy
- Department of Biological Sciences, Carnegie Mellon University, Pittsburgh, PA, USA
| | - Jesse I Morrison
- Department of Plant and Soil Sciences, Mississippi State University, Starkville, MS, USA
| | - Brian S Baldwin
- Department of Plant and Soil Sciences, Mississippi State University, Starkville, MS, USA
| | - Andy D Perkins
- Department of Computer Science and Engineering, Mississippi State University, Starkville, MS, USA
| | - Mark E Welch
- Department of Biological Sciences, Mississippi State University, Starkville, MS, USA
| |
Collapse
|
3
|
Carr CE, Ganugula R, Shikiya R, Soto AM, Marky LA. Effect of dC → d(m 5C) substitutions on the folding of intramolecular triplexes with mixed TAT and C +GC base triplets. Biochimie 2018; 146:156-165. [PMID: 29277568 PMCID: PMC5811340 DOI: 10.1016/j.biochi.2017.12.008] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2017] [Accepted: 12/19/2017] [Indexed: 12/31/2022]
Abstract
Oligonucleotide-directed triple helix formation has been recognized as a potential tool for targeting genes with high specificity. Cystosine methylation in the 5' position is both ubiquitous and a stable regulatory modification, which could potentially stabilize triple helix formation. In this work, we have used a combination of calorimetric and spectroscopic techniques to study the intramolecular unfolding of four triplexes and two duplexes. We used the following triplex control sequence, named Control Tri, d(AGAGAC5TCTCTC5TCTCT), where C5 are loops of five cytosines. From this sequence, we studied three other sequences with dC → d(m5C) substitutions on the Hoogsteen strand (2MeH), Crick strand (2MeC) and both strands (4MeHC). Calorimetric studies determined that methylation does increase the thermal and enthalpic stability, leading to an overall favorable free energy, and that this increased stability is cumulative, i.e. methylation on both the Hoogsteen and Crick strands yields the largest favorable free energy. The differential uptake of protons, counterions and water was determined. It was found that methylation increases cytosine protonation by shifting the apparent pKa value to a higher pH; this increase in proton uptake coincides with a release of counterions during folding of the triplex, likely due to repulsion from the increased positive charge from the protonated cytosines. The immobilization of water was not affected for triplexes with methylated cytosines on their Hoogsteen or Crick strands, but was seen for the triplex where both strands are methylated. This may be due to the alignment in the major groove of the methyl groups on the cytosines with the methyl groups on the thymines which causes an increase in structural water along the spine of the triplex.
Collapse
Affiliation(s)
- Carolyn E Carr
- Department of Pharmaceutical Sciences, University of Nebraska Medical Center, 986025 Nebraska Medical Center, Omaha, NE, 68198-6025, USA
| | - Rajkumar Ganugula
- Department of Pharmaceutical Sciences, University of Nebraska Medical Center, 986025 Nebraska Medical Center, Omaha, NE, 68198-6025, USA
| | - Ronald Shikiya
- Department of Pharmaceutical Sciences, University of Nebraska Medical Center, 986025 Nebraska Medical Center, Omaha, NE, 68198-6025, USA
| | - Ana Maria Soto
- Department of Pharmaceutical Sciences, University of Nebraska Medical Center, 986025 Nebraska Medical Center, Omaha, NE, 68198-6025, USA
| | - Luis A Marky
- Department of Pharmaceutical Sciences, University of Nebraska Medical Center, 986025 Nebraska Medical Center, Omaha, NE, 68198-6025, USA.
| |
Collapse
|
4
|
Srivastava A, Kumar AS, Mishra RK. Vertebrate GAF/ThPOK: emerging functions in chromatin architecture and transcriptional regulation. Cell Mol Life Sci 2018; 75:623-633. [PMID: 28856379 PMCID: PMC11105447 DOI: 10.1007/s00018-017-2633-7] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2017] [Revised: 08/09/2017] [Accepted: 08/25/2017] [Indexed: 12/31/2022]
Abstract
GAGA factor of Drosophila melanogaster (DmGAF) is a multifaceted transcription factor with diverse roles in chromatin regulation. Recently, ThPOK/c-Krox was identified as its vertebrate homologue (vGAF), which has a basic domain structure similar to DmGAF and is decorated with a number of post-translationally modified residues. In vertebrate genomes, vGAF associates with purine-rich GAGA sequences and performs diverse chromatin-mediated functions, viz., gene activation, repression and enhancer blocking. Expansion of regulatory chromatin proteins with the acquisition of PTMs appears to be the general trend that facilitated the evolution of complexity in vertebrates. Here, we compare the structural and functional features of vGAF with those of DmGAF and also assess the possible functional redundancy among paralogues of vGAF. We also discuss the underlying mechanisms which aid in the diverse and context-dependent functions of this protein.
Collapse
Affiliation(s)
- Avinash Srivastava
- CSIR-Centre for Cellular and Molecular Biology (CCMB), Uppal Road, Hyderabad, 500007, India
| | - Amitha Sampath Kumar
- CSIR-Centre for Cellular and Molecular Biology (CCMB), Uppal Road, Hyderabad, 500007, India
| | - Rakesh K Mishra
- CSIR-Centre for Cellular and Molecular Biology (CCMB), Uppal Road, Hyderabad, 500007, India.
| |
Collapse
|
5
|
Abe H, Gemmell NJ. Evolutionary Footprints of Short Tandem Repeats in Avian Promoters. Sci Rep 2016; 6:19421. [PMID: 26766026 PMCID: PMC4725869 DOI: 10.1038/srep19421] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2015] [Accepted: 12/11/2015] [Indexed: 01/12/2023] Open
Abstract
Short tandem repeats (STRs) or microsatellites are well-known sequence elements that may change the spacing between transcription factor binding sites (TFBSs) in promoter regions by expansion or contraction of repetitive units. Some of these mutations have the potential to contribute to phenotypic diversity by altering patterns of gene expression. To explore how repetitive sequence motifs within promoters have evolved in avian lineages under mutation-selection balance, more than 400 evolutionary conserved STRs (ecSTRs) were identified in this study by comparing the 2 kb upstream promoter sequences of chicken against those of other birds (turkey, duck, zebra finch, and flycatcher). The rate of conservation was significantly higher in AG dinucleotide repeats than in AC or AT repeats, with the expansion of AG motifs being noticeably constrained in passerines. Analysis of the relative distance between ecSTRs and TFBSs revealed a significantly higher rate of conserved TFBSs in the vicinity of ecSTRs in both chicken-duck and chicken-passerine comparisons. Our comparative study provides a novel insight into which intrinsic factors have influenced the degree of constraint on repeat expansion/contraction during avian promoter evolution.
Collapse
Affiliation(s)
- Hideaki Abe
- Department of Anatomy, University of Otago, Dunedin 9054, New Zealand
| | - Neil J Gemmell
- Department of Anatomy, University of Otago, Dunedin 9054, New Zealand.,Allan Wilson Centre for Molecular Ecology and Evolution, University of Otago, Dunedin 9054, New Zealand
| |
Collapse
|
6
|
Detection of G-quadruplex DNA using primer extension as a tool. PLoS One 2015; 10:e0119722. [PMID: 25799152 PMCID: PMC4370603 DOI: 10.1371/journal.pone.0119722] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2014] [Accepted: 01/23/2015] [Indexed: 01/22/2023] Open
Abstract
DNA sequence and structure play a key role in imparting fragility to different regions of the genome. Recent studies have shown that non-B DNA structures play a key role in causing genomic instability, apart from their physiological roles at telomeres and promoters. Structures such as G-quadruplexes, cruciforms, and triplexes have been implicated in making DNA susceptible to breakage, resulting in genomic rearrangements. Hence, techniques that aid in the easy identification of such non-B DNA motifs will prove to be very useful in determining factors responsible for genomic instability. In this study, we provide evidence for the use of primer extension as a sensitive and specific tool to detect such altered DNA structures. We have used the G-quadruplex motif, recently characterized at the BCL2 major breakpoint region as a proof of principle to demonstrate the advantages of the technique. Our results show that pause sites corresponding to the non-B DNA are specific, since they are absent when the G-quadruplex motif is mutated and their positions change in tandem with that of the primers. The efficiency of primer extension pause sites varied according to the concentration of monovalant cations tested, which support G-quadruplex formation. Overall, our results demonstrate that primer extension is a strong in vitro tool to detect non-B DNA structures such as G-quadruplex on a plasmid DNA, which can be further adapted to identify non-B DNA structures, even at the genomic level.
Collapse
|
7
|
Mohammadparast S, Bayat H, Biglarian A, Ohadi M. Exceptional expansion and conservation of a CT-repeat complex in the core promoter of PAXBP1 in primates. Am J Primatol 2014; 76:747-56. [PMID: 24573656 DOI: 10.1002/ajp.22266] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2013] [Revised: 12/28/2013] [Accepted: 01/28/2014] [Indexed: 11/11/2022]
Abstract
Adaptive evolution may be linked with the genomic distribution and function of short tandem repeats (STRs). Proximity of the core promoter STRs to the +1 transcription start site (TSS), and their mutable nature are characteristics that highlight those STRs as a novel source of interspecies variation. The PAXBP1 gene (alternatively known as GCFC1) core promoter contains the longest STR identified in a Homo sapiens gene core promoter. Indeed, this core promoter is a stretch of four consecutive CT-STRs. In the current study, we used the Ensembl, NCBI, and UCSC databases to analyze the evolutionary trend and functional implication of this CT-STR complex in six major lineages across vertebrates, including primates, non-primate mammals, birds, reptiles, amphibians, and fish. We observed exceptional expansion (≥4-repeats) and conservation of this CT-STR complex across primates, except prosimians, Microcebus murinus and Otolemur garnettii (Fisher exact P<4.1×10(-7)). H. sapiens has the most complex STR formula, and longest repeats. Macaca mulatta and Callithrix jacchus monkeys have the simplest STR formulas, and shortest repeat numbers. CT≥4-repeats were not detected in non-primate lineages. Different length alleles across the PAXBP1 core promoter CT-STRs significantly altered gene expression in vitro (P<0.001, t-test). PAXBP1 has a crucial role in craniofacial development, myogenesis, and spine morphogenesis, properties that have been diverged between primates and non-primates. To our knowledge, this is the first instance of expansion and conservation of a STR complex co-occurring specifically with the primate lineage.
Collapse
Affiliation(s)
- Saeid Mohammadparast
- Genetics Research Center, University of Social Welfare and Rehabilitation Sciences, Tehran, Iran
| | | | | | | |
Collapse
|
8
|
YY1 and a unique DNA repeat element regulates the transcription of mouse CS1 (CD319, SLAMF7) gene. Mol Immunol 2013; 54:254-63. [DOI: 10.1016/j.molimm.2012.12.017] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2012] [Revised: 12/17/2012] [Accepted: 12/19/2012] [Indexed: 12/22/2022]
|
9
|
Vasquez KM, Wang G. The yin and yang of repair mechanisms in DNA structure-induced genetic instability. Mutat Res 2013; 743-744:118-131. [PMID: 23219604 PMCID: PMC3661696 DOI: 10.1016/j.mrfmmm.2012.11.005] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2012] [Revised: 11/21/2012] [Accepted: 11/24/2012] [Indexed: 01/14/2023]
Abstract
DNA can adopt a variety of secondary structures that deviate from the canonical Watson-Crick B-DNA form. More than 10 types of non-canonical or non-B DNA secondary structures have been characterized, and the sequences that have the capacity to adopt such structures are very abundant in the human genome. Non-B DNA structures have been implicated in many important biological processes and can serve as sources of genetic instability, implicating them in disease and evolution. Non-B DNA conformations interact with a wide variety of proteins involved in replication, transcription, DNA repair, and chromatin architectural regulation. In this review, we will focus on the interactions of DNA repair proteins with non-B DNA and their roles in genetic instability, as the proteins and DNA involved in such interactions may represent plausible targets for selective therapeutic intervention.
Collapse
Affiliation(s)
- Karen M Vasquez
- Division of Pharmacology and Toxicology, College of Pharmacy, The University of Texas at Austin, Dell Pediatric Research Institute, 1400 Barbara Jordan Blvd. R1800, Austin, TX 78723, United States.
| | - Guliang Wang
- Division of Pharmacology and Toxicology, College of Pharmacy, The University of Texas at Austin, Dell Pediatric Research Institute, 1400 Barbara Jordan Blvd. R1800, Austin, TX 78723, United States
| |
Collapse
|
10
|
Sawaya S, Bagshaw A, Buschiazzo E, Kumar P, Chowdhury S, Black MA, Gemmell N. Microsatellite tandem repeats are abundant in human promoters and are associated with regulatory elements. PLoS One 2013; 8:e54710. [PMID: 23405090 PMCID: PMC3566118 DOI: 10.1371/journal.pone.0054710] [Citation(s) in RCA: 108] [Impact Index Per Article: 9.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2012] [Accepted: 12/18/2012] [Indexed: 12/13/2022] Open
Abstract
Tandem repeats are genomic elements that are prone to changes in repeat number and are thus often polymorphic. These sequences are found at a high density at the start of human genes, in the gene’s promoter. Increasing empirical evidence suggests that length variation in these tandem repeats can affect gene regulation. One class of tandem repeats, known as microsatellites, rapidly alter in repeat number. Some of the genetic variation induced by microsatellites is known to result in phenotypic variation. Recently, our group developed a novel method for measuring the evolutionary conservation of microsatellites, and with it we discovered that human microsatellites near transcription start sites are often highly conserved. In this study, we examined the properties of microsatellites found in promoters. We found a high density of microsatellites at the start of genes. We showed that microsatellites are statistically associated with promoters using a wavelet analysis, which allowed us to test for associations on multiple scales and to control for other promoter related elements. Because promoter microsatellites tend to be G/C rich, we hypothesized that G/C rich regulatory elements may drive the association between microsatellites and promoters. Our results indicate that CpG islands, G-quadruplexes (G4) and untranslated regulatory regions have highly significant associations with microsatellites, but controlling for these elements in the analysis does not remove the association between microsatellites and promoters. Due to their intrinsic lability and their overlap with predicted functional elements, these results suggest that many promoter microsatellites have the potential to affect human phenotypes by generating mutations in regulatory elements, which may ultimately result in disease. We discuss the potential functions of human promoter microsatellites in this context.
Collapse
Affiliation(s)
- Sterling Sawaya
- Centre for Reproduction and Genomics, Department of Anatomy, and Allan Wilson Centre for Molecular Ecology and Evolution, University of Otago, Dunedin, New Zealand.
| | | | | | | | | | | | | |
Collapse
|
11
|
Core promoter STRs: novel mechanism for inter-individual variation in gene expression in humans. Gene 2011; 492:195-8. [PMID: 22037607 DOI: 10.1016/j.gene.2011.10.028] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2011] [Revised: 09/27/2011] [Accepted: 10/11/2011] [Indexed: 11/21/2022]
Abstract
In a genome-scale analysis of the composition of core promoter sequences, we have recently shown that approximately 25% of the human protein-coding genes have at least one short tandem repeat (STR) of 3-repeats in their core promoters (i.e. the interval between -120 to +1). Through their nucleosome processing effect, GA-repeats play a crucial role in the regulation of gene transcription. In this study, we chose the human SRY (sex determining region Y)-box 5 (SOX5) gene as a prototype of the GA-rich core promoters to investigate the role of core promoter GA-STRs in gene expression. The human SOX5 gene is indispensable for diverse embryonic developmental processes, ranging from oligodendrocyte development and corticogenesis to chondrogenesis, and regulation of the cell cycle. Whereas the absolute ratio of 99% of the genes range between 0.2 and 2, the composition of the core promoter of the two most ubiquitously expressed mRNAs of the human SOX5 gene (transcripts ID: ENST00000451604 and ENST00000309359) is exceptionally rich in purine nucleotides (purine/pyrimidine ratio: 61.5). Indeed, this core promoter is an island of four tandem GA-STRs, and lacks the known TATA and TATA-less elements for gene transcription. Evolutionary conservation of this region between human and mouse (75% homology) supports important functional role for this promoter. In this study, we show that this nucleotide composition is indeed a potent promoter (p<1×10(-10)), and different haplotypes across the region result in significant difference in gene expression (p<1×10(-6)). To our knowledge, this is the first report of functional STRs in a human gene core promoter. Based on our search on the core promoters of the entire human protein-coding genes annotated in the GeneCards database (19,927genes) for the presence of pure GA-STRs, 429 genes contain at least one GA(3)-repeat in their core promoter. Core promoters with pure GA-STRs of GA(4) and above were observed in 61 genes. Our data unravel a novel mechanism for inter-individual variation in gene expression and complex traits/phenotypes through core promoter GA-STRs.
Collapse
|
12
|
Hamarsheh O, Amro A. Characterization of simple sequence repeats (SSRs) from Phlebotomus papatasi (Diptera: Psychodidae) expressed sequence tags (ESTs). Parasit Vectors 2011; 4:189. [PMID: 21958493 PMCID: PMC3191335 DOI: 10.1186/1756-3305-4-189] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2011] [Accepted: 09/29/2011] [Indexed: 10/31/2022] Open
Abstract
BACKGROUND Phlebotomus papatasi is a natural vector of Leishmania major, which causes cutaneous leishmaniasis in many countries. Simple sequence repeats (SSRs), or microsatellites, are common in eukaryotic genomes and are short, repeated nucleotide sequence elements arrayed in tandem and flanked by non-repetitive regions. The enrichment methods used previously for finding new microsatellite loci in sand flies remain laborious and time consuming; in silico mining, which includes retrieval and screening of microsatellites from large amounts of sequence data from sequence data bases using microsatellite search tools can yield many new candidate markers. RESULTS Simple sequence repeats (SSRs) were characterized in P. papatasi expressed sequence tags (ESTs) derived from a public database, National Center for Biotechnology Information (NCBI). A total of 42,784 sequences were mined, and 1,499 SSRs were identified with a frequency of 3.5% and an average density of 15.55 kb per SSR. Dinucleotide motifs were the most common SSRs, accounting for 67% followed by tri-, tetra-, and penta-nucleotide repeats, accounting for 31.1%, 1.5%, and 0.1%, respectively. The length of microsatellites varied from 5 to 16 repeats. Dinucleotide types; AG and CT have the highest frequency. Dinucleotide SSR-ESTs are relatively biased toward an excess of (AX)n repeats and a low GC base content. Forty primer pairs were designed based on motif lengths for further experimental validation. CONCLUSION The first large-scale survey of SSRs derived from P. papatasi is presented; dinucleotide SSRs identified are more frequent than other types. EST data mining is an effective strategy to identify functional microsatellites in P. papatasi.
Collapse
Affiliation(s)
- Omar Hamarsheh
- Department of Biological Sciences, Faculty of Science and Technology, Al-Quds University, PO Box 51000, Jerusalem, Palestine.
| | | |
Collapse
|
13
|
Buske FA, Mattick JS, Bailey TL. Potential in vivo roles of nucleic acid triple-helices. RNA Biol 2011; 8:427-39. [PMID: 21525785 DOI: 10.4161/rna.8.3.14999] [Citation(s) in RCA: 143] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022] Open
Abstract
The ability of double-stranded DNA to form a triple-helical structure by hydrogen bonding with a third strand is well established, but the biological functions of these structures remain largely unknown. There is considerable albeit circumstantial evidence for the existence of nucleic triplexes in vivo and their potential participation in a variety of biological processes including chromatin organization, DNA repair, transcriptional regulation, and RNA processing has been investigated in a number of studies to date. There is also a range of possible mechanisms to regulate triplex formation through differential expression of triplex-forming RNAs, alteration of chromatin accessibility, sequence unwinding and nucleotide modifications. With the advent of next generation sequencing technology combined with targeted approaches to isolate triplexes, it is now possible to survey triplex formation with respect to their genomic context, abundance and dynamical changes during differentiation and development, which may open up new vistas in understanding genome biology and gene regulation.
Collapse
Affiliation(s)
- Fabian A Buske
- Institute for Molecular Bioscience, The University of Queensland, Brisbane, QLD Australia
| | | | | |
Collapse
|
14
|
Han YJ, Ma SF, Yourek G, Park YD, Garcia JGN. A transcribed pseudogene of MYLK promotes cell proliferation. FASEB J 2011; 25:2305-12. [PMID: 21441351 DOI: 10.1096/fj.10-177808] [Citation(s) in RCA: 65] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/16/2023]
Abstract
Pseudogenes are considered nonfunctional genomic artifacts of catastrophic pathways. Recent evidence, however, indicates novel roles for pseudogenes as regulators of gene expression. We tested the functionality of myosin light chain kinase pseudogene (MYLKP1) in human cells and tissues by RT-PCR, promoter activity, and cell proliferation assays. MYLKP1 is partially duplicated from the original MYLK gene that encodes nonmuscle and smooth muscle myosin light chain kinase (smMLCK) isoforms and regulates cell contractility and cytokinesis. Despite strong homology with the smMLCK promoter (∼ 89.9%), the MYLKP1 promoter is minimally active in normal bronchial epithelial cells but highly active in lung adenocarcinoma cells. Moreover, MYLKP1 and smMLCK exhibit negatively correlated transcriptional patterns in normal and cancer cells with MYLKP1 strongly expressed in cancer cells and smMLCK highly expressed in non-neoplastic cells. For instance, expression of smMLCK decreased (19.5 ± 4.7 fold) in colon carcinoma tissues compared to normal colon tissues. Mechanistically, MYLKP1 overexpression inhibits smMLCK expression in cancer cells by decreasing RNA stability, leading to increased cell proliferation. These studies provide strong evidence for the functional involvement of pseudogenes in carcinogenesis and suggest MYLKP1 as a potential novel diagnostic or therapeutic target in human cancers.
Collapse
Affiliation(s)
- Yoo Jeong Han
- Department of Medicine, University of Illinois at Chicago, Chicago, Illinois 60612-7227, USA
| | | | | | | | | |
Collapse
|
15
|
Darvish H, Nabi MO, Firouzabadi SG, Karimlou M, Heidari A, Najmabadi H, Ohadi M. Exceptional human core promoter nucleotide compositions. Gene 2011; 475:79-86. [PMID: 21277957 DOI: 10.1016/j.gene.2010.12.013] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2010] [Revised: 12/22/2010] [Accepted: 12/27/2010] [Indexed: 11/28/2022]
Abstract
The proximal promoter sequences contain basic motifs for the expression of the downstream genes. We present genome-scale computational analyses of the 120-bp immediate upstream sequences to the +1 transcription start sites (TSSs) of 10,117 human protein-coding genes, and unravel exceptional genes in respect with the core promoter nucleotide composition. Our data reveal that while in 99% of the genes the absolute purine/pyrimidine ratio ranges between 0.2 and 2.5, certain genes show exceptional skew in this balance (e.g. ratios of 82.3 in VWA3A, 61.5 in Sox5, and 24.0 in BRWD3), and consist of islands of purines or pyrimidines. Furthermore, while over 95% of the genes lack more than one short tandem repeat (STR) in their core promoters, certain gene promoters are exceptionally rich in multiple STRs (e.g. eight consecutive STRs in UBE2QL1, and six STRs in GRIA2). We found sequence bias for the majority of those promoters across species, supporting functional roles for them in gene expression. Genes downstream to those promoters were also found to be of ontologic importance (i.e. we were able to track the majority of those genes to the lower species such as Saccharomyces cerevisiae and Caenorhabditis elegans). The exceptional promoters presented in this study lack the conventional motifs for the TATA, and TATA-less promoters, hence offering novel mechanisms for gene expression. They may also provide potential mechanisms for inter-individual variations in gene expression, and complex traits/disorders.
Collapse
Affiliation(s)
- H Darvish
- Genetics Research Center, University of Social Welfare and Rehabilitation Sciences, Tehran, Iran
| | | | | | | | | | | | | |
Collapse
|
16
|
Bergquist H, Nikravesh A, Fernández RD, Larsson V, Nguyen CH, Good L, Zain R. Structure-specific recognition of Friedreich's ataxia (GAA)n repeats by benzoquinoquinoxaline derivatives. Chembiochem 2010; 10:2629-37. [PMID: 19746387 DOI: 10.1002/cbic.200900263] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
Abstract
Expansion of GAA triplet repeats in intron 1 of the FXN gene reduces frataxin expression and causes Friedreich's ataxia. (GAA)n repeats form non-B-DNA structures, including triple helix H-DNA and higher-order structures (sticky DNA). In the proposed mechanisms of frataxin gene silencing, central unanswered questions involve the characterization of non-B-DNA structure(s) that are strongly suggested to play a role in frataxin expression. Here we examined (GAA)n binding by triplex-stabilizing benzoquinoquinoxaline (BQQ) and the corresponding triplex-DNA-cleaving BQQ-1,10-phenanthroline (BQQ-OP) compounds. We also examined the ability of these compounds to act as structural probes for H-DNA formation within higher-order structures at pathological frataxin sequences in plasmids. DNA-complex-formation analyses with a gel-mobility-shift assay and sequence-specific probing of H-DNA-forming (GAA)n sequences by single-strand oligonucleotides and triplex-directed cleavage demonstrated that a parallel pyrimidine (rather than purine) triplex is the more stable motif formed at (GAA)n repeats under physiologically relevant conditions.
Collapse
Affiliation(s)
- Helen Bergquist
- Department of Molecular Biology and Functional Genomics, Stockholm University, Svante Arrhenius väg 20C, 10691 Stockholm, Sweden
| | | | | | | | | | | | | |
Collapse
|
17
|
Smýkal P, Kalendar R, Ford R, Macas J, Griga M. Evolutionary conserved lineage of Angela-family retrotransposons as a genome-wide microsatellite repeat dispersal agent. Heredity (Edinb) 2009; 103:157-67. [PMID: 19384338 DOI: 10.1038/hdy.2009.45] [Citation(s) in RCA: 46] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023] Open
Abstract
A detailed examination of 45 pea (Pisum sativum L.) simple sequence repeat (SSR) loci revealed that 21 of them included homologous sequences corresponding to the long terminal repeat (LTR) of a novel retrotransposon. Further investigation, including full-length sequencing, led to its classification as an RLC-Angela-family-FJ434420 element. The LTR contained a variable region ranging from a simple TC repeat (TC)(11) to more complex repeats of TC/CA, (TC)(12-30), (CA)(18-22) and was up to 146 bp in length. These elements are the most abundant Ty1/copia retrotransposons identified in the pea genome and also occur in other legume species. It is interesting that analysis of 63 LTR-derived sequences originating from 30 legume species showed high phylogenetic conservation in their sequence, including the position of the variable SSR region. This extraordinary conservancy led us to the proposition of a new lineage, named MARTIANS, within the Angela family. Similar LTR structures and partial sequence similarities were detected in more distant members of this Angela family, the barley BARE-1 and rice RIRE-1 elements. Comparison of the LTR sequences from pea and Medicago truncatula elements indicated that microsatellites arise through the expansion of a pre-existing repeat motif. Thus, the presence of an SSR region within the LTR seems to be a typical feature of this MARTIANS lineage, and the evidence gathered from a wide range of species suggests that these elements may facilitate amplification and genome-wide dispersal of associated SSR sequences. The implications of this finding regarding the evolution of SSRs within the genome, as well as their utilization as molecular markers, are discussed.
Collapse
Affiliation(s)
- P Smýkal
- Agritec Plant Research Ltd, Plant Biotechnology Department, Sumperk, Czech Republic.
| | | | | | | | | |
Collapse
|