251
|
Lambert S, Carr AM. Impediments to replication fork movement: stabilisation, reactivation and genome instability. Chromosoma 2013; 122:33-45. [DOI: 10.1007/s00412-013-0398-9] [Citation(s) in RCA: 79] [Impact Index Per Article: 7.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2012] [Revised: 02/11/2013] [Accepted: 02/11/2013] [Indexed: 01/02/2023]
|
252
|
Zhang L, Li L, Xu F, Qi H, Wang X, Que H, Zhang G. Fosmid library construction and end sequences analysis of the Pacific oyster,Crassostrea gigas. MOLLUSCAN RESEARCH 2013. [DOI: 10.1080/13235818.2012.754149] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/26/2022]
|
253
|
Single-stranded annealing induced by re-initiation of replication origins provides a novel and efficient mechanism for generating copy number expansion via non-allelic homologous recombination. PLoS Genet 2013; 9:e1003192. [PMID: 23300490 PMCID: PMC3536649 DOI: 10.1371/journal.pgen.1003192] [Citation(s) in RCA: 34] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2012] [Accepted: 11/08/2012] [Indexed: 11/24/2022] Open
Abstract
Copy number expansions such as amplifications and duplications contribute to human phenotypic variation, promote molecular diversification during evolution, and drive the initiation and/or progression of various cancers. The mechanisms underlying these copy number changes are still incompletely understood, however. We recently demonstrated that transient, limited re-replication from a single origin in Saccharomyces cerevisiae efficiently induces segmental amplification of the re-replicated region. Structural analyses of such re-replication induced gene amplifications (RRIGA) suggested that RRIGA could provide a new mechanism for generating copy number variation by non-allelic homologous recombination (NAHR). Here we elucidate this new mechanism and provide insight into why it is so efficient. We establish that sequence homology is both necessary and sufficient for repetitive elements to participate in RRIGA and show that their recombination occurs by a single-strand annealing (SSA) mechanism. We also find that re-replication forks are prone to breakage, accounting for the widespread DNA damage associated with deregulation of replication proteins. These breaks appear to stimulate NAHR between re-replicated repeat sequences flanking a re-initiating replication origin. Our results support a RRIGA model where the expansion of a re-replication bubble beyond flanking homologous sequences followed by breakage at both forks in trans provides an ideal structural context for SSA–mediated NAHR to form a head-to-tail duplication. Given the remarkable efficiency of RRIGA, we suggest it may be an unappreciated contributor to copy number expansions in both disease and evolution. Duplications and amplifications of chromosomal segments are frequently observed in eukaryotic genomes, including both normal and cancerous human genomes. These copy number variations contribute to the phenotypic variation upon which natural selection acts. For example, the amplification of genes whose excessive copy number facilitates uncontrolled cell division is often selected for during tumor development. Copy number variations can often arise when repetitive sequence elements, which are dispersed throughout eukaryotic genomes, undergo a rearrangement called non-allelic homologous recombination. Exactly how these rearrangements occur is poorly understood. Here, using budding yeast to model this class of copy number variation, we uncover a new and highly efficient mechanism by which these variations can be generated. The precipitating event is the aberrant re-initiation of DNA replication at a replication origin. Normally the hundreds to thousands of origins scattered throughout a eukaryotic genome are tightly controlled such that each is permitted to initiate only once per cell cycle. However, disruptions in these controls can allow origins to re-initiate, and we show how the resulting DNA re-replication structure can be readily converted into a tandem duplication via non-allelic homologous recombination. Hence, the re-initiation of DNA replication is a potential source of copy number variation both in disease and during evolution.
Collapse
|
254
|
Podgornaya O, Gavrilova E, Stephanova V, Demin S, Komissarov A. Large tandem repeats make up the chromosome bar code: a hypothesis. ADVANCES IN PROTEIN CHEMISTRY AND STRUCTURAL BIOLOGY 2013; 90:1-30. [PMID: 23582200 DOI: 10.1016/b978-0-12-410523-2.00001-8] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/21/2023]
Abstract
Much of tandem repeats' functional nature in any genome remains enigmatic because there are only few tools available for dissecting and elucidating the functions of repeated DNA. The large tandem repeat arrays (satellite DNA) found in two mouse whole-genome shotgun assemblies were classified into 4 superfamilies, 8 families, and 62 subfamilies. With the simplified variant of chromosome positioning of different tandem repeats, we noticed the nonuniform distribution instead of the positions reported for mouse major and minor satellites. It is visible that each chromosome possesses a kind of unique code made up of different large tandem repeats. The reference genomes allow marking only internal tandem repeats, and even with such a limited data, the colored "bar code" made up of tandem repeats is visible. We suppose that tandem repeats bare the mechanism for chromosomes to recognize the regions to be associated. The associations, initially established via RNA, become fixed by histone modifications (the histone or chromatin code) and specific proteins. In such a way, associations, being at the beginning flexible and regulated, that is, adjustable, appear as irreversible and inheritable in cell generations. Tandem repeat multiformity tunes the developed nuclei 3D pattern by sequential steps of associations. Tandem repeats-based chromosome bar code could be the carrier of the genome structural information; that is, the order of precise tandem repeat association is the DNA morphogenetic program. Tandem repeats are the cores of the distinct 3D structures postulated in "gene gating" hypothesis.
Collapse
|
255
|
Shen JJ, Dushoff J, Bewick AJ, Chain FJ, Evans BJ. Genomic dynamics of transposable elements in the western clawed frog (Silurana tropicalis). Genome Biol Evol 2013; 5:998-1009. [PMID: 23645600 PMCID: PMC3673623 DOI: 10.1093/gbe/evt065] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 04/18/2013] [Indexed: 02/07/2023] Open
Abstract
Transposable elements (TEs) are repetitive DNA sequences that can make new copies of themselves that are inserted elsewhere in a host genome. The abundance and distributions of TEs vary considerably among phylogenetically diverse hosts. With the aim of exploring the basis of this variation, we evaluated correlations between several genomic variables and the presence of TEs and non-TE repeats in the complete genome sequence of the Western clawed frog (Silurana tropicalis). This analysis reveals patterns of TE insertion consistent with gene disruption but not with the insertional preference model. Analysis of non-TE repeats recovered unique features of their genome-wide distribution when compared with TE repeats, including no strong correlation with exons and a particularly strong negative correlation with GC content. We also collected polymorphism data from 25 TE insertion sites in 19 wild-caught S. tropicalis individuals. DNA transposon insertions were fixed at eight of nine sites and at a high frequency at one of nine, whereas insertions of long terminal repeat (LTR) and non-LTR retrotransposons were fixed at only 4 of 16 sites and at low frequency at 12 of 16. A maximum likelihood model failed to attribute these differences in insertion frequencies to variation in selection pressure on different classes of TE, opening the possibility that other phenomena such as variation in rates of replication or duration of residence in the genome could play a role. Taken together, these results identify factors that sculpt heterogeneity in TE distribution in S. tropicalis and illustrate that genomic dynamics differ markedly among TE classes and between TE and non-TE repeats.
Collapse
Affiliation(s)
- Jiangshan J. Shen
- Department of Biology, McMaster University, Hamilton, Ontario, Canada
- Present address: Department of Pathology, The University of Hong Kong, Hong Kong, China
| | - Jonathan Dushoff
- Department of Biology, McMaster University, Hamilton, Ontario, Canada
| | - Adam J. Bewick
- Department of Biology, McMaster University, Hamilton, Ontario, Canada
| | - Frédéric J.J. Chain
- Department of Evolutionary Ecology, Max Planck Institute for Evolutionary Biology, Plön, Germany
| | - Ben J. Evans
- Department of Biology, McMaster University, Hamilton, Ontario, Canada
| |
Collapse
|
256
|
Abstract
Genomic regions that determine mating compatibility are subject to distinct evolutionary forces that can lead to a cessation of meiotic recombination and the accumulation of structural changes between members of the homologous chromosome pair. The relatively recent discovery of dimorphic mating-type chromosomes in fungi can aid the understanding of sex chromosome evolution that is common to dioecious plants and animals. For the anther-smut fungus, Microbotryum lychnidis-dioicae (= M. violaceum isolated from Silene latifolia), the extent of recombination cessation on the dimorphic mating-type chromosomes has been conflictingly reported. Comparison of restriction digest optical maps for the two mating-type chromosomes shows that divergence extends over 90% of the chromosome lengths, flanked at either end by two pseudoautosomal regions. Evidence to support the expansion of recombination cessation in stages from the mating-type locus toward the pseudoautosomal regions was not found, but evidence of such expansion could be obscured by ongoing processes that affect genome structure. This study encourages the comparison of forces that may drive large-scale recombination suppression in fungi and other eukaryotes characterized by dimorphic chromosome pairs associated with sexual life cycles.
Collapse
|
257
|
Multiple pathways regulate minisatellite stability during stationary phase in yeast. G3-GENES GENOMES GENETICS 2012; 2:1185-95. [PMID: 23050229 PMCID: PMC3464111 DOI: 10.1534/g3.112.003673] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/05/2012] [Accepted: 08/05/2012] [Indexed: 12/20/2022]
Abstract
Alterations in minisatellite DNA repeat tracts in humans have been correlated with a number of serious disorders, including cancer. Despite their importance for human health, the genetic factors that influence minisatellite stability are not well understood. Previously, we identified mutations in the Saccharomyces cerevisiae zinc homeostasis genes ZRT1 and ZAP1 that significantly increase the frequency of minisatellite alteration specifically during stationary phase. In this work, we identified mutants of END3, PKC1, and RAD27 that increase minisatellite instability during stationary phase. Genetic analysis reveals that these genes, along with ZRT1 and ZAP1, comprise multiple pathways regulating minisatellite stability during stationary phase. Minisatellite alterations generated by perturbation of any of these pathways occur via homologous recombination. We present evidence that suggests formation of ssDNA or ssDNA breaks may play a primary role in stationary phase instability. Finally, we examined the roles of these pathways in the stability of a human minisatellite tract associated with the HRAS1 oncogene and found that loss of RAD27, but not END3 or PKC1, destabilizes the HRAS1 minisatellite in stationary phase yeast. This result indicates that the genetic control of stationary phase minisatellite stability is dependent on the sequence composition of the minisatellite itself.
Collapse
|
258
|
Purves J, Blades M, Arafat Y, Malik SA, Bayliss CD, Morrissey JA. Variation in the genomic locations and sequence conservation of STAR elements among staphylococcal species provides insight into DNA repeat evolution. BMC Genomics 2012; 13:515. [PMID: 23020678 PMCID: PMC3532100 DOI: 10.1186/1471-2164-13-515] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2012] [Accepted: 09/24/2012] [Indexed: 01/05/2023] Open
Abstract
Background Staphylococcus aureus Repeat (STAR) elements are a type of interspersed intergenic direct repeat. In this study the conservation and variation in these elements was explored by bioinformatic analyses of published staphylococcal genome sequences and through sequencing of specific STAR element loci from a large set of S. aureus isolates. Results Using bioinformatic analyses, we found that the STAR elements were located in different genomic loci within each staphylococcal species. There was no correlation between the number of STAR elements in each genome and the evolutionary relatedness of staphylococcal species, however higher levels of repeats were observed in both S. aureus and S. lugdunensis compared to other staphylococcal species. Unexpectedly, sequencing of the internal spacer sequences of individual repeat elements from multiple isolates showed conservation at the sequence level within deep evolutionary lineages of S. aureus. Whilst individual STAR element loci were demonstrated to expand and contract, the sequences associated with each locus were stable and distinct from one another. Conclusions The high degree of lineage and locus-specific conservation of these intergenic repeat regions suggests that STAR elements are maintained due to selective or molecular forces with some of these elements having an important role in cell physiology. The high prevalence in two of the more virulent staphylococcal species is indicative of a potential role for STAR elements in pathogenesis.
Collapse
|
259
|
Scala C, Tian X, Mehdiabadi NJ, Smith MH, Saxer G, Stephens K, Buzombo P, Strassmann JE, Queller DC. Amino acid repeats cause extraordinary coding sequence variation in the social amoeba Dictyostelium discoideum. PLoS One 2012; 7:e46150. [PMID: 23029418 PMCID: PMC3460934 DOI: 10.1371/journal.pone.0046150] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2012] [Accepted: 08/28/2012] [Indexed: 12/19/2022] Open
Abstract
Protein sequences are normally the most conserved elements of genomes owing to purifying selection to maintain their functions. We document an extraordinary amount of within-species protein sequence variation in the model eukaryote Dictyostelium discoideum stemming from triplet DNA repeats coding for long strings of single amino acids. D. discoideum has a very large number of such strings, many of which are polyglutamine repeats, the same sequence that causes various human neurological disorders in humans, like Huntington’s disease. We show here that D. discoideum coding repeat loci are highly variable among individuals, making D. discoideum a candidate for the most variable proteome. The coding repeat loci are not significantly less variable than similar non-coding triplet repeats. This pattern is consistent with these amino-acid repeats being largely non-functional sequences evolving primarily by mutation and drift.
Collapse
Affiliation(s)
- Clea Scala
- Department of Ecology and Evolutionary Biology, Rice University, Houston, Texas, United States of America
| | - Xiangjun Tian
- Department of Biology, Washington University in St. Louis, St. Louis, Missouri, United States of America
| | - Natasha J. Mehdiabadi
- Department of Ecology and Evolutionary Biology, Rice University, Houston, Texas, United States of America
| | - Margaret H. Smith
- Department of Ecology and Evolutionary Biology, Rice University, Houston, Texas, United States of America
| | - Gerda Saxer
- Department of Biochemistry and Cell Biology, Rice University, Houston, Texas, United States of America
| | - Katie Stephens
- Department of Ecology and Evolutionary Biology, Rice University, Houston, Texas, United States of America
| | - Prince Buzombo
- Department of Ecology and Evolutionary Biology, Rice University, Houston, Texas, United States of America
| | - Joan E. Strassmann
- Department of Biology, Washington University in St. Louis, St. Louis, Missouri, United States of America
| | - David C. Queller
- Department of Biology, Washington University in St. Louis, St. Louis, Missouri, United States of America
- * E-mail:
| |
Collapse
|
260
|
Glunčić M, Paar V. Direct mapping of symbolic DNA sequence into frequency domain in global repeat map algorithm. Nucleic Acids Res 2012; 41:e17. [PMID: 22977183 PMCID: PMC3592446 DOI: 10.1093/nar/gks721] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/22/2023] Open
Abstract
The main feature of global repeat map (GRM) algorithm (www.hazu.hr/grm/software/win/grm2012.exe) is its ability to identify a broad variety of repeats of unbounded length that can be arbitrarily distant in sequences as large as human chromosomes. The efficacy is due to the use of complete set of a K-string ensemble which enables a new method of direct mapping of symbolic DNA sequence into frequency domain, with straightforward identification of repeats as peaks in GRM diagram. In this way, we obtain very fast, efficient and highly automatized repeat finding tool. The method is robust to substitutions and insertions/deletions, as well as to various complexities of the sequence pattern. We present several case studies of GRM use, in order to illustrate its capabilities: identification of α-satellite tandem repeats and higher order repeats (HORs), identification of Alu dispersed repeats and of Alu tandems, identification of Period 3 pattern in exons, implementation of ‘magnifying glass’ effect, identification of complex HOR pattern, identification of inter-tandem transitional dispersed repeat sequences and identification of long segmental duplications. GRM algorithm is convenient for use, in particular, in cases of large repeat units, of highly mutated and/or complex repeats, and of global repeat maps for large genomic sequences (chromosomes and genomes).
Collapse
Affiliation(s)
- Matko Glunčić
- Faculty of Science, University of Zagreb, Bijenička 32 and Croatian Academy of Sciences and Arts, Zrinski trg 11, 10000 Zagreb, Croatia.
| | | |
Collapse
|
261
|
Almeida LA, Araujo R. Highlights on molecular identification of closely related species. INFECTION GENETICS AND EVOLUTION 2012; 13:67-75. [PMID: 22982158 DOI: 10.1016/j.meegid.2012.08.011] [Citation(s) in RCA: 35] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Subscribe] [Scholar Register] [Received: 05/15/2012] [Revised: 08/06/2012] [Accepted: 08/08/2012] [Indexed: 10/27/2022]
Abstract
The term "complex" emerged in the literature at the beginning of the genomic era associated to taxonomy and grouping organisms that belong to different species but exhibited similar patterns according to their morphological, physiological and/or other phenotypic features. DNA-DNA hybridization values ~70% and high identity on 16S rRNA gene sequences were recommended for species delineation. Electrophoretic methods showed in some cases to be useful for species identification and population structure but the reproducibility was questionable. Later, the implementation of polyphasic approaches involving phenotypic and molecular methods brought new insights into the analysis of population structure and phylogeny of several "species complexes", allowing the identification of new closely related species. Likewise, the introduction of multilocus sequence typing and sequencing analysis of several genes offered an evolutionary perspective to the term "species complex". Several centres worldwide have recently released increasing genetic information on distinct microbial species. A brief review will be presented to highlight the definition of "species complex" for selected microorganisms, mainly the prokaryotic Acinetobacter calcoaceticus -Acinetobacter baumannii, Borrelia burgdorferi sensu lato, Burkholderia cepacia, Mycobacterium tuberculosis and Nocardia asteroides complexes, and the eukaryotic Aspergillus fumigatus, Leishmania donovani and Saccharomyces sensu stricto complexes. The members of these complexes may show distinct epidemiology, pathogenicity and susceptibility, turning critical their correct identification. Dynamics of prokaryotic and eukaryotic genomes can be very distinct and the term "species complex" should be carefully extended.
Collapse
Affiliation(s)
- Lígia A Almeida
- IPATIMUP, Institute of Molecular Pathology and Immunology, University of Porto, Rua Dr. Roberto Frias s/n, 4200-465 Porto, Portugal.
| | | |
Collapse
|
262
|
Abstract
Genomic analyses increasingly make use of sophisticated statistical and computational approaches in investigations of genomic function and evolution. Scientists implementing and developing these approaches are often computational scientists, physicists, or mathematicians. This article aims to provide a compact overview of genome biology for these scientists. Thus, the article focuses on providing biological context to the genomic features, processes, and structures analysed by these approaches. Topics covered include (1) differences between eukaryotic and prokaryotic cells; (2) the physical structure of genomes and chromatin; (3) different categories of genomic regions, including those serving as templates for RNA and protein synthesis, regulatory regions, repetitive regions, and "architectural" or "organisational" regions, such as centromeres and telomeres; (4) the cell cycle; (5) an overview of transcription, translation, and protein structure; and (6) a glossary of relevant terms.
Collapse
|
263
|
Sato T, Watanabe K, Tamotsu S, Ichikawa A, Schmidt-Rhaesa A. Diversity of nematomorph and cohabiting nematode parasites in riparian ecosystems around the Kii Peninsula, Japan. CAN J ZOOL 2012. [DOI: 10.1139/z2012-048] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
Nematomorph parasites manipulate terrestrial invertebrate hosts to seek out and enter streams, thereby deriving substantial energy subsidies to stream salmonids. Despite this potential ecological role of nematomorphs, knowledge of their diversity remains unclear. Using molecular (i.e., 18S rRNA and mitochondrial COI genes) and morphological approaches, we explored the species diversity of suspected nematomorph specimens, as well as their terrestrial orthopteran hosts, in 10 stream and riparian ecosystems around the Kii Peninsula, central Honshu, Japan. We distinguished seven species of nematomorphs belonging to three genera based on molecular and morphological data. The identifications by the two approaches were consistent with each other at the genus level but partly not at the species level. Furthermore, among the suspected nematomorph specimens, eight nematode species belonging to the orders Mermithida and Trichocephalida were found from two sites. Several orthopterans, mainly camel crickets, were infected by nematomorphs and by a nematode without obvious species specificity. These results suggest that diverse parasites and their orthopteran hosts drive the parasite-mediated energy flow across the stream and riparian ecosystems.
Collapse
Affiliation(s)
- Takuya Sato
- The Hakubi Center for Advanced Research, Kyoto University, Yoshida-Ushinomiya-cyou, Sakyo-ku, Kyoto 606-8302, Japan
| | - Katsutoshi Watanabe
- Department of Zoology, Division of Biological Science, Graduate School of Science, Kyoto University, Kitashirakawa-Oiwakecho, Sakyo-ku, Kyoto 606-8502, Japan
| | - Satoshi Tamotsu
- Department of Biological Sciences, Nara Women’s University, Kitauoya-Nishi machi, Nara 630-8506 Japan
| | - Akihiko Ichikawa
- Orthopterological Society of Japan, 310 Kitadai Building, 17-13 Hirao-4 chome, Taisho-ku, Osaka 551-0012, Japan
| | - Andreas Schmidt-Rhaesa
- Zoological Museum, University Hamburg, Martin-Luther-King-Platz 3, 20146, Hamburg, Germany
| |
Collapse
|
264
|
Felicioli C, Marangoni R. BpMatch: an efficient algorithm for a segmental analysis of genomic sequences. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2012; 9:1120-1127. [PMID: 22350206 DOI: 10.1109/tcbb.2012.30] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/31/2023]
Abstract
Here, we propose BpMatch: an algorithm that, working on a suitably modified suffix-tree data structure, is able to compute, in a fast and efficient way, the coverage of a source sequence S on a target sequence T, by taking into account direct and reverse segments, eventually overlapped. Using BpMatch, the operator should define a priori, the minimum length l of a segment and the minimum number of occurrences minRep, so that only segments longer than l and having a number of occurrences greater than minRep are considered to be significant. BpMatch outputs the significant segments found and the computed segment-based distance. On the worst case, assuming the alphabet dimension d is a constant, the time required by BpMatch to calculate the coverage is O(l²n). On the average, by setting l ≥ 2 log(d)(n), the time required to calculate the coverage is only O(n). BpMatch, thanks to the minRep parameter, can also be used to perform a self-covering: to cover a sequence using segments coming from itself, by avoiding the trivial solution of having a single segment coincident with the whole sequence. The result of the self-covering approach is a spectral representation of the repeats contained in the sequence. BpMatch is freely available on: www.sourceforge.net/projects/bpmatch.
Collapse
|
265
|
Abstract
The Chinese virulent (CHv) strain of duck enteritis virus (DEV) has a genome of approximately 162,175 nucleotides with a GC content of 44.89%. Here we report the complete genomic sequence and annotation of DEV CHv, which offer an effective platform for providing authentic research experiences to novice scientists. In addition, knowledge of this virus will extend our general knowledge of DEV and will be useful for further studies of the mechanisms of virus replication and pathogenesis.
Collapse
|
266
|
Lim KG, Kwoh CK, Hsu LY, Wirawan A. Review of tandem repeat search tools: a systematic approach to evaluating algorithmic performance. Brief Bioinform 2012; 14:67-81. [PMID: 22648964 DOI: 10.1093/bib/bbs023] [Citation(s) in RCA: 68] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/31/2023] Open
Abstract
The prevalence of tandem repeats in eukaryotic genomes and their association with a number of genetic diseases has raised considerable interest in locating these repeats. Over the last 10-15 years, numerous tools have been developed for searching tandem repeats, but differences in the search algorithms adopted and difficulties with parameter settings have confounded many users resulting in widely varying results. In this review, we have systematically separated the algorithmic aspect of the search tools from the influence of the parameter settings. We hope that this will give a better understanding of how the tools differ in algorithmic performance, their inherent constraints and how one should approach in evaluating and selecting them.
Collapse
Affiliation(s)
- Kian Guan Lim
- Division of Software and Information Systems, School of Computer Engineering, Nanyang Technological University, 50 Nanyang Avenue, Singapore 639798.
| | | | | | | |
Collapse
|
267
|
Balamurugan K, Tracey ML, Heine U, Maha GC, Duncan GT. Mutation at the human D1S80 minisatellite locus. ScientificWorldJournal 2012; 2012:917235. [PMID: 22645469 PMCID: PMC3356730 DOI: 10.1100/2012/917235] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2011] [Accepted: 01/05/2012] [Indexed: 01/22/2023] Open
Abstract
Little is known about the general biology of minisatellites. The purpose of this study is to examine repeat mutations from the D1S80 minisatellite locus by sequence analysis to elucidate the mutational process at this locus. This is a highly polymorphic minisatellite locus, located in the subtelomeric region of chromosome 1. We have analyzed 90,000 human germline transmission events and found seven (7) mutations at this locus. The D1S80 alleles of the parentage trio, the child, mother, and the alleged father were sequenced and the origin of the mutation was determined. Using American Association of Blood Banks (AABB) guidelines, we found a male mutation rate of 1.04 × 10(-4) and a female mutation rate of 5.18 × 10(-5) with an overall mutation rate of approximately 7.77 × 10(-5). Also, in this study, we found that the identified mutations are in close proximity to the center of the repeat array rather than at the ends of the repeat array. Several studies have examined the mutational mechanisms of the minisatellites according to infinite allele model (IAM) and the one-step stepwise mutation model (SMM). In this study, we found that this locus fits into the one-step mutation model (SMM) mechanism in six out of seven instances similar to STR loci.
Collapse
Affiliation(s)
- Kuppareddi Balamurugan
- School of Criminal Justice, University of Southern Mississippi, 118 College Drive # 5127, Hattiesburg, MS 39406, USA.
| | | | | | | | | |
Collapse
|
268
|
Requena JM, Chicharro C, García L, Parrado R, Puerta CJ, Cañavate C. Sequence analysis of the 3'-untranslated region of HSP70 (type I) genes in the genus Leishmania: its usefulness as a molecular marker for species identification. Parasit Vectors 2012; 5:87. [PMID: 22541251 PMCID: PMC3425316 DOI: 10.1186/1756-3305-5-87] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/17/2012] [Accepted: 04/08/2012] [Indexed: 12/18/2022] Open
Abstract
Background The Leishmaniases are a group of clinically diverse diseases caused by parasites of the genus Leishmania. To distinguish between species is crucial for correct diagnosis and prognosis as well as for treatment decisions. Recently, sequencing of the HSP70 coding region has been applied in phylogenetic studies and for identifying of Leishmania species with excellent results. Methods In the present study, we analyzed the 3’-untranslated region (UTR) of Leishmania HSP70-type I gene from 24 strains representing eleven Leishmania species in the belief that this non-coding region would have a better discriminatory capacity for species typing than coding regions. Results It was observed that there was a remarkable degree of sequence conservation in this region, even between species of the subgenus Leishmania and Viannia. In addition, the presence of many microsatellites was a common feature of the 3´-UTR of HSP70-I genes in the Leishmania genus. Finally, we constructed dendrograms based on global sequence alignments of the analyzed Leishmania species and strains, the results indicated that this particular region of HSP70 genes might be useful for species (or species complex) typing, improving for particular species the discrimination capacity of phylogenetic trees based on HSP70 coding sequences. Given the large size variation of the analyzed region between the Leishmania and Viannia subgenera, direct visualization of the PCR amplification product would allow discrimination between subgenera, and a HaeIII-PCR-RFLP analysis might be used for differentiating some species within each subgenera. Conclusions Sequence and phylogenetic analyses indicated that this region, which is readily amplified using a single pair of primers from both Old and New World Leishmania species, might be useful as a molecular marker for species discrimination.
Collapse
Affiliation(s)
- Jose M Requena
- Centro de Biología Molecular Severo Ochoa (CSIC-UAM), Universidad Autonoma de Madrid, 28049 Madrid, Spain.
| | | | | | | | | | | |
Collapse
|
269
|
Unique profile of ordered arrangements of repetitive elements in the C57BL/6J mouse genome implicating their functional roles. PLoS One 2012; 7:e35156. [PMID: 22529984 PMCID: PMC3329453 DOI: 10.1371/journal.pone.0035156] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/25/2011] [Accepted: 03/09/2012] [Indexed: 12/18/2022] Open
Abstract
The entirety of all protein coding sequences is reported to represent a small fraction (∼2%) of the mouse and human genomes; the vast majority of the rest of the genome is presumed to be repetitive elements (REs). In this study, the C57BL/6J mouse reference genome was subjected to an unbiased RE mining to establish a whole-genome profile of RE occurrence and arrangement. The C57BL/6J mouse genome was fragmented into an initial set of 5,321 units of 0.5 Mb, and surveyed for REs using unbiased self-alignment and dot-matrix protocols. The survey revealed that individual chromosomes had unique profiles of RE arrangement structures, named RE arrays. The RE populations in certain genomic regions were arranged into various forms of complexly organized structures using combinations of direct and/or inverse repeats. Some of these RE arrays spanned stretches of over 2 Mb, which may contribute to the structural configuration of the respective genomic regions. There were substantial differences in RE density among the 21 chromosomes, with chromosome Y being the most densely populated. In addition, the RE array population in the mouse chromosomes X and Y was substantially different from those of the reference human chromosomes. Conversion of the dot-matrix data pertaining to a tandem 13-repeat structure within the Ch7.032 genome unit into a line map of known REs revealed a repeat unit of ∼11.3 Kb as a mosaic of six different RE types. The data obtained from this study allowed for a comprehensive RE profiling, including the establishment of a library of RE arrays, of the reference mouse genome. Some of these RE arrays may participate in a spectrum of normal and disease biology that are specific for mice.
Collapse
|
270
|
Cheng J, Xue H, Zhao X. Variation of serine-aspartate repeats in membrane proteins possibly contributes to staphylococcal microevolution. PLoS One 2012; 7:e34756. [PMID: 22509353 PMCID: PMC3324548 DOI: 10.1371/journal.pone.0034756] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2011] [Accepted: 03/05/2012] [Indexed: 11/18/2022] Open
Abstract
Tandem repeats (either as microsatellites or minisatellites) in eukaryotic and prokaryotic organisms are mutation-prone DNA. While minisatellites in prokaryotic genomes are underrepresented, the cell surface adhesins of bacteria often contain the minisatellite SD repeats, encoding the amino acid pair of serine-asparatate, especially in Staphylococcal strains. However, their relationship to biological functions is still elusive. In this study, effort was made to uncover the copy number variations of SD repeats by bioinformatic analysis and to detect changes in SD repeats during a plasmid-based assay, as a first step to understand its biological functions. The SD repeats were found to be mainly present in the cell surface proteins. The SD repeats were genetically unstable and polymorphic in terms of copy numbers and sequence compositions. Unlike SNPs, the change of its copy number was reversible, without frame shifting. More significantly, a rearrangement hot spot, the ATTC/AGRT site, was found to be mainly responsible for the instability and reversibility of SD repeats. These characteristics of SD repeats may facilitate bacteria to respond to environmental changes, with low cost, low risk and high efficiency.
Collapse
Affiliation(s)
- Jing Cheng
- Department of Animal Science, McGill University, Montreal, Quebec, Canada
| | - Huping Xue
- Department of Animal Science, McGill University, Montreal, Quebec, Canada
- State Key Laboratory of Genetic Engineering, School of Life Sciences, Fudan University, Shanghai, China
| | - Xin Zhao
- Department of Animal Science, McGill University, Montreal, Quebec, Canada
- * E-mail:
| |
Collapse
|
271
|
Birkbak NJ, Wang ZC, Kim JY, Eklund AC, Li Q, Tian R, Bowman-Colin C, Li Y, Greene-Colozzi A, Iglehart JD, Tung N, Ryan PD, Garber JE, Silver DP, Szallasi Z, Richardson AL. Telomeric allelic imbalance indicates defective DNA repair and sensitivity to DNA-damaging agents. Cancer Discov 2012; 2:366-375. [PMID: 22576213 PMCID: PMC3806629 DOI: 10.1158/2159-8290.cd-11-0206] [Citation(s) in RCA: 434] [Impact Index Per Article: 36.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/26/2023]
Abstract
UNLABELLED DNA repair competency is one determinant of sensitivity to certain chemotherapy drugs, such as cisplatin. Cancer cells with intact DNA repair can avoid the accumulation of genome damage during growth and also can repair platinum-induced DNA damage. We sought genomic signatures indicative of defective DNA repair in cell lines and tumors and correlated these signatures to platinum sensitivity. The number of subchromosomal regions with allelic imbalance extending to the telomere (N(tAI)) predicted cisplatin sensitivity in vitro and pathologic response to preoperative cisplatin treatment in patients with triple-negative breast cancer (TNBC). In serous ovarian cancer treated with platinum-based chemotherapy, higher levels of N(tAI) forecast a better initial response. We found an inverse relationship between BRCA1 expression and N(tAI) in sporadic TNBC and serous ovarian cancers without BRCA1 or BRCA2 mutation. Thus, accumulation of telomeric allelic imbalance is a marker of platinum sensitivity and suggests impaired DNA repair. SIGNIFICANCE Mutations in BRCA genes cause defects in DNA repair that predict sensitivity to DNA damaging agents, including platinum; however, some patients without BRCA mutations also benefit from these agents. NtAI, a genomic measure of unfaithfully repaired DNA, may identify cancer patients likely to benefit from treatments targeting defective DNA repair.
Collapse
Affiliation(s)
- Nicolai J Birkbak
- Center for Biological Sequence Analysis, Technical University of Denmark, DK-2800 Lyngby, Denmark
- Dana-Farber Cancer Institute, Harvard Medical School, Boston, MA 02215 USA
| | - Zhigang C Wang
- Dana-Farber Cancer Institute, Harvard Medical School, Boston, MA 02215 USA
- Brigham and Women's Hospital, Harvard Medical School, Boston, MA 02115 USA
| | - Ji-Young Kim
- Brigham and Women's Hospital, Harvard Medical School, Boston, MA 02115 USA
- CHA University School of Medicine, Seoul, Republic of Korea
| | - Aron C Eklund
- Center for Biological Sequence Analysis, Technical University of Denmark, DK-2800 Lyngby, Denmark
| | - Qiyuan Li
- Center for Biological Sequence Analysis, Technical University of Denmark, DK-2800 Lyngby, Denmark
- Dana-Farber Cancer Institute, Harvard Medical School, Boston, MA 02215 USA
| | - Ruiyang Tian
- Dana-Farber Cancer Institute, Harvard Medical School, Boston, MA 02215 USA
| | | | - Yang Li
- Dana-Farber Cancer Institute, Harvard Medical School, Boston, MA 02215 USA
| | | | - J Dirk Iglehart
- Dana-Farber Cancer Institute, Harvard Medical School, Boston, MA 02215 USA
- Brigham and Women's Hospital, Harvard Medical School, Boston, MA 02115 USA
| | - Nadine Tung
- Beth Israel Deaconess Medical Center, Harvard Medical School, Boston, MA 02215, USA
| | - Paula D Ryan
- Fox Chase Cancer Center, 333 Cottman Avenue, Philadelphia, PA 19111
| | - Judy E Garber
- Dana-Farber Cancer Institute, Harvard Medical School, Boston, MA 02215 USA
| | - Daniel P Silver
- Dana-Farber Cancer Institute, Harvard Medical School, Boston, MA 02215 USA
- Brigham and Women's Hospital, Harvard Medical School, Boston, MA 02115 USA
| | - Zoltan Szallasi
- Center for Biological Sequence Analysis, Technical University of Denmark, DK-2800 Lyngby, Denmark
- Children's Hospital Informatics Program at the Harvard-MIT Division of Health Sciences and Technology (CHIP@HST), Harvard Medical School, Boston, MA, 02115 USA
| | - Andrea L Richardson
- Dana-Farber Cancer Institute, Harvard Medical School, Boston, MA 02215 USA
- Brigham and Women's Hospital, Harvard Medical School, Boston, MA 02115 USA
| |
Collapse
|
272
|
Pellegrini M, Renda ME, Vecchio A. Tandem repeats discovery service (TReaDS) applied to finding novel cis-acting factors in repeat expansion diseases. BMC Bioinformatics 2012; 13 Suppl 4:S3. [PMID: 22536970 PMCID: PMC3303744 DOI: 10.1186/1471-2105-13-s4-s3] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022] Open
Abstract
Background Tandem repeats are multiple duplications of substrings in the DNA that occur contiguously, or at a short distance, and may involve some mutations (such as substitutions, insertions, and deletions). Tandem repeats have been extensively studied also for their association with the class of repeat expansion diseases (mostly affecting the nervous system). Comparative studies on the output of different tools for finding tandem repeats highlighted significant differences among the sets of detected tandem repeats, while many authors pointed up how critical it is the right choice of parameters. Results In this paper we present TReaDS - Tandem Repeats Discovery Service, a tandem repeat meta search engine. TReaDS forwards user requests to several state of the art tools for finding tandem repeats and merges their outcome into a single report, providing a global, synthetic, and comparative view of the results. In particular, TReaDS allows the user to (i) simultaneously run different algorithms on the same data set, (ii) choose for each algorithm a different setting of parameters, and (iii) obtain a report that can be downloaded for further, off-line, investigations. We used TReaDS to investigate sequences associated with repeat expansion diseases. Conclusions By using the tool TReaDS we discover that, for 27 repeat expansion diseases out of a currently known set of 29, long fuzzy tandem repeats are covering the expansion loci. Tests with control sets confirm the specificity of this association. This finding suggests that long fuzzy tandem repeats can be a new class of cis-acting elements involved in the mechanisms leading to the expansion instability. We strongly believe that biologists can be interested in a tool that, not only gives them the possibility of using multiple search algorithm at the same time, with the same effort exerted in using just one of the systems, but also simplifies the burden of comparing and merging the results, thus expanding our capabilities in detecting important phenomena related to tandem repeats.
Collapse
Affiliation(s)
- Marco Pellegrini
- Istituto di Informatica e Telematica, Consiglio Nazionale delle Ricerche, Pisa I-56124, Italy
| | | | | |
Collapse
|
273
|
Mogil LS, Slowikowski K, Laten HM. Computational and experimental analyses of retrotransposon-associated minisatellite DNAs in the soybean genome. BMC Bioinformatics 2012; 13 Suppl 2:S13. [PMID: 22536864 PMCID: PMC3305785 DOI: 10.1186/1471-2105-13-s2-s13] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/14/2023] Open
Abstract
BACKGROUND Retrotransposons are mobile DNA elements that spread through genomes via the action of element-encoded reverse transcriptases. They are ubiquitous constituents of most eukaryotic genomes, especially those of higher plants. The pericentromeric regions of soybean (Glycine max) chromosomes contain >3,200 intact copies of the Gmr9/GmOgre retrotransposon. Between the 3' end of the coding region and the long terminal repeat, this retrotransposon family contains a polymorphic minisatellite region composed of five distinct, interleaved minisatellite families. To better understand the possible role and origin of retrotransposon-associated minisatellites, a computational project to map and physically characterize all members of these families in the G. max genome, irrespective of their association with Gmr9, was undertaken. METHODS A computational pipeline was developed to map and analyze the organization and distribution of five Gmr9-associated minisatellites throughout the soybean genome. Polymerase chain reaction amplifications were used to experimentally assess the computational outputs. RESULTS A total of 63,841 copies of Gmr9-associated minisatellites were recovered from the assembled G. max genome. Ninety percent were associated with Gmr9, an additional 9% with other annotated retrotransposons, and 1% with uncharacterized repetitive DNAs. Monomers were tandemly interleaved and repeated up to 149 times per locus. CONCLUSIONS The computational pipeline enabled a fast, accurate, and detailed characterization of known minisatellites in a large, downloaded DNA database, and PCR amplification supported the general organization of these arrays.
Collapse
Affiliation(s)
- Lauren S Mogil
- Program in Bioinformatics Loyola University Chicago, 1032 W. Sheridan Rd, Chicago, IL 60660 USA
- Department of Biology, Loyola University Chicago, 1032 W. Sheridan Rd, Chicago, IL 60660 USA
- Present address: Department of Biochemistry and Molecular Biology, Mayo Graduate School, Rochester, MN 55905 USA
| | - Kamil Slowikowski
- Program in Bioinformatics Loyola University Chicago, 1032 W. Sheridan Rd, Chicago, IL 60660 USA
| | - Howard M Laten
- Program in Bioinformatics Loyola University Chicago, 1032 W. Sheridan Rd, Chicago, IL 60660 USA
- Department of Biology, Loyola University Chicago, 1032 W. Sheridan Rd, Chicago, IL 60660 USA
| |
Collapse
|
274
|
Tang SJ. A Model of Repetitive-DNA-Organized Chromatin Network of Interphase Chromosomes. Genes (Basel) 2012; 3:167-75. [PMID: 24704848 PMCID: PMC3902797 DOI: 10.3390/genes3010167] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2012] [Revised: 02/21/2012] [Accepted: 02/28/2012] [Indexed: 11/16/2022] Open
Abstract
During interphase, chromosomes are relatively de-condensed in the nuclear space. Interphase chromosomes are known to occupy nuclear space in a non-random manner (chromosome territory); however, their internal structures are poorly defined. In particular, little is understood about the molecular mechanisms that govern the internal organization of interphase chromosomes. The author recently proposed that pairing (or interaction) of repetitive DNA-containing chromatin regions is a critical driving force that specifies the higher-order organization of eukaryotic chromosomes. Guided by this theoretical framework and published experimental data on the structure of interphase chromosomes and the spatial distribution of repetitive DNA in interphase nuclei, I postulate here a molecular structure of chromatin organization in interphase chromosomes. According to this model, an interphase chromosome is a chromatin mesh (or lattice) that is formed by repeat pairing (RP). The mesh consists of two types of structural components: chromosome nodes and loose chromatin fibers. Chromosome nodes are DNA repeat assemblies (RAs) that are formed via RP, while loose fibers include chromatin loops that radiate from the nodes. Different loops crosslink by RPs and form a large integrated chromatin network. I suggest that the organization of the chromatin network of a given interphase chromosome is intrinsically specified by the distribution of repetitive DNA elements on the linear chromatin. The stability of the organization is governed by the collection of RA-formed nodes, and the dynamics of the organization is driven by the assembling and disassembling of the nodes.
Collapse
Affiliation(s)
- Shao-Jun Tang
- Department of Neuroscience and Cell Biology, University of Texas Medical Branch, Galveston, TX 77555, USA.
| |
Collapse
|
275
|
|
276
|
Abstract
Using High-Throughput DNA Sequencing (HTS) to examine gene expression is rapidly becoming a -viable choice and is typically referred to as RNA-seq. Often the depth and breadth of coverage of RNA-seq data can exceed what is achievable using microarrays. However, the strengths of RNA-seq are often its greatest weaknesses. Accurately and comprehensively mapping millions of relatively short reads to a reference genome sequence can require not only specialized software, but also more structured and automated procedures to manage, analyze, and visualize the data. Additionally, the computational hardware required to efficiently process and store the data can be a necessary and often-overlooked component of a research plan. We discuss several aspects of the computational analysis of RNA-seq, including file management and data quality control, analysis, and visualization. We provide a framework for a standard nomenclature -system that can facilitate automation and the ability to track data provenance. Finally, we provide a general workflow of the computational analysis of RNA-seq and a downloadable package of scripts to automate the processing.
Collapse
|
277
|
Zhou MB, Liu XM, Tang DQ. Transposable elements in Phyllostachys pubescens (Poaceae) genome survey sequences and the full-length cDNA sequences, and their association with simple-sequence repeats. GENETICS AND MOLECULAR RESEARCH 2011; 10:3026-37. [PMID: 22180036 DOI: 10.4238/2011.december.6.3] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/03/2022]
Abstract
Phyllostachys pubescens is a woody bamboo with the highest ecological, economic, and cultural values of all bamboos in Asia. There is more genomic data available for P. pubescens than for any other bamboo species, including 2.12-Mb genome survey sequences (GSS) and 11.4-Mb full-length cDNA sequences (FL-cDNAs) currently deposited in GenBank. Analysis of these sequences revealed that transposable elements (TEs) are abundant, diverse and polyphyletic in the P. pubescens genome, of which Ty3-gypsy and Ty1-copia are the two most abundant families. Phylogenic analysis showed that both elements probably arose before the Bambusoideae separated from the other Poaceae subfamilies. We found evidence that the distribution of some intragenic TEs correlated with transcript profiles, of which Mutator elements preferred to insert in the transcripts of transcription factors. Additionally, we found that the abundance of SSRs in TEs (4.56%) was significantly higher than in GSS (0.098%) and in FL-cDNAs (2.60%) in P. pubescens genome, and TA/AT and CT/AG repeats were found to be intimately associated with En/Spm and Mutator elements, respectively. Our data provide a glimpse of the structure and evolution of P. pubescens genome, although large-scale sequencing of the genome would be required to fully understand the architecture of the P. pubescens genome.
Collapse
Affiliation(s)
- M B Zhou
- The Nurturing Station for the State Key Laboratory of Subtropical Silviculture, Zhejiang A & F University, LinAn, Zhejiang Province, P.R. China
| | | | | |
Collapse
|
278
|
Komissarov AS, Gavrilova EV, Demin SJ, Ishov AM, Podgornaya OI. Tandemly repeated DNA families in the mouse genome. BMC Genomics 2011; 12:531. [PMID: 22035034 PMCID: PMC3218096 DOI: 10.1186/1471-2164-12-531] [Citation(s) in RCA: 51] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2011] [Accepted: 10/28/2011] [Indexed: 12/23/2022] Open
Abstract
Background Functional and morphological studies of tandem DNA repeats, that combine high portion of most genomes, are mostly limited due to the incomplete characterization of these genome elements. We report here a genome wide analysis of the large tandem repeats (TR) found in the mouse genome assemblies. Results Using a bioinformatics approach, we identified large TR with array size more than 3 kb in two mouse whole genome shotgun (WGS) assemblies. Large TR were classified based on sequence similarity, chromosome position, monomer length, array variability, and GC content; we identified four superfamilies, eight families, and 62 subfamilies - including 60 not previously described. 1) The superfamily of centromeric minor satellite is only found in the unassembled part of the reference genome. 2) The pericentromeric major satellite is the most abundant superfamily and reveals high order repeat structure. 3) Transposable elements related superfamily contains two families. 4) The superfamily of heterogeneous tandem repeats includes four families. One family is found only in the WGS, while two families represent tandem repeats with either single or multi locus location. Despite multi locus location, TRPC-21A-MM is placed into a separated family due to its abundance, strictly pericentromeric location, and resemblance to big human satellites. To confirm our data, we next performed in situ hybridization with three repeats from distinct families. TRPC-21A-MM probe hybridized to chromosomes 3 and 17, multi locus TR-22A-MM probe hybridized to ten chromosomes, and single locus TR-54B-MM probe hybridized with the long loops that emerge from chromosome ends. In addition to in silico predicted several extra-chromosomes were positive for TR by in situ analysis, potentially indicating inaccurate genome assembly of the heterochromatic genome regions. Conclusions Chromosome-specific TR had been predicted for mouse but no reliable cytogenetic probes were available before. We report new analysis that identified in silico and confirmed in situ 3/17 chromosome-specific probe TRPC-21-MM. Thus, the new classification had proven to be useful tool for continuation of genome study, while annotated TR can be the valuable source of cytogenetic probes for chromosome recognition.
Collapse
|
279
|
Hile SE, Wang X, Lee MYWT, Eckert KA. Beyond translesion synthesis: polymerase κ fidelity as a potential determinant of microsatellite stability. Nucleic Acids Res 2011; 40:1636-47. [PMID: 22021378 PMCID: PMC3287198 DOI: 10.1093/nar/gkr889] [Citation(s) in RCA: 40] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/12/2023] Open
Abstract
Microsatellite DNA synthesis represents a significant component of human genome replication that must occur faithfully. However, yeast replicative DNA polymerases do not possess high fidelity for microsatellite synthesis. We hypothesized that the structural features of Y-family polymerases that facilitate accurate translesion synthesis may promote accurate microsatellite synthesis. We compared human polymerases κ (Pol κ) and η (Pol η) fidelities to that of replicative human polymerase δ holoenzyme (Pol δ4), using the in vitro HSV-tk assay. Relative polymerase accuracy for insertion/deletion (indel) errors within 2-3 unit repeats internal to the HSV-tk gene concurred with the literature: Pol δ4 >> Pol κ or Pol η. In contrast, relative polymerase accuracy for unit-based indel errors within [GT](10) and [TC](11) microsatellites was: Pol κ ≥ Pol δ4 > Pol η. The magnitude of difference was greatest between Pols κ and δ4 with the [GT] template. Biochemically, Pol κ displayed less synthesis termination within the [GT] allele than did Pol δ4. In dual polymerase reactions, Pol κ competed with either a stalled or moving Pol δ4, thereby reducing termination. Our results challenge the ideology that pol κ is error prone, and suggest that DNA polymerases with complementary biochemical properties can function cooperatively at repetitive sequences.
Collapse
Affiliation(s)
- Suzanne E Hile
- Department of Pathology, Gittlen Cancer Research Foundation, Pennsylvania State University College of Medicine, 500 University Drive, Hershey, PA 17033, USA
| | | | | | | |
Collapse
|
280
|
Carr AM, Paek AL, Weinert T. DNA replication: failures and inverted fusions. Semin Cell Dev Biol 2011; 22:866-74. [PMID: 22020070 DOI: 10.1016/j.semcdb.2011.10.008] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2011] [Accepted: 10/12/2011] [Indexed: 11/16/2022]
Abstract
DNA replication normally follows the rules passed down from Watson and Crick: the chromosome duplicates as dictated by its antiparallel strands, base-pairing and leading and lagging strand differences. Real-life replication is more complicated, fraught with perils posed by chromosome damage for one, and by transcription of genes and by other perils that disrupt progress of the DNA replication machinery. Understanding the replication fork, including DNA structures, associated replisome and its regulators, is key to understanding how cells overcome perils and minimize error. Replication fork error leads to genome rearrangements and, potentially, cell death. Interest in the replication fork and its errors has recently gained added interest by the results of deep sequencing studies of human genomes. Several pathologies are associated with sometimes-bizarre genome rearrangements suggestive of elaborate replication fork failures. To try and understand the links between the replication fork, its failure and genome rearrangements, we discuss here phases of fork behavior (stall, collapse, restart and fork failures leading to rearrangements) and analyze two examples of instability from our own studies; one in fission yeast and the other in budding yeast.
Collapse
Affiliation(s)
- Antony M Carr
- Genome Damage and Stability Centre, School of Life Sciences, University of Sussex, Brighton, Sussex, UK.
| | | | | |
Collapse
|
281
|
Szpara ML, Tafuri YR, Parsons L, Shamim SR, Verstrepen KJ, Legendre M, Enquist LW. A wide extent of inter-strain diversity in virulent and vaccine strains of alphaherpesviruses. PLoS Pathog 2011; 7:e1002282. [PMID: 22022263 PMCID: PMC3192842 DOI: 10.1371/journal.ppat.1002282] [Citation(s) in RCA: 122] [Impact Index Per Article: 9.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2011] [Accepted: 08/10/2011] [Indexed: 12/17/2022] Open
Abstract
Alphaherpesviruses are widespread in the human population, and include herpes simplex virus 1 (HSV-1) and 2, and varicella zoster virus (VZV). These viral pathogens cause epithelial lesions, and then infect the nervous system to cause lifelong latency, reactivation, and spread. A related veterinary herpesvirus, pseudorabies (PRV), causes similar disease in livestock that result in significant economic losses. Vaccines developed for VZV and PRV serve as useful models for the development of an HSV-1 vaccine. We present full genome sequence comparisons of the PRV vaccine strain Bartha, and two virulent PRV isolates, Kaplan and Becker. These genome sequences were determined by high-throughput sequencing and assembly, and present new insights into the attenuation of a mammalian alphaherpesvirus vaccine strain. We find many previously unknown coding differences between PRV Bartha and the virulent strains, including changes to the fusion proteins gH and gB, and over forty other viral proteins. Inter-strain variation in PRV protein sequences is much closer to levels previously observed for HSV-1 than for the highly stable VZV proteome. Almost 20% of the PRV genome contains tandem short sequence repeats (SSRs), a class of nucleic acids motifs whose length-variation has been associated with changes in DNA binding site efficiency, transcriptional regulation, and protein interactions. We find SSRs throughout the herpesvirus family, and provide the first global characterization of SSRs in viruses, both within and between strains. We find SSR length variation between different isolates of PRV and HSV-1, which may provide a new mechanism for phenotypic variation between strains. Finally, we detected a small number of polymorphic bases within each plaque-purified PRV strain, and we characterize the effect of passage and plaque-purification on these polymorphisms. These data add to growing evidence that even plaque-purified stocks of stable DNA viruses exhibit limited sequence heterogeneity, which likely seeds future strain evolution. Alphaherpesviruses such as herpes simplex virus (HSV) are ubiquitous in the human population. HSV causes oral and genital lesions, and has co-morbidities in acquisition and spread of human immunodeficiency virus (HIV). The lack of a vaccine for HSV hinders medical progress for both of these infections. A related veterinary alphaherpesvirus, pseudorabies virus (PRV), has long served as a model for HSV vaccine development, because of their similar pathogenesis, neuronal spread, and infectious cycle. We present here the first full genome characterization of a live PRV vaccine strain, Bartha, and reveal a spectrum of unique mutations that are absent from two divergent wild-type PRV strains. These mutations can now be examined individually for their contribution to vaccine strain attenuation and for potential use in HSV vaccine development. These inter-strain comparisons also revealed an abundance of short repetitive elements in the PRV genome, a pattern which is repeated in other herpesvirus genomes and even the unrelated Mimivirus. We provide the first global characterization of repeats in viruses, comparing both their presence and their variation among different viral strains and species. Repetitive elements such as these have been shown to serve as hotspots of variation between individuals or strains of other organisms, generating adaptations or even disease states through changes in length of DNA-binding sites, protein folding motifs, and other structural elements. These data suggest for the first time that similar mechanisms could be widely distributed in viral biology as well.
Collapse
Affiliation(s)
- Moriah L. Szpara
- Department of Molecular Biology, Princeton University, Princeton, New Jersey, United States of America
- Princeton Neuroscience Institute, Princeton University, Princeton, New Jersey, United States of America
| | - Yolanda R. Tafuri
- Department of Molecular Biology, Princeton University, Princeton, New Jersey, United States of America
| | - Lance Parsons
- Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, New Jersey, United States of America
| | - S. Rafi Shamim
- Department of Molecular Biology, Princeton University, Princeton, New Jersey, United States of America
| | - Kevin J. Verstrepen
- VIB lab for Systems Biology and CMPG Lab for Genetics and Genomics, KULeuven, Gaston Geenslaan 1, Leuven, Belgium
| | - Matthieu Legendre
- Structural & Genomic Information Laboratory (CNRS, UPR2589), Mediterranean Institute of Microbiology, Aix-Marseille Université, Marseille, France
| | - L. W. Enquist
- Department of Molecular Biology, Princeton University, Princeton, New Jersey, United States of America
- Princeton Neuroscience Institute, Princeton University, Princeton, New Jersey, United States of America
- * E-mail:
| |
Collapse
|
282
|
Ma L, Jensen JS, Mancuso M, Hamasuna R, Jia Q, McGowin CL, Martin DH. Variability of trinucleotide tandem repeats in the MgPa operon and its repetitive chromosomal elements in Mycoplasma genitalium. J Med Microbiol 2011; 61:191-197. [PMID: 21997874 DOI: 10.1099/jmm.0.030858-0] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Mycoplasma genitalium, a human pathogen associated with sexually transmitted diseases, is unique in that it has the smallest genome of any known free-living organism. Despite its small genome, 4.7 % of the total genomic sequence is devoted to making the MgPa adhesin operon (containing the MG190, MG191 and MG192 genes) and its repetitive chromosomal sequences (known as MgPars). The goals of this study were to investigate the location, organization and variability of trinucleotide tandem repeats (TTRs) in the MgPa operon and MgPars and to explore the possible mechanisms and role of TTR variations. By analysing the complete MgPa operon and complete or partial MgPar sequences in a collection of 15 geographically diverse clinical strains of M. genitalium, TTR sequences were identified in four regions in MG191, one region in MG192, and two or three regions in each of all nine MgPars except for MgPar 3. These TTRs were variable not only in the repeat copy number but also in the repeat unit sequence among or within strains. The key mechanisms for the TTR variations likely include recombination between MgPa and MgPars, and slipped-strand mispairing. TTR variation may represent a mechanism to maximize the variation of the MgPa operon, which is complementary to genetic variation involving segmental recombination between MgPa and MgPars, thus enhancing the organism's ability to adhere to and colonize host cells as well as evasion of the host immune system.
Collapse
Affiliation(s)
- Liang Ma
- Section of Infectious Diseases, Department of Medicine, Louisiana State University Health Sciences Center, New Orleans, Louisiana, USA
| | - Jørgen S Jensen
- Mycoplasma Laboratory, Statens Serum Institut, Copenhagen, Denmark
| | - Miriam Mancuso
- Section of Infectious Diseases, Department of Medicine, Louisiana State University Health Sciences Center, New Orleans, Louisiana, USA
| | - Ryoichi Hamasuna
- Department of Urology, University of Occupational and Environmental Health, Yahatanishi-ku, Kitakyushu, Japan
| | - Qiuyao Jia
- Section of Infectious Diseases, Department of Medicine, Louisiana State University Health Sciences Center, New Orleans, Louisiana, USA
| | - Chris L McGowin
- Section of Infectious Diseases, Department of Medicine, Louisiana State University Health Sciences Center, New Orleans, Louisiana, USA
| | - David H Martin
- Section of Infectious Diseases, Department of Medicine, Louisiana State University Health Sciences Center, New Orleans, Louisiana, USA
| |
Collapse
|
283
|
Tyekucheva S, Yolken RH, McCombie WR, Parla J, Kramer M, Wheelan SJ, Sabunciyan S. Establishing the baseline level of repetitive element expression in the human cortex. BMC Genomics 2011; 12:495. [PMID: 21985647 PMCID: PMC3207997 DOI: 10.1186/1471-2164-12-495] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2010] [Accepted: 10/10/2011] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Although nearly half of the human genome is comprised of repetitive sequences, the expression profile of these elements remains largely uncharacterized. Recently developed high throughput sequencing technologies provide us with a powerful new set of tools to study repeat elements. Hence, we performed whole transcriptome sequencing to investigate the expression of repetitive elements in human frontal cortex using postmortem tissue obtained from the Stanley Medical Research Institute. RESULTS We found a significant amount of reads from the human frontal cortex originate from repeat elements. We also noticed that Alu elements were expressed at levels higher than expected by random or background transcription. In contrast, L1 elements were expressed at lower than expected amounts. CONCLUSIONS Repetitive elements are expressed abundantly in the human brain. This expression pattern appears to be element specific and can not be explained by random or background transcription. These results demonstrate that our knowledge about repetitive elements is far from complete. Further characterization is required to determine the mechanism, the control, and the effects of repeat element expression.
Collapse
Affiliation(s)
- Svitlana Tyekucheva
- Department of Biostatistics and Computational Biology, Dana-Farber CancerInstitute, 450 Brookline Ave, Boston, 02115, USA
| | | | | | | | | | | | | |
Collapse
|
284
|
|
285
|
Noskov VN, Lee NC, Larionov V, Kouprina N. Rapid generation of long tandem DNA repeat arrays by homologous recombination in yeast to study their function in mammalian genomes. Biol Proced Online 2011; 13:8. [PMID: 21982381 PMCID: PMC3200152 DOI: 10.1186/1480-9222-13-8] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2011] [Accepted: 10/07/2011] [Indexed: 12/18/2022] Open
Abstract
We describe here a method to rapidly convert any desirable DNA fragment, as small as 100 bp, into long tandem DNA arrays up to 140 kb in size that are inserted into a microbe vector. This method includes rolling-circle phi29 amplification (RCA) of the sequence in vitro and assembly of the RCA products in vivo by homologous recombination in the yeast Saccharomyces cerevisiae. The method was successfully used for a functional analysis of centromeric and pericentromeric repeats and construction of new vehicles for gene delivery to mammalian cells. The method may have general application in elucidating the role of tandem repeats in chromosome organization and dynamics. Each cycle of the protocol takes ~ two weeks to complete.
Collapse
Affiliation(s)
- Vladimir N Noskov
- Laboratory of Molecular Pharmacology, National Cancer Institute, National Institutes of Health, 9000 Rockville Pike, Bethesda, Maryland 20892, USA.
| | | | | | | |
Collapse
|
286
|
PCR amplification of repetitive sequences as a possible approach in relative species quantification. Meat Sci 2011; 90:438-43. [PMID: 21944936 DOI: 10.1016/j.meatsci.2011.09.002] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2011] [Revised: 09/01/2011] [Accepted: 09/03/2011] [Indexed: 11/22/2022]
Abstract
Both relative and absolute quantifications are possible in species quantification when single copy genomic DNA is used. However, amplification of single copy genomic DNA does not allow a limit of detection as low as one obtained from amplification of repetitive sequences. Amplification of repetitive sequences is therefore frequently used in absolute quantification but problems occur in relative quantification as the number of repetitive sequences is unknown. A promising approach was developed where data from amplification of repetitive sequences were used in relative quantification of species in binary mixtures. PCR LUX primers were designed that amplify repetitive and single copy sequences to establish the species dependent number (constants) (SDC) of amplified repetitive sequences per genome. The SDCs and data from amplification of repetitive sequences were tested for their applicability to relatively quantify the amount of chicken DNA in a binary mixture of chicken DNA and pig DNA. However, the designed PCR primers lack the specificity required for regulatory species control.
Collapse
|
287
|
G-quadruplex-induced instability during leading-strand replication. EMBO J 2011; 30:4033-46. [PMID: 21873979 DOI: 10.1038/emboj.2011.316] [Citation(s) in RCA: 250] [Impact Index Per Article: 19.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2011] [Accepted: 08/09/2011] [Indexed: 02/07/2023] Open
Abstract
G-quadruplexes are four-stranded nucleic acid structures whose biological functions remain poorly understood. In the yeast S. cerevisiae, we report that G-quadruplexes form and, if not properly processed, pose a specific challenge to replication. We show that the G-quadruplex-prone CEB1 tandem array is tolerated when inserted near ARS305 replication origin in wild-type cells but is very frequently destabilized upon treatment with the potent Phen-DC(3) G-quadruplex ligand, or in the absence of the G-quadruplex-unwinding Pif1 helicase, only when the G-rich strand is the template of leading-strand replication. The orientation-dependent instability is associated with the formation of Rad51-Rad52-dependent X-shaped intermediates during replication detected by two-dimensional (2D) gels, and relies on the presence of intact G-quadruplex motifs in CEB1 and on the activity of ARS305. The asymmetrical behaviour of G-quadruplex prone sequences during replication has implications for their evolutionary dynamics within genomes, including the maintenance of G-rich telomeres.
Collapse
|
288
|
Abstract
Mutation rates vary significantly within the genome and across species. Recent studies revealed a long suspected replication-timing effect on mutation rate, but the mechanisms that regulate the increase in mutation rate as the genome is replicated remain unclear. Evidence is emerging, however, that DNA repair systems, in general, are less efficient in late replicating heterochromatic regions compared to early replicating euchromatic regions of the genome. At the same time, mutation rates in both vertebrates and invertebrates have been shown to vary with generation time (GT). GT is correlated with genome size, which suggests a possible nucleotypic effect on species-specific mutation rates. These and other observations all converge on a role for DNA replication checkpoints in modulating generation times and mutation rates during the DNA synthetic phase (S phase) of the cell cycle. The following will examine the potential role of the intra-S checkpoint in regulating cell cycle times (GT) and mutation rates in eukaryotes. This article was published online on August 5, 2011. An error was subsequently identified. This notice is included in the online and print versions to indicate that both have been corrected October 4, 2011.
Collapse
Affiliation(s)
- John Herrick
- Department of Physics, Simon Fraser University, 8888 University Drive, Burnaby, British Columbia, Canada.
| |
Collapse
|
289
|
Chromatin Organization by Repetitive Elements (CORE): A Genomic Principle for the Higher-Order Structure of Chromosomes. Genes (Basel) 2011; 2:502-15. [PMID: 24710208 PMCID: PMC3927610 DOI: 10.3390/genes2030502] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/24/2011] [Revised: 07/21/2011] [Accepted: 07/25/2011] [Indexed: 12/01/2022] Open
Abstract
Eukaryotic genomes contain a large amount of DNA repeats (also known as repetitive DNA, repetitive elements, and repetitive sequences). Here, I propose a role of repetitive DNA in the formation of higher-order structures of chromosomes. The central idea of this theory is that chromatin regions with repetitive sequences pair with regions harboring homologous repeats and that such somatic repeat pairing (RP) assembles repetitive DNA chromatin into compact chromosomal domains that specify chromatin folding in a site-directed manner. According to this theory, DNA repeats are not randomly distributed in the genome. Instead, they form a core framework that coordinates the architecture of chromosomes. In contrast to the viewpoint that DNA repeats are genomic ‘junk’, this theory advocates that repetitive sequences are chromatin organizer modules that determine chromatin-chromatin contact points within chromosomes. This novel concept, if correct, would suggest that DNA repeats in the linear genome encode a blueprint for higher-order chromosomal organization.
Collapse
|
290
|
Chung BI, Lee KH, Shin KS, Kim WC, Kwon DN, You RN, Lee YK, Cho K, Cho DH. REMiner: a tool for unbiased mining and analysis of repetitive elements and their arrangement structures of large chromosomes. Genomics 2011; 98:381-9. [PMID: 21803149 DOI: 10.1016/j.ygeno.2011.07.002] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2011] [Revised: 06/21/2011] [Accepted: 07/15/2011] [Indexed: 11/18/2022]
Abstract
Repetitive elements (REs) constitute a substantial portion of the genomes of human and other species; however, the RE profiles (type, density, and arrangement) within the individual genomes have not been fully characterized. In this study, we developed an RE analysis tool, called REMiner, for a chromosome-wide investigation into the occurrence of individual REs and arrangement of clusters of REs, and REMiner's functional features were examined using the human chromosome Y. The algorithm implemented by REMiner focused on unbiased mining of REs in large chromosomes and data interface within a viewer. The data from the chromosome demonstrated that REMiner is an efficient tool in regard to its capacity for a large query size and the availability of a high-resolution viewer, featuring instant retrieval of alignment data and control of magnification and identity ratio. The chromosome-wide survey identified a diverse population of ordered RE arrangements, which may participate in the genome biology.
Collapse
Affiliation(s)
- Byung-Ik Chung
- Division of Electrical Engineering, Korea Advanced Institute of Science and Technology, Daejeon, South Korea
| | | | | | | | | | | | | | | | | |
Collapse
|
291
|
Chevereau G, Arneodo A, Vaillant C. Influence of the genomic sequence on the primary structure of chromatin. FRONTIERS IN LIFE SCIENCE 2011. [DOI: 10.1080/21553769.2012.708882] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/28/2022]
|
292
|
Alexandrov BS, Valtchinov VI, Alexandrov LB, Gelev V, Dagon Y, Bock J, Kohane IS, Rasmussen KØ, Bishop AR, Usheva A. DNA dynamics is likely to be a factor in the genomic nucleotide repeats expansions related to diseases. PLoS One 2011; 6:e19800. [PMID: 21625483 PMCID: PMC3098838 DOI: 10.1371/journal.pone.0019800] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2010] [Accepted: 04/15/2011] [Indexed: 11/23/2022] Open
Abstract
Trinucleotide repeats sequences (TRS) represent a common type of genomic DNA
motif whose expansion is associated with a large number of human diseases. The
driving molecular mechanisms of the TRS ongoing dynamic expansion across
generations and within tissues and its influence on genomic DNA functions are
not well understood. Here we report results for a novel and notable collective
breathing behavior of genomic DNA of tandem TRS, leading to propensity for large
local DNA transient openings at physiological temperature. Our Langevin
molecular dynamics (LMD) and Markov Chain Monte Carlo (MCMC) simulations
demonstrate that the patterns of openings of various TRSs depend specifically on
their length. The collective propensity for DNA strand separation of repeated
sequences serves as a precursor for outsized intermediate bubble states
independently of the G/C-content. We report that repeats have the potential to
interfere with the binding of transcription factors to their consensus sequence
by altered DNA breathing dynamics in proximity of the binding sites. These
observations might influence ongoing attempts to use LMD and MCMC simulations
for TRS–related modeling of genomic DNA functionality in elucidating the
common denominators of the dynamic TRS expansion mutation with potential
therapeutic applications.
Collapse
Affiliation(s)
- Boian S. Alexandrov
- Theoretical Division, Los Alamos National Laboratory, Los Alamos, New
Mexico, United States of America
| | - Vlad I. Valtchinov
- National Center for Biomedical Computing, Informatics for Integrating
Biology and the Bedside, Boston, Massachusetts, United States of
America
| | - Ludmil B. Alexandrov
- Endocrinology, Beth Israel Deaconess Medical Center, Harvard Medical
School, Boston, Massachusetts, United States of America
| | - Vladimir Gelev
- Endocrinology, Beth Israel Deaconess Medical Center, Harvard Medical
School, Boston, Massachusetts, United States of America
| | - Yossi Dagon
- Endocrinology, Beth Israel Deaconess Medical Center, Harvard Medical
School, Boston, Massachusetts, United States of America
| | - Jonathan Bock
- Endocrinology, Beth Israel Deaconess Medical Center, Harvard Medical
School, Boston, Massachusetts, United States of America
| | - Isaac S. Kohane
- National Center for Biomedical Computing, Informatics for Integrating
Biology and the Bedside, Boston, Massachusetts, United States of
America
| | - Kim Ø. Rasmussen
- Theoretical Division, Los Alamos National Laboratory, Los Alamos, New
Mexico, United States of America
| | - Alan R. Bishop
- Theoretical Division, Los Alamos National Laboratory, Los Alamos, New
Mexico, United States of America
| | - Anny Usheva
- Endocrinology, Beth Israel Deaconess Medical Center, Harvard Medical
School, Boston, Massachusetts, United States of America
- * E-mail:
| |
Collapse
|
293
|
Zuo Z, Lin HK, Trakselis MA. Strand annealing and terminal transferase activities of a B-family DNA polymerase. Biochemistry 2011; 50:5379-90. [PMID: 21545141 DOI: 10.1021/bi200421g] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
DNA replication polymerases have the inherent ability to faithfully and rapidly copy a DNA template according to precise Watson-Crick base pairing. The primary B-family DNA replication polymerase (Dpo1) in the hyperthermophilic archaeon, Sulfolobus solfataricus, is shown here to possess a remarkable DNA stabilizing ability for maintaining weak base pairing interactions to facilitate primer extension. This thermal stabilization by Dpo1 allowed for template-directed synthesis at temperatures more than 30 °C above the melting temperature of naked DNA. Surprisingly, Dpo1 also displays a competing terminal deoxynucleotide transferase (TdT) activity unlike any other B-family DNA polymerase. Dpo1 is shown to elongate single-stranded DNA in template-dependent and template-independent manners. Experiments with different homopolymeric templates indicate that initial deoxyribonucleotide incorporation is complementary to the template. Rate-limiting steps that include looping back and annealing to the template allow for a unique template-dependent terminal transferase activity. The multiple activities of this unique B-family DNA polymerase make this enzyme an essential component for DNA replication and DNA repair for the maintenance of the archaeal genome at high temperatures.
Collapse
Affiliation(s)
- Zhongfeng Zuo
- Department of Chemistry, University of Pittsburgh, Pittsburgh, PA 15260, USA
| | | | | |
Collapse
|
294
|
Kelly MK, Alver B, Kirkpatrick DT. Minisatellite alterations in ZRT1 mutants occur via RAD52-dependent and RAD52-independent mechanisms in quiescent stationary phase yeast cells. DNA Repair (Amst) 2011; 10:556-66. [PMID: 21515092 DOI: 10.1016/j.dnarep.2011.03.002] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2010] [Revised: 02/21/2011] [Accepted: 03/04/2011] [Indexed: 12/11/2022]
Abstract
Alterations in minisatellite DNA repeat tracts are associated with a variety of human diseases including Type 1 diabetes, progressive myoclonus epilepsy, and some types of cancer. However, in spite of their role in human health, the factors required for minisatellite alterations are not well understood. We previously identified a stationary phase specific increase in minisatellite instability caused by mutations in the high affinity zinc transporter ZRT1, using a minisatellite inserted into the ADE2 locus in Saccharomyces cerevisiae. Here, we examined ZRT1-mediated minisatellite instability in yeast strains lacking key recombination genes to determine the mechanisms by which these alterations occur. Our analysis revealed that minisatellite alterations in a Δzrt1 mutant occur by a combination of RAD52-dependent and RAD52-independent mechanisms. In this study, plasmid-based experiments demonstrate that ZRT1-mediated minisatellite alterations occur independently of chromosomal context or adenine auxotrophy, and confirmed the stationary phase timing of the events. To further examine the stationary phase specificity of ZRT1-mediated minisatellite alterations, we deleted ETR1 and POR1, genes that were previously shown to differentially affect the viability of quiescent or nonquiescent cells in stationary phase populations. These experiments revealed that minisatellite alterations in Δzrt1 mutants occur exclusively in quiescent stationary phase cells. Finally, we show that loss of ZRT1 stimulates alterations in a derivative of the human HRAS1 minisatellite. We propose that the mechanism of ZRT1-mediated minisatellite instability during quiescence is relevant to human cells, and thus, human disease.
Collapse
Affiliation(s)
- Maire K Kelly
- Department of Genetics, Cell Biology and Development, University of Minnesota, Minneapolis, MN 55455, USA
| | | | | |
Collapse
|
295
|
Sundararajan R, Freudenreich CH. Expanded CAG/CTG repeat DNA induces a checkpoint response that impacts cell proliferation in Saccharomyces cerevisiae. PLoS Genet 2011; 7:e1001339. [PMID: 21437275 PMCID: PMC3060079 DOI: 10.1371/journal.pgen.1001339] [Citation(s) in RCA: 27] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/27/2010] [Accepted: 02/15/2011] [Indexed: 11/18/2022] Open
Abstract
Repetitive DNA elements are mutational hotspots in the genome, and their instability is linked to various neurological disorders and cancers. Although it is known that expanded trinucleotide repeats can interfere with DNA replication and repair, the cellular response to these events has not been characterized. Here, we demonstrate that an expanded CAG/CTG repeat elicits a DNA damage checkpoint response in budding yeast. Using microcolony and single cell pedigree analysis, we found that cells carrying an expanded CAG repeat frequently experience protracted cell division cycles, persistent arrests, and morphological abnormalities. These phenotypes were further exacerbated by mutations in DSB repair pathways, including homologous recombination and end joining, implicating a DNA damage response. Cell cycle analysis confirmed repeat-dependent S phase delays and G2/M arrests. Furthermore, we demonstrate that the above phenotypes are due to the activation of the DNA damage checkpoint, since expanded CAG repeats induced the phosphorylation of the Rad53 checkpoint kinase in a rad52Δ recombination deficient mutant. Interestingly, cells mutated for the MRX complex (Mre11-Rad50-Xrs2), a central component of DSB repair which is required to repair breaks at CAG repeats, failed to elicit repeat-specific arrests, morphological defects, or Rad53 phosphorylation. We therefore conclude that damage at expanded CAG/CTG repeats is likely sensed by the MRX complex, leading to a checkpoint response. Finally, we show that repeat expansions preferentially occur in cells experiencing growth delays. Activation of DNA damage checkpoints in repeat-containing cells could contribute to the tissue degeneration observed in trinucleotide repeat expansion diseases. Expansion of a CAG/CTG trinucleotide repeat is the causative mutation for multiple neurodegenerative diseases, including Huntington's disease, myotonic dystrophy, and multiple types of spinocerebellar ataxias. Two reasons for the cell death that occurs in these diseases are toxicity of the repeat-containing RNA and of the polyglutamine-containing protein product. Although the expanded repeat can interfere with DNA replication and repair, it was not known whether the presence of the repeat within the DNA causes any additional cellular toxicity. In this study, we show that an expanded CAG/CTG tract placed within the chromosome of the model eukaryote, budding yeast, elicits a cellular response that interferes with cell growth and division. The effect is enhanced when DNA repair pathways, particularly double-strand break repair, are compromised. Moreover, cells experiencing an arrest were more likely to have undergone further repeat expansions. We show that the conserved MRX protein complex locates to the expanded repeat and is required to sense the damage and activate the DNA damage response. Our results suggest that DNA damage at expanded CAG/CTG repeats could contribute to both tissue degeneration and further repeat instability in affected individuals.
Collapse
|
296
|
Lee KH, Lee YK, Kwon DN, Chiu S, Chew V, Rah H, Kujawski G, Melhem R, Hsu K, Chung C, Greenhalgh DG, Cho K. Identification of a unique library of complex, but ordered, arrays of repetitive elements in the human genome and implication of their potential involvement in pathobiology. Exp Mol Pathol 2011; 90:300-11. [PMID: 21376035 DOI: 10.1016/j.yexmp.2011.02.007] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2011] [Accepted: 02/18/2011] [Indexed: 12/16/2022]
Abstract
Approximately 2% of the human genome is reported to be occupied by genes. Various forms of repetitive elements (REs), both characterized and uncharacterized, are presumed to make up the vast majority of the rest of the genomes of human and other species. In conjunction with a comprehensive annotation of genes, information regarding components of genome biology, such as gene polymorphisms, non-coding RNAs, and certain REs, is found in human genome databases. However, the genome-wide profile of unique RE arrangements formed by different groups of REs has not been fully characterized yet. In this study, the entire human genome was subjected to an unbiased RE survey to establish a whole-genome profile of REs and their arrangements. Due to the limitation in query size within the bl2seq alignment program (National Center for Biotechnology Information [NCBI]) utilized for the RE survey, the entire NCBI reference human genome was fragmented into 6206 units of 0.5M nucleotides. A number of RE arrangements with varying complexities and patterns were identified throughout the genome. Each chromosome had unique profiles of RE arrangements and density, and high levels of RE density were measured near the centromere regions. Subsequently, 175 complex RE arrangements, which were selected throughout the genome, were subjected to a comparison analysis using five different human genome sequences. Interestingly, three of the five human genome databases shared the exactly same arrangement patterns and sequences for all 175 RE arrangement regions (a total of 12,765,625 nucleotides). The findings from this study demonstrate that a substantial fraction of REs in the human genome are clustered into various forms of ordered structures. Further investigations are needed to examine whether some of these ordered RE arrangements contribute to the human pathobiology as a functional genome unit.
Collapse
Affiliation(s)
- Kang-Hoon Lee
- Burn Research, Shriners Hospitals for Children Northern California and Department of Surgery, University of California-Davis, 2425 Stockton Blvd., Sacramento, CA 95817, USA
| | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
297
|
MALAUSA THIBAUT, GILLES ANDRÉ, MEGLÉCZ EMESE, BLANQUART HÉLÈNE, DUTHOY STÉPHANIE, COSTEDOAT CAROLINE, DUBUT VINCENT, PECH NICOLAS, CASTAGNONE‐SERENO PHILIPPE, DÉLYE CHRISTOPHE, FEAU NICOLAS, FREY PASCAL, GAUTHIER PHILIPPE, GUILLEMAUD THOMAS, HAZARD LAURENT, LE CORRE VALÉRIE, LUNG‐ESCARMANT BRIGITTE, MALÉ PIERREG, FERREIRA STÉPHANIE, MARTIN JEAN. High‐throughput microsatellite isolation through 454 GS‐FLX Titanium pyrosequencing of enriched DNA libraries. Mol Ecol Resour 2011; 11:638-44. [PMID: 21676194 DOI: 10.1111/j.1755-0998.2011.02992.x] [Citation(s) in RCA: 256] [Impact Index Per Article: 19.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Affiliation(s)
- THIBAUT MALAUSA
- INRA, UMR 1301 IBSV INRA/UNSA/CNRS, 400 Route des Chappes, BP 167, 06903 Sophia‐Antipolis Cedex, France
| | - ANDRÉ GILLES
- Aix‐Marseille Université, CNRS, IRD, UMR 6116 – IMEP, Equipe Evolution Génome Environnement, Centre Saint‐Charles, Case 36, 3 Place Victor Hugo, 13331 Marseille Cedex 3, France
| | - EMESE MEGLÉCZ
- Aix‐Marseille Université, CNRS, IRD, UMR 6116 – IMEP, Equipe Evolution Génome Environnement, Centre Saint‐Charles, Case 36, 3 Place Victor Hugo, 13331 Marseille Cedex 3, France
| | - HÉLÈNE BLANQUART
- Genoscreen, Genomic Platform and R&D, Campus de l’Institut Pasteur, 1 rue du Professeur Calmette, Bâtiment Guérin, 59000 Lille, France
| | - STÉPHANIE DUTHOY
- Genoscreen, Genomic Platform and R&D, Campus de l’Institut Pasteur, 1 rue du Professeur Calmette, Bâtiment Guérin, 59000 Lille, France
| | - CAROLINE COSTEDOAT
- Aix‐Marseille Université, CNRS, IRD, UMR 6116 – IMEP, Equipe Evolution Génome Environnement, Centre Saint‐Charles, Case 36, 3 Place Victor Hugo, 13331 Marseille Cedex 3, France
| | - VINCENT DUBUT
- Aix‐Marseille Université, CNRS, IRD, UMR 6116 – IMEP, Equipe Evolution Génome Environnement, Centre Saint‐Charles, Case 36, 3 Place Victor Hugo, 13331 Marseille Cedex 3, France
| | - NICOLAS PECH
- Aix‐Marseille Université, CNRS, IRD, UMR 6116 – IMEP, Equipe Evolution Génome Environnement, Centre Saint‐Charles, Case 36, 3 Place Victor Hugo, 13331 Marseille Cedex 3, France
| | | | - CHRISTOPHE DÉLYE
- INRA, UMR 1210 Biologie et Gestion des Adventices, 17 rue Sully, 21000 Dijon, France
| | - NICOLAS FEAU
- INRA, UMR 1202 BIOGECO, Equipe de Pathologie Forestière, Domaine de Pierroton, 69 route d’Arcachon, 33612 Cestas Cedex, France
| | - PASCAL FREY
- INRA, Nancy‐Université, UMR 1136, Interactions Arbres – Microorganismes, IFR 110, 54280 Champenoux, France
| | - PHILIPPE GAUTHIER
- UMR CBGP (INRA/IRD/Cirad/Montpellier SupAgro), Campus International de Baillarguet, CS 30016, 34988 Montferrier‐sur‐Lez Cedex, France
| | - THOMAS GUILLEMAUD
- INRA, UMR 1301 IBSV INRA/UNSA/CNRS, 400 Route des Chappes, BP 167, 06903 Sophia‐Antipolis Cedex, France
| | - LAURENT HAZARD
- INRA – UMR 1248 AGIR, BP 52627, 31326 Castanet‐Tolosan Cedex, France
| | - VALÉRIE LE CORRE
- INRA, UMR 1210 Biologie et Gestion des Adventices, 17 rue Sully, 21000 Dijon, France
| | - BRIGITTE LUNG‐ESCARMANT
- INRA, UMR 1202 BIOGECO, Equipe de Pathologie Forestière, Domaine de Pierroton, 69 route d’Arcachon, 33612 Cestas Cedex, France
| | - PIERRE‐JEAN G. MALÉ
- UMR Evolution et Diversité Biologique (Université Toulouse III; CNRS), 118 Route de Narbonne, 31062 Toulouse, France
| | - STÉPHANIE FERREIRA
- Genoscreen, Genomic Platform and R&D, Campus de l’Institut Pasteur, 1 rue du Professeur Calmette, Bâtiment Guérin, 59000 Lille, France
| | - JEAN‐FRANÇOIS MARTIN
- UMR CBGP (INRA/IRD/Cirad/Montpellier SupAgro), Campus International de Baillarguet, CS 30016, 34988 Montferrier‐sur‐Lez Cedex, France
| |
Collapse
|
298
|
Lin Y, Wilson JH. Transcription-induced DNA toxicity at trinucleotide repeats: double bubble is trouble. Cell Cycle 2011; 10:611-8. [PMID: 21293182 DOI: 10.4161/cc.10.4.14729] [Citation(s) in RCA: 34] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/30/2023] Open
Abstract
Trinucleotide repeats (TNR) are a blessing and a curse. In coding regions, where they are enriched, short repeats offer the potential for continuous, rapid length variation with linked incremental changes in the activity of the encoded protein, a valuable source of variation for evolution. But at the upper end of these benign and beneficial lengths, trinucleotide repeats become very unstable, with a dangerous bias toward continual expansion, which can lead to neurological diseases in humans. The mechanisms of expansion are varied and the links to disease are complex. Where they have been delineated, however, they have often revealed unexpected, fundamental aspects of the underlying cell biology. Nowhere is this more apparent than in recent studies, which indicate that expanded CAG repeats can form toxic sites in the genome, which can, upon interaction with normal components of DNA metabolism, trigger cell death. Here we discuss the phenomenon of TNR-induced DNA toxicity, with special emphasis on the role of transcription. Transcription-induced DNA toxicity may have profound biological consequences, with particular relevance to repeat-associated neurodegenerative diseases.
Collapse
Affiliation(s)
- Yunfu Lin
- Verna and Marrs McLean Department of Biochemistry and Molecular Biology, Baylor College of Medicine, Houston, TX USA.
| | | |
Collapse
|
299
|
Ubeda-Manzanaro M, Merlo MA, Palazón JL, Sarasquete C, Rebordinos L. Sequence characterization and phylogenetic analysis of the 5S ribosomal DNA in species of the family Batrachoididae. Genome 2011; 53:723-30. [PMID: 20924421 DOI: 10.1139/g10-048] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
5S ribosomal DNA (rDNA) sequences were analyzed in four species belonging to different genera of the fish family Batrachoididae. Several 5S rDNA variants differing in their non-transcribed spacers (NTSs) were found and were grouped into two main types. Two species showed both types of 5S rDNA, whereas the other two species showed only one type. One type of NTS of Amphichthys cryptocentrus showed a high polymorphism due to several deletions and insertions, and phylogenetic analysis showed a between-species clustering of this type of NTS in Amphichthys cryptocentrus. These results suggest a clear differentiation in the model of 5S rDNA evolution of these four species of Batrachoididae, which appear to have been subject to processes of concerted evolution and birth-and-death evolution with purifying selection.
Collapse
Affiliation(s)
- María Ubeda-Manzanaro
- Instituto de Ciencias Marinas de Andalucía - CSIC, Polígono Río San Pedro, 11510 Puerto Real, Cádiz, Spain
| | | | | | | | | |
Collapse
|
300
|
Gemayel R, Vinces MD, Legendre M, Verstrepen KJ. Variable tandem repeats accelerate evolution of coding and regulatory sequences. Annu Rev Genet 2011; 44:445-77. [PMID: 20809801 DOI: 10.1146/annurev-genet-072610-155046] [Citation(s) in RCA: 390] [Impact Index Per Article: 30.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Genotype-to-phenotype mapping commonly focuses on two major classes of mutations: single nucleotide polymorphisms (SNPs) and copy number variation (CNV). Here, we discuss an underestimated third class of genotypic variation: changes in microsatellite and minisatellite repeats. Such tandem repeats (TRs) are ubiquitous, unstable genomic elements that have historically been designated as nonfunctional "junk DNA" and are therefore mostly ignored in comparative genomics. However, as many as 10% to 20% of eukaryotic genes and promoters contain an unstable repeat tract. Mutations in these repeats often have fascinating phenotypic consequences. For example, changes in unstable repeats located in or near human genes can lead to neurodegenerative diseases such as Huntington disease. Apart from their role in disease, variable repeats also confer useful phenotypic variability, including cell surface variability, plasticity in skeletal morphology, and tuning of the circadian rhythm. As such, TRs combine characteristics of genetic and epigenetic changes that may facilitate organismal evolvability.
Collapse
Affiliation(s)
- Rita Gemayel
- Laboratory for Systems Biology, VIB, B-3001 Heverlee, Belgium
| | | | | | | |
Collapse
|