1
|
Flynn JM, Yamashita YM. The implications of satellite DNA instability on cellular function and evolution. Semin Cell Dev Biol 2024; 156:152-159. [PMID: 37852904 DOI: 10.1016/j.semcdb.2023.10.005] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2023] [Revised: 09/21/2023] [Accepted: 10/11/2023] [Indexed: 10/20/2023]
Abstract
Abundant tandemly repeated satellite DNA is present in most eukaryotic genomes. Previous limitations including a pervasive view that it was uninteresting junk DNA, combined with challenges in studying it, are starting to dissolve - and recent studies have found important functions for satellite DNAs. The observed rapid evolution and implied instability of satellite DNA now has important significance for their functions and maintenance within the genome. In this review, we discuss the processes that lead to satellite DNA copy number instability, and the importance of mechanisms to manage the potential negative effects of instability. Satellite DNA is vulnerable to challenges during replication and repair, since it forms difficult-to-process secondary structures and its homology within tandem arrays can result in various types of recombination. Satellite DNA instability may be managed by DNA or chromatin-binding proteins ensuring proper nuclear localization and repair, or by proteins that process aberrant structures that satellite DNAs tend to form. We also discuss the pattern of satellite DNA mutations from recent mutation accumulation (MA) studies that have tracked changes in satellite DNA for up to 1000 generations with minimal selection. Finally, we highlight examples of satellite evolution from studies that have characterized satellites across millions of years of Drosophila fruit fly evolution, and discuss possible ways that selection might act on the satellite DNA composition.
Collapse
Affiliation(s)
- Jullien M Flynn
- Whitehead Institute for Biomedical Research, Cambridge, MA, USA; Howard Hughes Medical Institute, Cambridge, MA, USA.
| | - Yukiko M Yamashita
- Whitehead Institute for Biomedical Research, Cambridge, MA, USA; Howard Hughes Medical Institute, Cambridge, MA, USA; Massachusetts Institute of Technology, Cambridge, MA, USA.
| |
Collapse
|
2
|
Danchin A. From chemical metabolism to life: the origin of the genetic coding process. Beilstein J Org Chem 2017; 13:1119-1135. [PMID: 28684991 PMCID: PMC5480338 DOI: 10.3762/bjoc.13.111] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2017] [Accepted: 05/19/2017] [Indexed: 12/11/2022] Open
Abstract
Looking for origins is so much rooted in ideology that most studies reflect opinions that fail to explore the first realistic scenarios. To be sure, trying to understand the origins of life should be based on what we know of current chemistry in the solar system and beyond. There, amino acids and very small compounds such as carbon dioxide, dihydrogen or dinitrogen and their immediate derivatives are ubiquitous. Surface-based chemical metabolism using these basic chemicals is the most likely beginning in which amino acids, coenzymes and phosphate-based small carbon molecules were built up. Nucleotides, and of course RNAs, must have come to being much later. As a consequence, the key question to account for life is to understand how chemical metabolism that began with amino acids progressively shaped into a coding process involving RNAs. Here I explore the role of building up complementarity rules as the first information-based process that allowed for the genetic code to emerge, after RNAs were substituted to surfaces to carry over the basic metabolic pathways that drive the pursuit of life.
Collapse
Affiliation(s)
- Antoine Danchin
- Institute of Cardiometabolism and Nutrition, Hôpital de la Pitié-Salpêtrière, 47 Boulevard de l'Hôpital, 75013, Paris, France
| |
Collapse
|
3
|
Vinogradov AE. NUCLEOTYPIC EFFECT IN HOMEOTHERMS: BODY-MASS-CORRECTED BASAL METABOLIC RATE OF MAMMALS IS RELATED TO GENOME SIZE. Evolution 2017; 49:1249-1259. [DOI: 10.1111/j.1558-5646.1995.tb04451.x] [Citation(s) in RCA: 74] [Impact Index Per Article: 10.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/1993] [Accepted: 07/22/1994] [Indexed: 11/30/2022]
Affiliation(s)
- Alexander E. Vinogradov
- Institute of Cytology, Russian Academy of Sciences; Tikhoretsky Avenue 4 St. Petersburg 194064 Russia
| |
Collapse
|
4
|
Gallach M. 1.688 g/cm3satellite-related repeats: a missing link to dosage compensation and speciation. Mol Ecol 2015. [DOI: 10.1111/mec.13335] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022]
Affiliation(s)
- Miguel Gallach
- Center for Integrative Bioinformatics Vienna (CIBIV); Max F Perutz Laboratories; University of Vienna and Medical University of Vienna; Campus Vienna Biocenter 5 A-1030 Vienna Austria
| |
Collapse
|
5
|
Suárez-Díaz E, García-Deister V. That 70s show: regulation, evolution and development beyond molecular genetics. HISTORY AND PHILOSOPHY OF THE LIFE SCIENCES 2015; 36:503-524. [PMID: 26013314 DOI: 10.1007/s40656-014-0051-9] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/04/2014] [Accepted: 08/04/2014] [Indexed: 06/04/2023]
Abstract
This paper argues that the "long 1970s" (1969-1983) is an important though often overlooked period in the development of a rich landscape in the research of metabolism, development, and evolution. The period is marked by: shrinking public funding of basic science, shifting research agendas in molecular biology, the incorporation of new phenomena and experimental tools from previous biological research at the molecular level, and the development of recombinant DNA techniques. Research was reoriented towards eukaryotic cells and development, and in particular towards "giant" RNA processing and transcription. We will here focus on three different models of developmental regulation published in that period: the two models of eukaryotic genetic regulation at the transcriptional level that were developed by Georgii P. Georgiev on the one hand, and by Roy Britten and Eric Davidson on the other; and the model of genetic sufficiency and evolution of regulatory genes proposed by Emile Zuckerkandl. These three bases illustrate the range of exploratory hypotheses that characterised the challenging landscape of gene regulation in the 1970s, a period that in hindsight can be labelled as transitional, between the biology at the laboratory bench of the preceding period, and the biology of genetic engineering and intensive data-driven research that followed.
Collapse
|
6
|
Scherrer K. Regulation of gene expression and the transcription factor cycle hypothesis. Biochimie 2012; 94:1057-68. [PMID: 22234303 DOI: 10.1016/j.biochi.2011.12.010] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2011] [Accepted: 12/09/2011] [Indexed: 11/26/2022]
Abstract
Post-genomic data show unexpected extent of the transcribed genome and the size of individual primary transcripts. Hence, most cis-regulatory modules (CRMs) binding transcription factors (TFs) at promotor, enhancer and other sites are actually transcribed within full domain transcripts (FDTs). The ensemble of these CRMs placed way upstream of exon clusters, downstream and in intronic or intergenic positions represent a program of gene expression which has been formally analysed within the Gene and Genon concept [1,2]. This concept has emphasised the necessity to separate product information from regulative information to allow information-theoretic analysis of gene expression. Classically, TFs have been assumed to act at DNA level exclusively but evidence has accumulated indicating eventual post-transcriptional functions. The transcription factor cycle (TFC) hypothesis suggests the transfer of DNA-bound factors to nascent RNA. Exerting downstream functions in RNA processing and transport, these factors would be liberated by RNA processing and cycle back to the DNA maintaining active transcription. Sequestered on RNA in absence of processing they would constitute a negative feedback loop. The TFC concept may explain epigenetic regulation in mitosis and meiosis. In mitosis control factors may survive as single proteins but also attached to FDTs as organised complexes. This process might perpetuate in cell division conditioning of chromatin for transcription. As observed on lampbrush chromosomes formed in avian and amphibian oogenesis, in meiosis the genome is fully transcribed and oocytes conserve high Mr RNA of high sequence complexity. When new interphase chromosomes form in daughter cells and early embryogenesis, TFs and other factors attached to RNA might be reinserted onto the DNA.
Collapse
Affiliation(s)
- Klaus Scherrer
- Inst. J. Monod, CNRS and University Paris Diderot, 9, rue Larrey, 75005 Paris, France
| |
Collapse
|
7
|
|
8
|
Bhadury P, Song B, Ward BB. Intron features of key functional genes mediating nitrogen metabolism in marine phytoplankton. Mar Genomics 2011; 4:207-13. [DOI: 10.1016/j.margen.2011.06.002] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2010] [Revised: 05/31/2011] [Accepted: 06/04/2011] [Indexed: 10/18/2022]
|
9
|
Kumar RP, Senthilkumar R, Singh V, Mishra RK. Repeat performance: how do genome packaging and regulation depend on simple sequence repeats? Bioessays 2010; 32:165-74. [PMID: 20091758 DOI: 10.1002/bies.200900111] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022]
Abstract
Non-coding DNA has consistently increased during evolution of higher eukaryotes. Since the number of genes has remained relatively static during the evolution of complex organisms, it is believed that increased degree of sophisticated regulation of genes has contributed to the increased complexity. A higher proportion of non-coding DNA, including repeats, is likely to provide more complex regulatory potential. Here, we propose that repeats play a regulatory role by contributing to the packaging of the genome during cellular differentiation. Repeats, and in particular the simple sequence repeats, are proposed to serve as landmarks that can target regulatory mechanisms to a large number of genomic sites with the help of very few factors and regulate the linked loci in a coordinated manner. Repeats may, therefore, function as common target sites for regulatory mechanisms involved in the packaging and dynamic compartmentalization of the chromatin into active and inactive regions during cellular differentiation.
Collapse
Affiliation(s)
- Ram Parikshan Kumar
- Centre for Cellular and Molecular Biology, Uppal Road, Hyderabad 500 007, India
| | | | | | | |
Collapse
|
10
|
Abstract
In the lab, the cis-regulatory network seems to exhibit great functional redundancy. Many experiments testing enhancer activity of neighboring cis-regulatory elements show largely overlapping expression domains. Of recent interest, mice in which cis-regulatory ultraconserved elements were knocked out showed no obvious phenotype, further suggesting functional redundancy. Here, we present a global evolutionary analysis of mammalian conserved nonexonic elements (CNEs), and find strong evidence to the contrary. Given a set of CNEs conserved between several mammals, we characterize functional dispensability as the propensity for the ancestral element to be lost in mammalian species internal to the spanned species tree. We show that ultraconserved-like elements are over 300-fold less likely than neutral DNA to have been lost during rodent evolution. In fact, many thousands of noncoding loci under purifying selection display near uniform indispensability during mammalian evolution, largely irrespective of nucleotide conservation level. These findings suggest that many genomic noncoding elements possess functions that contribute noticeably to organism fitness in naturally evolving populations.
Collapse
Affiliation(s)
- Cory McLean
- Department of Computer Science, Stanford University, Stanford, California 94305, USA
| | | |
Collapse
|
11
|
Abstract
While less than 1.5% of the mammalian genome encodes proteins, it is now evident that the vast majority is transcribed, mainly into non-protein-coding RNAs. This raises the question of what fraction of the genome is functional, i.e., composed of sequences that yield functional products, are required for the expression (regulation or processing) of these products, or are required for chromosome replication and maintenance. Many of the observed noncoding transcripts are differentially expressed, and, while most have not yet been studied, increasing numbers are being shown to be functional and/or trafficked to specific subcellular locations, as well as exhibit subtle evidence of selection. On the other hand, analyses of conservation patterns indicate that only approximately 5% (3%-8%) of the human genome is under purifying selection for functions common to mammals. However, these estimates rely on the assumption that reference sequences (usually ancient transposon-derived sequences) have evolved neutrally, which may not be the case, and if so would lead to an underestimate of the fraction of the genome under evolutionary constraint. These analyses also do not detect functional sequences that are evolving rapidly and/or have acquired lineage-specific functions. Indeed, many regulatory sequences and known functional noncoding RNAs, including many microRNAs, are not conserved over significant evolutionary distances, and recent evidence from the ENCODE project suggests that many functional elements show no detectable level of sequence constraint. Thus, it is likely that much more than 5% of the genome encodes functional information, and although the upper bound is unknown, it may be considerably higher than currently thought.
Collapse
Affiliation(s)
- Michael Pheasant
- ARC Special Research Centre for Functional and Applied Genomics, Institute for Molecular Bioscience, University of Queensland, St Lucia, Queensland 4072, Australia
| | | |
Collapse
|
12
|
Grechko VV, Ciobanu DG, Darevsky IS, Kosushkin SA, Kramerov DA. Molecular evolution of satellite DNA repeats and speciation of lizards of the genus Darevskia (Sauria: Lacertidae). Genome 2007; 49:1297-307. [PMID: 17213912 DOI: 10.1139/g06-089] [Citation(s) in RCA: 16] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
Satellite DNA repeats were studied in Caucasian populations of 18 rock lizard species of the genus Darevskia. Four subfamilies (Caucasian Lacerta satellites (CLsat)I-IV) were identified, which shared 70%-75% sequence similarity. The distribution of CLsat subfamilies among the species was studied. All the species could be divided into at least 3 clades, depending on the content of CLsat subfamilies in each genome: "saxicola", "rudis", and "mixta" lizards. CLsatI was found in all studied species, but in very different quantities; the "saxicola" group contained this subfamily predominantly. The "rudis" group also contained CLsatIII, and the "mixta" group carried considerable amounts of CLsatII. The highest concentrations of CLsatI and CLsatII were detected in 2 ground lizards--D. derjugini and D. praticola, respectively. D. parvula predominantly carried CLsatIII. CLsatIV was found only in the Crimean species D. lindholmi. The distribution patterns of satellite subfamilies show possible postglacial speciation within the genus Darevskia. A hybrid origin of species that possess 2 or 3 CLsat subfamilies and important clarifications to the systematics of the genus are proposed.
Collapse
Affiliation(s)
- Vernata V Grechko
- Engelhardt Institute of Molecular Biology, Russian Academy of Sciences, Moscow, 119991, Vavilov st, 32, Russia.
| | | | | | | | | |
Collapse
|
13
|
Abstract
Research into the origins of introns is at a critical juncture in the resolution of theories on the evolution of early life (which came first, RNA or DNA?), the identity of LUCA (the last universal common ancestor, was it prokaryotic- or eukaryotic-like?), and the significance of noncoding nucleotide variation. One early notion was that introns would have evolved as a component of an efficient mechanism for the origin of genes. But alternative theories emerged as well. From the debate between the "introns-early" and "introns-late" theories came the proposal that introns arose before the origin of genetically encoded proteins and DNA, and the more recent "introns-first" theory, which postulates the presence of introns at that early evolutionary stage from a reconstruction of the "RNA world." Here we review seminal and recent ideas about intron origins. Recent discoveries about the patterns and causes of intron evolution make this one of the most hotly debated and exciting topics in molecular evolutionary biology today.
Collapse
Affiliation(s)
- Francisco Rodríguez-Trelles
- Department of Ecology and Evolutionary Biology, University of California, Irvine, California 92697-2525, USA.
| | | | | |
Collapse
|
14
|
Efficient inefficiency: biochemical "junk" may represent molecular bridesmaids awaiting emergent function as a buffer against environmental fluctuation. Med Hypotheses 2006; 67:914-21. [PMID: 16581198 DOI: 10.1016/j.mehy.2006.02.022] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2006] [Accepted: 02/01/2006] [Indexed: 10/24/2022]
Abstract
The biochemical function of many parts of the genome, transcriptome, proteome, and interactome remain largely unknown. We propose that portions of these fundamental building blocks of life have no current biochemical function per se. Rather, sections of these "omes" may contribute to an inventory of biochemical parts and circuits that participate in the development of emergent functions. Low fidelity deoxyribonucleic acid replication, transcription, translation, and post-translational modification all represent potential mechanisms to produce an inventory of parts. Stochastic processes that influence the conformations of ribonucleic acid molecules and proteins may also contribute to potential biochemical inventory. Some components of the biochemical inventory may enable future adaptations, some may produce disease, and some may remain useless. The function of many of these components await discovery, not by science, but by evolution. While carrying such purposeless biochemical units may appear to dilute fitness by exacting a thermodynamic cost, we argue that net fitness becomes enhanced when considering the value for potential future innovations. One can envision components that intermingle, interact, and act out mock pathways, but in most cases remain molecular bridesmaids. Given sufficiently low thermodynamic cost, such stochastic cycling may persist until a markedly advantageous or cataclysmically disadvantageous trait emerges. Maladaptive screening and utilization of inventory content can lead to disease phenotypes, a process buffered and regulated in part by the heat shock protein and stress response network. Whereas failure of the ubiquitin pathway to recycle misfolded proteins has become increasingly recognized as a source of disease, protein misfolding may itself represent one step in a process that maximizes functional innovation through increasing proteomic diversity. Fractal correlates of these processes occur at the organizational level of cells and organisms. That the abnormal accumulation of units induces local collapse may serve to limit the extension of damage to the greater system at large. The immune and cognitive systems that selectively sample and prune environmental content may serve as additional portals for innovation.
Collapse
|
15
|
Nikolaou C, Almirantis Y. “Word” Preference in the Genomic Text and Genome Evolution: Different Modes of n-tuplet Usage in Coding and Noncoding Sequences. J Mol Evol 2005; 61:23-35. [PMID: 16059753 DOI: 10.1007/s00239-004-0209-2] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2004] [Accepted: 02/02/2005] [Indexed: 10/25/2022]
Abstract
Extensive work on n-tuplet occurrence in genomic sequences has revealed the correlation of their usage with sequence origin. Parallel to that, there exist different restrictions in the nucleotide composition of coding and noncoding sequences that may result in distinct modes of usage of n-tuplets. The relatively simple approaches described herein focus on such differences. They are based on simple summation measures of n-tuplet frequencies, computed after filtering the background nucleotide composition. Among the main targets of this work is to draw some conclusions on the qualitative differences in the composition of genomic sequences depending on their functionality. Moreover, an evolutionary model is formulated, including simple forms of ubiquitous events of genome dynamics: genomic fusions, genome shuffling due to transpositions, replication slippage, and point mutations. This model is shown to be able to reproduce all the statistical features of genomic sequences discussed herein.
Collapse
Affiliation(s)
- Christoforos Nikolaou
- Institute of Biology, National Research Center for Physical Sciences Demokritos,, 15310, Athens, Greece
| | | |
Collapse
|
16
|
Nikolaou C, Almirantis Y. A study of the middle-scale nucleotide clustering in DNA sequences of various origin and functionality, by means of a method based on a modified standard deviation. J Theor Biol 2002; 217:479-92. [PMID: 12234754 DOI: 10.1006/jtbi.2002.3045] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
The deviation from randomness in the distribution of nucleotides in genomic sequences is quantified and studied, using a modified standard deviation (MSD). This method implies a "per block" computation of the standard deviation of the nucleotide frequencies of occurrence, using local means (means taken in a neighborhood of each block). This quantity may serve as a scale-dependent measure of the nucleotide clustering. In the present work, the meso-scale of tenths of nucleotides is principally explored, by means of suitably adjusted filter parameters. This length scale is of an order of magnitude not directly affected by the grammar and syntax rules of the protein-coding procedure, remaining shorter than the scale of appearance of large-scale characteristics of the genome. MSD has been found to distinguish systematically between the sequences of different origin and functionality. The most near-random are found to be coding sequences of prokaryotes, while in intronic and intergenic regions of eukaryotic genomes, extended clustering of similar nucleotides is observed. The distributions of MSD values of large collections of sequences are found to be in most cases characteristic of their biological role and origin. Protein- and non-coding, prokaryotic and eukaryotic DNA as well as promoter, rRNA, viral and organelle sequences have been examined. The presented results corroborate a recently proposed model for genome evolution. The method is also applied for an assessment of the annotation of ORFs taken from the complete genome of Saccharomyces cerevisiae.
Collapse
Affiliation(s)
- Christoforos Nikolaou
- Institute of Biology, National Research Center for Physical Sciences, "Demokritos" 15310, Athens, Greece
| | | |
Collapse
|
17
|
Abstract
Within-intron difference of correlation with base composition of the adjacent exons was studied in the genomes of 34 species. For this purpose, GC-percent was determined for segments of 50 bp in length taken at both intron margins and in the internal part of the intron. It was found that in certain genomes the coefficient of correlation with GC-percent of the adjacent exon was significantly higher for the intron margin than for the internal part of the intron (homeotherms, cereals). Only part of this difference can be explained by unequal probability of insertion of transposable elements. Those multicellular organisms which have a low or no within-intron difference in correlation with the adjacent exons (anamniotes, invertebrates, dicots) show a higher local compositional heterogeneity (a greater exon/intron contrast in the GC-content). These results are evidence against the mutational bias being a possible explanation for the compositional genome heterogeneity. Thus, in the genomes with a high global heterogeneity there seems to be a selective force for compliance of intron base composition with the adjacent exons. This force is stronger in those parts of the intron that are closer to exons. In addition, the previously found positive general correlation between the genome size and average intron length was confirmed with a much larger dataset. However, within separate phylogenetic groups this rule can be broken, as it occurs in the cereals (family Poaceae), where a negative correlation was found.
Collapse
Affiliation(s)
- A E Vinogradov
- Institute of Cytology, Russian Academy of Sciences, Tikhoretsky Avenue 4, 194064, St. Petersburg, Russia.
| |
Collapse
|
18
|
Brosius J. RNAs from all categories generate retrosequences that may be exapted as novel genes or regulatory elements. Gene 1999; 238:115-34. [PMID: 10570990 DOI: 10.1016/s0378-1119(99)00227-9] [Citation(s) in RCA: 275] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
Abstract
While the significance of middle repetitive elements had been neglected for a long time, there are again tendencies to ascribe most members of a given middle repetitive sequence family a functional role--as if the discussion of SINE (short interspersed repetitive elements) function only can occupy extreme positions. In this article, I argue that differences between the various classes of retrosequences concern mainly their copy numbers. Consequently, the function of SINEs should be viewed as pragmatic such as, for example, mRNA-derived retrosequences, without underestimating the impact of retroposition for generation of novel protein coding genes or parts thereof (exon shuffling by retroposition) and in particular of SINEs (and retroelements) in modulating genes and their expression. Rapid genomic change by accumulating retrosequences may even facilitate speciation [McDonald, J.F., 1995. Transposable elements: possible catalysts of organismic evolution. Trends Ecol. Evol. 10, 123-126.] In addition to providing mobile regulatory elements, small RNA-derived retrosequences including SINEs can, in analogy to mRNA-derived retrosequences, also give rise to novel small RNA genes. Perhaps not representative for all SINE/master gene relationships, we gained significant knowledge by studying the small neuronal non-messenger RNAs, namely BC1 RNA in rodents and BC200 RNA in primates. BC1 is the first identified master gene generating a subclass of ID repetitive elements, and BC200 is the only known Alu element (monomeric) that was exapted as a novel small RNA encoding gene.
Collapse
Affiliation(s)
- J Brosius
- Institute of Experimental Pathology/Molecular Neurobiology, ZMBE, University of Münster, Germany.
| |
Collapse
|
19
|
Pons J, Bruvo B, Juan C, Petitpierre E, Plohl M, Ugarković D. Conservation of satellite DNA in species of the genus Pimelia (Tenebrionidae, Coleoptera). Gene 1997; 205:183-90. [PMID: 9461393 DOI: 10.1016/s0378-1119(97)00402-2] [Citation(s) in RCA: 23] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
Abstract
Satellite DNA has been characterized in six allopatric species from the genus Pimelia: P. interjecta, P. integra, P. variolosa and P. baetica, inhabiting Iberian Peninsula, and P. elevata and P. criba, endemic to Balearic Islands Ibiza and Mallorca, respectively. All species show the presence of a single satellite DNA of a basic monomer length of 357 bp and A+T content of 69%, comprising a considerable amount of the genome (39%-45%, corresponding to about 4.5 x 10(5) copies per haploid genome). The sequence analysis of 22 cloned repeats reveals very high intra- and interspecific sequence similarity. Phylogenetic analysis separates the satellite sequences into two clusters, each comprising clones from three species exclusively. Within the clusters, satellite clones are not grouped species-specifically, except those of P. integra where species-diagnostic nt substitutions are detected with a pattern that could be produced by gene conversion. Such high sequence conservation could be related to preservation of satellite DNA curvature, resulting in a higher order helical structure, proposed to act as a specific protein binding domain.
Collapse
Affiliation(s)
- J Pons
- Departament de Biologia Ambiental, Universitat de les Illes Balears, Palma de Mallorca, Spain
| | | | | | | | | | | |
Collapse
|
20
|
Abstract
Transcriptional repression in eukaryotes often involves tens or hundreds of kilobase pairs, two to three orders of magnitude more than the bacterial operator/repressor model does. Classical repression, represented by this model, was maintained over the whole span of evolution under different guises, and consists of repressor factors interacting primarily with promoters and, in later evolution, also with enhancers. The use of much larger amounts of DNA in the other mode of repression, here called the sectorial mode ('superrepression'), results in the conceptual transfer of so-called junk DNA to the domain of functional DNA. This contribution to the solution of the c-value paradox involves perhaps 15% of genomic 'junk,' and encompasses the bulk of the introns, thought to fill a stabilizing role in sectorially repressed chromatin structures. In the case of developmental genes, such structures appear to be heterochromatoid in character. However, solid clues regarding general structural features of superrepressed terminal differentiation genes remain elusive. The competition among superrepressible DNA sectors for sectorially binding factors offers, in principle, a molecular mechanism for developmental switches. Position effect variegation may be considered an abnormal manifestation of normal processes that underly development and involve heterochromatoid sectorial repression, which is apparently required for local elimination or modulation of morphological features (morpholysis). Sectorial repression of genes participating either in development or in terminal differentiation is considered instrumental in establishing stable cell types, and provides a basis for the distinction between determination and cell type specification. The gamut of possible stable cell types may have been broadened by the appearance in evolution of heavy isochores. Additional types of relatively frequent GC-rich cis-acting DNA motifs may offer reiterated binding sites to factors endowed with a selective (though not individually strong) affinity for these motifs. The majority of sequence motifs thought to be used in superrepression need not be individually maintained by natural selection. It is re-emphasized that the dispensability of sequences is not an indicator of their nonfunctionality and that in many cases, along noncoding sequences, nucleotides tend to fill functions collectively, rather than individually.
Collapse
Affiliation(s)
- E Zuckerkandl
- Institute of Molecular Medical Sciences, Palo Alto, CA 94306, USA
| |
Collapse
|
21
|
Losada A, Villasante A. Autosomal location of a new subtype of 1.688 satellite DNA of Drosophila melanogaster. Chromosome Res 1996; 4:372-83. [PMID: 8871826 DOI: 10.1007/bf02257273] [Citation(s) in RCA: 23] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/02/2023]
Abstract
During the screening of a Drosophila melanogaster YAC library with DNA from the minichromosome Dp(1;f)1187 we isolated a clone, yw20D5, which contains a new subtype of 1.688 satellite DNA. Although the sequences of several monomers subcloned from the YAC show a considerable variation in length, the derived consensus sequence is 356-bp long. This new subtype and the one constituted by the 353-bp repeats are both located on the left arm heterochromatin of chromosome 3, arranged in separate arrays. Despite their autosomal location, phylogenetic relationships among 1.688 satellite sequences suggest that they may have originated from the 359-bp repeats of the X chromosome heterochromatin. We have used the new 356-bp repeats to investigate whether sequences related to the 1.688 satellite are dispersed along the euchromatic arms of the autosomes in a similar way to that in which they are found along the X chromosome euchromatin.
Collapse
Affiliation(s)
- A Losada
- Centro de Biología Molecular Severo Ochoa (CSIC-UAM)
| | | |
Collapse
|
22
|
von Sternberg R. The role of constrained self-organization in genome structural evolution. Acta Biotheor 1996; 44:95-118. [PMID: 9028019 DOI: 10.1007/bf00048418] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/03/2023]
Abstract
A hypothesis of genome structural evolution is explored. Rapid and cohesive alterations in genome organization are viewed as resulting from the dynamic and constrained interactions of chromosomal subsystem components. A combination of macromolecular boundary conditions and DNA element involvement in far-from-equilibrium reactions is proposed to increase the complexity of genomic subsystems via the channelling of genome turnover; interactions between subsystems create higher-order subsystems expanding the phase space for further genetic evolution. The operation of generic constraints on structuration in genome evolution is suggested by i) universal, homoplasic features of chromosome organization and ii) the metastable nature of genome structures where lower-level flux is constrained by higher-order structures. Phenomena such as 'genomic shock', bursts of transposable element activity, concerted evolution, etc., are hypothesized to result from constrained systemic responses to endogenous/exogenous, micro/macro perturbations. The constraints operating on genome turnover are expected to increase with chromosomal structural complexity, the number of interacting subsystems, and the degree to which interactions between genomic components are tightly ordered.
Collapse
Affiliation(s)
- R von Sternberg
- Center for Intelligent Systems, T.J. Watson School, State University of New York at Binghamton 13902, USA
| |
Collapse
|
23
|
Abstract
The Second International Workshop on Drosophila Heterochromatin, held in Honolulu from January 4-7, 1995, brought together about 70 scientists from the US, Canada, Germany, Italy, Russia, and the Netherlands. After the first of these international meetings, five years ago, Mary Lou Pardue and Wolfgang Hennig, in these columns, commented on its proceedings, and on heterochromatin in general. Although the questions that they raised cannot yet be answered exhaustively, important and sometimes surprising new observations have been made, some previously tentative answers have been firmed up, and some theoretical views underwent significant shifts. We wish to reflect here a few of the data presented at the second workshop, and express some thoughts suggested to us by these recent findings.
Collapse
Affiliation(s)
- E Zuckerkandl
- Institute of Molecular Medical Sciences, 460 Page Mill Road, Palo Alto, CA 94306, USA
| | | |
Collapse
|
24
|
López-León MD, Vázquez P, Hewitt GM, Camacho JP. Cloning and sequence analysis of an extremely homogeneous tandemly repeated DNA in the grasshopper Eyprepocnemis plorans. Heredity (Edinb) 1995; 75 ( Pt 4):370-5. [PMID: 7591833 DOI: 10.1038/hdy.1995.148] [Citation(s) in RCA: 22] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/26/2023] Open
Abstract
Digestion of total nuclear DNA of the grasshopper Eyprepocnemis plorans with seven different restriction endonucleases (REs), and subsequent agarose gel electrophoresis, has shown the presence of highly repetitive DNA yielding the typical ladder-like banding pattern. The most clear pattern was produced by DraI, the monomer being some 180 bp. This repeat unit was subsequently cloned and sequenced. Bidirectional sequencing of five randomly chosen clones showed exactly the same nucleotides in all 180 positions. The possible explanations for such an extreme homogeneity of this tandem repeat are discussed in the light of current hypotheses on repetitive DNA function and molecular drive mechanisms.
Collapse
Affiliation(s)
- M D López-León
- Departamento de Genética, Facultad de Ciencias, Universidad de Granada, Spain
| | | | | | | |
Collapse
|
25
|
Abstract
Reverse transcription has been an important mediator of genomic change. This influence dates back more than three billion years, when the RNA genome was converted into the DNA genome. While the current cellular role(s) of reverse transcriptase are not yet completely understood, it has become clear over the last few years that this enzyme is still responsible for generating significant genomic change and that its activities are one of the driving forces of evolution. Reverse transcriptase generates, for example, extra gene copies (retrogenes), using as a template mature messenger RNAs. Such retrogenes do not always end up as nonfunctional pseudogenes but form, after reinsertion into the genome, new unions with resident promoter elements that may alter the gene's temporal and/or spatial expression levels. More frequently, reverse transcriptase produces copies of nonmessenger RNAs, such as small nuclear or cytoplasmic RNAs. Extremely high copy numbers can be generated by this process. The resulting reinserted DNA copies are therefore referred to as short interspersed repetitive elements (SINEs). SINEs have long been considered selfish DNA, littering the genome via exponential propagation but not contributing to the host's fitness. Many SINEs, however, can give rise to novel genes encoding small RNAs, and are the migrant carriers of numerous control elements and sequence motifs that can equip resident genes with novel regulatory elements [Brosius J. and Gould S.J., Proc Natl Acad Sci USA 89, 10706-10710, 1992]. Retrosequences, such as SINEs and portions of retroelements (e.g., long terminal repeats, LTRs), are capable of donating sequence motifs for nucleosome positioning, DNA methylation, transcriptional enhancers and silencers, poly(A) addition sequences, determinants of RNA stability or transport, splice sites, and even amino acid codons for incorporation into open reading frames as novel protein domains. Retroposition can therefore be considered as a major pacemaker for evolution (including speciation). Retroposons, with their unique properties and actions, form the molecular basis of important evolutionary concepts, such as exaptation [Gould S.J. and Vrba E., Paleobiology 8, 4-15, 1982] and punctuated equilibrium [Elredge N. and Gould S.J. in Schopf T.J.M. (ed). Models in Paleobiology. Freeman, Cooper, San Francisco, 1972, pp. 82-115].
Collapse
Affiliation(s)
- J Brosius
- Institute for Experimental Pathology, ZMBE University of Münster, Germany.
| | | |
Collapse
|
26
|
Haviland MB, Kessling AM, Davignon J, Sing CF. Cladistic analysis of the apolipoprotein AI-CIII-AIV gene cluster using a healthy French Canadian sample. I. Haploid analysis. Ann Hum Genet 1995; 59:211-31. [PMID: 7625767 DOI: 10.1111/j.1469-1809.1995.tb00742.x] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/26/2023]
Abstract
A cladistic analysis was carried out to identify haplotypes hypothesized to differ for functional DNA sequence variations within the apolipoprotein (apo) AI-CIII-AIV gene cluster that affect plasma lipid, lipoprotein and apolipoprotein levels. A sample of unrelated healthy French Canadians was studied. First, a cladogram of the observed apo AI-CIII-AIV haplotypes was estimated. Then this cladogram was used to define a statistical analysis of the association between haplotype variation and variation in plasma lipid, lipoprotein and apolipoprotein levels. Three haplotypes were identified which were associated with small (5-12% of the total sum of squares) pleiotropic effects on plasma lipid, lipoprotein and apolipoprotein traits and these effects were context, i.e. gender, dependent.
Collapse
Affiliation(s)
- M B Haviland
- Department of Human Genetics, University of Michigan, Ann Arbor 48109-0618, USA
| | | | | | | |
Collapse
|
27
|
Toman PD, de Crombrugghe B. The mouse type-III procollagen-encoding gene: genomic cloning and complete DNA sequence. Gene 1994; 147:161-8. [PMID: 7926795 DOI: 10.1016/0378-1119(94)90061-2] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/27/2023]
Abstract
Overlapping cosmid clones were isolated that covered the entire mouse type-III collagen-encoding gene (mCol3) locus including flanking sequences approximately 40 kb upstream and 20 kb downstream from the gene. This gene was characterized initially by restriction mapping and then followed by sequencing of 43.6 kb, including 5 kb upstream from the transcription start point (tsp) and all exons and introns of the entire gene. The optimal parameters for sequencing a gene of this size were determined by sequencing 5-10-kb fragments at different ratios of random and directed sequencing, and comparing their efficiency. Based on our experience for sequencing mCol3, we have estimated that the most cost-efficient method was to achieve a twofold redundancy in sequencing by using random DNA subclones as templates for sequencing prior to initiating directed DNA sequencing to close the gaps between contiguous regions. mCol3 spans 37.6 kb from the tsp to the single polyadenylation site and contains 51 exons. The overall structure of mCol3 is similar to that of other members of the fibrillar collagen-encoding gene family. Several repetitive elements were located within the gene boundaries. Based on the nucleotide (nt) sequence, the predicted sizes of the mouse type-III collagen (mCOL3) mRNA and polypeptide are 4767 nt and 1464 amino acids (aa), respectively. A comparison of mCOL3 versus the human type-III collagen (hCOL3) showed 91% identity at the aa level.
Collapse
Affiliation(s)
- P D Toman
- Department of Molecular Genetics, University of Texas M.D. Anderson Cancer Center, Houston 77030
| | | |
Collapse
|
28
|
Potapenko AI, Khudolii GA, Akif'ev AP. The conserved fraction of repetitive sequences in the human and mammalian genome in health, in pathology, and for long-term mutagenic exposure. Bull Exp Biol Med 1994. [DOI: 10.1007/bf02444090] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
|
29
|
Abstract
Traditionally, many people doing research in molecular biology attribute coding properties to a given DNA sequence if this sequence contains an open reading frame for translation into a sequence of amino acids. This protein coding capability of DNA was detected about 30 years ago. The underlying genetic code is highly conserved and present in every biological species studied so far. Today, it is obvious that DNA has a much larger coding potential for other important tasks. Apart from coding for specific RNA molecules such as rRNA, snRNA and tRNA molecules, specific structural and sequence patterns of the DNA chain itself express distinct codes for the regulation and expression of its genetic activity. A chromatin code has been defined for phasing of the histone-octamer protein complex in the nucleosome. A translation frame code has been shown to exist that determines correct triplet counting at the ribosome during protein synthesis. A loop code seems to organize the single stranded interaction of the nascent RNA chain with proteins during the splicing process, and a splicing code phases successive 5' and 3' splicing sites. Most of these DNA codes are not exclusively based on the primary DNA sequence itself, but also seem to include specific features of the corresponding higher order structures. Based on the view that these various DNA codes are genetically instructive for specific molecular interactions or processes, important in the nucleus during interphase and during cell division, the coding capability of tandem repetitive DNA sequences has recently been reconsidered.
Collapse
Affiliation(s)
- P Vogt
- Section Molecular Human Genetics, University of Heidelberg, Federal Republic of Germany
| |
Collapse
|
30
|
Abstract
Some evolutionary consequences of different rates and trends in DNA damage and repair are explained. Different types of DNA damaging agents cause nonrandom lesions along the DNA. The type of DNA sequence motifs to be preferentially attacked depends upon the chemical or physical nature of the assaulting agent and the DNA base composition. Higher-order chromatin structure, the nonrandom nucleosome positioning along the DNA, the absence of nucleosomes from the promoter regions of active genes, curved DNA, the presence of sequence-specific binding proteins, and the torsional strain on the DNA induced by an increased transcriptional activity all are expected to affect rates of damage of individual genes. Furthermore, potential Z-DNA, H-DNA, slippage, and cruciform structures in the regulatory region of some genes or in other genomic loci induced by torsional strain on the DNA are more prone to modification by genotoxic agents. A specific actively transcribed gene may be preferentially damaged over nontranscribed genes only in specific cell types that maintain this gene in active chromatin fractions because of (1) its decondensed chromatin structure, (2) torsional strain in its DNA, (3) absence of nucleosomes from its regulatory region, and (4) altered nucleosome structure in its coding sequence due to the presence of modified histones and HMG proteins. The situation in this regard of germ cell lineages is, of course, the only one to intervene in evolution. Most lesions in DNA such as those caused by UV or DNA alkylating agents tend to diminish the GC content of genomes. Thus, DNA sequences not bound by selective constraints, such as pseudogenes, will show an increase in their AT content during evolution as evidenced by experimental observations. On the other hand, transcriptionally active parts may be repaired at rates higher than inactive parts of the genome, and proliferating cells may display higher repair activities than quiescent cells. This might arise from a tight coupling of the repair process with both transcription and replication, all these processes taking place on the nuclear matrix. Repair activities differ greatly among species, and there is a good correlation between life span and repair among mammals. It is predicted that genes that are transcriptionally active in germ-cell lineages have a lower mutation rate than bulk DNA, a circumstance that is expected to be reflected in evolution. Exception to this rule might be genes containing potential Z-DNA, H-DNA, or cruciform structures in their coding or regulatory regions that appear to be refractory to repair.(ABSTRACT TRUNCATED AT 400 WORDS)
Collapse
Affiliation(s)
- T Boulikas
- Linus Pauling Institute of Science and Medicine, Palo Alto, CA
| |
Collapse
|