1
|
Boi L. A reappraisal of the form: function problem-theory and phenomenology. Theory Biosci 2022; 141:73-103. [PMID: 35471494 DOI: 10.1007/s12064-022-00368-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2021] [Accepted: 03/11/2022] [Indexed: 11/26/2022]
Abstract
This paper is aimed at demonstrating that some geometrical and topological transformations and operations serve not only as promoters of many specific genetic and cellular events in multicellular living organisms, but also as initiators of the organization and regulation of their functions. Thus, changes in the form and structure of macromolecular and cellular systems must be directly associated to their functions. There are specific classes of enzymes that manipulate the geometry and topology of complex DNA-protein structures, and thereby they perform many important cellular processes, including segregation of daughter chromosomes, gene regulation, and DNA repair. We argue that form has an organizing power, hence a causal action, in the sense that it enables to induce functional events during different biological processes, at the supramolecular, cellular, and organismal levels of organization. Clearly, topological forms must be matched with specific kinetic and dynamical parameters to have a functional effectiveness in living systems. This effectiveness is remarkably apparent, to give an example, in the regulation of the genome functions and in cell activity. In more general terms, we try to show that the conformational plasticity of biological systems depends on different kinds of topological manipulations performed by specific families of enzymes. In doing so, they catalyze all those spatial and dynamical changes of biological structures that are suitable for the functions to be acted by the organism.
Collapse
Affiliation(s)
- Luciano Boi
- École des Hautes Études en Sciences Sociales, Centre de Mathématiques (CAMS), 54, bd Raspail, 75006, Paris, France.
| |
Collapse
|
2
|
Abstract
The main thesis developed in this article is that the key feature of biological life is the a biological process can control and regulate other processes, and it maintains that ability over time. This control can happen hierarchically and/or reciprocally, and it takes place in three-dimensional space. This implies that the information that a biological process has to utilize is only about the control, but not about the content of those processes. Those other processes can be vastly more complex that the controlling process itself, and in fact necessarily so. In particular, each biological process draws upon the complexity of its environment.
Collapse
Affiliation(s)
- Jürgen Jost
- Max Planck Institute for Mathematics in the Sciences, Leipzig, Germany. .,Santa Fe Institute for the Sciences of Complexity, Santa Fe, New Mexico, USA.
| |
Collapse
|
3
|
Le Treut G, Képès F, Orland H. A Polymer Model for the Quantitative Reconstruction of Chromosome Architecture from HiC and GAM Data. Biophys J 2018; 115:2286-2294. [PMID: 30527448 DOI: 10.1016/j.bpj.2018.10.032] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2018] [Revised: 10/03/2018] [Accepted: 10/26/2018] [Indexed: 01/03/2023] Open
Abstract
It is widely believed that the folding of the chromosome in the nucleus has a major effect on genetic expression. For example, coregulated genes in several species have been shown to colocalize in space despite being far away on the DNA sequence. In this manuscript, we present a new, to our knowledge, method to model the three-dimensional structure of the chromosome in live cells based on DNA-DNA interactions measured in high-throughput chromosome conformation capture experiments and genome architecture mapping. Our approach incorporates a polymer model and directly uses the contact probabilities measured in high-throughput chromosome conformation capture experiments and genome architecture mapping experiments rather than estimates of average distances between genomic loci. Specifically, we model the chromosome as a Gaussian polymer with harmonic interactions and extract the coupling coefficients best reproducing the experimental contact probabilities. In contrast to existing methods, we give an exact expression of the contact probabilities at thermodynamic equilibrium. The Gaussian effective model reconstructed with our method reproduces experimental contacts with high accuracy. We also show how Brownian dynamics simulations of our reconstructed Gaussian effective model can be used to study chromatin organization and possibly give some clue about its dynamics.
Collapse
Affiliation(s)
- Guillaume Le Treut
- Department of Physics, University of California San Diego, La Jolla, California.
| | - François Képès
- institute of Systems and Synthetic Biology, Genopole, CNRS, UEVE, Université Paris-Saclay, Évry, France
| | - Henri Orland
- Institut de Physique Théorique, CEA, CNRS-URA 2306, Gif-sur-Yvette, France; Beijing Computational Science Research Center, Beijing, China
| |
Collapse
|
4
|
From multiple pathogenicity islands to a unique organized pathogenicity archipelago. Sci Rep 2016; 6:27978. [PMID: 27302835 PMCID: PMC4908373 DOI: 10.1038/srep27978] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2016] [Accepted: 05/25/2016] [Indexed: 12/24/2022] Open
Abstract
Pathogenicity islands are sets of successive genes in a genome that determine the virulence of a bacterium. In a growing number of studies, bacterial virulence appears to be determined by multiple islands scattered along the genome. This is the case in a family of seven plant pathogens and a human pathogen that, under KdgR regulation, massively secrete enzymes such as pectinases that degrade plant cell wall. Here we show that their multiple pathogenicity islands form together a coherently organized, single “archipelago” at the genome scale. Furthermore, in half of the species, most genes encoding secreted pectinases are expressed from the same DNA strand (transcriptional co-orientation). This genome architecture favors DNA conformations that are conducive to genes spatial co-localization, sometimes complemented by co-orientation. As proteins tend to be synthetized close to their encoding genes in bacteria, we propose that this architecture would favor the efficient funneling of pectinases at convergent points within the cell. The underlying functional hypothesis is that this convergent funneling of the full blend of pectinases constitutes a crucial strategy for successful degradation of the plant cell wall. Altogether, our work provides a new approach to describe and predict, at the genome scale, the full virulence complement.
Collapse
|
5
|
Bouyioukos C, Elati M, Képès F. Analysis tools for the interplay between genome layout and regulation. BMC Bioinformatics 2016; 17 Suppl 5:191. [PMID: 27294345 PMCID: PMC4905612 DOI: 10.1186/s12859-016-1047-0] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022] Open
Abstract
BACKGROUND Genome layout and gene regulation appear to be interdependent. Understanding this interdependence is key to exploring the dynamic nature of chromosome conformation and to engineering functional genomes. Evidence for non-random genome layout, defined as the relative positioning of either co-functional or co-regulated genes, stems from two main approaches. Firstly, the analysis of contiguous genome segments across species, has highlighted the conservation of gene arrangement (synteny) along chromosomal regions. Secondly, the study of long-range interactions along a chromosome has emphasised regularities in the positioning of microbial genes that are co-regulated, co-expressed or evolutionarily correlated. While one-dimensional pattern analysis is a mature field, it is often powerless on biological datasets which tend to be incomplete, and partly incorrect. Moreover, there is a lack of comprehensive, user-friendly tools to systematically analyse, visualise, integrate and exploit regularities along genomes. RESULTS Here we present the Genome REgulatory and Architecture Tools SCAN (GREAT:SCAN) software for the systematic study of the interplay between genome layout and gene expression regulation. GREAT SCAN is a collection of related and interconnected applications currently able to perform systematic analyses of genome regularities as well as to improve transcription factor binding sites (TFBS) and gene regulatory network predictions based on gene positional information. CONCLUSIONS We demonstrate the capabilities of these tools by studying on one hand the regular patterns of genome layout in the major regulons of the bacterium Escherichia coli. On the other hand, we demonstrate the capabilities to improve TFBS prediction in microbes. Finally, we highlight, by visualisation of multivariate techniques, the interplay between position and sequence information for effective transcription regulation.
Collapse
Affiliation(s)
- Costas Bouyioukos
- />institute of Systems and Synthetic Biology (iSSB), Genopole, CNRS, Université d’Évry Val d’Essonne, Évry, France
| | - Mohamed Elati
- />institute of Systems and Synthetic Biology (iSSB), Genopole, CNRS, Université d’Évry Val d’Essonne, Évry, France
| | - François Képès
- />institute of Systems and Synthetic Biology (iSSB), Genopole, CNRS, Université d’Évry Val d’Essonne, Évry, France
- />Department of BioEngineering, Imperial College London, London, United Kingdom
| |
Collapse
|
6
|
Bouyioukos C, Bucchini F, Elati M, Képès F. GREAT: a web portal for Genome Regulatory Architecture Tools. Nucleic Acids Res 2016; 44:W77-82. [PMID: 27151196 PMCID: PMC4987929 DOI: 10.1093/nar/gkw384] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2016] [Accepted: 04/26/2016] [Indexed: 11/15/2022] Open
Abstract
GREAT (Genome REgulatory Architecture Tools) is a novel web portal for tools designed to generate user-friendly and biologically useful analysis of genome architecture and regulation. The online tools of GREAT are freely accessible and compatible with essentially any operating system which runs a modern browser. GREAT is based on the analysis of genome layout -defined as the respective positioning of co-functional genes- and its relation with chromosome architecture and gene expression. GREAT tools allow users to systematically detect regular patterns along co-functional genomic features in an automatic way consisting of three individual steps and respective interactive visualizations. In addition to the complete analysis of regularities, GREAT tools enable the use of periodicity and position information for improving the prediction of transcription factor binding sites using a multi-view machine learning approach. The outcome of this integrative approach features a multivariate analysis of the interplay between the location of a gene and its regulatory sequence. GREAT results are plotted in web interactive graphs and are available for download either as individual plots, self-contained interactive pages or as machine readable tables for downstream analysis. The GREAT portal can be reached at the following URL https://absynth.issb.genopole.fr/GREAT and each individual GREAT tool is available for downloading.
Collapse
Affiliation(s)
- Costas Bouyioukos
- iSSB, CNRS, Genopole, UEVE, Université Paris-Saclay, 5 rue Henri Desbruères, Évry 91030 Cedex, France
| | - François Bucchini
- iSSB, CNRS, Genopole, UEVE, Université Paris-Saclay, 5 rue Henri Desbruères, Évry 91030 Cedex, France
| | - Mohamed Elati
- iSSB, CNRS, Genopole, UEVE, Université Paris-Saclay, 5 rue Henri Desbruères, Évry 91030 Cedex, France
| | - François Képès
- iSSB, CNRS, Genopole, UEVE, Université Paris-Saclay, 5 rue Henri Desbruères, Évry 91030 Cedex, France
| |
Collapse
|
7
|
|
8
|
Schmidt HG, Sewitz S, Andrews SS, Lipkow K. An integrated model of transcription factor diffusion shows the importance of intersegmental transfer and quaternary protein structure for target site finding. PLoS One 2014; 9:e108575. [PMID: 25333780 PMCID: PMC4204827 DOI: 10.1371/journal.pone.0108575] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2014] [Accepted: 08/30/2014] [Indexed: 11/30/2022] Open
Abstract
We present a computational model of transcription factor motion that explains both the observed rapid target finding of transcription factors, and how this motion influences protein and genome structure. Using the Smoldyn software, we modelled transcription factor motion arising from a combination of unrestricted 3D diffusion in the nucleoplasm, sliding along the DNA filament, and transferring directly between filament sections by intersegmental transfer. This presents a fine-grain picture of the way in which transcription factors find their targets two orders of magnitude faster than 3D diffusion alone allows. Eukaryotic genomes contain sections of nucleosome free regions (NFRs) around the promoters; our model shows that the presence and size of these NFRs can be explained as their acting as antennas on which transcription factors slide to reach their targets. Additionally, our model shows that intersegmental transfer may have shaped the quaternary structure of transcription factors: sequence specific DNA binding proteins are unusually enriched in dimers and tetramers, perhaps because these allow intersegmental transfer, which accelerates target site finding. Finally, our model shows that a ‘hopping’ motion can emerge from 3D diffusion on small scales. This explains the apparently long sliding lengths that have been observed for some DNA binding proteins observed in vitro. Together, these results suggest that transcription factor diffusion dynamics help drive the evolution of protein and genome structure.
Collapse
Affiliation(s)
- Hugo G. Schmidt
- Department of Biochemistry & Cambridge Systems Biology Centre, University of Cambridge, Cambridge, United Kingdom
- * E-mail: (HS); (KL)
| | - Sven Sewitz
- Department of Biochemistry & Cambridge Systems Biology Centre, University of Cambridge, Cambridge, United Kingdom
- Nuclear Dynamics Programme, The Babraham Institute, Cambridge, United Kingdom
| | - Steven S. Andrews
- Fred Hutchinson Cancer Research Center, Seattle, Washington, United States of America
| | - Karen Lipkow
- Department of Biochemistry & Cambridge Systems Biology Centre, University of Cambridge, Cambridge, United Kingdom
- Nuclear Dynamics Programme, The Babraham Institute, Cambridge, United Kingdom
- * E-mail: (HS); (KL)
| |
Collapse
|
9
|
Elati M, Nicolle R, Junier I, Fernández D, Fekih R, Font J, Képès F. PreCisIon: PREdiction of CIS-regulatory elements improved by gene's positION. Nucleic Acids Res 2012; 41:1406-15. [PMID: 23241390 PMCID: PMC3561985 DOI: 10.1093/nar/gks1286] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/28/2023] Open
Abstract
Conventional approaches to predict transcriptional regulatory interactions usually rely on the definition of a shared motif sequence on the target genes of a transcription factor (TF). These efforts have been frustrated by the limited availability and accuracy of TF binding site motifs, usually represented as position-specific scoring matrices, which may match large numbers of sites and produce an unreliable list of target genes. To improve the prediction of binding sites, we propose to additionally use the unrelated knowledge of the genome layout. Indeed, it has been shown that co-regulated genes tend to be either neighbors or periodically spaced along the whole chromosome. This study demonstrates that respective gene positioning carries significant information. This novel type of information is combined with traditional sequence information by a machine learning algorithm called PreCisIon. To optimize this combination, PreCisIon builds a strong gene target classifier by adaptively combining weak classifiers based on either local binding sequence or global gene position. This strategy generically paves the way to the optimized incorporation of any future advances in gene target prediction based on local sequence, genome layout or on novel criteria. With the current state of the art, PreCisIon consistently improves methods based on sequence information only. This is shown by implementing a cross-validation analysis of the 20 major TFs from two phylogenetically remote model organisms. For Bacillus subtilis and Escherichia coli, respectively, PreCisIon achieves on average an area under the receiver operating characteristic curve of 70 and 60%, a sensitivity of 80 and 70% and a specificity of 60 and 56%. The newly predicted gene targets are demonstrated to be functionally consistent with previously known targets, as assessed by analysis of Gene Ontology enrichment or of the relevant literature and databases.
Collapse
Affiliation(s)
- Mohamed Elati
- Institute of Systems and Synthetic Biology, CNRS, University of Evry, Genopole, 91030 Evry, France.
| | | | | | | | | | | | | |
Collapse
|
10
|
Benza VG, Bassetti B, Dorfman KD, Scolari VF, Bromek K, Cicuta P, Lagomarsino MC. Physical descriptions of the bacterial nucleoid at large scales, and their biological implications. REPORTS ON PROGRESS IN PHYSICS. PHYSICAL SOCIETY (GREAT BRITAIN) 2012; 75:076602. [PMID: 22790781 DOI: 10.1088/0034-4885/75/7/076602] [Citation(s) in RCA: 52] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/01/2023]
Abstract
Recent experimental and theoretical approaches have attempted to quantify the physical organization (compaction and geometry) of the bacterial chromosome with its complement of proteins (the nucleoid). The genomic DNA exists in a complex and dynamic protein-rich state, which is highly organized at various length scales. This has implications for modulating (when not directly enabling) the core biological processes of replication, transcription and segregation. We overview the progress in this area, driven in the last few years by new scientific ideas and new interdisciplinary experimental techniques, ranging from high space- and time-resolution microscopy to high-throughput genomics employing sequencing to map different aspects of the nucleoid-related interactome. The aim of this review is to present the wide spectrum of experimental and theoretical findings coherently, from a physics viewpoint. In particular, we highlight the role that statistical and soft condensed matter physics play in describing this system of fundamental biological importance, specifically reviewing classic and more modern tools from the theory of polymers. We also discuss some attempts toward unifying interpretations of the current results, pointing to possible directions for future investigation.
Collapse
Affiliation(s)
- Vincenzo G Benza
- Dipartimento di Fisica e Matematica, Università dell'Insubria, Como, Italy
| | | | | | | | | | | | | |
Collapse
|
11
|
The layout of a bacterial genome. FEBS Lett 2012; 586:2043-8. [DOI: 10.1016/j.febslet.2012.03.051] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2012] [Revised: 03/25/2012] [Accepted: 03/26/2012] [Indexed: 12/25/2022]
|
12
|
Junier I, Hérisson J, Képès F. Genomic organization of evolutionarily correlated genes in bacteria: limits and strategies. J Mol Biol 2012; 419:369-86. [PMID: 22446685 DOI: 10.1016/j.jmb.2012.03.009] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2011] [Revised: 03/12/2012] [Accepted: 03/13/2012] [Indexed: 12/30/2022]
Abstract
The need for efficient molecular interplay in time and space within a cell imposes strong constraints that could be partially relaxed if relative gene positions along chromosomes were appropriate. Comparative genomics studies have demonstrated the short-scale conservation of gene proximity along bacterial chromosomes. Additionally, the long-range periodic positioning of evolutionarily correlated genes within Escherichia coli has recently been highlighted. To gain further insight into these different genetic organizations, we examined the compromise between chromosomal proximity and periodicity for all available eubacterial genomes by evaluating groups of evolutionarily correlated genes from a benchmark data set. In enterobacteria, strict chromosomal proximity is found to be limited to groups under 20 genes, whereas periodicity is significant in all groups over 50. The E. coli K12 genome bears 511 periodic genes (12% of the genome), whose orthologs are found to be periodic in all eubacterial phyla. These periodic genes predominantly function in macromolecular synthesis and spatial organization of cellular components. They are enriched in essential and housekeeping genes and tend to often be constitutively expressed. On this basis, it is argued that chromosomal proximity and periodicity are ubiquitous complementary genomic strategies that favor the build-up of local concentrations of co-functional molecules. In particular, the periodic layout may facilitate chromosome folding to spatially organize the construction of major cell components. The transition at 20 genes is reminiscent of the size of the longest operons and of topological microdomains. The range for which DNA neighborhood optimizes biochemical interactions might therefore be defined by DNA topology.
Collapse
Affiliation(s)
- Ivan Junier
- Epigenomics Project/Institute of Systems and Synthetic Biology, Genopole, CNRS, University of Evry, 91030 Evry, France.
| | | | | |
Collapse
|
13
|
Scolari VF, Bassetti B, Sclavi B, Lagomarsino MC. Gene clusters reflecting macrodomain structure respond to nucleoid perturbations. ACTA ACUST UNITED AC 2011; 7:878-88. [DOI: 10.1039/c0mb00213e] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/02/2023]
|
14
|
Mathelier A, Carbone A. Chromosomal periodicity and positional networks of genes in Escherichia coli. Mol Syst Biol 2010; 6:366. [PMID: 20461073 PMCID: PMC2890325 DOI: 10.1038/msb.2010.21] [Citation(s) in RCA: 27] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2009] [Accepted: 03/18/2010] [Indexed: 12/26/2022] Open
Abstract
Escherichia coli periodic gene distribution is identified for a periodic interval of 33 kb. Two positional networks of genes are discovered by studying gene periodic distribution: one is driven by metabolic genes and the other by genes involved in cellular processing and signaling. A functional core of Escherichia coli genes drives gene periodic distribution. A few chromosomal regions that preserve gene transcription profiles across environmental changes are identified. This single genome analysis approach can be taken as a footprint for a large-scale bacterial and archaeal periodic distribution analysis.
The structure of dynamic folds in microbial chromosomes is largely unknown. On the other hand, genes characterizing a functional core in Escherichia coli K12 show to be periodically distributed along the arcs, suggesting an encoded three-dimensional genomic organization helping functional activities among which are translation and, possibly, transcription. Core genes are expected to be either highly expressed or rapidly expressed when needed. Because of E. coli K12 life mode, they are especially encoded at the genomic level, with a very biased codon composition, and as a consequence, they can, at some extent, be predicted in silico. On the basis of a computational method allowing the definition of a class of genes that are organism specific, we identify a pool of core genes, some of which are conserved across many species, some depend on the environmental living conditions of the organism, some are involved in the stress response, and others have no yet identified function. This set of predicted core genes covers roughly 10% of all genes in E. coli K12 and approximates well the class of experimentally known essential genes. An important property of core genes is that they cover all the spectrum of microbial functions. This means that for any functional class of genes, some representative of the class belongs to the functional core. Consequently, we reasoned, the three-dimensional chromosomal arrangement of these genes may be important to fulfill basic functional responses. A strong periodic signal of 33 kb is detected, and the approach shows also that a periodic arrangement affects not only core genes, but in fact, all genes along the E. coli K12 chromosome, even if the signal is weaker. An analysis of functional classes of genes shows that they systematically organize into two independent positional gene networks, one driven by metabolic genes and the other by genes involved in cellular processing and signaling (Figure 5A). We conclude that functional reasons justify periodic gene organization. To explore the functional basis of the distribution, we examined the relationships between the codon bias of E. coli K12 genes and transcriptomic data for a number of different growth conditions. We could identify in a very precise manner a few chromosomal regions that preserve gene transcription profiles across environmental changes. These regions present a profile of the expression levels for their genes, which is periodic by a period of 33 kb. These finding generate new questions on evolutionary pressures imposed on the chromosome and suggest a number of insights on chromosomal superhelicity that can lead to a precise conception of experiments and to hypothesis to be tested. The theoretical analysis of functional classes of genes involved in the periodic distribution, for instance, makes clear that metabolic genes and genes involved in translation are expected to be the most affected by a disruption of the periodic chromosomal arrangement. The methodological approach is based on single genome analysis. Given either core genes or genes organized in functional classes, we analyze the detailed distribution of distances between pairs of genes through a parameterized model based on signal processing and find that these groups of genes tend to be separated by a regular integral distance characterized by a periodic interval of 33 kb. The methodology can be applied to any set of genes and can be taken as a footprint for large-scale bacterial and archaeal analysis. The structure of dynamic folds in microbial chromosomes is largely unknown. Here, we find that genes with a highly biased codon composition and characterizing a functional core in Escherichia coli K12 show to be periodically distributed along the arcs, suggesting an encoded three-dimensional genomic organization helping functional activities among which are translation and, possibly, transcription. This extends to functional classes of genes that are shown to systematically organize into two independent positional gene networks, one driven by metabolic genes and the other by genes involved in cellular processing and signaling. We conclude that functional reasons justify periodic gene organization. This finding generates new questions on evolutionary pressures imposed on the chromosome. Our methodological approach is based on single genome analysis. Given either core genes or genes organized in functional classes, we analyze the detailed distribution of distances between pairs of genes through a parameterized model based on signal processing and find that these groups of genes tend to be separated by a regular integral distance. The methodology can be applied to any set of genes and can be taken as a footprint for large-scale bacterial and archaeal analysis.
Collapse
Affiliation(s)
- Anthony Mathelier
- UPMC Univ Paris 06, FRE3214, Génomique Analytique, 15 rue de l'Ecole de Médecine, Paris, France
| | | |
Collapse
|
15
|
Junier I, Hérisson J, Képès F. Periodic pattern detection in sparse boolean sequences. Algorithms Mol Biol 2010; 5:31. [PMID: 20831781 PMCID: PMC2949599 DOI: 10.1186/1748-7188-5-31] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/26/2010] [Accepted: 09/10/2010] [Indexed: 01/13/2023] Open
Abstract
Background The specific position of functionally related genes along the DNA has been shown to reflect the interplay between chromosome structure and genetic regulation. By investigating the statistical properties of the distances separating such genes, several studies have highlighted various periodic trends. In many cases, however, groups built up from co-functional or co-regulated genes are small and contain wrong information (data contamination) so that the statistics is poorly exploitable. In addition, gene positions are not expected to satisfy a perfectly ordered pattern along the DNA. Within this scope, we present an algorithm that aims to highlight periodic patterns in sparse boolean sequences, i.e. sequences of the type 010011011010... where the ratio of the number of 1's (denoting here the transcription start of a gene) to 0's is small. Results The algorithm is particularly robust with respect to strong signal distortions such as the addition of 1's at arbitrary positions (contaminated data), the deletion of existing 1's in the sequence (missing data) and the presence of disorder in the position of the 1's (noise). This robustness property stems from an appropriate exploitation of the remarkable alignment properties of periodic points in solenoidal coordinates. Conclusions The efficiency of the algorithm is demonstrated in situations where standard Fourier-based spectral methods are poorly adapted. We also show how the proposed framework allows to identify the 1's that participate in the periodic trends, i.e. how the framework allows to allocate a positional score to genes, in the same spirit of the sequence score. The software is available for public use at http://www.issb.genopole.fr/MEGA/Softwares/iSSB_SolenoidalApplication.zip.
Collapse
|
16
|
Junier I, Martin O, Képès F. Spatial and topological organization of DNA chains induced by gene co-localization. PLoS Comput Biol 2010; 6:e1000678. [PMID: 20169181 PMCID: PMC2820526 DOI: 10.1371/journal.pcbi.1000678] [Citation(s) in RCA: 58] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2009] [Accepted: 01/12/2010] [Indexed: 12/22/2022] Open
Abstract
Transcriptional activity has been shown to relate to the organization of chromosomes in the eukaryotic nucleus and in the bacterial nucleoid. In particular, highly transcribed genes, RNA polymerases and transcription factors gather into discrete spatial foci called transcription factories. However, the mechanisms underlying the formation of these foci and the resulting topological order of the chromosome remain to be elucidated. Here we consider a thermodynamic framework based on a worm-like chain model of chromosomes where sparse designated sites along the DNA are able to interact whenever they are spatially close by. This is motivated by recurrent evidence that there exist physical interactions between genes that operate together. Three important results come out of this simple framework. First, the resulting formation of transcription foci can be viewed as a micro-phase separation of the interacting sites from the rest of the DNA. In this respect, a thermodynamic analysis suggests transcription factors to be appropriate candidates for mediating the physical interactions between genes. Next, numerical simulations of the polymer reveal a rich variety of phases that are associated with different topological orderings, each providing a way to increase the local concentrations of the interacting sites. Finally, the numerical results show that both one-dimensional clustering and periodic location of the binding sites along the DNA, which have been observed in several organisms, make the spatial co-localization of multiple families of genes particularly efficient. The good operation of cells relies on a coordination between chromosome structure and genetic regulation which is yet to be understood. This can be seen in particular from the transcription machinery: in some eukaryotes and bacteria, transcription of highly active genes occurs within discrete foci called transcription factories, where RNA polymerases, transcription factors and their target genes co-localize. The mechanisms underlying the formation of these foci and the resulting topological structure of the chromosome remain to be elucidated. Here, we propose a thermodynamic framework based on a polymer description of DNA in which genes effectively interact through attractive forces in physical space. The formation of transcription foci then corresponds to a self-organizing process whereby the interacting genes and the non-interacting DNA form two phases that tend to separate. Numerical simulations of the model unveil a rich zoology of the topological ordering of DNA around the foci and show that regularities in the positions of the interacting genes make the spatial co-localization of multiple families of genes particularly efficient. Experimental testing of the predictions of our model should shed new light on the relation between transcriptional regulation and cellular conformations of chromosomes.
Collapse
Affiliation(s)
- Ivan Junier
- Epigenomics Project, Genopole, CNRS UPS 3201, UniverSud Paris, University of Evry, Genopole Campus 1 - Genavenir 6, Evry, France
- Institut des Systèmes Complexes Paris Île-de-France, Paris, France
| | - Olivier Martin
- Université Paris-Sud, UMR 8626 LPTMS, F-91405, Orsay, France
- Université Paris-Sud, UMR 0320/UMR 8120 Génétique Végétale, Gif/Yvette, France
| | - François Képès
- Epigenomics Project, Genopole, CNRS UPS 3201, UniverSud Paris, University of Evry, Genopole Campus 1 - Genavenir 6, Evry, France
- * E-mail:
| |
Collapse
|
17
|
Scherrer K, Jost J. Gene and genon concept: coding versus regulation. A conceptual and information-theoretic analysis of genetic storage and expression in the light of modern molecular biology. Theory Biosci 2007; 126:65-113. [PMID: 18087760 PMCID: PMC2242853 DOI: 10.1007/s12064-007-0012-x] [Citation(s) in RCA: 56] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2007] [Accepted: 07/13/2007] [Indexed: 01/15/2023]
Abstract
We analyse here the definition of the gene in order to distinguish, on the basis of modern insight in molecular biology, what the gene is coding for, namely a specific polypeptide, and how its expression is realized and controlled. Before the coding role of the DNA was discovered, a gene was identified with a specific phenotypic trait, from Mendel through Morgan up to Benzer. Subsequently, however, molecular biologists ventured to define a gene at the level of the DNA sequence in terms of coding. As is becoming ever more evident, the relations between information stored at DNA level and functional products are very intricate, and the regulatory aspects are as important and essential as the information coding for products. This approach led, thus, to a conceptual hybrid that confused coding, regulation and functional aspects. In this essay, we develop a definition of the gene that once again starts from the functional aspect. A cellular function can be represented by a polypeptide or an RNA. In the case of the polypeptide, its biochemical identity is determined by the mRNA prior to translation, and that is where we locate the gene. The steps from specific, but possibly separated sequence fragments at DNA level to that final mRNA then can be analysed in terms of regulation. For that purpose, we coin the new term "genon". In that manner, we can clearly separate product and regulative information while keeping the fundamental relation between coding and function without the need to introduce a conceptual hybrid. In mRNA, the program regulating the expression of a gene is superimposed onto and added to the coding sequence in cis - we call it the genon. The complementary external control of a given mRNA by trans-acting factors is incorporated in its transgenon. A consequence of this definition is that, in eukaryotes, the gene is, in most cases, not yet present at DNA level. Rather, it is assembled by RNA processing, including differential splicing, from various pieces, as steered by the genon. It emerges finally as an uninterrupted nucleic acid sequence at mRNA level just prior to translation, in faithful correspondence with the amino acid sequence to be produced as a polypeptide. After translation, the genon has fulfilled its role and expires. The distinction between the protein coding information as materialised in the final polypeptide and the processing information represented by the genon allows us to set up a new information theoretic scheme. The standard sequence information determined by the genetic code expresses the relation between coding sequence and product. Backward analysis asks from which coding region in the DNA a given polypeptide originates. The (more interesting) forward analysis asks in how many polypeptides of how many different types a given DNA segment is expressed. This concerns the control of the expression process for which we have introduced the genon concept. Thus, the information theoretic analysis can capture the complementary aspects of coding and regulation, of gene and genon.
Collapse
Affiliation(s)
- Klaus Scherrer
- Institut Jacques Monod, CNRS and Univ. Paris 7, 2, place Jussieu, 75251 Paris-Cedex 5, France
| | - Jürgen Jost
- Max Planck Institute for Mathematics in the Sciences MPI MIS, Inselstrasse 22, 04103 Leipzig, Germany
| |
Collapse
|
18
|
Abstract
A cell transmits to its progeny the activity level of many of its genes, not just their sequence. Just like the sequence may vary through a mutation, the gene activity level may change through an "epimutation" (an epigenetic modification) which is heritable and does not entail any concomitant genetic alteration. An epimutation can have important phenotypic consequences, that eventually survive to the loss of the environmental conditions that triggered it. For instance, epimutations are responsible for the divergence between a neuron and an epithelial cell that both come from the same egg and contain the same genome complement. This phenotypic difference is much larger than the one between the neurons from two animal species with dissimilar genotypes, thereby underlining the importance of epimutations. Tradition opposes the genetic and epigenetic visions, the latter being often adequated to the DNA methylation phenomenon. However, epimutations display a rich spectrum of modes that can all fit in a unique reference system based on correlated chemical, spatial and temporal scales. This reference system allows the integration of purely genetic mutations at one of its ends, thus paving the way to a new, gradual vision that encompasses the genome and the epigenome. At the other end can be found two types of epimutations that are both wide-ranging in space and rapid in producing phenotypic alterations. Firstly, long-range rearrangements of the three-dimensional structure of the chromosome may influence gene expression in an heritable fashion. Such rearrangements seem to result from the collective dynamics of DNA-related activities, particularly transcription. Lastly, heritable regulatory states, e.g. a differentiated state that results from tipping a regulatory "toggle switch", involve components that are distributed throughout the nucleus or the cytoplasm, and possibly all the way to cell confines.
Collapse
Affiliation(s)
- François Képès
- Programme d'Epigénomique, Genopole, ATelier de Génomique Cognitive, CNRS UMR 8071/Genopole, 93, rue Henri-Rochefort, 91000 Evry, France.
| |
Collapse
|