1
|
Diaz-Del-Pino S, Perez-Wohlfeil E, Trelles O. Unraveling Genome Evolution Throughout Visual Analysis: The XCout Portal. Bioinform Biol Insights 2021; 15:11779322211021422. [PMID: 34163150 PMCID: PMC8191064 DOI: 10.1177/11779322211021422] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2021] [Accepted: 05/01/2021] [Indexed: 11/25/2022] Open
Abstract
Due to major breakthroughs in sequencing technologies throughout the last decades, the time and cost per sequencing experiment have reduced drastically, overcoming the data generation barrier during the early genomic era. Such a shift has encouraged the scientific community to develop new computational methods that are able to compare large genomic sequences, thus enabling large-scale studies of genome evolution. The field of comparative genomics has proven itself invaluable for studying the evolutionary mechanisms and the forces driving genome evolution. In this line, a full genome comparison study between 2 species requires a quadratic number of comparisons in terms of the number of sequences (around 400 chromosome comparisons in the case of mammalian genomes); however, when studying conserved syntenies or evolutionary rearrangements, many sequence comparisons can be skipped for not all will contain significant signals. Subsequently, the scientific community has developed fast heuristics to perform multiple pairwise comparisons between large sequences to determine whether significant sets of conserved similarities exist. The data generation problem is no longer an issue, yet the limitations have shifted toward the analysis of such massive data. Therefore, we present XCout, a Web-based visual analytics application for multiple genome comparisons designed to improve the analysis of large-scale evolutionary studies using novel techniques in Web visualization. XCout enables to work on hundreds of comparisons at once, thus reducing the time of the analysis by identifying significant signals between chromosomes across multiple species. Among others, XCout introduces several techniques to aid in the analysis of large-scale genome rearrangements, particularly (1) an interactive heatmap interface to display comparisons using automatic color scales based on similarity thresholds to ease detection at first sight, (2) an overlay system to detect individual signal contributions between chromosomes, (3) a tracking tool to trace conserved blocks across different species to perform evolutionary studies, and (4) a search engine to search annotations throughout different species.
Collapse
Affiliation(s)
- Sergio Diaz-Del-Pino
- Computer Architecture Department, Instituto de Investigación Biomédica de Málaga (IBIMA), University of Malaga, Malaga, Spain
| | - Esteban Perez-Wohlfeil
- Computer Architecture Department, Instituto de Investigación Biomédica de Málaga (IBIMA), University of Malaga, Malaga, Spain
| | - Oswaldo Trelles
- Computer Architecture Department, Instituto de Investigación Biomédica de Málaga (IBIMA), University of Malaga, Malaga, Spain
| |
Collapse
|
2
|
Abstract
MOTIVATION An important task in comparative genomics is to detect functional units by analyzing gene-context patterns. Colinear syntenic blocks (CSBs) are groups of genes that are consistently encoded in the same neighborhood and in the same order across a wide range of taxa. Such CSBs are likely essential for the regulation of gene expression in prokaryotes. Recent results indicate that colinearity can be conserved across multiple operons, thus motivating the discovery of multi-operon CSBs. This computational task raises scalability challenges in large datasets. RESULTS We propose an efficient algorithm for the discovery of cross-strand multi-operon CSBs in large genomic datasets. The proposed algorithm uses match-point arithmetic, which is scalable for large datasets of microbial genomes in terms of running time and space requirements. The algorithm is implemented and incorporated into a tool with a graphical user interface, called CSBFinder-S. We applied CSBFinder-S to data mine 1485 prokaryotic genomes and analyzed the identified cross-strand CSBs. Our results indicate that most of the syntenic blocks are exclusively colinear. Additional results indicate that transcriptional regulation by overlapping transcriptional genes is abundant in bacteria. We demonstrate the utility of CSBFinder-S to identify common function of the gene-pair PulEF in multiple contexts, including Type 2 Secretion System, Type 4 Pilus System and DNA uptake machinery. AVAILABILITY AND IMPLEMENTATION CSBFinder-S software and code are publicly available at https://github.com/dinasv/CSBFinder. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Dina Svetlitsky
- Department of Computer Science, Ben-Gurion University of the Negev, Beer-Sheva, Israel
| | - Tal Dagan
- Institute of Microbiology, Kiel University, Kiel 24118, Germany
| | - Michal Ziv-Ukelson
- Department of Computer Science, Ben-Gurion University of the Negev, Beer-Sheva, Israel
| |
Collapse
|
3
|
Svetlitsky D, Dagan T, Chalifa-Caspi V, Ziv-Ukelson M. CSBFinder: discovery of colinear syntenic blocks across thousands of prokaryotic genomes. Bioinformatics 2020; 35:1634-1643. [PMID: 30321308 DOI: 10.1093/bioinformatics/bty861] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2018] [Revised: 09/06/2018] [Accepted: 10/14/2018] [Indexed: 01/12/2023] Open
Abstract
MOTIVATION Identification of conserved syntenic blocks across microbial genomes is important for several problems in comparative genomics such as gene annotation, study of genome organization and evolution and prediction of gene interactions. Current tools for syntenic block discovery do not scale up to the large quantity of prokaryotic genomes available today. RESULTS We present a novel methodology for the discovery, ranking and taxonomic distribution analysis of colinear syntenic blocks (CSBs)-groups of genes that are consistently located close to each other, in the same order, across a wide range of taxa. We present an efficient algorithm that identifies CSBs in large genomic datasets. The algorithm is implemented and incorporated in a novel tool with a graphical user interface, denoted CSBFinder, that ranks the discovered CSBs according to a probabilistic score and clusters them to families according to their gene content similarity. We apply CSBFinder to data mine 1487 prokaryotic genomes including chromosomes and plasmids. For post-processing analysis, we generate heatmaps for visualizing the distribution of CSB family members across various taxa. We exemplify the utility of CSBFinder in operon prediction, in deciphering unknown gene function and in taxonomic analysis of colinear syntenic blocks. AVAILABILITY AND IMPLEMENTATION CSBFinder software and code are publicly available at https://github.com/dinasv/CSBFinder. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Dina Svetlitsky
- Department of Computer Science, Ben-Gurion University of the Negev, Beer-Sheva, Israel
| | - Tal Dagan
- Institute of General Microbiology, Christian-Albrechts University Kiel, Kiel, Germany
| | - Vered Chalifa-Caspi
- Bioinformatics Core Facility, National Institute for Biotechnology in the Negev, Ben-Gurion University of the Negev, Beer-Sheva, Israel
| | - Michal Ziv-Ukelson
- Department of Computer Science, Ben-Gurion University of the Negev, Beer-Sheva, Israel
| |
Collapse
|
4
|
Danchin A, Sekowska A, Noria S. Functional Requirements in the Program and the Cell Chassis for Next-Generation Synthetic Biology. Synth Biol (Oxf) 2018. [DOI: 10.1002/9783527688104.ch5] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022] Open
Affiliation(s)
- Antoine Danchin
- Institute of Cardiometabolism and Nutrition; 47 boulevard de l'Hôpital Paris 75013 France
| | - Agnieszka Sekowska
- Institute of Cardiometabolism and Nutrition; 47 boulevard de l'Hôpital Paris 75013 France
| | - Stanislas Noria
- Fondation Fourmentin-Guilbert; 2 avenue du Pavé Neuf Noisy le Grand 93160 France
| |
Collapse
|
5
|
Cherniak C, Rodriguez-Esteban R. Body maps on the human genome. Mol Cytogenet 2013; 6:61. [PMID: 24354739 PMCID: PMC3905923 DOI: 10.1186/1755-8166-6-61] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2013] [Accepted: 12/05/2013] [Indexed: 01/19/2023] Open
Abstract
BACKGROUND Chromosomes have territories, or preferred locales, in the cell nucleus. When these sites are taken into account, some large-scale structure of the human genome emerges. RESULTS The synoptic picture is that genes highly expressed in particular topologically compact tissues are not randomly distributed on the genome. Rather, such tissue-specific genes tend to map somatotopically onto the complete chromosome set. They seem to form a "genome homunculus": a multi-dimensional, genome-wide body representation extending across chromosome territories of the entire spermcell nucleus. The antero-posterior axis of the body significantly corresponds to the head-tail axis of the nucleus, and the dorso-ventral body axis to the central-peripheral nucleus axis. CONCLUSIONS This large-scale genomic structure includes thousands of genes. One rationale for a homuncular genome structure would be to minimize connection costs in genetic networks. Somatotopic maps in cerebral cortex have been reported for over a century.
Collapse
Affiliation(s)
- Christopher Cherniak
- Committee for Philosophy and the Sciences, Department of Philosophy, University of Maryland, College Park, MD 20742, USA
| | - Raul Rodriguez-Esteban
- Committee for Philosophy and the Sciences, Department of Philosophy, University of Maryland, College Park, MD 20742, USA
| |
Collapse
|
6
|
Fritsche M, Li S, Heermann DW, Wiggins PA. A model for Escherichia coli chromosome packaging supports transcription factor-induced DNA domain formation. Nucleic Acids Res 2012; 40:972-80. [PMID: 21976727 PMCID: PMC3273793 DOI: 10.1093/nar/gkr779] [Citation(s) in RCA: 69] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2011] [Revised: 09/05/2011] [Accepted: 09/05/2011] [Indexed: 01/07/2023] Open
Abstract
What physical mechanism leads to organization of a highly condensed and confined circular chromosome? Computational modeling shows that confinement-induced organization is able to overcome the chromosome's propensity to mix by the formation of topological domains. The experimentally observed high precision of separate subcellular positioning of loci (located on different chromosomal domains) in Escherichia coli naturally emerges as a result of entropic demixing of such chromosomal loops. We propose one possible mechanism for organizing these domains: regulatory control defined by the underlying E. coli gene regulatory network requires the colocalization of transcription factor genes and target genes. Investigating this assumption, we find the DNA chain to self-organize into several topologically distinguishable domains where the interplay between the entropic repulsion of chromosomal loops and their compression due to the confining geometry induces an effective nucleoid filament-type of structure. Thus, we propose that the physical structure of the chromosome is a direct result of regulatory interactions. To reproduce the observed precise ordering of the chromosome, we estimate that the domain sizes are distributed between 10 and 700 kb, in agreement with the size of topological domains identified in the context of DNA supercoiling.
Collapse
Affiliation(s)
- Miriam Fritsche
- Institute for Theoretical Physics, University of Heidelberg, Philosophenweg 19, D-69120 Heidelberg, Germany.
| | | | | | | |
Collapse
|
7
|
Danchin A. A challenge to vaccinology: living organisms trap information. Vaccine 2009; 27 Suppl 6:G13-6. [PMID: 20006133 PMCID: PMC7115390 DOI: 10.1016/j.vaccine.2009.10.071] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/26/2009] [Revised: 10/11/2009] [Accepted: 10/14/2009] [Indexed: 11/03/2022]
Abstract
Life couples reproduction of the cell machinery with replication of the genetic program. Both processes are linked to the expression of some information. Over time, reproduction can enhance the information of the machine. We show that accumulation of valuable information results from degradative processes required to make room for novel entities. Degradation systems act as Maxwell's demons, using energy not to make room per se, but to prevent degradation of what has some functional features. This myopic process will accumulate information, whatever its source, in a ratchet-like manner. The consequence is that genes acquired by horizontal transfer as well as viruses will tend to perpetuate in niches where they are functional, creating recurrent conditions for emergence of diseases.
Collapse
Affiliation(s)
- Antoine Danchin
- CEA/Genoscope, Amabiotics, 2, rue Gaston Crémieux, 91057 Evry Cedex, France.
| |
Collapse
|
8
|
Abstract
Operons (clusters of co-regulated genes with related functions) are common features of bacterial genomes. More recently, functional gene clustering has been reported in eukaryotes, from yeasts to filamentous fungi, plants, and animals. Gene clusters can consist of paralogous genes that have most likely arisen by gene duplication. However, there are now many examples of eukaryotic gene clusters that contain functionally related but non-homologous genes and that represent functional gene organizations with operon-like features (physical clustering and co-regulation). These include gene clusters for use of different carbon and nitrogen sources in yeasts, for production of antibiotics, toxins, and virulence determinants in filamentous fungi, for production of defense compounds in plants, and for innate and adaptive immunity in animals (the major histocompatibility locus). The aim of this article is to review features of functional gene clusters in prokaryotes and eukaryotes and the significance of clustering for effective function.
Collapse
Affiliation(s)
- Anne E Osbourn
- Department of Metabolic Biology, John Innes Centre, Colney Lane, Norwich NR4 7UH, UK.
| | | |
Collapse
|
9
|
Abstract
Many bacterial cellular processes interact intimately with the chromosome. Such interplay is the major driving force of genome structure or organization. Interactions take place at different scales-local for gene expression, global for replication-and lead to the differentiation of the chromosome into organizational units such as operons, replichores, or macrodomains. These processes are intermingled in the cell and create complex higher-level organizational features that are adaptive because they favor the interplay between the processes. The surprising result of selection for genome organization is that gene repertoires change much more quickly than chromosomal structure. Comparative genomics and experimental genomic manipulations are untangling the different cellular and evolutionary mechanisms causing such resilience to change. Since organization results from cellular processes, a better understanding of chromosome organization will help unravel the underlying cellular processes and their diversity.
Collapse
Affiliation(s)
- Eduardo P C Rocha
- Institut Pasteur, Microbial Evolutionary Genomics, F-75015 Paris, France.
| |
Collapse
|
10
|
Danchin A. Bacteria as computers making computers. FEMS Microbiol Rev 2009; 33:3-26. [PMID: 19016882 PMCID: PMC2704931 DOI: 10.1111/j.1574-6976.2008.00137.x] [Citation(s) in RCA: 102] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2008] [Revised: 09/20/2008] [Accepted: 09/21/2008] [Indexed: 12/13/2022] Open
Abstract
Various efforts to integrate biological knowledge into networks of interactions have produced a lively microbial systems biology. Putting molecular biology and computer sciences in perspective, we review another trend in systems biology, in which recursivity and information replace the usual concepts of differential equations, feedback and feedforward loops and the like. Noting that the processes of gene expression separate the genome from the cell machinery, we analyse the role of the separation between machine and program in computers. However, computers do not make computers. For cells to make cells requires a specific organization of the genetic program, which we investigate using available knowledge. Microbial genomes are organized into a paleome (the name emphasizes the role of the corresponding functions from the time of the origin of life), comprising a constructor and a replicator, and a cenome (emphasizing community-relevant genes), made up of genes that permit life in a particular context. The cell duplication process supposes rejuvenation of the machine and replication of the program. The paleome also possesses genes that enable information to accumulate in a ratchet-like process down the generations. The systems biology must include the dynamics of information creation in its future developments.
Collapse
Affiliation(s)
- Antoine Danchin
- Génétique des Génomes Bactériens, Institut Pasteur, Paris, France.
| |
Collapse
|
11
|
|
12
|
Watt RM, Wang J, Leong M, Kung HF, Cheah KS, Liu D, Danchin A, Huang JD. Visualizing the proteome of Escherichia coli: an efficient and versatile method for labeling chromosomal coding DNA sequences (CDSs) with fluorescent protein genes. Nucleic Acids Res 2007; 35:e37. [PMID: 17272300 PMCID: PMC1874593 DOI: 10.1093/nar/gkl1158] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/12/2023] Open
Abstract
To investigate the feasibility of conducting a genomic-scale protein labeling and localization study in Escherichia coli, a representative subset of 23 coding DNA sequences (CDSs) was selected for chromosomal tagging with one or more fluorescent protein genes (EGFP, EYFP, mRFP1, DsRed2). We used λ-Red recombination to precisely and efficiently position PCR-generated DNA targeting cassettes containing a fluorescent protein gene and an antibiotic resistance marker, at the C-termini of the CDSs of interest, creating in-frame fusions under the control of their native promoters. We incorporated cre/loxP and flpe/frt technology to enable multiple rounds of chromosomal tagging events to be performed sequentially with minimal disruption to the target locus, thus allowing sets of proteins to be co-localized within the cell. The visualization of labeled proteins in live E. coli cells using fluorescence microscopy revealed a striking variety of distributions including: membrane and nucleoid association, polar foci and diffuse cytoplasmic localization. Fifty of the fifty-two independent targeting experiments performed were successful, and 21 of the 23 selected CDSs could be fluorescently visualized. Our results show that E. coli has an organized and dynamic proteome, and demonstrate that this approach is applicable for tagging and (co-) localizing CDSs on a genome-wide scale.
Collapse
Affiliation(s)
- Rory M. Watt
- Open Laboratory of Chemical Biology, The Institute of Molecular Technology for Drug Discovery and Synthesis, Department of Chemistry, The University of Hong Kong, Pokfulam Road, Hong Kong SAR, China, Department of Biochemistry, The University of Hong Kong, 3/F Laboratory Block, Faculty of Medicine Building, 21 Sassoon Road, Pokfulam, Hong Kong SAR, China, The Center for Emerging Infectious Diseases, Faculty of Medicine, Chinese University of Hong Kong, Shatin, N.T., Hong Kong SAR, China, National Laboratory of Medical Molecular Biology, Institute of Basic Medical Sciences, Chinese Academy of Medical Sciences (CAMS) & Peking Union Medical College (PUMC), Beijing 100005, P.R. China, Unité GGB, CNRS URA 2171, Institut Pasteur, 28 rue Dr. Roux, 75015 Paris, France and HKU-Pasteur Research Centre, Dexter HC Man Building, 8, Sassoon Road, Pokfulam, Hong Kong SAR, China
| | - Jing Wang
- Open Laboratory of Chemical Biology, The Institute of Molecular Technology for Drug Discovery and Synthesis, Department of Chemistry, The University of Hong Kong, Pokfulam Road, Hong Kong SAR, China, Department of Biochemistry, The University of Hong Kong, 3/F Laboratory Block, Faculty of Medicine Building, 21 Sassoon Road, Pokfulam, Hong Kong SAR, China, The Center for Emerging Infectious Diseases, Faculty of Medicine, Chinese University of Hong Kong, Shatin, N.T., Hong Kong SAR, China, National Laboratory of Medical Molecular Biology, Institute of Basic Medical Sciences, Chinese Academy of Medical Sciences (CAMS) & Peking Union Medical College (PUMC), Beijing 100005, P.R. China, Unité GGB, CNRS URA 2171, Institut Pasteur, 28 rue Dr. Roux, 75015 Paris, France and HKU-Pasteur Research Centre, Dexter HC Man Building, 8, Sassoon Road, Pokfulam, Hong Kong SAR, China
| | - Meikid Leong
- Open Laboratory of Chemical Biology, The Institute of Molecular Technology for Drug Discovery and Synthesis, Department of Chemistry, The University of Hong Kong, Pokfulam Road, Hong Kong SAR, China, Department of Biochemistry, The University of Hong Kong, 3/F Laboratory Block, Faculty of Medicine Building, 21 Sassoon Road, Pokfulam, Hong Kong SAR, China, The Center for Emerging Infectious Diseases, Faculty of Medicine, Chinese University of Hong Kong, Shatin, N.T., Hong Kong SAR, China, National Laboratory of Medical Molecular Biology, Institute of Basic Medical Sciences, Chinese Academy of Medical Sciences (CAMS) & Peking Union Medical College (PUMC), Beijing 100005, P.R. China, Unité GGB, CNRS URA 2171, Institut Pasteur, 28 rue Dr. Roux, 75015 Paris, France and HKU-Pasteur Research Centre, Dexter HC Man Building, 8, Sassoon Road, Pokfulam, Hong Kong SAR, China
| | - Hsiang-fu Kung
- Open Laboratory of Chemical Biology, The Institute of Molecular Technology for Drug Discovery and Synthesis, Department of Chemistry, The University of Hong Kong, Pokfulam Road, Hong Kong SAR, China, Department of Biochemistry, The University of Hong Kong, 3/F Laboratory Block, Faculty of Medicine Building, 21 Sassoon Road, Pokfulam, Hong Kong SAR, China, The Center for Emerging Infectious Diseases, Faculty of Medicine, Chinese University of Hong Kong, Shatin, N.T., Hong Kong SAR, China, National Laboratory of Medical Molecular Biology, Institute of Basic Medical Sciences, Chinese Academy of Medical Sciences (CAMS) & Peking Union Medical College (PUMC), Beijing 100005, P.R. China, Unité GGB, CNRS URA 2171, Institut Pasteur, 28 rue Dr. Roux, 75015 Paris, France and HKU-Pasteur Research Centre, Dexter HC Man Building, 8, Sassoon Road, Pokfulam, Hong Kong SAR, China
| | - Kathryn S.E. Cheah
- Open Laboratory of Chemical Biology, The Institute of Molecular Technology for Drug Discovery and Synthesis, Department of Chemistry, The University of Hong Kong, Pokfulam Road, Hong Kong SAR, China, Department of Biochemistry, The University of Hong Kong, 3/F Laboratory Block, Faculty of Medicine Building, 21 Sassoon Road, Pokfulam, Hong Kong SAR, China, The Center for Emerging Infectious Diseases, Faculty of Medicine, Chinese University of Hong Kong, Shatin, N.T., Hong Kong SAR, China, National Laboratory of Medical Molecular Biology, Institute of Basic Medical Sciences, Chinese Academy of Medical Sciences (CAMS) & Peking Union Medical College (PUMC), Beijing 100005, P.R. China, Unité GGB, CNRS URA 2171, Institut Pasteur, 28 rue Dr. Roux, 75015 Paris, France and HKU-Pasteur Research Centre, Dexter HC Man Building, 8, Sassoon Road, Pokfulam, Hong Kong SAR, China
| | - Depei Liu
- Open Laboratory of Chemical Biology, The Institute of Molecular Technology for Drug Discovery and Synthesis, Department of Chemistry, The University of Hong Kong, Pokfulam Road, Hong Kong SAR, China, Department of Biochemistry, The University of Hong Kong, 3/F Laboratory Block, Faculty of Medicine Building, 21 Sassoon Road, Pokfulam, Hong Kong SAR, China, The Center for Emerging Infectious Diseases, Faculty of Medicine, Chinese University of Hong Kong, Shatin, N.T., Hong Kong SAR, China, National Laboratory of Medical Molecular Biology, Institute of Basic Medical Sciences, Chinese Academy of Medical Sciences (CAMS) & Peking Union Medical College (PUMC), Beijing 100005, P.R. China, Unité GGB, CNRS URA 2171, Institut Pasteur, 28 rue Dr. Roux, 75015 Paris, France and HKU-Pasteur Research Centre, Dexter HC Man Building, 8, Sassoon Road, Pokfulam, Hong Kong SAR, China
| | - Antoine Danchin
- Open Laboratory of Chemical Biology, The Institute of Molecular Technology for Drug Discovery and Synthesis, Department of Chemistry, The University of Hong Kong, Pokfulam Road, Hong Kong SAR, China, Department of Biochemistry, The University of Hong Kong, 3/F Laboratory Block, Faculty of Medicine Building, 21 Sassoon Road, Pokfulam, Hong Kong SAR, China, The Center for Emerging Infectious Diseases, Faculty of Medicine, Chinese University of Hong Kong, Shatin, N.T., Hong Kong SAR, China, National Laboratory of Medical Molecular Biology, Institute of Basic Medical Sciences, Chinese Academy of Medical Sciences (CAMS) & Peking Union Medical College (PUMC), Beijing 100005, P.R. China, Unité GGB, CNRS URA 2171, Institut Pasteur, 28 rue Dr. Roux, 75015 Paris, France and HKU-Pasteur Research Centre, Dexter HC Man Building, 8, Sassoon Road, Pokfulam, Hong Kong SAR, China
| | - Jian-Dong Huang
- Open Laboratory of Chemical Biology, The Institute of Molecular Technology for Drug Discovery and Synthesis, Department of Chemistry, The University of Hong Kong, Pokfulam Road, Hong Kong SAR, China, Department of Biochemistry, The University of Hong Kong, 3/F Laboratory Block, Faculty of Medicine Building, 21 Sassoon Road, Pokfulam, Hong Kong SAR, China, The Center for Emerging Infectious Diseases, Faculty of Medicine, Chinese University of Hong Kong, Shatin, N.T., Hong Kong SAR, China, National Laboratory of Medical Molecular Biology, Institute of Basic Medical Sciences, Chinese Academy of Medical Sciences (CAMS) & Peking Union Medical College (PUMC), Beijing 100005, P.R. China, Unité GGB, CNRS URA 2171, Institut Pasteur, 28 rue Dr. Roux, 75015 Paris, France and HKU-Pasteur Research Centre, Dexter HC Man Building, 8, Sassoon Road, Pokfulam, Hong Kong SAR, China
- *To whom correspondence should be addressed. (+852) 2819 2810(+852) 2855 1254
| |
Collapse
|
13
|
|
14
|
Allen TE, Price ND, Joyce AR, Palsson BØ. Long-range periodic patterns in microbial genomes indicate significant multi-scale chromosomal organization. PLoS Comput Biol 2006; 2:e2. [PMID: 16410829 PMCID: PMC1326223 DOI: 10.1371/journal.pcbi.0020002] [Citation(s) in RCA: 43] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2005] [Accepted: 12/07/2005] [Indexed: 01/02/2023] Open
Abstract
Genome organization can be studied through analysis of chromosome position-dependent patterns in sequence-derived parameters. A comprehensive analysis of such patterns in prokaryotic sequences and genome-scale functional data has yet to be performed. We detected spatial patterns in sequence-derived parameters for 163 chromosomes occurring in 135 bacterial and 16 archaeal organisms using wavelet analysis. Pattern strength was found to correlate with organism-specific features such as genome size, overall GC content, and the occurrence of known motility and chromosomal binding proteins. Given additional functional data for Escherichia coli, we found significant correlations among chromosome position dependent patterns in numerous properties, some of which are consistent with previously experimentally identified chromosome macrodomains. These results demonstrate that the large-scale organization of most sequenced genomes is significantly nonrandom, and, moreover, that this organization is likely linked to genome size, nucleotide composition, and information transfer processes. Constraints on genome evolution and design are thus not solely dependent upon information content, but also upon an intricate multi-parameter, multi-length-scale organization of the chromosome.
Collapse
Affiliation(s)
- Timothy E Allen
- Department of Bioengineering, University of California San Diego, La Jolla, California, United States of America
| | - Nathan D Price
- Department of Bioengineering, University of California San Diego, La Jolla, California, United States of America
| | - Andrew R Joyce
- Bioinformatics Program, University of California San Diego, La Jolla, California, United States of America
| | - Bernhard Ø Palsson
- Department of Bioengineering, University of California San Diego, La Jolla, California, United States of America
- * To whom correspondence should be addressed. E-mail:
| |
Collapse
|
15
|
Abstract
A living cell is not an aggregate of molecules but an organized pattern, structured in space and in time. This article addresses some conceptual issues in the genesis of spatial architecture, including how molecules find their proper location in cell space, the origins of supramolecular order, the role of the genes, cell morphology, the continuity of cells, and the inheritance of order. The discussion is framed around a hierarchy of physiological processes that bridge the gap between nanometer-sized molecules and cells three to six orders of magnitude larger. Stepping stones include molecular self-organization, directional physiology, spatial markers, gradients, fields, and physical forces. The knowledge at hand leads to an unconventional interpretation of biological order. I have come to think of cells as self-organized systems composed of genetically specified elements plus heritable structures. The smallest self that can be fairly said to organize itself is the whole cell. If structure, form, and function are ever to be computed from data at a lower level, the starting point will be not the genome, but a spatially organized system of molecules. This conclusion invites us to reconsider our understanding of what genes do, what organisms are, and how living systems could have arisen on the early Earth.
Collapse
Affiliation(s)
- Franklin M Harold
- Department of Microbiology, University of Washington, Seattle 98195, USA.
| |
Collapse
|
16
|
Carpentier AS, Torrésani B, Grossmann A, Hénaut A. Decoding the nucleoid organisation of Bacillus subtilis and Escherichia coli through gene expression data. BMC Genomics 2005; 6:84. [PMID: 15938745 PMCID: PMC1177944 DOI: 10.1186/1471-2164-6-84] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2005] [Accepted: 06/06/2005] [Indexed: 11/25/2022] Open
Abstract
Background Although the organisation of the bacterial chromosome is an area of active research, little is known yet on that subject. The difficulty lies in the fact that the system is dynamic and difficult to observe directly. The advent of massive hybridisation techniques opens the way to further studies of the chromosomal structure because the genes that are co-expressed, as identified by microarray experiments, probably share some spatial relationship. The use of several independent sets of gene expression data should make it possible to obtain an exhaustive view of the genes co-expression and thus a more accurate image of the structure of the chromosome. Results For both Bacillus subtilis and Escherichia coli the co-expression of genes varies as a function of the distance between the genes along the chromosome. The long-range correlations are surprising: the changes in the level of expression of any gene are correlated (positively or negatively) to the changes in the expression level of other genes located at well-defined long-range distances. This property is true for all the genes, regardless of their localisation on the chromosome. We also found short-range correlations, which suggest that the location of these co-expressed genes corresponds to DNA turns on the nucleoid surface (14–16 genes). Conclusion The long-range correlations do not correspond to the domains so far identified in the nucleoid. We explain our results by a model of the nucleoid solenoid structure based on two types of spirals (short and long). The long spirals are uncoiled expressed DNA while the short ones correspond to coiled unexpressed DNA.
Collapse
Affiliation(s)
- Anne-Sophie Carpentier
- Laboratoire Génome et Informatique, CNRS UMR 8116, Tour Evry2, 523 Place des Terrasses, 91034 Evry Cedex, France
| | - Bruno Torrésani
- CMI, Université de Provence, 39 rue Joliot-Curie, 13453 Marseille cedex 13, France
| | - Alex Grossmann
- Laboratoire Génome et Informatique, CNRS UMR 8116, Tour Evry2, 523 Place des Terrasses, 91034 Evry Cedex, France
| | - Alain Hénaut
- Laboratoire Génome et Informatique, CNRS UMR 8116, Tour Evry2, 523 Place des Terrasses, 91034 Evry Cedex, France
| |
Collapse
|
17
|
Price ND, Reed JL, Palsson BØ. Genome-scale models of microbial cells: evaluating the consequences of constraints. Nat Rev Microbiol 2004; 2:886-97. [PMID: 15494745 DOI: 10.1038/nrmicro1023] [Citation(s) in RCA: 686] [Impact Index Per Article: 34.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022]
Abstract
Microbial cells operate under governing constraints that limit their range of possible functions. With the availability of annotated genome sequences, it has become possible to reconstruct genome-scale biochemical reaction networks for microorganisms. The imposition of governing constraints on a reconstructed biochemical network leads to the definition of achievable cellular functions. In recent years, a substantial and growing toolbox of computational analysis methods has been developed to study the characteristics and capabilities of microorganisms using a constraint-based reconstruction and analysis (COBRA) approach. This approach provides a biochemically and genetically consistent framework for the generation of hypotheses and the testing of functions of microbial cells.
Collapse
Affiliation(s)
- Nathan D Price
- Department of Bioengineering, University of California, San Diego, La Jolla, California 92093, USA
| | | | | |
Collapse
|
18
|
Danchin A. The bag or the spindle: the cell factory at the time of systems' biology. Microb Cell Fact 2004; 3:13. [PMID: 15537427 PMCID: PMC534799 DOI: 10.1186/1475-2859-3-13] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2004] [Accepted: 11/10/2004] [Indexed: 11/10/2022] Open
Abstract
Genome programs changed our view of bacteria as cell factories, by making them amenable to systematic rational improvement. As a first step, isolated genes (including those of the metagenome), or small gene clusters are improved and expressed in a variety of hosts. New techniques derived from functional genomics (transcriptome, proteome and metabolome studies) now allow users to shift from this single-gene approach to a more integrated view of the cell, where it is more and more considered as a factory. One can expect in the near future that bacteria will be entirely reprogrammed, and perhaps even created de novo from bits and pieces, to constitute man-made cell factories. This will require exploration of the landscape made of neighbourhoods of all the genes in the cell. Present work is already paving the way for that futuristic view of bacteria in industry.
Collapse
Affiliation(s)
- Antoine Danchin
- Genetics of Bacterial Genomes, Institut Pasteur, 28, rue du Docteur Roux, 75724 Paris Cedex 15, France.
| |
Collapse
|
19
|
Tosato V, Gjuracic K, Vlahovicek K, Pongor S, Danchin A, Bruschi CV. The DNA secondary structure of the Bacillus subtilis genome. FEMS Microbiol Lett 2003; 218:23-30. [PMID: 12583893 DOI: 10.1111/j.1574-6968.2003.tb11493.x] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022] Open
Abstract
The entire genomic DNA sequence of the Gram-positive bacterium Bacillus subtilis reported in the SubtiList database has been subjected in this work to a complete bioinformatic analysis of the potential formation of secondary DNA structures such as hairpins and bending. The most significant of these structures have been mapped with respect to their genomic location and compared to those structures already known to have a physiological role, such as the rho-independent transcription terminators. The distribution of these structures along the bacterial chromosome shows two major features: (i). the concentration of the most curved DNA in the intergenic regions rather than within the ORFs, and (ii). a decreasing gradient of large hairpins from the origin towards the terC end of chromosomal DNA replication. Given the increasing biological relevance of secondary DNA structures, these findings should facilitate further studies on the evolution, dynamics and expression of the genetic information stored in bacterial genomes.
Collapse
Affiliation(s)
- Valentina Tosato
- Microbiology Group, International Centre for Genetic Engineering and Biotechnology, AREA Science Park, Padriciano 99, 34012, Trieste, Italy
| | | | | | | | | | | |
Collapse
|
20
|
Qiu P, Benbow L, Liu S, Greene JR, Wang L. Analysis of a human brain transcriptome map. BMC Genomics 2002; 3:10. [PMID: 11955288 PMCID: PMC103672 DOI: 10.1186/1471-2164-3-10] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2001] [Accepted: 04/16/2002] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Genome wide transcriptome maps can provide tools to identify candidate genes that are over-expressed or silenced in certain disease tissue and increase our understanding of the structure and organization of the genome. Expressed Sequence Tags (ESTs) from the public dbEST and proprietary Incyte LifeSeq databases were used to derive a transcript map in conjunction with the working draft assembly of the human genome sequence. RESULTS Examination of ESTs derived from brain tissues (excluding brain tumor tissues) suggests that these genes are distributed on chromosomes in a non-random fashion. Some regions on the genome are dense with brain-enriched genes while some regions lack brain-enriched genes, suggesting a significant correlation between distribution of genes along the chromosome and tissue type. ESTs from brain tumor tissues have also been mapped to the human genome working draft. We reveal that some regions enriched in brain genes show a significant decrease in gene expression in brain tumors, and, conversely that some regions lacking in brain genes show an increased level of gene expression in brain tumors. CONCLUSIONS This report demonstrates a novel approach for tissue specific transcriptome mapping using EST-based quantitative assessment.
Collapse
Affiliation(s)
- Ping Qiu
- Bioinformatics Group and Human Genomic Research Department, Schering-Plough Research Institute, 2015 Galloping Hill Road, Kenilworth, New Jersey 07033, USA
| | - Lawrence Benbow
- Bioinformatics Group and Human Genomic Research Department, Schering-Plough Research Institute, 2015 Galloping Hill Road, Kenilworth, New Jersey 07033, USA
| | - Suxing Liu
- Tumor Biology Department, Schering-Plough Research Institute, 2015 Galloping Hill Road, Kenilworth, New Jersey 07033, USA
| | - Jonathan R Greene
- Bioinformatics Group and Human Genomic Research Department, Schering-Plough Research Institute, 2015 Galloping Hill Road, Kenilworth, New Jersey 07033, USA
| | - Luquan Wang
- Bioinformatics Group and Human Genomic Research Department, Schering-Plough Research Institute, 2015 Galloping Hill Road, Kenilworth, New Jersey 07033, USA
| |
Collapse
|
21
|
Yanai I, Mellor JC, DeLisi C. Identifying functional links between genes using conserved chromosomal proximity. Trends Genet 2002; 18:176-9. [PMID: 11932011 DOI: 10.1016/s0168-9525(01)02621-x] [Citation(s) in RCA: 52] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
Abstract
Conservation of proximity of a pair of genes across multiple genomes generally indicates that their functions could be linked. Here, we present a systematic evaluation using 42 complete microbial genomes from 25 phylogenetic groups to test the reliability of this observation in predicting function for genes. We find a relationship between the number of phylogenetic groups in which a gene pair is proximate and the probability that the pair belongs to a common pathway. Our method produces 1586 links between ortholog families substantiated by observed proximity in genomes representing at least three phylogenetic groups. Of the pairs annotated in the KEGG database, 80% are in the same biological pathway in KEGG.
Collapse
Affiliation(s)
- Itai Yanai
- Bioinformatics Graduate Program and Dept of Biomedical Engineering, Boston University, Boston, MA 02215, USA.
| | | | | |
Collapse
|
22
|
Coppée JY, Auger S, Turlin E, Sekowska A, Le Caer JP, Labas V, Vagner V, Danchin A, Martin-Verstraete I. Sulfur-limitation-regulated proteins in Bacillus subtilis: a two-dimensional gel electrophoresis study. MICROBIOLOGY (READING, ENGLAND) 2001; 147:1631-1640. [PMID: 11390694 DOI: 10.1099/00221287-147-6-1631] [Citation(s) in RCA: 30] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
Little is known about the genes and enzymes involved in sulfur assimilation in Bacillus subtilis, or about the regulation of their expression or activity. To identify genes regulated by sulfur limitation, the authors used two- dimensional (2D) gel electrophoresis to compare the proteome of a wild-type strain grown with either sulfate or glutathione as sole sulfur source. A total of 15 proteins whose synthesis is modified under these two conditions were identified by matrix-assisted laser desorption/ionization time of flight (MALDI TOF) mass spectrometry. In the presence of sulfate, an increased amount of proteins involved in the metabolism of C(1) units (SerA, GlyA, FolD) and in the biosynthesis of purines (PurQ, Xpt) and pyrimidines (Upp, PyrAA, PyrF) was observed. In the presence of glutathione, the synthesis of two uptake systems (DppE, SsuA), an oxygenase (SsuD), cysteine synthase (CysK) and two proteins of unknown function (YtmI, YurL) was increased. The changes in expression of the corresponding genes, in the presence of sulfate and glutathione, were monitored using slot-blot analyses and lacZ fusions. The ytmI gene is part of a locus of 12 genes which are co-regulated in response to sulfur availability. This putative operon is activated by a LysR-like regulator, YTLI: This is the first regulator involved in the control of expression in response to sulfur availability to be identified in B. subtilis.
Collapse
Affiliation(s)
- Jean-Yves Coppée
- Unité de Régulation de l'Expression Génétique, Institut Pasteur, 28 rue du Docteur Roux, 75724 Paris Cedex 15, France1
| | - Sandrine Auger
- Unité de Régulation de l'Expression Génétique, Institut Pasteur, 28 rue du Docteur Roux, 75724 Paris Cedex 15, France1
| | - Evelyne Turlin
- Unité de Régulation de l'Expression Génétique, Institut Pasteur, 28 rue du Docteur Roux, 75724 Paris Cedex 15, France1
| | - Agnieszka Sekowska
- Unité de Régulation de l'Expression Génétique, Institut Pasteur, 28 rue du Docteur Roux, 75724 Paris Cedex 15, France1
| | - Jean-Pierre Le Caer
- Neurobiologie et Diversité Cellulaire, CNRS UMR 7637, Ecole Supérieure de Physique et Chimie Industrielles de la Ville de Paris, 10 rue Vauquelin, 75005 Paris, France2
| | - Valérie Labas
- Neurobiologie et Diversité Cellulaire, CNRS UMR 7637, Ecole Supérieure de Physique et Chimie Industrielles de la Ville de Paris, 10 rue Vauquelin, 75005 Paris, France2
| | | | - Antoine Danchin
- Unité de Régulation de l'Expression Génétique, Institut Pasteur, 28 rue du Docteur Roux, 75724 Paris Cedex 15, France1
| | - Isabelle Martin-Verstraete
- Unité de Régulation de l'Expression Génétique, Institut Pasteur, 28 rue du Docteur Roux, 75724 Paris Cedex 15, France1
| |
Collapse
|
23
|
Tamames J. Evolution of gene order conservation in prokaryotes. Genome Biol 2001; 2:RESEARCH0020. [PMID: 11423009 PMCID: PMC33396 DOI: 10.1186/gb-2001-2-6-research0020] [Citation(s) in RCA: 137] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2001] [Revised: 04/09/2001] [Accepted: 04/12/2001] [Indexed: 11/11/2022] Open
Abstract
BACKGROUND As more complete genomes are sequenced, conservation of gene order between different organisms is emerging as an informative property of the genomes. Conservation of gene order has been used for predicting function and functional interactions of proteins, as well as for studying the evolutionary relationships between genomes. The reasons for the maintenance of gene order are still not well understood, as the organization of the prokaryote genome into operons and lateral gene transfer cannot possibly account for all the instances of conservation found. Comprehensive studies of gene order are one way of elucidating the nature of these maintaining forces. RESULTS Gene order is extensively conserved between closely related species, but rapidly becomes less conserved among more distantly related organisms, probably in a cooperative fashion. This trend could be universal in prokaryotic genomes, as archaeal genomes are likely to behave similarly to bacterial genomes. Gene order conservation could therefore be used as a valid phylogenetic measure to study relationships between species. Even between very distant species, remnants of gene order conservation exist in the form of highly conserved clusters of genes. This suggests the existence of selective processes that maintain the organization of these regions. Because the clusters often span more than one operon, common regulation probably cannot be invoked as the cause of the maintenance of gene order. CONCLUSIONS Gene order conservation is a genomic measure that can be useful for studying relationships between prokaryotes and the evolutionary forces shaping their genomes. Gene organization is extensively conserved in some genomic regions, and further studies are needed to elucidate the reason for this conservation.
Collapse
Affiliation(s)
- J Tamames
- Centro de Astrobiología, INTA/CSIC, Carretera de Ajalvir Km, 4, 28850 Torrejón de Ardoz, Madrid, Spain.
| |
Collapse
|
24
|
Sekowska A, Danchin A, Risler JL. Phylogeny of related functions: the case of polyamine biosynthetic enzymes. MICROBIOLOGY (READING, ENGLAND) 2000; 146 ( Pt 8):1815-1828. [PMID: 10931887 DOI: 10.1099/00221287-146-8-1815] [Citation(s) in RCA: 55] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
Genome annotation requires explicit identification of gene function. This task frequently uses protein sequence alignments with examples having a known function. Genetic drift, co-evolution of subunits in protein complexes and a variety of other constraints interfere with the relevance of alignments. Using a specific class of proteins, it is shown that a simple data analysis approach can help solve some of the problems posed. The origin of ureohydrolases has been explored by comparing sequence similarity trees, maximizing amino acid alignment conservation. The trees separate agmatinases from arginases but suggest the presence of unknown biases responsible for unexpected positions of some enzymes. Using factorial correspondence analysis, a distance tree between sequences was established, comparing regions with gaps in the alignments. The gap tree gives a consistent picture of functional kinship, perhaps reflecting some aspects of phylogeny, with a clear domain of enzymes encoding two types of ureohydrolases (agmatinases and arginases) and activities related to, but different from ureohydrolases. Several annotated genes appeared to correspond to a wrong assignment if the trees were significant. They were cloned and their products expressed and identified biochemically. This substantiated the validity of the gap tree. Its organization suggests a very ancient origin of ureohydrolases. Some enzymes of eukaryotic origin are spread throughout the arginase part of the trees: they might have been derived from the genes found in the early symbiotic bacteria that became the organelles. They were transferred to the nucleus when symbiotic genes had to escape Muller's ratchet. This work also shows that arginases and agmatinases share the same two manganese-ion-binding sites and exhibit only subtle differences that can be accounted for knowing the three-dimensional structure of arginases. In the absence of explicit biochemical data, extreme caution is needed when annotating genes having similarities to ureohydrolases.
Collapse
Affiliation(s)
- Agnieszka Sekowska
- Hong Kong University Pasteur Research Centre, Dexter HC Man Building, 8 Sassoon Road, Pokfulam, Hong Kong2
- Regulation of Gene Expression, Institut Pasteur, 28 rue du Docteur Roux, 75724 Paris Cedex 15, France1
| | - Antoine Danchin
- Hong Kong University Pasteur Research Centre, Dexter HC Man Building, 8 Sassoon Road, Pokfulam, Hong Kong2
- Regulation of Gene Expression, Institut Pasteur, 28 rue du Docteur Roux, 75724 Paris Cedex 15, France1
| | - Jean-Loup Risler
- Genome and Informatics, Université de Versailles-Saint-Quentin, 45 Avenue des Etats Unis, 78035 Versailles Cedex, France3
| |
Collapse
|
25
|
Rocha EP, Sekowska A, Danchin A. Sulphur islands in the Escherichia coli genome: markers of the cell's architecture? FEBS Lett 2000; 476:8-11. [PMID: 10878240 DOI: 10.1016/s0014-5793(00)01660-4] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
Two highly contrasted images depict genomes: at first sight, genes appear to be distributed randomly along the chromosome. In contrast, their organisation into operons (or pathogenicity islands) suggests that, at least locally, related functions are in physical proximity. Analysis of the codon usage bias in orthologous genes in the genome of bacteria which diverged a long time ago suggested that some physical (architectural) selection pressure organised the distribution of genes along the chromosome. The metabolism of highly reactive species such as sulphur-containing molecules must be compartmentalised to escape the deleterious actions of diffusible reagents such as gases or radicals. We analysed the distribution of sulphur metabolism genes in the genome of Escherichia coli and found a number of them to be clustered into statistically significant islands. Another interesting feature of these genes is that the proteins they encode are significantly deprived of cysteine and methionine residues, as compared to the bulk proteins. We speculate that this clustering is associated to the organisation of sulphur metabolism proteins into islands where the sensitive sulphur-containing molecules are protected from reacting with elements in the environment such as dioxygen, nitric oxide or radicals.
Collapse
Affiliation(s)
- E P Rocha
- Atelier de BioInformatique, 12, rue Cuvier, 75005 Paris, France
| | | | | |
Collapse
|