1
|
Muzyukina P, Soutourina O. CRISPR genotyping methods: Tracing the evolution from spoligotyping to machine learning. Biochimie 2024; 217:66-73. [PMID: 37506757 DOI: 10.1016/j.biochi.2023.07.017] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2023] [Revised: 07/16/2023] [Accepted: 07/24/2023] [Indexed: 07/30/2023]
Abstract
CRISPR (clustered regularly interspaced short palindromic repeats)-Cas (CRISPR-associated) systems provide prokaryotes with adaptive immunity defenses against foreign genetic invaders. The identification of CRISPR-Cas function is among the most impactful discoveries of recent decades that have shaped the development of genome editing in various organisms paving the way for a plethora of promising applications in biotechnology and health. Even before the discovery of CRISPR-Cas biological role, the particular structure of CRISPR loci has been explored for epidemiological genotyping of bacterial pathogens. CRISPR-Cas loci are arranged in CRISPR arrays of mostly identical direct repeats intercalated with invader-derived spacers and an operon of cas genes encoding the Cas protein components. Each small CRISPR RNA (crRNA) encoded within the CRISPR array constitutes a key functional unit of this RNA-based CRISPR-Cas defense system guiding the Cas effector proteins toward the foreign nucleic acids for their destruction. The information acquired from prior invader encounters and stored within CRISPR arrays turns out to be extremely valuable in tracing the microevolution and epidemiology of major bacterial pathogens. We review here the history of CRISPR-based typing strategies highlighting the first PCR-based methods that have set the stage for recent developments of high-throughput sequencing and machine learning-based approaches. A great amount of whole genome sequencing and metagenomic data accumulated in recent years opens up new avenues for combining experimental and computational approaches of high-resolution CRISPR-based typing.
Collapse
Affiliation(s)
- P Muzyukina
- Université Paris-Saclay, CEA, CNRS, Institute for Integrative Biology of the Cell (I2BC), 91198, Gif-sur-Yvette, France
| | - O Soutourina
- Université Paris-Saclay, CEA, CNRS, Institute for Integrative Biology of the Cell (I2BC), 91198, Gif-sur-Yvette, France; Institut Universitaire de France (IUF), Paris, France.
| |
Collapse
|
2
|
Khan F, Khan M, Iqbal N, Khan S, Muhammad Khan D, Khan A, Wei DQ. Prediction of Recombination Spots Using Novel Hybrid Feature Extraction Method via Deep Learning Approach. Front Genet 2020; 11:539227. [PMID: 33093842 PMCID: PMC7527634 DOI: 10.3389/fgene.2020.539227] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2020] [Accepted: 08/13/2020] [Indexed: 01/20/2023] Open
Abstract
Meiotic recombination is the driving force of evolutionary development and an important source of genetic variation. The meiotic recombination does not take place randomly in a chromosome but occurs in some regions of the chromosome. A region in chromosomes with higher rate of meiotic recombination events are considered as hotspots and a region where frequencies of the recombination events are lower are called coldspots. Prediction of meiotic recombination spots provides useful information about the basic functionality of inheritance and genome diversity. This study proposes an intelligent computational predictor called iRSpots-DNN for the identification of recombination spots. The proposed predictor is based on a novel feature extraction method and an optimized deep neural network (DNN). The DNN was employed as a classification engine whereas, the novel features extraction method was developed to extract meaningful features for the identification of hotspots and coldspots across the yeast genome. Unlike previous algorithms, the proposed feature extraction avoids bias among different selected features and preserved the sequence discriminant properties along with the sequence-structure information simultaneously. This study also considered other effective classifiers named support vector machine (SVM), K-nearest neighbor (KNN), and random forest (RF) to predict recombination spots. Experimental results on a benchmark dataset with 10-fold cross-validation showed that iRSpots-DNN achieved the highest accuracy, i.e., 95.81%. Additionally, the performance of the proposed iRSpots-DNN is significantly better than the existing predictors on a benchmark dataset. The relevant benchmark dataset and source code are freely available at: https://github.com/Fatima-Khan12/iRspot_DNN/tree/master/iRspot_DNN.
Collapse
Affiliation(s)
- Fatima Khan
- Department of Computer Science, Abdul Wali Khan University Mardan, Mardan, Pakistan
| | - Mukhtaj Khan
- Department of Computer Science, Abdul Wali Khan University Mardan, Mardan, Pakistan
| | - Nadeem Iqbal
- Department of Computer Science, Abdul Wali Khan University Mardan, Mardan, Pakistan
| | - Salman Khan
- Department of Computer Science, Abdul Wali Khan University Mardan, Mardan, Pakistan
| | - Dost Muhammad Khan
- Department of Statistics, Abdul Wali Khan University Mardan, Mardan, Pakistan
| | - Abbas Khan
- Department of Bioinformatics and Biological Statistics, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, China
| | - Dong-Qing Wei
- Department of Bioinformatics and Biological Statistics, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, China.,State Key Laboratory of Microbial Metabolism, Shanghai-Islamabad-Belgrade Joint Innovation Center on Antibacterial Resistances, Joint Laboratory of International Cooperation in Metabolic and Developmental Sciences, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Ministry of Education, Shanghai, China.,Peng Cheng Laboratory, Shenzhen, China
| |
Collapse
|
3
|
Prathiviraj R, Chellapandi P. Comparative genomic analysis reveals starvation survival systems in Methanothermobacter thermautotrophicus ΔH. Anaerobe 2020; 64:102216. [PMID: 32504807 DOI: 10.1016/j.anaerobe.2020.102216] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2019] [Revised: 05/07/2020] [Accepted: 05/18/2020] [Indexed: 12/26/2022]
Abstract
Methanothermobacter thermautotrophicus ΔH (MTH) is a thermophilic hydrogenotrophic methanogenic archaeon capable of reducing CO2 with H2 to produce methane gas. It is the potential candidate in the biomethanation of CO2 and CO in anaerobic reactors and biogas upgrading process. However, systematic studies addressing its genome conservation and function remain scant in this genome. In this study, we have evaluated its evolutionary resemblance and metabolic discrepancy, particularly in starvation survival systems by comparing the genomic contexts with Methanothermobacter marburgensis str. Marburg (MMG) and Methanobacterium formicicum DSM 1535 (MFO). The phylogenomic analysis of this study indicated that there was a strong phylogenomic signal among MTH, MMG, and MFO in the whole-genome tree. DNA replication machinery was conserved in the MTH genome and might have evolved at different evolution rates. Genome synteny analysis observed collinearity of either gene orders or gene families has to be maintained with syntenic blocks located in the syntenic out-paralogs. A genome-wide metabolic analysis identified some unique putative metabolic subsystems in MTH, which are proposed to determine its growth characteristics in diverse environments. MTH genome comprised of 93 unique genes-coding for starvation survival and stress-response proteins. These proteins confer its adaptation to nutritional deprivation and other abiotic stresses. MTH has a typical system to withstand its growth and cell viability during stable operation and recovery after prolonged starvation. Thus, the present work will provide an insight to improve the genome refinement and metabolic reconstruction in parallel to other closely related species.
Collapse
Affiliation(s)
- R Prathiviraj
- Molecular Systems Engineering Lab, Department of Bioinformatics, School of Life Sciences, Bharathidasan University, Tiruchirappalli, 620024, Tamil Nadu, India
| | - P Chellapandi
- Molecular Systems Engineering Lab, Department of Bioinformatics, School of Life Sciences, Bharathidasan University, Tiruchirappalli, 620024, Tamil Nadu, India.
| |
Collapse
|
4
|
Yang H, Yang W, Dao FY, Lv H, Ding H, Chen W, Lin H. A comparison and assessment of computational method for identifying recombination hotspots in Saccharomyces cerevisiae. Brief Bioinform 2019; 21:1568-1580. [PMID: 31633777 DOI: 10.1093/bib/bbz123] [Citation(s) in RCA: 67] [Impact Index Per Article: 13.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2019] [Revised: 05/03/2019] [Accepted: 08/19/2019] [Indexed: 12/27/2022] Open
Abstract
Meiotic recombination is one of the most important driving forces of biological evolution, which is initiated by double-strand DNA breaks. Recombination has important roles in genome diversity and evolution. This review firstly provides a comprehensive survey of the 15 computational methods developed for identifying recombination hotspots in Saccharomyces cerevisiae. These computational methods were discussed and compared in terms of underlying algorithms, extracted features, predictive capability and practical utility. Subsequently, a more objective benchmark data set was constructed to develop a new predictor iRSpot-Pse6NC2.0 (http://lin-group.cn/server/iRSpot-Pse6NC2.0). To further demonstrate the generalization ability of these methods, we compared iRSpot-Pse6NC2.0 with existing methods on the chromosome XVI of S. cerevisiae. The results of the independent data set test demonstrated that the new predictor is superior to existing tools in the identification of recombination hotspots. The iRSpot-Pse6NC2.0 will become an important tool for identifying recombination hotspot.
Collapse
Affiliation(s)
- Hui Yang
- Key Laboratory for Neuro-Information of Ministry of Education, School of Life Science and Technology, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu 610054, China
| | - Wuritu Yang
- Development and Planning Department, Inner Mongolia University, Hohhot 010021, China
| | - Fu-Ying Dao
- Key Laboratory for Neuro-Information of Ministry of Education, School of Life Science and Technology, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu 610054, China
| | - Hao Lv
- Key Laboratory for Neuro-Information of Ministry of Education, School of Life Science and Technology, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu 610054, China
| | - Hui Ding
- Key Laboratory for Neuro-Information of Ministry of Education, School of Life Science and Technology, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu 610054, China
| | - Wei Chen
- Innovative Institute of Chinese Medicine and Pharmacy, Chengdu University of Traditional Chinese Medicine, Chengdu 611730, China
| | - Hao Lin
- Center for Genomics and Computational Biology, School of Life Sciences, North China University of Science and Technology, Tangshan 063000, China
| |
Collapse
|
5
|
Kelman LM, Kelman Z. Do Archaea Need an Origin of Replication? Trends Microbiol 2018; 26:172-174. [PMID: 29268981 DOI: 10.1016/j.tim.2017.12.001] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2017] [Revised: 11/29/2017] [Accepted: 12/01/2017] [Indexed: 11/16/2022]
Abstract
Chromosomal DNA replication starts at a specific region called an origin of replication. Until recently, all organisms were thought to require origins to replicate their chromosomes. It was recently discovered that some archaeal species do not utilize origins of replication under laboratory growth conditions.
Collapse
Affiliation(s)
- Lori M Kelman
- Program in Biotechnology, Montgomery College, 20200 Observation Drive, Germantown, MD 20876, USA
| | - Zvi Kelman
- Biomolecular Labeling Laboratory, National Institute of Standards and Technology and Institute for Bioscience and Biotechnology Research, University of Maryland, 9600 Gudelsky Drive, Rockville, MD 20850, USA.
| |
Collapse
|
6
|
Abstract
Codon usage depends on mutation bias, tRNA-mediated selection, and the need for high efficiency and accuracy in translation. One codon in a synonymous codon family is often strongly over-used, especially in highly expressed genes, which often leads to a high dN/dS ratio because dS is very small. Many different codon usage indices have been proposed to measure codon usage and codon adaptation. Sense codon could be misread by release factors and stop codons misread by tRNAs, which also contribute to codon usage in rare cases. This chapter outlines the conceptual framework on codon evolution, illustrates codon-specific and gene-specific codon usage indices, and presents their applications. A new index for codon adaptation that accounts for background mutation bias (Index of Translation Elongation) is presented and contrasted with codon adaptation index (CAI) which does not consider background mutation bias. They are used to re-analyze data from a recent paper claiming that translation elongation efficiency matters little in protein production. The reanalysis disproves the claim.
Collapse
|
7
|
Al Maruf MA, Shatabda S. iRSpot-SF: Prediction of recombination hotspots by incorporating sequence based features into Chou's Pseudo components. Genomics 2018; 111:966-972. [PMID: 29935224 DOI: 10.1016/j.ygeno.2018.06.003] [Citation(s) in RCA: 30] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2018] [Revised: 06/09/2018] [Accepted: 06/13/2018] [Indexed: 11/28/2022]
Abstract
Recombination hotspots in a genome are unevenly distributed. Hotspots are regions in a genome that show higher rates of meiotic recombinations. Computational methods for recombination hotspot prediction often use sophisticated features that are derived from physico-chemical or structure based properties of nucleotides. In this paper, we propose iRSpot-SF that uses sequence based features which are computationally cheap to generate. Four feature groups are used in our method: k-mer composition, gapped k-mer composition, TF-IDF of k-mers and reverse complement k-mer composition. We have used recursive feature elimination to select 17 top features for hotspot prediction. Our analysis shows the superiority of gapped k-mer composition and reverse complement k-mer composition features over others. We have used SVM with RBF kernel as a classification algorithm. We have tested our algorithm on standard benchmark datasets. Compared to other methods iRSpot-SF is able to produce significantly better results in terms of accuracy, Mathew's Correlation Coefficient and sensitivity which are 84.58%, 0.6941 and 84.57%. We have made our method readily available to use as a python based tool and made the datasets and source codes available at: https://github.com/abdlmaruf/iRSpot-SF. An web application is developed based on iRSpot-SF and freely available to use at: http://irspot.pythonanywhere.com/server.html.
Collapse
Affiliation(s)
- Md Abdullah Al Maruf
- Department of Computer Science and Engineering, United International University, Madani Aveneue, Satarkul, Badda, Dhaka 1212, Bangladesh
| | - Swakkhar Shatabda
- Department of Computer Science and Engineering, United International University, Madani Aveneue, Satarkul, Badda, Dhaka 1212, Bangladesh.
| |
Collapse
|
8
|
Arakawa K, Tomita M. The GC Skew Index: A Measure of Genomic Compositional Asymmetry and the Degree of Replicational Selection. Evol Bioinform Online 2017. [DOI: 10.1177/117693430700300006] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022] Open
Abstract
Circular bacterial chromosomes have highly polarized nucleotide composition in the two replichores, and this genomic strand asymmetry can be visualized using GC skew graphs. Here we propose and discuss the GC skew index (GCSI) for the quantification of genomic compositional skew, which combines a normalized measure of fast Fourier transform to capture the shape of the skew graph and Euclidean distance between the two vertices in a cumulative skew graph to represent the degree of skew. We calculated GCSI for all available bacterial genomes, and GCSI correlated well with the visibility of GC skew. This novel index is useful for estimating confidence levels for the prediction of replication origin and terminus by methods based on GC skew and for measuring the strength of replicational selection in a genome.
Collapse
Affiliation(s)
- Kazuharu Arakawa
- Institute for Advanced Biosciences, Keio University, Fujisawa, Kanagawa 252-8520, Japan
| | - Masaru Tomita
- Institute for Advanced Biosciences, Keio University, Fujisawa, Kanagawa 252-8520, Japan
| |
Collapse
|
9
|
Abstract
Many plasmids have been described in Euryarchaeota, one of the three major archaeal phyla, most of them in salt-loving haloarchaea and hyperthermophilic Thermococcales. These plasmids resemble bacterial plasmids in terms of size (from small plasmids encoding only one gene up to large megaplasmids) and replication mechanisms (rolling circle or theta). Some of them are related to viral genomes and form a more or less continuous sequence space including many integrated elements. Plasmids from Euryarchaeota have been useful for designing efficient genetic tools for these microorganisms. In addition, they have also been used to probe the topological state of plasmids in species with or without DNA gyrase and/or reverse gyrase. Plasmids from Euryarchaeota encode both DNA replication proteins recruited from their hosts and novel families of DNA replication proteins. Euryarchaeota form an interesting playground to test evolutionary hypotheses on the origin and evolution of viruses and plasmids, since a robust phylogeny is available for this phylum. Preliminary studies have shown that for different plasmid families, plasmids share a common gene pool and coevolve with their hosts. They are involved in gene transfer, mostly between plasmids and viruses present in closely related species, but rarely between cells from distantly related archaeal lineages. With few exceptions (e.g., plasmids carrying gas vesicle genes), most archaeal plasmids seem to be cryptic. Interestingly, plasmids and viral genomes have been detected in extracellular membrane vesicles produced by Thermococcales, suggesting that these vesicles could be involved in the transfer of viruses and plasmids between cells.
Collapse
|
10
|
Cossu M, Da Cunha V, Toffano-Nioche C, Forterre P, Oberto J. Comparative genomics reveals conserved positioning of essential genomic clusters in highly rearranged Thermococcales chromosomes. Biochimie 2015; 118:313-21. [PMID: 26166067 PMCID: PMC4640148 DOI: 10.1016/j.biochi.2015.07.008] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2015] [Accepted: 07/08/2015] [Indexed: 12/01/2022]
Abstract
The genomes of the 21 completely sequenced Thermococcales display a characteristic high level of rearrangements. As a result, the prediction of their origin and termination of replication on the sole basis of chromosomal DNA composition or skew is inoperative. Using a different approach based on biologically relevant sequences, we were able to determine oriC position in all 21 genomes. The position of dif, the site where chromosome dimers are resolved before DNA segregation could be predicted in 19 genomes. Computation of the core genome uncovered a number of essential gene clusters with a remarkably stable chromosomal position across species, in sharp contrast with the scrambled nature of their genomes. The active chromosomal reorganization of numerous genes acquired by horizontal transfer, mainly from mobile elements, could explain this phenomenon. Thermococcales chromosomal landmarks were uncovered using biologically relevant sequences. Core genomes procedures predict integration of mobile elements on Thermococcales chromosomes. Thermococcales genomes are highly rearranged but core clusters positions remain invariable. Thermococcales core genes are more expressed and predominantly encoded on the leading strand.
Collapse
Affiliation(s)
- Matteo Cossu
- Institute of Integrative Cellular Biology, CEA, CNRS, Université Paris Sud, 91405 Orsay, France
| | - Violette Da Cunha
- Institute of Integrative Cellular Biology, CEA, CNRS, Université Paris Sud, 91405 Orsay, France
| | - Claire Toffano-Nioche
- Institute of Integrative Cellular Biology, CEA, CNRS, Université Paris Sud, 91405 Orsay, France
| | - Patrick Forterre
- Institute of Integrative Cellular Biology, CEA, CNRS, Université Paris Sud, 91405 Orsay, France
| | - Jacques Oberto
- Institute of Integrative Cellular Biology, CEA, CNRS, Université Paris Sud, 91405 Orsay, France
| |
Collapse
|
11
|
Abstract
DNA replication is essential for all life forms. Although the process is fundamentally conserved in the three domains of life, bioinformatic, biochemical, structural, and genetic studies have demonstrated that the process and the proteins involved in archaeal DNA replication are more similar to those in eukaryal DNA replication than in bacterial DNA replication, but have some archaeal-specific features. The archaeal replication system, however, is not monolithic, and there are some differences in the replication process between different species. In this review, the current knowledge of the mechanisms governing DNA replication in Archaea is summarized. The general features of the replication process as well as some of the differences are discussed.
Collapse
Affiliation(s)
- Lori M Kelman
- Program in Biotechnology, Montgomery College, Germantown, Maryland 20876;
| | | |
Collapse
|
12
|
Abstract
Evolutionary selection for optimal genome preservation, replication, and expression should yield similar chromosome organizations in any type of cells. And yet, the chromosome organization is surprisingly different between eukaryotes and prokaryotes. The nuclear versus cytoplasmic accommodation of genetic material accounts for the distinct eukaryotic and prokaryotic modes of genome evolution, but it falls short of explaining the differences in the chromosome organization. I propose that the two distinct ways to organize chromosomes are driven by the differences between the global-consecutive chromosome cycle of eukaryotes and the local-concurrent chromosome cycle of prokaryotes. Specifically, progressive chromosome segregation in prokaryotes demands a single duplicon per chromosome, while other "precarious" features of the prokaryotic chromosomes can be viewed as compensations for this severe restriction.
Collapse
|
13
|
Characterization of the replication initiator Orc1/Cdc6 from the Archaeon Picrophilus torridus. J Bacteriol 2013; 196:276-86. [PMID: 24187082 DOI: 10.1128/jb.01020-13] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022] Open
Abstract
Eukaryotic DNA replication is preceded by the assembly of prereplication complexes (pre-RCs) at or very near origins in G1 phase, which licenses origin firing in S phase. The archaeal DNA replication machinery broadly resembles the eukaryal apparatus, though simpler in form. The eukaryotic replication initiator origin recognition complex (ORC), which serially recruits Cdc6 and other pre-RC proteins, comprises six components, Orc1-6. In archaea, a single gene encodes a protein similar to both the eukaryotic Cdc6 and the Orc1 subunit of the eukaryotic ORC, with most archaea possessing one to three Orc1/Cdc6 orthologs. Genome sequence analysis of the extreme acidophile Picrophilus torridus revealed a single Orc1/Cdc6 (PtOrc1/Cdc6). Biochemical analyses show MBP-tagged PtOrc1/Cdc6 to preferentially bind ORB (origin recognition box) sequences. The protein hydrolyzes ATP in a DNA-independent manner, though DNA inhibits MBP-PtOrc1/Cdc6-mediated ATP hydrolysis. PtOrc1/Cdc6 exists in stable complex with PCNA in Picrophilus extracts, and MBP-PtOrc1/Cdc6 interacts directly with PCNA through a PIP box near its C terminus. Furthermore, PCNA stimulates MBP-PtOrc1/Cdc6-mediated ATP hydrolysis in a DNA-dependent manner. This is the first study reporting a direct interaction between Orc1/Cdc6 and PCNA in archaea. The bacterial initiator DnaA is converted from an active to an inactive form by ATP hydrolysis, a process greatly facilitated by the bacterial ortholog of PCNA, the β subunit of Pol III. The stimulation of PtOrc1/Cdc6-mediated ATP hydrolysis by PCNA and the conservation of PCNA-interacting protein motifs in several archaeal PCNAs suggest the possibility of a similar mechanism of regulation existing in archaea. This mechanism may involve other yet to be identified archaeal proteins.
Collapse
|
14
|
Hyrien O, Rappailles A, Guilbaud G, Baker A, Chen CL, Goldar A, Petryk N, Kahli M, Ma E, d'Aubenton-Carafa Y, Audit B, Thermes C, Arneodo A. From simple bacterial and archaeal replicons to replication N/U-domains. J Mol Biol 2013; 425:4673-89. [PMID: 24095859 DOI: 10.1016/j.jmb.2013.09.021] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2013] [Revised: 09/15/2013] [Accepted: 09/19/2013] [Indexed: 10/26/2022]
Abstract
The Replicon Theory proposed 50 years ago has proven to apply for replicons of the three domains of life. Here, we review our knowledge of genome organization into single and multiple replicons in bacteria, archaea and eukarya. Bacterial and archaeal replicator/initiator systems are quite specific and efficient, whereas eukaryotic replicons show degenerate specificity and efficiency, allowing for complex regulation of origin firing time. We expand on recent evidence that ~50% of the human genome is organized as ~1,500 megabase-sized replication domains with a characteristic parabolic (U-shaped) replication timing profile and linear (N-shaped) gradient of replication fork polarity. These N/U-domains correspond to self-interacting segments of the chromatin fiber bordered by open chromatin zones and replicate by cascades of origin firing initiating at their borders and propagating to their center, possibly by fork-stimulated initiation. The conserved occurrence of this replication pattern in the germline of mammals has resulted over evolutionary times in the formation of megabase-sized domains with an N-shaped nucleotide compositional skew profile due to replication-associated mutational asymmetries. Overall, these results reveal an evolutionarily conserved but developmentally plastic organization of replication that is driving mammalian genome evolution.
Collapse
Affiliation(s)
- Olivier Hyrien
- Ecole Normale Supérieure, IBENS UMR8197 U1024, Paris 75005, France.
| | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
15
|
Abstract
The initiation of DNA replication represents a committing step to cell proliferation. Appropriate replication onset depends on multiprotein complexes that help properly distinguish origin regions, generate nascent replication bubbles, and promote replisome formation. This review describes initiation systems employed by bacteria, archaea, and eukaryotes, with a focus on comparing and contrasting molecular mechanisms among organisms. Although commonalities can be found in the functional domains and strategies used to carry out and regulate initiation, many key participants have markedly different activities and appear to have evolved convergently. Despite significant advances in the field, major questions still persist in understanding how initiation programs are executed at the molecular level.
Collapse
Affiliation(s)
- Alessandro Costa
- Clare Hall Laboratories, London Research Institute, Cancer Research UK, Hertfordshire, EN6 3LD United Kingdom
| | - Iris V. Hood
- Department of Molecular and Cell Biology, California Institute for Quantitative Biosciences, University of California, Berkeley, California 94720
| | - James M. Berger
- Department of Molecular and Cell Biology, California Institute for Quantitative Biosciences, University of California, Berkeley, California 94720
| |
Collapse
|
16
|
Gao F, Luo H, Zhang CT. DoriC 5.0: an updated database of oriC regions in both bacterial and archaeal genomes. Nucleic Acids Res 2012; 41:D90-3. [PMID: 23093601 PMCID: PMC3531139 DOI: 10.1093/nar/gks990] [Citation(s) in RCA: 111] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022] Open
Abstract
Replication of chromosomes is one of the central events in the cell cycle. Chromosome replication begins at specific sites, called origins of replication (oriCs), for all three domains of life. However, the origins of replication still remain unknown in a considerably large number of bacterial and archaeal genomes completely sequenced so far. The availability of increasing complete bacterial and archaeal genomes has created challenges and opportunities for identification of their oriCs in silico, as well as in vivo. Based on the Z-curve theory, we have developed a web-based system Ori-Finder to predict oriCs in bacterial genomes with high accuracy and reliability by taking advantage of comparative genomics, and the predicted oriC regions have been organized into an online database DoriC, which is publicly available at http://tubic.tju.edu.cn/doric/ since 2007. Five years after we constructed DoriC, the database has significant advances over the number of bacterial genomes, increasing about 4-fold. Additionally, oriC regions in archaeal genomes identified by in vivo experiments, as well as in silico analyses, have also been added to the database. Consequently, the latest release of DoriC contains oriCs for >1500 bacterial genomes and 81 archaeal genomes, respectively.
Collapse
Affiliation(s)
- Feng Gao
- Department of Physics, Tianjin University, Tianjin 300072, China.
| | | | | |
Collapse
|
17
|
Xia X. DNA replication and strand asymmetry in prokaryotic and mitochondrial genomes. Curr Genomics 2012; 13:16-27. [PMID: 22942672 PMCID: PMC3269012 DOI: 10.2174/138920212799034776] [Citation(s) in RCA: 43] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2011] [Revised: 09/26/2011] [Accepted: 10/02/2011] [Indexed: 11/22/2022] Open
Abstract
Different patterns of strand asymmetry have been documented in a variety of prokaryotic genomes as well as mitochondrial genomes. Because different replication mechanisms often lead to different patterns of strand asymmetry, much can be learned of replication mechanisms by examining strand asymmetry. Here I summarize the diverse patterns of strand asymmetry among different taxonomic groups to suggest that (1) the single-origin replication may not be universal among bacterial species as the endosymbionts Wigglesworthia glossinidia, Wolbachia species, cyanobacterium Synechocystis 6803 and Mycoplasma pulmonis genomes all exhibit strand asymmetry patterns consistent with the multiple origins of replication, (2) different replication origins in some archaeal genomes leave quite different patterns of strand asymmetry, suggesting that different replication origins in the same genome may be differentially used, (3) mitochondrial genomes from representative vertebrate species share one strand asymmetry pattern consistent with the strand-displacement replication documented in mammalian mtDNA, suggesting that the mtDNA replication mechanism in mammals may be shared among all vertebrate species, and (4) mitochondrial genomes from primitive forms of metazoans such as the sponge and hydra (representing Porifera and Cnidaria, respectively), as well as those from plants, have strand asymmetry patterns similar to single-origin or multi-origin replications observed in prokaryotes and are drastically different from mitochondrial genomes from other metazoans. This may explain why sponge and hydra mitochondrial genomes, as well as plant mitochondrial genomes, evolves much slower than those from other metazoans.
Collapse
Affiliation(s)
- Xuhua Xia
- Department of Biology and Center for Advanced Research in Environmental Genomics, University of Ottawa, 30 Marie Curie, P.O. Box 450, Station A, Ottawa, Ontario, Canada
| |
Collapse
|
18
|
Pelve EA, Lindås AC, Knöppel A, Mira A, Bernander R. Four chromosome replication origins in the archaeonPyrobaculum calidifontis. Mol Microbiol 2012; 85:986-95. [DOI: 10.1111/j.1365-2958.2012.08155.x] [Citation(s) in RCA: 37] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2023]
|
19
|
Ishino Y, Ishino S. Rapid progress of DNA replication studies in Archaea, the third domain of life. SCIENCE CHINA-LIFE SCIENCES 2012; 55:386-403. [PMID: 22645083 DOI: 10.1007/s11427-012-4324-9] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/27/2012] [Accepted: 04/20/2012] [Indexed: 02/04/2023]
Abstract
Archaea, the third domain of life, are interesting organisms to study from the aspects of molecular and evolutionary biology. Archaeal cells have a unicellular ultrastructure without a nucleus, resembling bacterial cells, but the proteins involved in genetic information processing pathways, including DNA replication, transcription, and translation, share strong similarities with those of Eukaryota. Therefore, archaea provide useful model systems to understand the more complex mechanisms of genetic information processing in eukaryotic cells. Moreover, the hyperthermophilic archaea provide very stable proteins, which are especially useful for the isolation of replisomal multicomplexes, to analyze their structures and functions. This review focuses on the history, current status, and future directions of archaeal DNA replication studies.
Collapse
Affiliation(s)
- Yoshizumi Ishino
- Department of Bioscience and Biotechnology, Graduate School of Bioresource and Bioenvironmental Sciences, Kyushu University, Fukuoka, Japan.
| | | |
Collapse
|
20
|
Rajewska M, Wegrzyn K, Konieczny I. AT-rich region and repeated sequences - the essential elements of replication origins of bacterial replicons. FEMS Microbiol Rev 2011; 36:408-34. [PMID: 22092310 DOI: 10.1111/j.1574-6976.2011.00300.x] [Citation(s) in RCA: 78] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2011] [Accepted: 07/07/2011] [Indexed: 11/27/2022] Open
Abstract
Repeated sequences are commonly present in the sites for DNA replication initiation in bacterial, archaeal, and eukaryotic replicons. Those motifs are usually the binding places for replication initiation proteins or replication regulatory factors. In prokaryotic replication origins, the most abundant repeated sequences are DnaA boxes which are the binding sites for chromosomal replication initiation protein DnaA, iterons which bind plasmid or phage DNA replication initiators, defined motifs for site-specific DNA methylation, and 13-nucleotide-long motifs of a not too well-characterized function, which are present within a specific region of replication origin containing higher than average content of adenine and thymine residues. In this review, we specify methods allowing identification of a replication origin, basing on the localization of an AT-rich region and the arrangement of the origin's structural elements. We describe the regularity of the position and structure of the AT-rich regions in bacterial chromosomes and plasmids. The importance of 13-nucleotide-long repeats present at the AT-rich region, as well as other motifs overlapping them, was pointed out to be essential for DNA replication initiation including origin opening, helicase loading and replication complex assembly. We also summarize the role of AT-rich region repeated sequences for DNA replication regulation.
Collapse
Affiliation(s)
- Magdalena Rajewska
- Department of Molecular and Cellular Biology, Intercollegiate Faculty of Biotechnology, University of Gdansk, Gdansk, Poland
| | | | | |
Collapse
|
21
|
Defining components of the chromosomal origin of replication of the hyperthermophilic archaeon Pyrococcus furiosus needed for construction of a stable replicating shuttle vector. Appl Environ Microbiol 2011; 77:6343-9. [PMID: 21784908 DOI: 10.1128/aem.05057-11] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/07/2023] Open
Abstract
We report the construction of a series of replicating shuttle vectors that consist of a low-copy-number cloning vector for Escherichia coli and functional components of the origin of replication (oriC) of the chromosome of the hyperthermophilic archaeon Pyrococcus furiosus. In the process of identifying the minimum replication origin sequence required for autonomous plasmid replication in P. furiosus, we discovered that several features of the origin predicted by bioinformatic analysis and in vitro binding studies were not essential for stable autonomous plasmid replication. A minimum region required to promote plasmid DNA replication was identified, and plasmids based on this sequence readily transformed P. furiosus. The plasmids replicated autonomously and existed in a single copy. In contrast to shuttle vectors based on a plasmid from the closely related hyperthermophile Pyrococcus abyssi for use in P. furiosus, plasmids based on the P. furiosus chromosomal origin were structurally unchanged after transformation and were stable without selection for more than 100 generations.
Collapse
|
22
|
Dyall-Smith ML, Pfeiffer F, Klee K, Palm P, Gross K, Schuster SC, Rampp M, Oesterhelt D. Haloquadratum walsbyi: limited diversity in a global pond. PLoS One 2011; 6:e20968. [PMID: 21701686 PMCID: PMC3119063 DOI: 10.1371/journal.pone.0020968] [Citation(s) in RCA: 90] [Impact Index Per Article: 6.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2011] [Accepted: 05/14/2011] [Indexed: 12/03/2022] Open
Abstract
Background Haloquadratum walsbyi commonly dominates the microbial flora of hypersaline waters. Its cells are extremely fragile squares requiring >14%(w/v) salt for growth, properties that should limit its dispersal and promote geographical isolation and divergence. To assess this, the genome sequences of two isolates recovered from sites at near maximum distance on Earth, were compared. Principal Findings Both chromosomes are 3.1 MB in size, and 84% of each sequence was highly similar to the other (98.6% identity), comprising the core sequence. ORFs of this shared sequence were completely synteneic (conserved in genomic orientation and order), without inversion or rearrangement. Strain-specific insertions/deletions could be precisely mapped, often allowing the genetic events to be inferred. Many inferred deletions were associated with short direct repeats (4–20 bp). Deletion-coupled insertions are frequent, producing different sequences at identical positions. In cases where the inserted and deleted sequences are homologous, this leads to variant genes in a common synteneic background (as already described by others). Cas/CRISPR systems are present in C23T but have been lost in HBSQ001 except for a few spacer remnants. Numerous types of mobile genetic elements occur in both strains, most of which appear to be active, and with some specifically targetting others. Strain C23T carries two ∼6 kb plasmids that show similarity to halovirus His1 and to sequences nearby halovirus/plasmid gene clusters commonly found in haloarchaea. Conclusions Deletion-coupled insertions show that Hqr. walsbyi evolves by uptake and precise integration of foreign DNA, probably originating from close relatives. Change is also driven by mobile genetic elements but these do not by themselves explain the atypically low gene coding density found in this species. The remarkable genome conservation despite the presence of active systems for genome rearrangement implies both an efficient global dispersal system, and a high selective fitness for this species.
Collapse
Affiliation(s)
- Mike L Dyall-Smith
- Department of Membrane Biochemistry, Max-Planck-Institute of Biochemistry, Martinsried, Germany.
| | | | | | | | | | | | | | | |
Collapse
|
23
|
Cortez D, Quevillon-Cheruel S, Gribaldo S, Desnoues N, Sezonov G, Forterre P, Serre MC. Evidence for a Xer/dif system for chromosome resolution in archaea. PLoS Genet 2010; 6:e1001166. [PMID: 20975945 PMCID: PMC2958812 DOI: 10.1371/journal.pgen.1001166] [Citation(s) in RCA: 41] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/26/2010] [Accepted: 09/17/2010] [Indexed: 12/02/2022] Open
Abstract
Homologous recombination events between circular chromosomes, occurring during or after replication, can generate dimers that need to be converted to monomers prior to their segregation at cell division. In Escherichia coli, chromosome dimers are converted to monomers by two paralogous site-specific tyrosine recombinases of the Xer family (XerC/D). The Xer recombinases act at a specific dif site located in the replication termination region, assisted by the cell division protein FtsK. This chromosome resolution system has been predicted in most Bacteria and further characterized for some species. Archaea have circular chromosomes and an active homologous recombination system and should therefore resolve chromosome dimers. Most archaea harbour a single homologue of bacterial XerC/D proteins (XerA), but not of FtsK. Therefore, the role of XerA in chromosome resolution was unclear. Here, we have identified dif-like sites in archaeal genomes by using a combination of modeling and comparative genomics approaches. These sites are systematically located in replication termination regions. We validated our in silico prediction by showing that the XerA protein of Pyrococcus abyssi specifically recombines plasmids containing the predicted dif site in vitro. In contrast to the bacterial system, XerA can recombine dif sites in the absence of protein partners. Whereas Archaea and Bacteria use a completely different set of proteins for chromosome replication, our data strongly suggest that XerA is most likely used for chromosome resolution in Archaea. Bacteria with circular chromosome and active homologous recombination systems have to resolve chromosomal dimers before segregation at cell division. In Escherichia coli, the Xer site-specific recombination system, composed of two recombinases and a specific chromosomal site (dif), is involved in the correct inheritance of the chromosome. The recombination event is tightly regulated by the chromosome translocase FtsK. This chromosome resolution system has been predicted in most bacteria and further characterized for some species. Intriguingly, most archaea possess a gene coding for a recombinase homologous to bacterial Xers, but none have homologues of the bacterial FtsK. We identified the specific target sites for archaeal Xer. This site, present in one copy per chromosome, is located in the replication termination region and shows sequence similarities with bacterial dif sites. In vitro, the archaeal Xer recombines this site in the absence of protein partner. It has been shown that DNA–related proteins from Archaea and Eukarya share a common origin, whereas their analogues in Bacteria have evolved independently. In this context, Eukarya and Archaea would represent sister groups. Therefore, the presence of a shared Xer-dif system between Bacteria and Archaea illustrates the complex origin of modern DNA genomes.
Collapse
Affiliation(s)
- Diego Cortez
- Institut Pasteur, Unité Biologie Moléculaire du Gène chez les Extrêmophiles, Paris, France
| | - Sophie Quevillon-Cheruel
- Institut de Biochimie et de Biophysique Moléculaire et Cellulaire, UMR8619-CNRS, Université Paris-Sud 11, IFR115, Orsay, France
| | - Simonetta Gribaldo
- Institut Pasteur, Unité Biologie Moléculaire du Gène chez les Extrêmophiles, Paris, France
| | - Nicole Desnoues
- Institut Pasteur, Unité Biologie Moléculaire du Gène chez les Extrêmophiles, Paris, France
| | - Guennadi Sezonov
- Institut Pasteur, Unité Biologie Moléculaire du Gène chez les Extrêmophiles, Paris, France
- Université Pierre et Marie Curie, Paris, France
| | - Patrick Forterre
- Institut Pasteur, Unité Biologie Moléculaire du Gène chez les Extrêmophiles, Paris, France
- Institut de Génétique et Microbiologie, Université Paris-Sud 11, UMR8621-CNRS, IFR115, Orsay, France
| | - Marie-Claude Serre
- Institut de Génétique et Microbiologie, Université Paris-Sud 11, UMR8621-CNRS, IFR115, Orsay, France
- * E-mail:
| |
Collapse
|
24
|
von Jan M, Lapidus A, Del Rio TG, Copeland A, Tice H, Cheng JF, Lucas S, Chen F, Nolan M, Goodwin L, Han C, Pitluck S, Liolios K, Ivanova N, Mavromatis K, Ovchinnikova G, Chertkov O, Pati A, Chen A, Palaniappan K, Land M, Hauser L, Chang YJ, Jeffries CD, Saunders E, Brettin T, Detter JC, Chain P, Eichinger K, Huber H, Spring S, Rohde M, Göker M, Wirth R, Woyke T, Bristow J, Eisen JA, Markowitz V, Hugenholtz P, Kyrpides NC, Klenk HP. Complete genome sequence of Archaeoglobus profundus type strain (AV18). Stand Genomic Sci 2010; 2:327-46. [PMID: 21304717 PMCID: PMC3035285 DOI: 10.4056/sigs.942153] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
Archaeoglobus profundus (Burggraf et al. 1990) is a hyperthermophilic archaeon in the euryarchaeal class Archaeoglobi, which is currently represented by the single family Archaeoglobaceae, containing six validly named species and two strains ascribed to the genus 'Geoglobus' which is taxonomically challenged as the corresponding type species has no validly published name. All members were isolated from marine hydrothermal habitats and are obligate anaerobes. Here we describe the features of the organism, together with the complete genome sequence and annotation. This is the second completed genome sequence of a member of the class Archaeoglobi. The 1,563,423 bp genome with its 1,858 protein-coding and 52 RNA genes is a part of the Genomic Encyclopedia of Bacteria and Archaea project.
Collapse
|
25
|
Akita M, Adachi A, Takemura K, Yamagami T, Matsunaga F, Ishino Y. Cdc6/Orc1 from Pyrococcus furiosus may act as the origin recognition protein and Mcm helicase recruiter. Genes Cells 2010; 15:537-52. [PMID: 20384788 DOI: 10.1111/j.1365-2443.2010.01402.x] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]
Abstract
Archaea have one or more Cdc6/Orc1 proteins, which share sequence similarities with eukaryotic Cdc6 and Orc1. These proteins are involved in the initiation process of DNA replication, although their specific function has not been elucidated, except for origin recognition and binding. We showed that the Cdc6/Orc1 protein from the hyperthermophilic archaeon Pyrococcus furiosus specifically binds to the oriC region in the whole genome. However, it remains unclear how this initiator protein specifically recognizes the oriC region and how the Mcm helicase is recruited to oriC. In the current study, we characterized the biochemical properties of Cdc6/Orc1 in P. furiosus. The ATPase activity of the Cdc6/Orc1 protein was completely suppressed by binding to DNA containing the origin recognition box (ORB). Limited proteolysis and DNase I-footprint experiments suggested that the Cdc6/Orc1 protein changes its conformation on the ORB sequence in the presence of ATP. This conformational change may have an unknown, important function in the initiation process. Results from an in vitro recruiting assay indicated that Mcm is recruited onto the oriC region in a Cdc6/Orc1-dependent, but not ATP-dependent, manner. However, some other function is required for the functional loading of this helicase to start the unwinding of the replication fork DNA.
Collapse
Affiliation(s)
- Masaki Akita
- Graduate School of Bioresource and Bioenvironmental Sciences, Kyushu University, 6-10-1 Hakozaki, Higashi-ku, Fukuoka, Fukuoka 812-8581, Japan
| | | | | | | | | | | |
Collapse
|
26
|
Matsunaga F, Takemura K, Akita M, Adachi A, Yamagami T, Ishino Y. Localized melting of duplex DNA by Cdc6/Orc1 at the DNA replication origin in the hyperthermophilic archaeon Pyrococcus furiosus. Extremophiles 2009; 14:21-31. [PMID: 19787415 DOI: 10.1007/s00792-009-0284-9] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2009] [Accepted: 09/14/2009] [Indexed: 10/20/2022]
Abstract
The initiation step is a key process to regulate the frequency of DNA replication. Although recent studies in Archaea defined the origin of DNA replication (oriC) and the Cdc6/Orc1 homolog as an origin recognition protein, the location and mechanism of duplex opening have remained unclear. We have found that Cdc6/Orc1 binds to oriC and unwinds duplex DNA in the hyperthermophilic archaeon Pyrococcus furiosus, by means of a P1 endonuclease assay. A primer extension analysis further revealed that this localized unwinding occurs in the oriC region at a specific site, which is 12-bp long and rich in adenine and thymine. This site is different from the predicted duplex unwinding element (DUE) that we reported previously. We also discovered that Cdc6/Orc1 induces topological changes in supercoiled oriC DNA, and that this process is dependent on the AAA+ domain. These results indicate that topological alterations of oriC DNA by Cdc6/Orc1 introduce a single-stranded region at the 12-mer site, that could possibly serve as an entry point for Mcm helicase.
Collapse
Affiliation(s)
- Fujihiko Matsunaga
- Department of Genetic Resources Technology, Graduate School of Bioresource and Bioenvironmental Sciences, Kyushu University, 6-10-1 Hakozaki, Higashi-ku, Fukuoka, Fukuoka 812-8581, Japan
| | | | | | | | | | | |
Collapse
|
27
|
Wigley DB. ORC proteins: marking the start. Curr Opin Struct Biol 2009; 19:72-8. [PMID: 19217277 DOI: 10.1016/j.sbi.2008.12.010] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/08/2008] [Accepted: 12/19/2008] [Indexed: 11/27/2022]
Abstract
The DNA replication apparatus of archaea is more closely related to that of eukaryotes than eubacteria. Furthermore, recent work has shown that archaea, like eukaryotes, have multiple replication origins. Biochemical data are starting to reveal how archaeal origin binding proteins recognise and remodel origin DNA sequences. Crystal structures of archaeal replication origin binding proteins complexed with their DNA targets revealed details of how they interact with origins and showed that they introduce significant deformations of the DNA. Although these recent advances provide insight about the initial interactions of proteins at archaeal replication origins, the molecular mechanisms of origin assembly and firing still remain elusive.
Collapse
Affiliation(s)
- Dale B Wigley
- Cancer Research UK Clare Hall Laboratories, The London Research Institute, Blanche Lane, South Mimms, Potters Bar, Herts, UK.
| |
Collapse
|
28
|
Podar M, Anderson I, Makarova KS, Elkins JG, Ivanova N, Wall MA, Lykidis A, Mavromatis K, Sun H, Hudson ME, Chen W, Deciu C, Hutchison D, Eads JR, Anderson A, Fernandes F, Szeto E, Lapidus A, Kyrpides NC, Saier MH, Richardson PM, Rachel R, Huber H, Eisen JA, Koonin EV, Keller M, Stetter KO. A genomic analysis of the archaeal system Ignicoccus hospitalis-Nanoarchaeum equitans. Genome Biol 2008; 9:R158. [PMID: 19000309 PMCID: PMC2614490 DOI: 10.1186/gb-2008-9-11-r158] [Citation(s) in RCA: 89] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2008] [Revised: 10/21/2008] [Accepted: 11/10/2008] [Indexed: 01/03/2023] Open
Abstract
Sequencing of the complete genome of Ignicoccus hospitalis gives insight into its association with another species of Archaea, Nanoarchaeum equitans. Background The relationship between the hyperthermophiles Ignicoccus hospitalis and Nanoarchaeum equitans is the only known example of a specific association between two species of Archaea. Little is known about the mechanisms that enable this relationship. Results We sequenced the complete genome of I. hospitalis and found it to be the smallest among independent, free-living organisms. A comparative genomic reconstruction suggests that the I. hospitalis lineage has lost most of the genes associated with a heterotrophic metabolism that is characteristic of most of the Crenarchaeota. A streamlined genome is also suggested by a low frequency of paralogs and fragmentation of many operons. However, this process appears to be partially balanced by lateral gene transfer from archaeal and bacterial sources. Conclusions A combination of genomic and cellular features suggests highly efficient adaptation to the low energy yield of sulfur-hydrogen respiration and efficient inorganic carbon and nitrogen assimilation. Evidence of lateral gene exchange between N. equitans and I. hospitalis indicates that the relationship has impacted both genomes. This association is the simplest symbiotic system known to date and a unique model for studying mechanisms of interspecific relationships at the genomic and metabolic levels.
Collapse
Affiliation(s)
- Mircea Podar
- Biosciences Division, Oak Ridge National Laboratory, 1 Bethel Valley Rd, Oak Ridge, TN 37831, USA.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
29
|
Duderstadt KE, Berger JM. AAA+ ATPases in the initiation of DNA replication. Crit Rev Biochem Mol Biol 2008; 43:163-87. [PMID: 18568846 DOI: 10.1080/10409230802058296] [Citation(s) in RCA: 67] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
Abstract
All cellular organisms and many viruses rely on large, multi-subunit molecular machines, termed replisomes, to ensure that genetic material is accurately duplicated for transmission from one generation to the next. Replisome assembly is facilitated by dedicated initiator proteins, which serve to both recognize replication origins and recruit requisite replisomal components to the DNA in a cell-cycle coordinated manner. Exactly how imitators accomplish this task, and the extent to which initiator mechanisms are conserved among different organisms have remained outstanding issues. Recent structural and biochemical findings have revealed that all cellular initiators, as well as the initiators of certain classes of double-stranded DNA viruses, possess a common adenine nucleotide-binding fold belonging to the ATPases Associated with various cellular Activities (AAA+) family. This review focuses on how the AAA+ domain has been recruited and adapted to control the initiation of DNA replication, and how the use of this ATPase module underlies a common set of initiator assembly states and functions. How biochemical and structural properties correlate with initiator activity, and how species-specific modifications give rise to unique initiator functions, are also discussed.
Collapse
Affiliation(s)
- Karl E Duderstadt
- Department Molecular and Cell Biology and Biophysics Graduate Group, California Institute for Quantitative Biology, University of California, Berkeley, California 94720-3220, USA.
| | | |
Collapse
|
30
|
Sernova NV, Gelfand MS. Identification of replication origins in prokaryotic genomes. Brief Bioinform 2008; 9:376-91. [PMID: 18660512 DOI: 10.1093/bib/bbn031] [Citation(s) in RCA: 47] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
The availability of hundreds of complete bacterial genomes has created new challenges and simultaneously opportunities for bioinformatics. In the area of statistical analysis of genomic sequences, the studies of nucleotide compositional bias and gene bias between strands and replichores paved way to the development of tools for prediction of bacterial replication origins. Only a few (about 20) origin regions for eubacteria and archaea have been proven experimentally. One reason for that may be that this is now considered as an essentially bioinformatics problem, where predictions are sufficiently reliable not to run labor-intensive experiments, unless specifically needed. Here we describe the main existing approaches to the identification of replication origin (oriC) and termination (terC) loci in prokaryotic chromosomes and characterize a number of computational tools based on various skew types and other types of evidence. We also classify the eubacterial and archaeal chromosomes by predictability of their replication origins using skew plots. Finally, we discuss possible combined approaches to the identification of the oriC sites that may be used to improve the prediction tools, in particular, the analysis of DnaA binding sites using the comparative genomic methods.
Collapse
Affiliation(s)
- Natalia V Sernova
- Institute for Information Transmission Problems (Kharkevich Institute), Russian Academy of Sciences, Bolshoi Karetny pereulok, 19, Moscow, 127994, Russia
| | | |
Collapse
|
31
|
Berthon J, Cortez D, Forterre P. Genomic context analysis in Archaea suggests previously unrecognized links between DNA replication and translation. Genome Biol 2008; 9:R71. [PMID: 18400081 PMCID: PMC2643942 DOI: 10.1186/gb-2008-9-4-r71] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2007] [Revised: 02/22/2008] [Accepted: 04/09/2008] [Indexed: 11/05/2022] Open
Abstract
Specific functional interactions of proteins involved in DNA replication and/or DNA repair or transcription might occur in Archaea, suggesting a previously unrecognized regulatory network coupling DNA replication and translation, which might also exist in Eukarya. Background Comparative analysis of genomes is valuable to explore evolution of genomes, deduce gene functions, or predict functional linking between proteins. Here, we have systematically analyzed the genomic environment of all known DNA replication genes in 27 archaeal genomes to infer new connections for DNA replication proteins from conserved genomic associations. Results Two distinct sets of DNA replication genes frequently co-localize in archaeal genomes: the first includes the genes for PCNA, the small subunit of the DNA primase (PriS), and Gins15; the second comprises the genes for MCM and Gins23. Other genomic associations of genes encoding proteins involved in informational processes that may be functionally relevant at the cellular level have also been noted; in particular, the association between the genes for PCNA, transcription factor S, and NudF. Surprisingly, a conserved cluster of genes coding for proteins involved in translation or ribosome biogenesis (S27E, L44E, aIF-2 alpha, Nop10) is almost systematically contiguous to the group of genes coding for PCNA, PriS, and Gins15. The functional relevance of this cluster encoding proteins conserved in Archaea and Eukarya is strongly supported by statistical analysis. Interestingly, the gene encoding the S27E protein, also known as metallopanstimulin 1 (MPS-1) in human, is overexpressed in multiple cancer cell lines. Conclusion Our genome context analysis suggests specific functional interactions for proteins involved in DNA replication between each other or with proteins involved in DNA repair or transcription. Furthermore, it suggests a previously unrecognized regulatory network coupling DNA replication and translation in Archaea that may also exist in Eukarya.
Collapse
Affiliation(s)
- Jonathan Berthon
- Univ. Paris-Sud 11, CNRS, UMR8621, Institut de Génétique et Microbiologie, 91405 Orsay CEDEX, France.
| | | | | |
Collapse
|
32
|
Abstract
To date, methanogens are the only group within the archaea where firing DNA replication origins have not been demonstrated in vivo. In the present study we show that a previously identified cluster of ORB (origin recognition box) sequences do indeed function as an origin of replication in vivo in the archaeon Methanothermobacter thermautotrophicus. Although the consensus sequence of ORBs in M. thermautotrophicus is somewhat conserved when compared with ORB sequences in other archaea, the Cdc6-1 protein from M. thermautotrophicus (termed MthCdc6-1) displays sequence-specific binding that is selective for the MthORB sequence and does not recognize ORBs from other archaeal species. Stabilization of in vitro MthORB DNA binding by MthCdc6-1 requires additional conserved sequences 3' to those originally described for M. thermautotrophicus. By testing synthetic sequences bearing mutations in the MthORB consensus sequence, we show that Cdc6/ORB binding is critically dependent on the presence of an invariant guanine found in all archaeal ORB sequences. Mutation of a universally conserved arginine residue in the recognition helix of the winged helix domain of archaeal Cdc6-1 shows that specific origin sequence recognition is dependent on the interaction of this arginine residue with the invariant guanine. Recognition of a mutated origin sequence can be achieved by mutation of the conserved arginine residue to a lysine or glutamine residue. Thus despite a number of differences in protein and DNA sequences between species, the mechanism of origin recognition and binding appears to be conserved throughout the archaea.
Collapse
|
33
|
Touchon M, Rocha EPC. From GC skews to wavelets: a gentle guide to the analysis of compositional asymmetries in genomic data. Biochimie 2007; 90:648-59. [PMID: 17988781 DOI: 10.1016/j.biochi.2007.09.015] [Citation(s) in RCA: 32] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2007] [Accepted: 09/21/2007] [Indexed: 12/29/2022]
Abstract
Compositional asymmetries are pervasive in DNA sequences. They are the result of the asymmetric interactions between DNA and cellular mechanisms such as replication and transcription. Here, we review many of the methods that have been proposed over the years to analyse compositional asymmetries in DNA sequences. Among these we list GC skews, oligonucleotide skews and wavelets, which among other uses have been extensively employed to delimitate origins and termini of replication in genomes. We also review the use of multivariate methods, such as factorial correspondence analysis, discriminant analysis and analysis of variance, which allow assigning compositional strand asymmetries to the different biological processes shaping sequence composition. Finally, we review methods that have been used to infer substitution matrices and allow understanding the mutational processes underlying strand asymmetry. We focus on replication asymmetries because they have been more thoroughly studied, but the methods may be adapted, and often are, to other problems. Although strand asymmetry has been studied more frequently through compositional skews of nucleotides or oligonucleotides, we recall that, depending on the goal of the analysis, other methods may be more appropriate to answer certain biological questions. We also refer to programs freely available to analyse strand asymmetry.
Collapse
Affiliation(s)
- Marie Touchon
- Atelier de Bioinformatique, Université Pierre et Marie Curie-Paris 6, Paris, France
| | | |
Collapse
|
34
|
Affiliation(s)
- Roxana E Georgescu
- Laboratory of DNA Replication, Howard Hughes Medical Institute, Rockefeller University, New York, NY 10065, USA
| | | |
Collapse
|
35
|
Gaudier M, Schuwirth BS, Westcott SL, Wigley DB. Structural basis of DNA replication origin recognition by an ORC protein. Science 2007; 317:1213-6. [PMID: 17761880 DOI: 10.1126/science.1143664] [Citation(s) in RCA: 118] [Impact Index Per Article: 6.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/02/2022]
Abstract
DNA replication in archaea and in eukaryotes share many similarities. We report the structure of an archaeal origin recognition complex protein, ORC1, bound to an origin recognition box, a DNA sequence that is found in multiple copies at replication origins. DNA binding is mediated principally by a C-terminal winged helix domain that inserts deeply into the major and minor grooves, widening them both. However, additional DNA contacts are made with the N-terminal AAA+ domain, which inserts into the minor groove at a characteristic G-rich sequence, inducing a 35 degrees bend in the duplex and providing directionality to the binding site. Both contact regions also induce substantial unwinding of the DNA. The structure provides insight into the initial step in assembly of a replication origin and recruitment of minichromosome maintenance (MCM) helicase to that origin.
Collapse
Affiliation(s)
- Martin Gaudier
- Cancer Research UK Clare Hall Laboratories, London Research Institute, Blanche Lane, South Mimms, Potters Bar, Herts EN6 3LD, UK
| | | | | | | |
Collapse
|
36
|
Matsunaga F, Glatigny A, Mucchielli-Giorgi MH, Agier N, Delacroix H, Marisa L, Durosay P, Ishino Y, Aggerbeck L, Forterre P. Genomewide and biochemical analyses of DNA-binding activity of Cdc6/Orc1 and Mcm proteins in Pyrococcus sp. Nucleic Acids Res 2007; 35:3214-22. [PMID: 17452353 PMCID: PMC1904270 DOI: 10.1093/nar/gkm212] [Citation(s) in RCA: 28] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/02/2022] Open
Abstract
The origin of DNA replication (oriC) of the hyperthermophilic archaeon Pyrococcus abyssi contains multiple ORB and mini-ORB repeats that show sequence similarities to other archaeal ORB (origin recognition box). We report here that the binding of Cdc6/Orc1 to a 5 kb region containing oriC in vivo was highly specific both in exponential and stationary phases, by means of chromatin immunoprecipitation coupled with hybridization on a whole genome microarray (ChIP-chip). The oriC region is practically the sole binding site for the Cdc6/Orc1, thereby distinguishing oriC in the 1.8 M bp genome. We found that the 5 kb region contains a previously unnoticed cluster of ORB and mini-ORB repeats in the gene encoding the small subunit (dp1) for DNA polymerase II (PolD). ChIP and the gel retardation analyses further revealed that Cdc6/Orc1 specifically binds both of the ORB clusters in oriC and dp1. The organization of the ORB clusters in the dp1 and oriC is conserved during evolution in the order Thermococcales, suggesting a role in the initiation of DNA replication. Our ChIP-chip analysis also revealed that Mcm alters the binding specificity to the oriC region according to the growth phase, consistent with its role as a licensing factor.
Collapse
Affiliation(s)
- Fujihiko Matsunaga
- Institut de Génétique et Microbiologie, UMR8621, Bât. 409, Université Paris-Sud, 91405 Orsay Cedex, France, Department of Genetic Resources Technology, Faculty of Agriculture, Kyushu University, Fukuoka 812-8581, Japan and Gif/Orsay DNA Microarray Platform (GODMAP), Centre de Génétique Moléculaire UPR2167, Centre National de la Recherche Scientifique, 91198 Gif-sur-Yvette, Associated with the Université Pierre et Marie Curie-Paris 6, Paris F-75005, France
| | - Annie Glatigny
- Institut de Génétique et Microbiologie, UMR8621, Bât. 409, Université Paris-Sud, 91405 Orsay Cedex, France, Department of Genetic Resources Technology, Faculty of Agriculture, Kyushu University, Fukuoka 812-8581, Japan and Gif/Orsay DNA Microarray Platform (GODMAP), Centre de Génétique Moléculaire UPR2167, Centre National de la Recherche Scientifique, 91198 Gif-sur-Yvette, Associated with the Université Pierre et Marie Curie-Paris 6, Paris F-75005, France
| | - Marie-Hélène Mucchielli-Giorgi
- Institut de Génétique et Microbiologie, UMR8621, Bât. 409, Université Paris-Sud, 91405 Orsay Cedex, France, Department of Genetic Resources Technology, Faculty of Agriculture, Kyushu University, Fukuoka 812-8581, Japan and Gif/Orsay DNA Microarray Platform (GODMAP), Centre de Génétique Moléculaire UPR2167, Centre National de la Recherche Scientifique, 91198 Gif-sur-Yvette, Associated with the Université Pierre et Marie Curie-Paris 6, Paris F-75005, France
| | - Nicolas Agier
- Institut de Génétique et Microbiologie, UMR8621, Bât. 409, Université Paris-Sud, 91405 Orsay Cedex, France, Department of Genetic Resources Technology, Faculty of Agriculture, Kyushu University, Fukuoka 812-8581, Japan and Gif/Orsay DNA Microarray Platform (GODMAP), Centre de Génétique Moléculaire UPR2167, Centre National de la Recherche Scientifique, 91198 Gif-sur-Yvette, Associated with the Université Pierre et Marie Curie-Paris 6, Paris F-75005, France
| | - Hervé Delacroix
- Institut de Génétique et Microbiologie, UMR8621, Bât. 409, Université Paris-Sud, 91405 Orsay Cedex, France, Department of Genetic Resources Technology, Faculty of Agriculture, Kyushu University, Fukuoka 812-8581, Japan and Gif/Orsay DNA Microarray Platform (GODMAP), Centre de Génétique Moléculaire UPR2167, Centre National de la Recherche Scientifique, 91198 Gif-sur-Yvette, Associated with the Université Pierre et Marie Curie-Paris 6, Paris F-75005, France
| | - Laetitia Marisa
- Institut de Génétique et Microbiologie, UMR8621, Bât. 409, Université Paris-Sud, 91405 Orsay Cedex, France, Department of Genetic Resources Technology, Faculty of Agriculture, Kyushu University, Fukuoka 812-8581, Japan and Gif/Orsay DNA Microarray Platform (GODMAP), Centre de Génétique Moléculaire UPR2167, Centre National de la Recherche Scientifique, 91198 Gif-sur-Yvette, Associated with the Université Pierre et Marie Curie-Paris 6, Paris F-75005, France
| | - Patrice Durosay
- Institut de Génétique et Microbiologie, UMR8621, Bât. 409, Université Paris-Sud, 91405 Orsay Cedex, France, Department of Genetic Resources Technology, Faculty of Agriculture, Kyushu University, Fukuoka 812-8581, Japan and Gif/Orsay DNA Microarray Platform (GODMAP), Centre de Génétique Moléculaire UPR2167, Centre National de la Recherche Scientifique, 91198 Gif-sur-Yvette, Associated with the Université Pierre et Marie Curie-Paris 6, Paris F-75005, France
| | - Yoshizumi Ishino
- Institut de Génétique et Microbiologie, UMR8621, Bât. 409, Université Paris-Sud, 91405 Orsay Cedex, France, Department of Genetic Resources Technology, Faculty of Agriculture, Kyushu University, Fukuoka 812-8581, Japan and Gif/Orsay DNA Microarray Platform (GODMAP), Centre de Génétique Moléculaire UPR2167, Centre National de la Recherche Scientifique, 91198 Gif-sur-Yvette, Associated with the Université Pierre et Marie Curie-Paris 6, Paris F-75005, France
| | - Lawrence Aggerbeck
- Institut de Génétique et Microbiologie, UMR8621, Bât. 409, Université Paris-Sud, 91405 Orsay Cedex, France, Department of Genetic Resources Technology, Faculty of Agriculture, Kyushu University, Fukuoka 812-8581, Japan and Gif/Orsay DNA Microarray Platform (GODMAP), Centre de Génétique Moléculaire UPR2167, Centre National de la Recherche Scientifique, 91198 Gif-sur-Yvette, Associated with the Université Pierre et Marie Curie-Paris 6, Paris F-75005, France
| | - Patrick Forterre
- Institut de Génétique et Microbiologie, UMR8621, Bât. 409, Université Paris-Sud, 91405 Orsay Cedex, France, Department of Genetic Resources Technology, Faculty of Agriculture, Kyushu University, Fukuoka 812-8581, Japan and Gif/Orsay DNA Microarray Platform (GODMAP), Centre de Génétique Moléculaire UPR2167, Centre National de la Recherche Scientifique, 91198 Gif-sur-Yvette, Associated with the Université Pierre et Marie Curie-Paris 6, Paris F-75005, France
- *To whom correspondence should be addressed. Tel: +33 1 69 157489; Fax: +33 1 69 157808;
| |
Collapse
|
37
|
Norais C, Hawkins M, Hartman AL, Eisen JA, Myllykallio H, Allers T. Genetic and physical mapping of DNA replication origins in Haloferax volcanii. PLoS Genet 2007; 3:e77. [PMID: 17511521 PMCID: PMC1868953 DOI: 10.1371/journal.pgen.0030077] [Citation(s) in RCA: 111] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2006] [Accepted: 03/05/2007] [Indexed: 11/18/2022] Open
Abstract
The halophilic archaeon Haloferax volcanii has a multireplicon genome, consisting of a main chromosome, three secondary chromosomes, and a plasmid. Genes for the initiator protein Cdc6/Orc1, which are commonly located adjacent to archaeal origins of DNA replication, are found on all replicons except plasmid pHV2. However, prediction of DNA replication origins in H. volcanii is complicated by the fact that this species has no less than 14 cdc6/orc1 genes. We have used a combination of genetic, biochemical, and bioinformatic approaches to map DNA replication origins in H. volcanii. Five autonomously replicating sequences were found adjacent to cdc6/orc1 genes and replication initiation point mapping was used to confirm that these sequences function as bidirectional DNA replication origins in vivo. Pulsed field gel analyses revealed that cdc6/orc1-associated replication origins are distributed not only on the main chromosome (2.9 Mb) but also on pHV1 (86 kb), pHV3 (442 kb), and pHV4 (690 kb) replicons. Gene inactivation studies indicate that linkage of the initiator gene to the origin is not required for replication initiation, and genetic tests with autonomously replicating plasmids suggest that the origin located on pHV1 and pHV4 may be dominant to the principal chromosomal origin. The replication origins we have identified appear to show a functional hierarchy or differential usage, which might reflect the different replication requirements of their respective chromosomes. We propose that duplication of H. volcanii replication origins was a prerequisite for the multireplicon structure of this genome, and that this might provide a means for chromosome-specific replication control under certain growth conditions. Our observations also suggest that H. volcanii is an ideal organism for studying how replication of four replicons is regulated in the context of the archaeal cell cycle.
Collapse
Affiliation(s)
- Cédric Norais
- Institut de Génétique et Microbiologie, Université Paris-Sud, Orsay, France
- CNRS, UMR8621, Orsay, France
| | - Michelle Hawkins
- Institute of Genetics, University of Nottingham, Nottingham, United Kingdom
| | - Amber L Hartman
- Johns Hopkins University, Baltimore, Maryland, United States of America
| | - Jonathan A Eisen
- The Institute for Genomic Research, Rockville, Maryland, United States of America
| | - Hannu Myllykallio
- Institut de Génétique et Microbiologie, Université Paris-Sud, Orsay, France
- CNRS, UMR8621, Orsay, France
- * To whom correspondence should be addressed. E-mail: (HM); (TA)
| | - Thorsten Allers
- Institute of Genetics, University of Nottingham, Nottingham, United Kingdom
- * To whom correspondence should be addressed. E-mail: (HM); (TA)
| |
Collapse
|
38
|
Arakawa K, Saito R, Tomita M. Noise-reduction filtering for accurate detection of replication termini in bacterial genomes. FEBS Lett 2006; 581:253-8. [PMID: 17188685 DOI: 10.1016/j.febslet.2006.12.021] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2006] [Revised: 12/05/2006] [Accepted: 12/08/2006] [Indexed: 11/27/2022]
Abstract
Bacterial chromosomes are highly polarized in their nucleotide composition through mutational selection related to replication. Using compositional skews such as the GC skew, replication origin and terminus can be predicted in silico by observing the shift points. However, the genome sequence is affected by myriad functional requirements and selection on numerous subgenomic features, and elimination of this "noise" should lead to better predictions. Here, we present a noise-reduction approach that uses low-pass filtering through Fast Fourier transform coupled with cumulative skew graphs. It increases the prediction accuracy of the replication termini compared with previously documented methods based on genomic base composition.
Collapse
Affiliation(s)
- Kazuharu Arakawa
- Institute for Advanced Biosciences, Keio University, Fujisawa 252-8520, Japan
| | | | | |
Collapse
|
39
|
Rocha EPC, Touchon M, Feil EJ. Similar compositional biases are caused by very different mutational effects. Genome Res 2006; 16:1537-47. [PMID: 17068325 PMCID: PMC1665637 DOI: 10.1101/gr.5525106] [Citation(s) in RCA: 71] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
Compositional replication strand bias, commonly referred to as GC skew, is present in many genomes of prokaryotes, eukaryotes, and viruses. Although cytosine deamination in ssDNA (resulting in C-->T changes on the leading strand) is often invoked as its major cause, the precise contributions of this and other substitution types are currently unknown. It is also unclear if the underlying mutational asymmetries are the same among taxa, are stable over time, or how closely the observed biases are to mutational equilibrium. We analyzed nearly neutral sites of seven taxa each with between three and six complete bacterial genomes, and inferred the substitution spectra of fourfold degenerate positions in nonhighly expressed genes. Using a bootstrap procedure, we extracted compositional biases associated with replication and identified the significant asymmetries. Although all taxa showed an overrepresentation of G relative to C on the leading strand (and imbalances between A and T), widely variable substitution asymmetries are noted. Surprisingly, all substitution types show significant asymmetry in at least one taxon, but none were universally biased in all taxa. Notably, in the two most biased genomes, A-->G, rather than C-->T, shapes the compositional bias. Given the variability in these biases, we propose that the process is multifactorial. Finally, we also find that most genomes are not at compositional equilibrium, and suggest that mutational-based heterotachy is deeply imprinted in the history of biological macromolecules. This shows that similar compositional biases associated with the same essential well-conserved process, replication, do not reflect similar mutational processes in different genomes, and that caution is required in inferring the roles of specific mutational biases on the basis of contemporary patterns of sequence composition.
Collapse
Affiliation(s)
- Eduardo P C Rocha
- Unité Génétique des Génomes Bactériens, URA 2171, Institut Pasteur, 75015 Paris, France.
| | | | | |
Collapse
|
40
|
Maeder DL, Anderson I, Brettin TS, Bruce DC, Gilna P, Han CS, Lapidus A, Metcalf WW, Saunders E, Tapia R, Sowers KR. The Methanosarcina barkeri genome: comparative analysis with Methanosarcina acetivorans and Methanosarcina mazei reveals extensive rearrangement within methanosarcinal genomes. J Bacteriol 2006; 188:7922-31. [PMID: 16980466 PMCID: PMC1636319 DOI: 10.1128/jb.00810-06] [Citation(s) in RCA: 126] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
We report here a comparative analysis of the genome sequence of Methanosarcina barkeri with those of Methanosarcina acetivorans and Methanosarcina mazei. The genome of M. barkeri is distinguished by having an organization that is well conserved with respect to the other Methanosarcina spp. in the region proximal to the origin of replication, with interspecies gene similarities as high as 95%. However, it is disordered and marked by increased transposase frequency and decreased gene synteny and gene density in the distal semigenome. Of the 3,680 open reading frames (ORFs) in M. barkeri, 746 had homologs with better than 80% identity to both M. acetivorans and M. mazei, while 128 nonhypothetical ORFs were unique (nonorthologous) among these species, including a complete formate dehydrogenase operon, genes required for N-acetylmuramic acid synthesis, a 14-gene gas vesicle cluster, and a bacterial-like P450-specific ferredoxin reductase cluster not previously observed or characterized for this genus. A cryptic 36-kbp plasmid sequence that contains an orc1 gene flanked by a presumptive origin of replication consisting of 38 tandem repeats of a 143-nucleotide motif was detected in M. barkeri. Three-way comparison of these genomes reveals differing mechanisms for the accrual of changes. Elongation of the relatively large M. acetivorans genome is the result of uniformly distributed multiple gene scale insertions and duplications, while the M. barkeri genome is characterized by localized inversions associated with the loss of gene content. In contrast, the short M. mazei genome most closely approximates the putative ancestral organizational state of these species.
Collapse
Affiliation(s)
- Dennis L Maeder
- Center of Marine Biotechnology, University of Maryland Biotechnology Institute, 701 E. Pratt Street, Baltimore, MD 21202, USA
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|
41
|
Ozaki S, Fujimitsu K, Kurumizaka H, Katayama T. The DnaA homolog of the hyperthermophilic eubacterium Thermotoga maritima forms an open complex with a minimal 149-bp origin region in an ATP-dependent manner. Genes Cells 2006; 11:425-38. [PMID: 16611245 DOI: 10.1111/j.1365-2443.2006.00950.x] [Citation(s) in RCA: 29] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
In Escherichia coli, ATP-DnaA, but not ADP-DnaA, forms an initiation complex that undergoes site-specific duplex DNA unwinding, open complex formation. However, it remains unclear how highly the ATP-dependent activation of the initiation factor is conserved in evolution. The hyperthermophile Thermotoga maritima is one of the most ancient eubacteria in evolution. Here, we show that the DnaA homolog (tmaDnaA) of this bacterium forms open complexes with the predicted origin region (tma-oriC) in vitro. TmaDnaA has a strong and specific affinity for ATP/ADP as well as for 12-mer repeating sequences within the tma-oriC. Unlike ADP-tmaDnaA, ATP-tmaDnaA is highly cooperative in DNA binding and forms open complexes in a manner that depends on temperature and the superhelical tension of the tma-oriC-bearing plasmid. The minimal tma-oriC required for unwinding is a 149-bp region containing five repeats of the 12-mer sequence and two AT-rich 9-mer repeats. TmaDnaA-binding to the 12-mer motif provokes DNA bending. The 9-mer region is the duplex-unwinding site. The tmaDnaA-binding and unwinding motifs of tma-oriC share sequence homology with corresponding archaeal and eukaryotic sequences. These findings suggest that the ATP-dependent molecular switch of the initiator and the mechanisms in the replication initiation complex are highly conserved in eubacterial evolution.
Collapse
Affiliation(s)
- Shogo Ozaki
- Department of Molecular Biology, Graduate School of Pharmaceutical Sciences, Kyushu University, 3-1-1 Maidashi, Higashi-ku, Fukuoka 812-8582, Japan
| | | | | | | |
Collapse
|
42
|
Grainge I, Gaudier M, Schuwirth BS, Westcott SL, Sandall J, Atanassova N, Wigley DB. Biochemical analysis of a DNA replication origin in the archaeon Aeropyrum pernix. J Mol Biol 2006; 363:355-69. [PMID: 16978641 DOI: 10.1016/j.jmb.2006.07.076] [Citation(s) in RCA: 42] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2006] [Revised: 07/26/2006] [Accepted: 07/26/2006] [Indexed: 11/29/2022]
Abstract
We have characterised the interaction of the Aeropyrum pernix origin recognition complex proteins (ORC1 and ORC2) with DNA using DNase I footprinting. Each protein binds upstream of its respective gene. However, ORC1 protein alone interacts more tightly with an additional region containing multiple origin recognition box (ORB) sites that we show to be a replication origin. At this origin, there are four ORB elements disposed either side of an A+T-rich region. An ORC1 protein dimer binds at each of these ORB sites. Once all four ORB sites have bound ORC1 protein, there is a transition to a higher-order assembly with a defined alteration in topology and superhelicity. Furthermore, after this transition, the A+T-rich region becomes sensitive to digestion by DNase I and P1 nuclease, revealing that the transition promotes distortion of the DNA in this region, presumably as a prelude to loading of MCM helicase.
Collapse
Affiliation(s)
- Ian Grainge
- Cancer Research UK, Clare Hall Laboratories, The London Research Institute, Blanche Lane, South Mimms, Potters Bar, Herts EN6 3LD, UK
| | | | | | | | | | | | | |
Collapse
|
43
|
Kasiviswanathan R, Shin JH, Kelman Z. DNA binding by the Methanothermobacter thermautotrophicus Cdc6 protein is inhibited by the minichromosome maintenance helicase. J Bacteriol 2006; 188:4577-80. [PMID: 16740965 PMCID: PMC1482948 DOI: 10.1128/jb.00168-06] [Citation(s) in RCA: 16] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
The Cdc6 proteins from the archaeon Methanothermobacter thermautotrophicus were previously shown to bind double-stranded DNA. It is shown here that the proteins also bind single-stranded DNA. Using minichromosome maintenance (MCM) helicase mutant proteins unable to bind DNA, it was found that the interaction of MCM with Cdc6 inhibits the DNA binding activity of Cdc6.
Collapse
Affiliation(s)
- Rajesh Kasiviswanathan
- University of Maryland Biotechnology Institute, Center for Advanced Research in Biotechnology, 9600 Gudelsky Drive, Rockville, MD 20850, USA
| | | | | |
Collapse
|
44
|
Yamashiro K, Yokobori SI, Oshima T, Yamagishi A. Structural analysis of the plasmid pTA1 isolated from the thermoacidophilic archaeon Thermoplasma acidophilum. Extremophiles 2006; 10:327-35. [PMID: 16493526 DOI: 10.1007/s00792-005-0502-z] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2005] [Accepted: 12/04/2005] [Indexed: 11/25/2022]
Abstract
Thermoplasma acidophilum is a thermoacidophilic archaeon that grows optimally at pH1.8 and 56 degrees C and has no cell wall. Plasmid pTA1 was found in some strains of the species. We sequenced plasmid pTA1 and analyzed the open reading frames (ORFs). pTA1 was found to be a circular DNA molecule of 15,723 bp. Eighteen ORFs were found; none of the gene products except ORF1 had sequence similarity to known proteins. ORF1 showed similarity to Cdc6, which is involved in genome-replication initiation in Eukarya and Archaea. T. acidophilum has two Cdc6 homologues in the genome. The homologue found in pTA1 is most similar to Tvo3, one of the three Cdc6 homologues found in the genome of Thermoplasma volcanium, among all of the Cdc6 family proteins. The phylogenetic analysis suggested that plasmid pTA1 is possibly originated from the chromosomal DNA of Thermoplasma.
Collapse
Affiliation(s)
- Kan Yamashiro
- Department of Molecular Biology, Tokyo University of Pharmacy and Life Science, 1432-1 Horinouchi, Hachioji, Tokyo 192-0392, Japan
| | | | | | | |
Collapse
|
45
|
Worning P, Jensen LJ, Hallin PF, Staerfeldt HH, Ussery DW. Origin of replication in circular prokaryotic chromosomes. Environ Microbiol 2006; 8:353-61. [PMID: 16423021 DOI: 10.1111/j.1462-2920.2005.00917.x] [Citation(s) in RCA: 87] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]
Abstract
To predict origins of replication in prokaryotic chromosomes, we analyse the leading and lagging strands of 200 chromosomes for differences in oligomer composition and show that these correlate strongly with taxonomic grouping, lifestyle and molecular details of the replication process. While all bacteria have a preference for Gs over Cs on the leading strand, we discover that the direction of the A/T skew is determined by the polymerase-alpha subunit that replicates the leading strand. The strength of the strand bias varies greatly between both phyla and environments and appears to correlate with growth rate. Finally we observe much greater diversity of skew among archaea than among bacteria. We have developed a program that accurately locates the origins of replication by measuring the differences between leading and lagging strand of all oligonucleotides up to 8 bp in length. The program and results for all publicly available genomes are available from http://www.cbs.dtu.dk/services/GenomeAtlas/suppl/origin.
Collapse
Affiliation(s)
- Peder Worning
- Biological Sciences, AstraZeneca R and D Lund, Sweden
| | | | | | | | | |
Collapse
|
46
|
Kasiviswanathan R, Shin JH, Kelman Z. Interactions between the archaeal Cdc6 and MCM proteins modulate their biochemical properties. Nucleic Acids Res 2005; 33:4940-50. [PMID: 16150924 PMCID: PMC1201339 DOI: 10.1093/nar/gki807] [Citation(s) in RCA: 36] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022] Open
Abstract
The origin recognition complex, Cdc6 and the minichromosome maintenance (MCM) complex play essential roles in the initiation of eukaryotic DNA replication. Homologs of these proteins may play similar roles in archaeal replication initiation. While the interactions among the eukaryotic initiation proteins are well documented, the protein-protein interactions between the archaeal proteins have not yet been determined. Here, an extensive structural and functional analysis of the interactions between the Methanothermobacter thermautotrophicus MCM and the two Cdc6 proteins (Cdc6-1 and -2) identified in the organism is described. The main contact between Cdc6 and MCM occurs via the N-terminal portion of the MCM protein. It was found that Cdc6-MCM interaction, but not Cdc6-DNA binding, plays the predominant role in regulating MCM helicase activity. In addition, the data showed that the interactions with MCM modulate the autophosphorylation of Cdc6-1 and -2. The results also suggest that MCM and DNA may compete for Cdc6-1 protein binding. The implications of these observations for the initiation of archaeal DNA replication are discussed.
Collapse
Affiliation(s)
| | | | - Zvi Kelman
- To whom correspondence should be addressed. Tel: +1 240 314 6294; Fax: +1 240 314 6255;
| |
Collapse
|
47
|
Abstract
Replication of DNA is essential for the propagation of life. It is somewhat surprising then that, despite the vital nature of this process, cellular organisms show a great deal of variety in the mechanisms that they employ to ensure appropriate genome duplication. This diversity is manifested along classical evolutionary lines, with distinct combinations of replicon architecture and replication proteins being found in the three domains of life: the Bacteria, the Eukarya and the Archaea. Furthermore, although there are mechanistic parallels, even within a given domain of life, the way origins of replication are defined shows remarkable variation.
Collapse
Affiliation(s)
- Nicholas P Robinson
- MRC Cancer Cell Unit, Hutchison MRC Research Centre, Hills Road, Cambridge, UK
| | | |
Collapse
|
48
|
Zhang R, Zhang CT. Identification of replication origins in archaeal genomes based on the Z-curve method. ARCHAEA-AN INTERNATIONAL MICROBIOLOGICAL JOURNAL 2005; 1:335-46. [PMID: 15876567 PMCID: PMC2685548 DOI: 10.1155/2005/509646] [Citation(s) in RCA: 79] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
The Z-curve is a three-dimensional curve that constitutes a unique representation of a DNA sequence, i.e., both the Z-curve and the given DNA sequence can be uniquely reconstructed from the other. We employed Z-curve analysis to identify one replication origin in the Methanocaldococcus jannaschii genome, two replication origins in the Halobacterium species NRC-1 genome and one replication origin in the Methanosarcina mazei genome. One of the predicted replication origins of Halobacterium species NRC-1 is the same as a replication origin later identified by in vivo experiments. The Z-curve analysis of the Sulfolobus solfataricus P2 genome suggested the existence of three replication origins, which is also consistent with later experimental results. This review aims to summarize applications of the Z-curve in identifying replication origins of archaeal genomes, and to provide clues about the locations of as yet unidentified replication origins of the Aeropyrum pernix K1, Methanococcus maripaludis S2, Picrophilus torridus DSM 9790 and Pyrobaculum aerophilum str. IM2 genomes.
Collapse
Affiliation(s)
- Ren Zhang
- Department of Epidemiology and Biostatistics, Tianjin Cancer Institute and Hospital, Tianjin 300060, China
| | - Chun-Ting Zhang
- Department of Physics, Tianjin University, Tianjin 300072, China
- Corresponding author ()
| |
Collapse
|
49
|
Capaldi SA, Berger JM. Biochemical characterization of Cdc6/Orc1 binding to the replication origin of the euryarchaeon Methanothermobacter thermoautotrophicus. Nucleic Acids Res 2004; 32:4821-32. [PMID: 15358831 PMCID: PMC519113 DOI: 10.1093/nar/gkh819] [Citation(s) in RCA: 48] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Archaeal cell division cycle protein 6 (Cdc6)/Origin Replication Complex subunit 1 (Orc1) proteins share sequence homology with eukaryotic DNA replication initiation factors but are also structurally similar to the bacterial initiator DnaA. To better understand whether Cdc6/Orc1 functions in an eukaryotic or bacterial-like manner, we have characterized the interaction of two Cdc6/Orc1 paralogs (mthCdc6-1 and mthCdc6-2) with the replication origin from Methanothermobacter thermoautotrophicus. We show that while both proteins display a low affinity for a small dsDNA of random sequence, mthCdc6-1 binds tightly to a short duplex containing a single copy of a 13 bp sequence that is repeated throughout the origin. Surprisingly, sequence comparisons show that this 13 bp sequence is a minimized version of the Origin Recognition Box element found in many euryarchaeotal origins. Analysis of mthCdc6-1 mutants demonstrates that the helix-turn-helix motif in the winged-helix domain mediates the interaction with this sequence. Association of both mthCdc6/Orc1 paralogs with the duplex containing the minimized Origin Recognition Box fits to an independent binding sites model, but their interaction with longer DNA ligands is cooperative. Together, our data provide the first detailed biophysical characterization of the association of an archaeal DNA replication initiator with its origin. Our observations also indicate that the origin-binding properties of Cdc6/Orc1 proteins closely resemble those of bacterial DnaA.
Collapse
Affiliation(s)
- Stephanie A Capaldi
- Department of Molecular and Cell Biology, 227 Hildebrand Hall #3206, University of California Berkeley, Berkeley, CA 94720, USA
| | | |
Collapse
|
50
|
Contursi P, Pisani FM, Grigoriev A, Cannio R, Bartolucci S, Rossi M. Identification and autonomous replication capability of a chromosomal replication origin from the archaeon Sulfolobus solfataricus. Extremophiles 2004; 8:385-91. [PMID: 15480865 DOI: 10.1007/s00792-004-0399-y] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2003] [Accepted: 05/10/2004] [Indexed: 11/29/2022]
Abstract
Here, we describe the identification of a chromosomal DNA replication origin (oriC) from the hyperthermophilic archaeon Sulfolobus solfataricus (subdomain of Crenarchaeota). By means of a cumulative GC-skew analysis of the Sulfolobus genome sequence, a candidate oriC was mapped within a 1.12-kb region located between the two divergently transcribed MCM- and cdc6-like genes. We demonstrated that plasmids containing the Sulfolobus oriC sequence and a hygromycin-resistance selectable marker were maintained in an episomal state in transformed S. solfataricus cells under selective pressure. The proposed location of the origin was confirmed by 2-D gel electrophoresis experiments. This is the first report on the functional cloning of a chromosomal oriC from an archaeon and represents an important step toward the reconstitution of an archaeal in vitro DNA replication system.
Collapse
Affiliation(s)
- Patrizia Contursi
- Dipartimento di Chimica Biologica, Università degli Studi di Napoli, Via Mezzocannone, 16, 80134, Napoli, Italy
| | | | | | | | | | | |
Collapse
|