1
|
Liu Y, Wu X, d'Aubenton-Carafa Y, Thermes C, Chen CL. OKseqHMM: a genome-wide replication fork directionality analysis toolkit. Nucleic Acids Res 2023; 51:e22. [PMID: 36629249 PMCID: PMC9976876 DOI: 10.1093/nar/gkac1239] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2022] [Revised: 12/06/2022] [Accepted: 12/19/2022] [Indexed: 01/12/2023] Open
Abstract
During each cell division, tens of thousands of DNA replication origins are co-ordinately activated to ensure the complete duplication of the human genome. However, replication fork progression can be challenged by many factors, including co-directional and head-on transcription-replication conflicts (TRC). Head-on TRCs are more dangerous for genome integrity. To study the direction of replication fork movement and TRCs, we developed a bioinformatics toolkit called OKseqHMM (https://github.com/CL-CHEN-Lab/OK-Seq, https://doi.org/10.5281/zenodo.7428883). Then, we used OKseqHMM to analyse a large number of datasets obtained by Okazaki fragment sequencing to directly measure the genome-wide replication fork directionality (RFD) and to accurately predict replication initiation and termination at a fine resolution in organisms including yeast, mouse and human. We also successfully applied our analysis to other genome-wide sequencing techniques that also contain RFD information (e.g. eSPAN, TrAEL-seq). Our toolkit can be used to predict replication initiation and fork progression direction genome-wide in a wide range of cell models and growth conditions. Comparing the replication and transcription directions allows identifying loci at risk of TRCs, particularly head-on TRCs, and investigating their role in genome instability by checking DNA damage data, which is of prime importance for human health.
Collapse
Affiliation(s)
- Yaqun Liu
- Institut Curie, Université PSL, Sorbonne Université, CNRS UMR3244, Dynamics of Genetic Information, 75005 Paris, France
| | - Xia Wu
- Institut Curie, Université PSL, Sorbonne Université, CNRS UMR3244, Dynamics of Genetic Information, 75005 Paris, France
| | - Yves d'Aubenton-Carafa
- Université Paris-Saclay, CEA, CNRS, Institute for Integrative Biology of the Cell (I2BC), 91198 Gif-sur-Yvette, France
| | - Claude Thermes
- Université Paris-Saclay, CEA, CNRS, Institute for Integrative Biology of the Cell (I2BC), 91198 Gif-sur-Yvette, France
| | - Chun-Long Chen
- Institut Curie, Université PSL, Sorbonne Université, CNRS UMR3244, Dynamics of Genetic Information, 75005 Paris, France
| |
Collapse
|
2
|
Genome-wide measurement of DNA replication fork directionality and quantification of DNA replication initiation and termination with Okazaki fragment sequencing. Nat Protoc 2023; 18:1260-1295. [PMID: 36653528 DOI: 10.1038/s41596-022-00793-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2021] [Accepted: 11/09/2022] [Indexed: 01/19/2023]
Abstract
Studying the dynamics of genome replication in mammalian cells has been historically challenging. To reveal the location of replication initiation and termination in the human genome, we developed Okazaki fragment sequencing (OK-seq), a quantitative approach based on the isolation and strand-specific sequencing of Okazaki fragments, the lagging strand replication intermediates. OK-seq quantitates the proportion of leftward- and rightward-oriented forks at every genomic locus and reveals the location and efficiency of replication initiation and termination events. Here we provide the detailed experimental procedures for performing OK-seq in unperturbed cultured human cells and budding yeast and the bioinformatics pipelines for data processing and computation of replication fork directionality. Furthermore, we present the analytical approach based on a hidden Markov model, which allows automated detection of ascending, descending and flat replication fork directionality segments revealing the zones of replication initiation, termination and unidirectional fork movement across the entire genome. These tools are essential for the accurate interpretation of human and yeast replication programs. The experiments and the data processing can be accomplished within six days. Besides revealing the genome replication program in fine detail, OK-seq has been instrumental in numerous studies unravelling mechanisms of genome stability, epigenome maintenance and genome evolution.
Collapse
|
3
|
Theulot B, Lacroix L, Arbona JM, Millot GA, Jean E, Cruaud C, Pellet J, Proux F, Hennion M, Engelen S, Lemainque A, Audit B, Hyrien O, Le Tallec B. Genome-wide mapping of individual replication fork velocities using nanopore sequencing. Nat Commun 2022; 13:3295. [PMID: 35676270 PMCID: PMC9177527 DOI: 10.1038/s41467-022-31012-0] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2022] [Accepted: 05/26/2022] [Indexed: 01/27/2023] Open
Abstract
Little is known about replication fork velocity variations along eukaryotic genomes, since reference techniques to determine fork speed either provide no sequence information or suffer from low throughput. Here we present NanoForkSpeed, a nanopore sequencing-based method to map and extract the velocity of individual forks detected as tracks of the thymidine analogue bromodeoxyuridine incorporated during a brief pulse-labelling of asynchronously growing cells. NanoForkSpeed retrieves previous Saccharomyces cerevisiae mean fork speed estimates (≈2 kb/min) in the BT1 strain exhibiting highly efficient bromodeoxyuridine incorporation and wild-type growth, and precisely quantifies speed changes in cells with altered replisome progression or exposed to hydroxyurea. The positioning of >125,000 fork velocities provides a genome-wide map of fork progression based on individual fork rates, showing a uniform fork speed across yeast chromosomes except for a marked slowdown at known pausing sites.
Collapse
Affiliation(s)
- Bertrand Theulot
- Institut de Biologie de l'Ecole Normale Supérieure (IBENS), Ecole Normale Supérieure, CNRS, INSERM, Université PSL, 46 rue d'Ulm, F-75005, Paris, France
- Sorbonne Université, Collège Doctoral, F-75005, Paris, France
| | - Laurent Lacroix
- Institut de Biologie de l'Ecole Normale Supérieure (IBENS), Ecole Normale Supérieure, CNRS, INSERM, Université PSL, 46 rue d'Ulm, F-75005, Paris, France.
| | - Jean-Michel Arbona
- Laboratoire de Biologie et Modélisation de la Cellule, Ecole Normale Supérieure de Lyon, CNRS, UMR5239, INSERM, U1293, Université Claude Bernard Lyon 1, 46 allée d'Italie, F-69364, Lyon, France
| | - Gael A Millot
- Institut Pasteur, Université Paris Cité, Bioinformatics and Biostatistics Hub, F-75015, Paris, France
| | - Etienne Jean
- Institut de Biologie de l'Ecole Normale Supérieure (IBENS), Ecole Normale Supérieure, CNRS, INSERM, Université PSL, 46 rue d'Ulm, F-75005, Paris, France
| | - Corinne Cruaud
- Genoscope, Institut de biologie François-Jacob, Commissariat à l'Energie Atomique (CEA), Université Paris-Saclay, Evry, France
| | - Jade Pellet
- Institut de Biologie de l'Ecole Normale Supérieure (IBENS), Ecole Normale Supérieure, CNRS, INSERM, Université PSL, 46 rue d'Ulm, F-75005, Paris, France
| | - Florence Proux
- Institut de Biologie de l'Ecole Normale Supérieure (IBENS), Ecole Normale Supérieure, CNRS, INSERM, Université PSL, 46 rue d'Ulm, F-75005, Paris, France
| | - Magali Hennion
- Université Paris Cité, Epigenetics and Cell Fate, UMR7216, CNRS, Paris, 75013, France
| | - Stefan Engelen
- Génomique Métabolique, Genoscope, Institut François Jacob, CEA, CNRS, Univ. Evry, Université Paris-Saclay, 91057, Evry, France
| | - Arnaud Lemainque
- Genoscope, Institut de biologie François-Jacob, Commissariat à l'Energie Atomique (CEA), Université Paris-Saclay, Evry, France
| | - Benjamin Audit
- ENSL, CNRS, Laboratoire de physique, F-69342, Lyon, France
| | - Olivier Hyrien
- Institut de Biologie de l'Ecole Normale Supérieure (IBENS), Ecole Normale Supérieure, CNRS, INSERM, Université PSL, 46 rue d'Ulm, F-75005, Paris, France.
| | - Benoît Le Tallec
- Institut de Biologie de l'Ecole Normale Supérieure (IBENS), Ecole Normale Supérieure, CNRS, INSERM, Université PSL, 46 rue d'Ulm, F-75005, Paris, France.
| |
Collapse
|
4
|
Organization of DNA Replication Origin Firing in Xenopus Egg Extracts: The Role of Intra-S Checkpoint. Genes (Basel) 2021; 12:genes12081224. [PMID: 34440398 PMCID: PMC8394201 DOI: 10.3390/genes12081224] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2021] [Revised: 08/02/2021] [Accepted: 08/04/2021] [Indexed: 12/11/2022] Open
Abstract
During cell division, the duplication of the genome starts at multiple positions called replication origins. Origin firing requires the interaction of rate-limiting factors with potential origins during the S(ynthesis)-phase of the cell cycle. Origins fire as synchronous clusters which is proposed to be regulated by the intra-S checkpoint. By modelling the unchallenged, the checkpoint-inhibited and the checkpoint protein Chk1 over-expressed replication pattern of single DNA molecules from Xenopus sperm chromatin replicated in egg extracts, we demonstrate that the quantitative modelling of data requires: (1) a segmentation of the genome into regions of low and high probability of origin firing; (2) that regions with high probability of origin firing escape intra-S checkpoint regulation and (3) the variability of the rate of DNA synthesis close to replication forks is a necessary ingredient that should be taken in to account in order to describe the dynamic of replication origin firing. This model implies that the observed origin clustering emerges from the apparent synchrony of origin firing in regions with high probability of origin firing and challenge the assumption that the intra-S checkpoint is the main regulator of origin clustering.
Collapse
|
5
|
Blin M, Lacroix L, Petryk N, Jaszczyszyn Y, Chen CL, Hyrien O, Le Tallec B. DNA molecular combing-based replication fork directionality profiling. Nucleic Acids Res 2021; 49:e69. [PMID: 33836085 PMCID: PMC8266662 DOI: 10.1093/nar/gkab219] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2020] [Revised: 03/15/2021] [Accepted: 03/19/2021] [Indexed: 01/05/2023] Open
Abstract
The replication strategy of metazoan genomes is still unclear, mainly because definitive maps of replication origins are missing. High-throughput methods are based on population average and thus may exclusively identify efficient initiation sites, whereas inefficient origins go undetected. Single-molecule analyses of specific loci can detect both common and rare initiation events along the targeted regions. However, these usually concentrate on positioning individual events, which only gives an overview of the replication dynamics. Here, we computed the replication fork directionality (RFD) profiles of two large genes in different transcriptional states in chicken DT40 cells, namely untranscribed and transcribed DMD and CCSER1 expressed at WT levels or overexpressed, by aggregating hundreds of oriented replication tracks detected on individual DNA fibres stretched by molecular combing. These profiles reconstituted RFD domains composed of zones of initiation flanking a zone of termination originally observed in mammalian genomes and were highly consistent with independent population-averaging profiles generated by Okazaki fragment sequencing. Importantly, we demonstrate that inefficient origins do not appear as detectable RFD shifts, explaining why dispersed initiation has remained invisible to population-based assays. Our method can both generate quantitative profiles and identify discrete events, thereby constituting a comprehensive approach to study metazoan genome replication.
Collapse
Affiliation(s)
- Marion Blin
- Département de Gastro-entérologie, pôle MAD, Assistance Publique des Hôpitaux de Marseille, Centre Hospitalier Universitaire de Marseille, Marseille, France
| | - Laurent Lacroix
- Institut de Biologie de l’Ecole Normale Supérieure (IBENS), Ecole Normale Supérieure, CNRS, INSERM, Université PSL, 46 rue d’Ulm, F-75005 Paris, France
| | - Nataliya Petryk
- Institut de Biologie de l’Ecole Normale Supérieure (IBENS), Ecole Normale Supérieure, CNRS, INSERM, Université PSL, 46 rue d’Ulm, F-75005 Paris, France
- Université Paris-Saclay, CEA, CNRS, Institute for Integrative Biology of the Cell (I2BC), F-91198 Gif-sur-Yvette, France
| | - Yan Jaszczyszyn
- Université Paris-Saclay, CEA, CNRS, Institute for Integrative Biology of the Cell (I2BC), F-91198 Gif-sur-Yvette, France
| | - Chun-Long Chen
- Institut Curie, Université PSL, Sorbonne Université, CNRS UMR3244, F-75005 Paris, France
| | - Olivier Hyrien
- Institut de Biologie de l’Ecole Normale Supérieure (IBENS), Ecole Normale Supérieure, CNRS, INSERM, Université PSL, 46 rue d’Ulm, F-75005 Paris, France
| | - Benoît Le Tallec
- Institut de Biologie de l’Ecole Normale Supérieure (IBENS), Ecole Normale Supérieure, CNRS, INSERM, Université PSL, 46 rue d’Ulm, F-75005 Paris, France
| |
Collapse
|
6
|
Kirstein N, Buschle A, Wu X, Krebs S, Blum H, Kremmer E, Vorberg IM, Hammerschmidt W, Lacroix L, Hyrien O, Audit B, Schepers A. Human ORC/MCM density is low in active genes and correlates with replication time but does not delimit initiation zones. eLife 2021; 10:62161. [PMID: 33683199 PMCID: PMC7993996 DOI: 10.7554/elife.62161] [Citation(s) in RCA: 21] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2020] [Accepted: 03/05/2021] [Indexed: 12/22/2022] Open
Abstract
Eukaryotic DNA replication initiates during S phase from origins that have been licensed in the preceding G1 phase. Here, we compare ChIP-seq profiles of the licensing factors Orc2, Orc3, Mcm3, and Mcm7 with gene expression, replication timing, and fork directionality profiles obtained by RNA-seq, Repli-seq, and OK-seq. Both, the origin recognition complex (ORC) and the minichromosome maintenance complex (MCM) are significantly and homogeneously depleted from transcribed genes, enriched at gene promoters, and more abundant in early- than in late-replicating domains. Surprisingly, after controlling these variables, no difference in ORC/MCM density is detected between initiation zones, termination zones, unidirectionally replicating regions, and randomly replicating regions. Therefore, ORC/MCM density correlates with replication timing but does not solely regulate the probability of replication initiation. Interestingly, H4K20me3, a histone modification proposed to facilitate late origin licensing, was enriched in late-replicating initiation zones and gene deserts of stochastic replication fork direction. We discuss potential mechanisms specifying when and where replication initiates in human cells.
Collapse
Affiliation(s)
- Nina Kirstein
- Research Unit Gene Vectors, Helmholtz Zentrum München (GmbH), German Research Center for Environmental Health, Munich, Germany
| | - Alexander Buschle
- Research Unit Gene Vectors, Helmholtz Zentrum München (GmbH), German Research Center for Environmental Health and German Center for Infection Research (DZIF), Munich, Germany
| | - Xia Wu
- Institut de Biologie de l'ENS (IBENS), Département de Biologie, Ecole Normale Supérieure, CNRS, Inserm, PSL Research University, Paris, France
| | - Stefan Krebs
- Laboratory for Functional Genome Analysis (LAFUGA), Gene Center of the Ludwig-Maximilians Universität (LMU) München, Munich, Germany
| | - Helmut Blum
- Laboratory for Functional Genome Analysis (LAFUGA), Gene Center of the Ludwig-Maximilians Universität (LMU) München, Munich, Germany
| | - Elisabeth Kremmer
- Institute for Molecular Immunology, Monoclonal Antibody Core Facility, Helmholtz Zentrum München, German Research Center for Environmental Health (GmbH), Neuherberg, Germany
| | - Ina M Vorberg
- German Center for Neurodegenerative Diseases (DZNE e.V.), Bonn, Germany.,Rheinische Friedrich-Wilhelms-Universität Bonn, Bonn, Germany
| | - Wolfgang Hammerschmidt
- Research Unit Gene Vectors, Helmholtz Zentrum München (GmbH), German Research Center for Environmental Health and German Center for Infection Research (DZIF), Munich, Germany
| | - Laurent Lacroix
- Institut de Biologie de l'ENS (IBENS), Département de Biologie, Ecole Normale Supérieure, CNRS, Inserm, PSL Research University, Paris, France
| | - Olivier Hyrien
- Institut de Biologie de l'ENS (IBENS), Département de Biologie, Ecole Normale Supérieure, CNRS, Inserm, PSL Research University, Paris, France
| | - Benjamin Audit
- Univ Lyon, ENS de Lyon, Univ. Claude Bernard, CNRS, Laboratoire de Physique, 69342 Lyon, France
| | - Aloys Schepers
- Research Unit Gene Vectors, Helmholtz Zentrum München (GmbH), German Research Center for Environmental Health, Munich, Germany
| |
Collapse
|
7
|
Hennion M, Arbona JM, Lacroix L, Cruaud C, Theulot B, Tallec BL, Proux F, Wu X, Novikova E, Engelen S, Lemainque A, Audit B, Hyrien O. FORK-seq: replication landscape of the Saccharomyces cerevisiae genome by nanopore sequencing. Genome Biol 2020; 21:125. [PMID: 32456659 PMCID: PMC7251829 DOI: 10.1186/s13059-020-02013-3] [Citation(s) in RCA: 28] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2019] [Accepted: 04/10/2020] [Indexed: 12/17/2022] Open
Abstract
Genome replication mapping methods profile cell populations, masking cell-to-cell heterogeneity. Here, we describe FORK-seq, a nanopore sequencing method to map replication of single DNA molecules at 200-nucleotide resolution. By quantifying BrdU incorporation along pulse-chased replication intermediates from Saccharomyces cerevisiae, we orient 58,651 replication tracks reproducing population-based replication directionality profiles and map 4964 and 4485 individual initiation and termination events, respectively. Although most events cluster at known origins and fork merging zones, 9% and 18% of initiation and termination events, respectively, occur at many locations previously missed. Thus, FORK-seq reveals the full extent of cell-to-cell heterogeneity in DNA replication.
Collapse
Affiliation(s)
- Magali Hennion
- Institut de Biologie de l’Ecole Normale Supérieure (IBENS), Ecole Normale Supérieure, CNRS, INSERM, Université PSL, 46 rue d’Ulm, Paris, 75005 France
- Current address: Epigenetics and Cell Fate Center, CNRS, Université de Paris, 35 rue Hélène Brion, Paris, 75013 France
| | - Jean-Michel Arbona
- Université de Lyon, ENS de Lyon, Université Claude Bernard Lyon 1, CNRS, Laboratoire de Physique, Lyon, 69342 France
- Current address: Laboratory of Biology and Modelling of the Cell, Université de Lyon, ENS de Lyon, Université Claude Bernard, CNRS UMR 5239, INSERM U1210, 46 Allée d’Italie Site Jacques Monod, Lyon, 69007 France
| | - Laurent Lacroix
- Institut de Biologie de l’Ecole Normale Supérieure (IBENS), Ecole Normale Supérieure, CNRS, INSERM, Université PSL, 46 rue d’Ulm, Paris, 75005 France
| | - Corinne Cruaud
- Commissariat à l’Energie Atomique et aux Energies Alternatives (CEA), Institut de biologie François-Jacob, Genoscope, 2 rue Gaston Crémieux, Evry, 91057 France
| | - Bertrand Theulot
- Institut de Biologie de l’Ecole Normale Supérieure (IBENS), Ecole Normale Supérieure, CNRS, INSERM, Université PSL, 46 rue d’Ulm, Paris, 75005 France
| | - Benoît Le Tallec
- Institut de Biologie de l’Ecole Normale Supérieure (IBENS), Ecole Normale Supérieure, CNRS, INSERM, Université PSL, 46 rue d’Ulm, Paris, 75005 France
| | - Florence Proux
- Institut de Biologie de l’Ecole Normale Supérieure (IBENS), Ecole Normale Supérieure, CNRS, INSERM, Université PSL, 46 rue d’Ulm, Paris, 75005 France
| | - Xia Wu
- Institut de Biologie de l’Ecole Normale Supérieure (IBENS), Ecole Normale Supérieure, CNRS, INSERM, Université PSL, 46 rue d’Ulm, Paris, 75005 France
| | - Elizaveta Novikova
- Institut de Biologie de l’Ecole Normale Supérieure (IBENS), Ecole Normale Supérieure, CNRS, INSERM, Université PSL, 46 rue d’Ulm, Paris, 75005 France
| | - Stefan Engelen
- Commissariat à l’Energie Atomique et aux Energies Alternatives (CEA), Institut de biologie François-Jacob, Genoscope, 2 rue Gaston Crémieux, Evry, 91057 France
| | - Arnaud Lemainque
- Commissariat à l’Energie Atomique et aux Energies Alternatives (CEA), Institut de biologie François-Jacob, Genoscope, 2 rue Gaston Crémieux, Evry, 91057 France
| | - Benjamin Audit
- Université de Lyon, ENS de Lyon, Université Claude Bernard Lyon 1, CNRS, Laboratoire de Physique, Lyon, 69342 France
| | - Olivier Hyrien
- Institut de Biologie de l’Ecole Normale Supérieure (IBENS), Ecole Normale Supérieure, CNRS, INSERM, Université PSL, 46 rue d’Ulm, Paris, 75005 France
| |
Collapse
|
8
|
Harmanci A, Harmanci AS, Swaminathan J, Gopalakrishnan V. EpiSAFARI: sensitive detection of valleys in epigenetic signals for enhancing annotations of functional elements. Bioinformatics 2019; 36:1014-1021. [PMID: 31501853 PMCID: PMC7703766 DOI: 10.1093/bioinformatics/btz702] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2019] [Revised: 07/22/2019] [Accepted: 09/05/2019] [Indexed: 01/31/2023] Open
Abstract
MOTIVATION Functional genomics experiments generate genomewide signal profiles that are dense information sources for annotating the regulatory elements. These profiles measure epigenetic activity at the nucleotide resolution and they exhibit distinctive patterns as they fluctuate along the genome. Most notable of these patterns are the valley patterns that are prevalently observed in assays such as ChIP Sequencing and bisulfite sequencing. The genomic positions of valleys pinpoint locations of cis-regulatory elements such as enhancers and insulators. Systematic identification of the valleys provides novel information for delineating the annotation of regulatory elements. Nevertheless, the valleys are not reported by majority of the analysis pipelines. RESULTS We describe EpiSAFARI, a computational method for sensitive detection of valleys from diverse types of epigenetic profiles. EpiSAFARI employs a novel smoothing method for decreasing noise in signal profiles and accounts for technical factors such as sparse signals, mappability and nucleotide content. In performance comparisons, EpiSAFARI performs favorably in terms of accuracy. The histone modification valleys detected by EpiSAFARI exhibit high conservation, transcription factor binding and they are enriched in nascent transcription. In addition, the large clusters of histone valleys are found to be enriched at the promoters of the developmentally associated genes. Differential histone valleys exhibit concordance with differential DNase signal at cell line specific valleys. DNA methylation valleys exhibit elevated conservation and high transcription factor binding. Specifically, we observed enriched binding of transcription factors associated with chromatin structure around methyl-valleys. AVAILABILITY AND IMPLEMENTATION EpiSAFARI is publicly available at https://github.com/harmancilab/EpiSAFARI. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
| | - Akdes Serin Harmanci
- School of Biomedical Informatics, Center for Systems Medicine, University of Texas Health Science Center, Houston, TX 77030, USA
| | | | - Vidya Gopalakrishnan
- Department of Pediatrics, USA,Department of Molecular and Cellular Oncology, USA,Brain Tumor Center, USA,Center for Cancer Epigenetics, University of Texas, M.D. Anderson Cancer Center, Houston, TX 77030, USA,M.D. Anderson UTHealth Graduate School of Biomedical Sciences, Houston, TX 77030, USA
| |
Collapse
|
9
|
Wu X, Kabalane H, Kahli M, Petryk N, Laperrousaz B, Jaszczyszyn Y, Drillon G, Nicolini FE, Perot G, Robert A, Fund C, Chibon F, Xia R, Wiels J, Argoul F, Maguer-Satta V, Arneodo A, Audit B, Hyrien O. Developmental and cancer-associated plasticity of DNA replication preferentially targets GC-poor, lowly expressed and late-replicating regions. Nucleic Acids Res 2019; 46:10157-10172. [PMID: 30189101 PMCID: PMC6212843 DOI: 10.1093/nar/gky797] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2018] [Accepted: 08/24/2018] [Indexed: 01/08/2023] Open
Abstract
The spatiotemporal program of metazoan DNA replication is regulated during development and altered in cancers. We have generated novel OK-seq, Repli-seq and RNA-seq data to compare the DNA replication and gene expression programs of twelve cancer and non-cancer human cell types. Changes in replication fork directionality (RFD) determined by OK-seq are widespread but more frequent within GC-poor isochores and largely disconnected from transcription changes. Cancer cell RFD profiles cluster with non-cancer cells of similar developmental origin but not with different cancer types. Importantly, recurrent RFD changes are detected in specific tumour progression pathways. Using a model for establishment and early progression of chronic myeloid leukemia (CML), we identify 1027 replication initiation zones (IZs) that progressively change efficiency during long-term expression of the BCR-ABL1 oncogene, being twice more often downregulated than upregulated. Prolonged expression of BCR-ABL1 results in targeting of new IZs and accentuation of previous efficiency changes. Targeted IZs are predominantly located in GC-poor, late replicating gene deserts and frequently silenced in late CML. Prolonged expression of BCR-ABL1 results in massive deletion of GC-poor, late replicating DNA sequences enriched in origin silencing events. We conclude that BCR-ABL1 expression progressively affects replication and stability of GC-poor, late-replicating regions during CML progression.
Collapse
Affiliation(s)
- Xia Wu
- Institut de Biologie de l'École Normale Supérieure (IBENS), Département de Biologie, Ecole Normale Supérieure, CNRS, Inserm, PSL Research University, F-75005 Paris, France.,Physics Department, East China Normal University, Shanghai, China
| | - Hadi Kabalane
- Univ Lyon, ENS de Lyon, Univ Claude Bernard Lyon 1, CNRS, Laboratoire de Physique, F-69342 Lyon, France
| | - Malik Kahli
- Institut de Biologie de l'École Normale Supérieure (IBENS), Département de Biologie, Ecole Normale Supérieure, CNRS, Inserm, PSL Research University, F-75005 Paris, France
| | - Nataliya Petryk
- Institut de Biologie de l'École Normale Supérieure (IBENS), Département de Biologie, Ecole Normale Supérieure, CNRS, Inserm, PSL Research University, F-75005 Paris, France
| | - Bastien Laperrousaz
- Univ Lyon, ENS de Lyon, Univ Claude Bernard Lyon 1, CNRS, Laboratoire de Physique, F-69342 Lyon, France.,CNRS UMR5286, INSERM U1052, Centre de Recherche en Cancérologie de Lyon, F- 69008 Lyon, France
| | - Yan Jaszczyszyn
- Institute for Integrative Biology of the Cell (I2BC), CEA, CNRS, Université Paris-Sud, Université Paris-Saclay, Gif-sur-Yvette, France
| | - Guenola Drillon
- Univ Lyon, ENS de Lyon, Univ Claude Bernard Lyon 1, CNRS, Laboratoire de Physique, F-69342 Lyon, France
| | - Frank-Emmanuel Nicolini
- CNRS UMR5286, INSERM U1052, Centre de Recherche en Cancérologie de Lyon, F- 69008 Lyon, France.,Centre Léon Bérard, F-69008 Lyon, France
| | - Gaëlle Perot
- INSERM U1218, Institut Bergonié, F-33000 Bordeaux, France
| | - Aude Robert
- UMR 8126, Université Paris-Sud Paris-Saclay, CNRS, Institut Gustave Roussy, Villejuif, France
| | - Cédric Fund
- École Normale Supérieure, PSL Research University, CNRS, Inserm, IBENS, Plateforme Génomique, 75005 Paris, France
| | | | - Ruohong Xia
- Physics Department, East China Normal University, Shanghai, China
| | - Joëlle Wiels
- UMR 8126, Université Paris-Sud Paris-Saclay, CNRS, Institut Gustave Roussy, Villejuif, France
| | - Françoise Argoul
- LOMA, Université de Bordeaux, CNRS, UMR 5798, F-33405 Talence, France
| | - Véronique Maguer-Satta
- CNRS UMR5286, INSERM U1052, Centre de Recherche en Cancérologie de Lyon, F- 69008 Lyon, France
| | - Alain Arneodo
- LOMA, Université de Bordeaux, CNRS, UMR 5798, F-33405 Talence, France
| | - Benjamin Audit
- Univ Lyon, ENS de Lyon, Univ Claude Bernard Lyon 1, CNRS, Laboratoire de Physique, F-69342 Lyon, France
| | - Olivier Hyrien
- Institut de Biologie de l'École Normale Supérieure (IBENS), Département de Biologie, Ecole Normale Supérieure, CNRS, Inserm, PSL Research University, F-75005 Paris, France
| |
Collapse
|
10
|
Attuel G, Gerasimova-Chechkina E, Argoul F, Yahia H, Arneodo A. Multifractal Desynchronization of the Cardiac Excitable Cell Network During Atrial Fibrillation. II. Modeling. Front Physiol 2019; 10:480. [PMID: 31105585 PMCID: PMC6492055 DOI: 10.3389/fphys.2019.00480] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2018] [Accepted: 04/05/2019] [Indexed: 11/13/2022] Open
Abstract
In a companion paper (I. Multifractal analysis of clinical data), we used a wavelet-based multiscale analysis to reveal and quantify the multifractal intermittent nature of the cardiac impulse energy in the low frequency range ≲ 2Hz during atrial fibrillation (AF). It demarcated two distinct areas within the coronary sinus (CS) with regionally stable multifractal spectra likely corresponding to different anatomical substrates. The electrical activity also showed no sign of the kind of temporal correlations typical of cascading processes across scales, thereby indicating that the multifractal scaling is carried by variations in the large amplitude oscillations of the recorded bipolar electric potential. In the present study, to account for these observations, we explore the role of the kinetics of gap junction channels (GJCs), in dynamically creating a new kind of imbalance between depolarizing and repolarizing currents. We propose a one-dimensional (1D) spatial model of a denervated myocardium, where the coupling of cardiac cells fails to synchronize the network of cardiac cells because of abnormal transjunctional capacitive charging of GJCs. We show that this non-ohmic nonlinear conduction 1D modeling accounts quantitatively well for the "multifractal random noise" dynamics of the electrical activity experimentally recorded in the left atrial posterior wall area. We further demonstrate that the multifractal properties of the numerical impulse energy are robust to changes in the model parameters.
Collapse
Affiliation(s)
- Guillaume Attuel
- Geometry and Statistics in Acquisition Data, Centre de Recherche INRIA, Talence, France
| | | | - Françoise Argoul
- Laboratoire Ondes et Matières d'Aquitaine, Université de Bordeaux, UMR 5798, CNRS, Talence, France
| | - Hussein Yahia
- Geometry and Statistics in Acquisition Data, Centre de Recherche INRIA, Talence, France
| | - Alain Arneodo
- Laboratoire Ondes et Matières d'Aquitaine, Université de Bordeaux, UMR 5798, CNRS, Talence, France
| |
Collapse
|
11
|
Zhang X, Pan W. Exon prediction based on multiscale products of a genomic-inspired multiscale bilateral filtering. PLoS One 2019; 14:e0205050. [PMID: 30897105 PMCID: PMC6428306 DOI: 10.1371/journal.pone.0205050] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2018] [Accepted: 03/05/2019] [Indexed: 11/21/2022] Open
Abstract
Multiscale signal processing techniques such as wavelet filtering have proved to be particularly successful in predicting exon sequences. Traditional wavelet predictor is domain filtering, and enforces exon features by weighting nucleotide values with coefficients. Such a measure performs linear filtering and is not suitable for preserving the short coding exons and the exon-intron boundaries. This paper describes a prediction framework that is capable of non-linearly processing DNA sequences while achieving high prediction rates. There are two key contributions. The first is the introduction of a genomic-inspired multiscale bilateral filtering (MSBF) which exploits both weighting coefficients in the spatial domain and nucleotide similarity in the range. Similarly to wavelet transform, the MSBF is also defined as a weighted sum of nucleotides. The difference is that the MSBF takes into account the variation of nucleotides at a specific codon position. The second contribution is the exploitation of inter-scale correlation in MSBF domain to find the inter-scale dependency on the differences between the exon signal and the background noise. This favourite property is used to sharp the important structures while weakening noise. Three benchmark data sets have been used in the evaluation of considered methods. By comparison with four existing techniques, the prediction results demonstrate that: the proposed method reveals at least improvement of 4.1%, 50.5%, 25.6%, 2.5%, 10.8%, 15.5%, 11.1%, 12.3%, 9.2% and 2.4% on the exons length of 1–24, 25–49, 50–74, 75–99, 100–124, 125–149, 150–174, 175–199, 200–299 and 300–300+, respectively. The MSBF of its nonlinear nature is good at energy compaction, which makes it capable of locating the sharp variations around short exons. The direct scale multiplication of coefficients at several adjacent scales obviously enhanced exon features while the noise contents were suppressed. We show that the non-linear nature and correlation-based property achieved in proposed predictor is greater than that for traditional filtering, which leads to better exon prediction performance. There are some possible applications of this predictor. Its good localization and protection of sharp variations will make the predictor be suitable to perform fault diagnosis of aero-engine.
Collapse
Affiliation(s)
- Xiaolei Zhang
- College of Air Traffic Management, Civil Aviation Flight University of China, Guanghan, P.R. China
| | - Weijun Pan
- College of Air Traffic Management, Civil Aviation Flight University of China, Guanghan, P.R. China
- * E-mail:
| |
Collapse
|
12
|
Ciardo D, Goldar A, Marheineke K. On the Interplay of the DNA Replication Program and the Intra-S Phase Checkpoint Pathway. Genes (Basel) 2019; 10:E94. [PMID: 30700024 PMCID: PMC6410103 DOI: 10.3390/genes10020094] [Citation(s) in RCA: 21] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2019] [Revised: 01/23/2019] [Accepted: 01/25/2019] [Indexed: 12/12/2022] Open
Abstract
DNA replication in eukaryotes is achieved by the activation of multiple replication origins which needs to be precisely coordinated in space and time. This spatio-temporal replication program is regulated by many factors to maintain genome stability, which is frequently threatened through stresses of exogenous or endogenous origin. Intra-S phase checkpoints monitor the integrity of DNA synthesis and are activated when replication forks are stalled. Their activation leads to the stabilization of forks, to the delay of the replication program by the inhibition of late firing origins, and the delay of G2/M phase entry. In some cell cycles during early development these mechanisms are less efficient in order to allow rapid cell divisions. In this article, we will review our current knowledge of how the intra-S phase checkpoint regulates the replication program in budding yeast and metazoan models, including early embryos with rapid S phases. We sum up current models on how the checkpoint can inhibit origin firing in some genomic regions, but allow dormant origin activation in other regions. Finally, we discuss how numerical and theoretical models can be used to connect the multiple different actors into a global process and to extract general rules.
Collapse
Affiliation(s)
- Diletta Ciardo
- Institute for Integrative Biology of the Cell (I2BC), CEA, CNRS, Univ. Paris-Sud, Université Paris-Saclay, 91198 Gif-sur-Yvette CEDEX, France.
| | | | | |
Collapse
|
13
|
Attuel G, Gerasimova-Chechkina E, Argoul F, Yahia H, Arneodo A. Multifractal Desynchronization of the Cardiac Excitable Cell Network During Atrial Fibrillation. I. Multifractal Analysis of Clinical Data. Front Physiol 2018; 8:1139. [PMID: 29632492 PMCID: PMC5880174 DOI: 10.3389/fphys.2017.01139] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2017] [Accepted: 12/24/2017] [Indexed: 12/19/2022] Open
Abstract
Atrial fibrillation (AF) is a cardiac arrhythmia characterized by rapid and irregular atrial electrical activity with a high clinical impact on stroke incidence. Best available therapeutic strategies combine pharmacological and surgical means. But when successful, they do not always prevent long-term relapses. Initial success becomes all the more tricky to achieve as the arrhythmia maintains itself and the pathology evolves into sustained or chronic AF. This raises the open crucial issue of deciphering the mechanisms that govern the onset of AF as well as its perpetuation. In this study, we develop a wavelet-based multi-scale strategy to analyze the electrical activity of human hearts recorded by catheter electrodes, positioned in the coronary sinus (CS), during episodes of AF. We compute the so-called multifractal spectra using two variants of the wavelet transform modulus maxima method, the moment (partition function) method and the magnitude cumulant method. Application of these methods to long time series recorded in a patient with chronic AF provides quantitative evidence of the multifractal intermittent nature of the electric energy of passing cardiac impulses at low frequencies, i.e., for times (≳0.5 s) longer than the mean interbeat (≃ 10-1 s). We also report the results of a two-point magnitude correlation analysis which infers the absence of a multiplicative time-scale structure underlying multifractal scaling. The electric energy dynamics looks like a "multifractal white noise" with quadratic (log-normal) multifractal spectra. These observations challenge concepts of functional reentrant circuits in mechanistic theories of AF, still leaving open the role of the autonomic nervous system (ANS). A transition is indeed observed in the computed multifractal spectra which group according to two distinct areas, consistently with the anatomical substrate binding to the CS, namely the left atrial posterior wall, and the ligament of Marshall which is innervated by the ANS. In a companion paper (II. Modeling), we propose a mathematical model of a denervated heart where the kinetics of gap junction conductance alone induces a desynchronization of the myocardial excitable cells, accounting for the multifractal spectra found experimentally in the left atrial posterior wall area.
Collapse
Affiliation(s)
- Guillaume Attuel
- Geometry and Statistics in Acquisition Data, Centre de Recherche INRIA, Talence, France
| | | | - Francoise Argoul
- Laboratoire Ondes et Matières d'Aquitaine, Université de Bordeaux, Centre National de la Recherche Scientifique, UMR 5798, Talence, France
| | - Hussein Yahia
- Geometry and Statistics in Acquisition Data, Centre de Recherche INRIA, Talence, France
| | - Alain Arneodo
- Laboratoire Ondes et Matières d'Aquitaine, Université de Bordeaux, Centre National de la Recherche Scientifique, UMR 5798, Talence, France
| |
Collapse
|
14
|
Reinhart M, Cardoso MC. A journey through the microscopic ages of DNA replication. PROTOPLASMA 2017; 254:1151-1162. [PMID: 27943022 PMCID: PMC5376393 DOI: 10.1007/s00709-016-1058-8] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/03/2016] [Accepted: 12/01/2016] [Indexed: 06/06/2023]
Abstract
Scientific discoveries and technological advancements are inseparable but not always take place in a coherent chronological manner. In the next, we will provide a seemingly unconnected and serendipitous series of scientific facts that, in the whole, converged to unveil DNA and its duplication. We will not cover here the many and fundamental contributions from microbial genetics and in vitro biochemistry. Rather, in this journey, we will emphasize the interplay between microscopy development culminating on super resolution fluorescence microscopy (i.e., nanoscopy) and digital image analysis and its impact on our understanding of DNA duplication. We will interlace the journey with landmark concepts and experiments that have brought the cellular DNA replication field to its present state.
Collapse
Affiliation(s)
- Marius Reinhart
- Cell Biology and Epigenetics, Department of Biology, Technische Universität Darmstadt, Schnittspahnstrasse 10, 64287, Darmstadt, Germany
| | - M Cristina Cardoso
- Cell Biology and Epigenetics, Department of Biology, Technische Universität Darmstadt, Schnittspahnstrasse 10, 64287, Darmstadt, Germany.
| |
Collapse
|
15
|
Boulos RE, Tremblay N, Arneodo A, Borgnat P, Audit B. Multi-scale structural community organisation of the human genome. BMC Bioinformatics 2017; 18:209. [PMID: 28399820 PMCID: PMC5387268 DOI: 10.1186/s12859-017-1616-x] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2016] [Accepted: 03/28/2017] [Indexed: 12/11/2022] Open
Abstract
BACKGROUND Structural interaction frequency matrices between all genome loci are now experimentally achievable thanks to high-throughput chromosome conformation capture technologies. This ensues a new methodological challenge for computational biology which consists in objectively extracting from these data the structural motifs characteristic of genome organisation. RESULTS We deployed the fast multi-scale community mining algorithm based on spectral graph wavelets to characterise the networks of intra-chromosomal interactions in human cell lines. We observed that there exist structural domains of all sizes up to chromosome length and demonstrated that the set of structural communities forms a hierarchy of chromosome segments. Hence, at all scales, chromosome folding predominantly involves interactions between neighbouring sites rather than the formation of links between distant loci. CONCLUSIONS Multi-scale structural decomposition of human chromosomes provides an original framework to question structural organisation and its relationship to functional regulation across the scales. By construction the proposed methodology is independent of the precise assembly of the reference genome and is thus directly applicable to genomes whose assembly is not fully determined.
Collapse
Affiliation(s)
- Rasha E Boulos
- Univ Lyon, Ens de Lyon, Univ Claude Bernard Lyon 1, CNRS, Laboratoire de Physique, F-69342, Lyon, France.,Present address: Montpellier Cancer Institute (ICM), Montpellier Cancer Research Institute (IRCM) Inserm U1194, University of Montpellier, Montpellier, France
| | - Nicolas Tremblay
- Univ Lyon, Ens de Lyon, Univ Claude Bernard Lyon 1, CNRS, Laboratoire de Physique, F-69342, Lyon, France.,Present address: CNRS, GIPSA-lab, Grenoble, France
| | - Alain Arneodo
- Univ Lyon, Ens de Lyon, Univ Claude Bernard Lyon 1, CNRS, Laboratoire de Physique, F-69342, Lyon, France.,Present address: LOMA, Université de Bordeaux, CNRS, UMR 5798, 51 Cours de le Libération, Talence, 33405, France
| | - Pierre Borgnat
- Univ Lyon, Ens de Lyon, Univ Claude Bernard Lyon 1, CNRS, Laboratoire de Physique, F-69342, Lyon, France
| | - Benjamin Audit
- Univ Lyon, Ens de Lyon, Univ Claude Bernard Lyon 1, CNRS, Laboratoire de Physique, F-69342, Lyon, France.
| |
Collapse
|
16
|
Gerasimova-Chechkina E, Toner B, Marin Z, Audit B, Roux SG, Argoul F, Khalil A, Gileva O, Naimark O, Arneodo A. Comparative Multifractal Analysis of Dynamic Infrared Thermograms and X-Ray Mammograms Enlightens Changes in the Environment of Malignant Tumors. Front Physiol 2016; 7:336. [PMID: 27555823 PMCID: PMC4977307 DOI: 10.3389/fphys.2016.00336] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2016] [Accepted: 07/20/2016] [Indexed: 01/07/2023] Open
Abstract
There is growing evidence that the microenvironment surrounding a tumor plays a special role in cancer development and cancer therapeutic resistance. Tumors arise from the dysregulation and alteration of both the malignant cells and their environment. By providing tumor-repressing signals, the microenvironment can impose and sustain normal tissue architecture. Once tissue homeostasis is lost, the altered microenvironment can create a niche favoring the tumorigenic transformation process. A major challenge in early breast cancer diagnosis is thus to show that these physiological and architectural alterations can be detected with currently used screening techniques. In a recent study, we used a 1D wavelet-based multi-scale method to analyze breast skin temperature temporal fluctuations collected with an IR thermography camera in patients with breast cancer. This study reveals that the multifractal complexity of temperature fluctuations superimposed on cardiogenic and vasomotor perfusion oscillations observed in healthy breasts is lost in malignant tumor foci in cancerous breasts. Here we use a 2D wavelet-based multifractal method to analyze the spatial fluctuations of breast density in the X-ray mammograms of the same panel of patients. As compared to the long-range correlations and anti-correlations in roughness fluctuations, respectively observed in dense and fatty breast areas, some significant change in the nature of breast density fluctuations with some clear loss of correlations is detected in the neighborhood of malignant tumors. This attests to some architectural disorganization that may deeply affect heat transfer and related thermomechanics in breast tissues, corroborating the change to homogeneous monofractal temperature fluctuations recorded in cancerous breasts with the IR camera. These results open new perspectives in computer-aided methods to assist in early breast cancer diagnosis.
Collapse
Affiliation(s)
| | - Brian Toner
- CompuMAINE Laboratory, Department of Mathematics and Statistics, University of Maine Orono, ME, USA
| | - Zach Marin
- CompuMAINE Laboratory, Department of Mathematics and Statistics, University of Maine Orono, ME, USA
| | - Benjamin Audit
- Université Lyon, Ecole Normale Supérieure de Lyon, Université Claude Bernard Lyon 1, Centre National de la Recherche Scientifique, Laboratoire de Physique Lyon, France
| | - Stephane G Roux
- Université Lyon, Ecole Normale Supérieure de Lyon, Université Claude Bernard Lyon 1, Centre National de la Recherche Scientifique, Laboratoire de Physique Lyon, France
| | - Francoise Argoul
- Université Lyon, Ecole Normale Supérieure de Lyon, Université Claude Bernard Lyon 1, Centre National de la Recherche Scientifique, Laboratoire de PhysiqueLyon, France; Laboratoire Ondes et Matière d'Aquitaine, Centre National de la Recherche Scientifique, Université de Bordeaux, UMR 5798Talence, France
| | - Andre Khalil
- CompuMAINE Laboratory, Department of Mathematics and Statistics, University of Maine Orono, ME, USA
| | - Olga Gileva
- Department of Therapeutic and Propedeutic Dentistry, Perm State Medical University Perm, Russia
| | - Oleg Naimark
- Laboratory of Physical Foundation of Strength, Institute of Continuous Media Mechanics UB RAS Perm, Russia
| | - Alain Arneodo
- Université Lyon, Ecole Normale Supérieure de Lyon, Université Claude Bernard Lyon 1, Centre National de la Recherche Scientifique, Laboratoire de PhysiqueLyon, France; Laboratoire Ondes et Matière d'Aquitaine, Centre National de la Recherche Scientifique, Université de Bordeaux, UMR 5798Talence, France
| |
Collapse
|
17
|
Petryk N, Kahli M, d'Aubenton-Carafa Y, Jaszczyszyn Y, Shen Y, Silvain M, Thermes C, Chen CL, Hyrien O. Replication landscape of the human genome. Nat Commun 2016; 7:10208. [PMID: 26751768 PMCID: PMC4729899 DOI: 10.1038/ncomms10208] [Citation(s) in RCA: 201] [Impact Index Per Article: 25.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2015] [Accepted: 11/13/2015] [Indexed: 12/21/2022] Open
Abstract
Despite intense investigation, human replication origins and termini remain elusive. Existing data have shown strong discrepancies. Here we sequenced highly purified Okazaki fragments from two cell types and, for the first time, quantitated replication fork directionality and delineated initiation and termination zones genome-wide. Replication initiates stochastically, primarily within non-transcribed, broad (up to 150 kb) zones that often abut transcribed genes, and terminates dispersively between them. Replication fork progression is significantly co-oriented with the transcription. Initiation and termination zones are frequently contiguous, sometimes separated by regions of unidirectional replication. Initiation zones are enriched in open chromatin and enhancer marks, even when not flanked by genes, and often border ‘topologically associating domains' (TADs). Initiation zones are enriched in origin recognition complex (ORC)-binding sites and better align to origins previously mapped using bubble-trap than λ-exonuclease. This novel panorama of replication reveals how chromatin and transcription modulate the initiation process to create cell-type-specific replication programs. The physical origin and termination sites of DNA replication in human cells have remained elusive. Here the authors use Okazaki fragment sequencing to reveal global replication patterns and show how chromatin and transcription modulate the process.
Collapse
Affiliation(s)
- Nataliya Petryk
- Ecole Normale Supérieure, Institut de Biologie de l'ENS (IBENS), and Inserm U1024, and CNRS UMR 8197, 46 rue d'Ulm, Paris F-75005, France.,Institute for Integrative Biology of the Cell (I2BC), CEA, CNRS, Université Paris-Sud, UMR 9198, FRC 3115, Avenue de la Terrasse, Bâtiment 24, Gif-sur-Yvette, Paris F-91198, France
| | - Malik Kahli
- Ecole Normale Supérieure, Institut de Biologie de l'ENS (IBENS), and Inserm U1024, and CNRS UMR 8197, 46 rue d'Ulm, Paris F-75005, France
| | - Yves d'Aubenton-Carafa
- Institute for Integrative Biology of the Cell (I2BC), CEA, CNRS, Université Paris-Sud, UMR 9198, FRC 3115, Avenue de la Terrasse, Bâtiment 24, Gif-sur-Yvette, Paris F-91198, France
| | - Yan Jaszczyszyn
- Institute for Integrative Biology of the Cell (I2BC), CEA, CNRS, Université Paris-Sud, UMR 9198, FRC 3115, Avenue de la Terrasse, Bâtiment 24, Gif-sur-Yvette, Paris F-91198, France
| | - Yimin Shen
- Institute for Integrative Biology of the Cell (I2BC), CEA, CNRS, Université Paris-Sud, UMR 9198, FRC 3115, Avenue de la Terrasse, Bâtiment 24, Gif-sur-Yvette, Paris F-91198, France
| | - Maud Silvain
- Institute for Integrative Biology of the Cell (I2BC), CEA, CNRS, Université Paris-Sud, UMR 9198, FRC 3115, Avenue de la Terrasse, Bâtiment 24, Gif-sur-Yvette, Paris F-91198, France
| | - Claude Thermes
- Institute for Integrative Biology of the Cell (I2BC), CEA, CNRS, Université Paris-Sud, UMR 9198, FRC 3115, Avenue de la Terrasse, Bâtiment 24, Gif-sur-Yvette, Paris F-91198, France
| | - Chun-Long Chen
- Institute for Integrative Biology of the Cell (I2BC), CEA, CNRS, Université Paris-Sud, UMR 9198, FRC 3115, Avenue de la Terrasse, Bâtiment 24, Gif-sur-Yvette, Paris F-91198, France
| | - Olivier Hyrien
- Ecole Normale Supérieure, Institut de Biologie de l'ENS (IBENS), and Inserm U1024, and CNRS UMR 8197, 46 rue d'Ulm, Paris F-75005, France
| |
Collapse
|
18
|
Zhang X, Shen Z, Zhang G, Shen Y, Chen M, Zhao J, Wu R. Short Exon Detection via Wavelet Transform Modulus Maxima. PLoS One 2016; 11:e0163088. [PMID: 27635656 PMCID: PMC5026382 DOI: 10.1371/journal.pone.0163088] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2016] [Accepted: 09/04/2016] [Indexed: 02/05/2023] Open
Abstract
The detection of short exons is a challenging open problem in the field of bioinformatics. Due to the fact that the weakness of existing model-independent methods lies in their inability to reliably detect small exons, a model-independent method based on the singularity detection with wavelet transform modulus maxima has been developed for detecting short coding sequences (exons) in eukaryotic DNA sequences. In the analysis of our method, the local maxima can capture and characterize singularities of short exons, which helps to yield significant patterns that are rarely observed with the traditional methods. In order to get some information about singularities on the differences between the exon signal and the background noise, the noise level is estimated by filtering the genomic sequence through a notch filter. Meanwhile, a fast method based on a piecewise cubic Hermite interpolating polynomial is applied to reconstruct the wavelet coefficients for improving the computational efficiency. In addition, the output measure of a paired-numerical representation calculated in both forward and reverse directions is used to incorporate a useful DNA structural property. The performances of our approach and other techniques are evaluated on two benchmark data sets. Experimental results demonstrate that the proposed method outperforms all assessed model-independent methods for detecting short exons in terms of evaluation metrics.
Collapse
Affiliation(s)
- Xiaolei Zhang
- Shantou University Medical College, Shantou, P.R. China
| | - Zhiwei Shen
- Department of Radiology, Second Affiliated Hospital of Shantou University Medical College, Shantou, P.R. China
| | - Guishan Zhang
- College of Engineering, Shantou University, Shantou, P.R. China
| | - Yuanyu Shen
- Department of Radiology, Second Affiliated Hospital of Shantou University Medical College, Shantou, P.R. China
| | - Miaomiao Chen
- Department of Radiology, Second Affiliated Hospital of Shantou University Medical College, Shantou, P.R. China
| | - Jiaxiang Zhao
- College of Electronic Information and Optical Engineering, Nankai University, Tianjin, P.R. China
- * E-mail: (JXZ); (RHW)
| | - Renhua Wu
- Department of Radiology, Second Affiliated Hospital of Shantou University Medical College, Shantou, P.R. China
- * E-mail: (JXZ); (RHW)
| |
Collapse
|
19
|
Boulos RE, Drillon G, Argoul F, Arneodo A, Audit B. Structural organization of human replication timing domains. FEBS Lett 2015; 589:2944-57. [PMID: 25912651 DOI: 10.1016/j.febslet.2015.04.015] [Citation(s) in RCA: 28] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2015] [Revised: 04/09/2015] [Accepted: 04/10/2015] [Indexed: 12/16/2022]
Abstract
Recent analysis of genome-wide epigenetic modification data, mean replication timing (MRT) profiles and chromosome conformation data in mammals have provided increasing evidence that flexibility in replication origin usage is regulated locally by the epigenetic landscape and over larger genomic distances by the 3D chromatin architecture. Here, we review the recent results establishing some link between replication domains and chromatin structural domains in pluripotent and various differentiated cell types in human. We reconcile the originally proposed dichotomic picture of early and late constant timing regions that replicate by multiple rather synchronous origins in separated nuclear compartments of open and closed chromatins, with the U-shaped MRT domains bordered by "master" replication origins specified by a localized (∼200-300 kb) zone of open and transcriptionally active chromatin from which a replication wave likely initiates and propagates toward the domain center via a cascade of origin firing. We discuss the relationships between these MRT domains, topologically associated domains and lamina-associated domains. This review sheds a new light on the epigenetically regulated global chromatin reorganization that underlies the loss of pluripotency and the determination of differentiation properties.
Collapse
Affiliation(s)
- Rasha E Boulos
- Université de Lyon, F-69000 Lyon, France; Laboratoire de Physique, CNRS UMR5672, Ecole Normale Supérieure de Lyon, F-69007 Lyon, France
| | - Guénola Drillon
- Université de Lyon, F-69000 Lyon, France; Laboratoire de Physique, CNRS UMR5672, Ecole Normale Supérieure de Lyon, F-69007 Lyon, France
| | - Françoise Argoul
- Université de Lyon, F-69000 Lyon, France; Laboratoire de Physique, CNRS UMR5672, Ecole Normale Supérieure de Lyon, F-69007 Lyon, France
| | - Alain Arneodo
- Université de Lyon, F-69000 Lyon, France; Laboratoire de Physique, CNRS UMR5672, Ecole Normale Supérieure de Lyon, F-69007 Lyon, France
| | - Benjamin Audit
- Université de Lyon, F-69000 Lyon, France; Laboratoire de Physique, CNRS UMR5672, Ecole Normale Supérieure de Lyon, F-69007 Lyon, France.
| |
Collapse
|
20
|
Drillon G, Audit B, Argoul F, Arneodo A. Ubiquitous human 'master' origins of replication are encoded in the DNA sequence via a local enrichment in nucleosome excluding energy barriers. JOURNAL OF PHYSICS. CONDENSED MATTER : AN INSTITUTE OF PHYSICS JOURNAL 2015; 27:064102. [PMID: 25563930 DOI: 10.1088/0953-8984/27/6/064102] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/04/2023]
Abstract
As the elementary building block of eukaryotic chromatin, the nucleosome is at the heart of the compromise between the necessity of compacting DNA in the cell nucleus and the required accessibility to regulatory proteins. The recent availability of genome-wide experimental maps of nucleosome positions for many different organisms and cell types has provided an unprecedented opportunity to elucidate to what extent the DNA sequence conditions the primary structure of chromatin and in turn participates in the chromatin-mediated regulation of nuclear functions, such as gene expression and DNA replication. In this study, we use in vivo and in vitro genome-wide nucleosome occupancy data together with the set of nucleosome-free regions (NFRs) predicted by a physical model of nucleosome formation based on sequence-dependent bending properties of the DNA double-helix, to investigate the role of intrinsic nucleosome occupancy in the regulation of the replication spatio-temporal programme in human. We focus our analysis on the so-called replication U/N-domains that were shown to cover about half of the human genome in the germline (skew-N domains) as well as in embryonic stem cells, somatic and HeLa cells (mean replication timing U-domains). The 'master' origins of replication (MaOris) that border these megabase-sized U/N-domains were found to be specified by a few hundred kb wide regions that are hyper-sensitive to DNase I cleavage, hypomethylated, and enriched in epigenetic marks involved in transcription regulation, the hallmarks of localized open chromatin structures. Here we show that replication U/N-domain borders that are conserved in all considered cell lines have an environment highly enriched in nucleosome-excluding-energy barriers, suggesting that these ubiquitous MaOris have been selected during evolution. In contrast, MaOris that are cell-type-specific are mainly regulated epigenetically and are no longer favoured by a local abundance of intrinsic NFRs encoded in the DNA sequence. At the smaller few hundred bp scale of gene promoters, CpG-rich promoters of housekeeping genes found nearby ubiquitous MaOris as well as CpG-poor promoters of tissue-specific genes found nearby cell-type-specific MaOris, both correspond to in vivo NFRs that are not coded as nucleosome-excluding-energy barriers. Whereas the former promoters are likely to correspond to high occupancy transcription factor binding regions, the latter are an illustration that gene regulation in human is typically cell-type-specific.
Collapse
Affiliation(s)
- Guénola Drillon
- Université de Lyon, F-69000 Lyon, France. Laboratoire de Physique, CNRS UMR 5672, École Normale Supérieure de Lyon, F-69007 Lyon, France
| | | | | | | |
Collapse
|
21
|
Embryonic stem cell specific "master" replication origins at the heart of the loss of pluripotency. PLoS Comput Biol 2015; 11:e1003969. [PMID: 25658386 PMCID: PMC4319821 DOI: 10.1371/journal.pcbi.1003969] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2014] [Accepted: 10/06/2014] [Indexed: 11/29/2022] Open
Abstract
Epigenetic regulation of the replication program during mammalian cell differentiation remains poorly understood. We performed an integrative analysis of eleven genome-wide epigenetic profiles at 100 kb resolution of Mean Replication Timing (MRT) data in six human cell lines. Compared to the organization in four chromatin states shared by the five somatic cell lines, embryonic stem cell (ESC) line H1 displays (i) a gene-poor but highly dynamic chromatin state (EC4) associated to histone variant H2AZ rather than a HP1-associated heterochromatin state (C4) and (ii) a mid-S accessible chromatin state with bivalent gene marks instead of a polycomb-repressed heterochromatin state. Plastic MRT regions (≲ 20% of the genome) are predominantly localized at the borders of U-shaped timing domains. Whereas somatic-specific U-domain borders are gene-dense GC-rich regions, 31.6% of H1-specific U-domain borders are early EC4 regions enriched in pluripotency transcription factors NANOG and OCT4 despite being GC poor and gene deserts. Silencing of these ESC-specific “master” replication initiation zones during differentiation corresponds to a loss of H2AZ and an enrichment in H3K9me3 mark characteristic of late replicating C4 heterochromatin. These results shed a new light on the epigenetically regulated global chromatin reorganization that underlies the loss of pluripotency and lineage commitment. During development, embryonic stem cell (ESC) enter a program of cell differentiation eventually leading to all the necessary differentiated cell types. Understanding the mechanisms responsible for the underlying modifications of the gene expression program is of fundamental importance, as it will likely have strong impact on the development of regenerative medicine. We show that besides some epigenetic regulation, ubiquitous master replication origins at replication timing U-domain borders shared by 6 human cell types are transcriptionally active open chromatin regions specified by a local enrichment in nucleosome free regions encoded in the DNA sequence suggesting that they have been selected during evolution. In contrast, ESC specific master replication origins bear a unique epigenetic signature (enrichment in CTCF, H2AZ, NANOG, OCT4, …) likely contributing to maintain ESC chromatin in a highly dynamic and accessible state that is refractory to polycomb and HP1 heterochromatin spreading. These ESC specific master origins thus appear as key genomic regions where epigenetic control of chromatin organization is at play to maintain pluripotency of stem cell lineages and to guide lineage commitment to somatic cell types.
Collapse
|
22
|
Richard CD, Tanenbaum A, Audit B, Arneodo A, Khalil A, Frankel WN. SWDreader: a wavelet-based algorithm using spectral phase to characterize spike-wave morphological variation in genetic models of absence epilepsy. J Neurosci Methods 2014; 242:127-40. [PMID: 25549550 DOI: 10.1016/j.jneumeth.2014.12.016] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2014] [Revised: 12/17/2014] [Accepted: 12/19/2014] [Indexed: 12/12/2022]
Abstract
BACKGROUND Spike-wave discharges (SWD) found in neuroelectrical recordings are pathognomonic to absence epilepsy. The characteristic spike-wave morphology of the spike-wave complex (SWC) constituents of SWDs can be mathematically described by a subset of possible spectral power and phase values. Morlet wavelet transform (MWT) generates time-frequency representations well-suited to identifying this SWC-associated subset. NEW METHOD MWT decompositions of SWDs reveal spectral power concentrated at harmonic frequencies. The phase relationships underlying SWC morphology were identified by calculating the differences between phase values at SWD fundamental frequency from the 2nd, 3rd, and 4th harmonics, then using the three phase differences as coordinates to generate a density distribution in a {360°×360°×360°} phase difference space. Strain-specific density distributions were generated from SWDs of mice carrying the Gria4, Gabrg2, or Scn8a mutations to determine whether SWC morphological variants reliably mapped to the same regions of the distribution, and if distribution values could be used to detect SWD. COMPARISON WITH EXISTING METHODS To the best of our knowledge, this algorithm is the first to employ spectral phase to quantify SWC morphology, making it possible to computationally distinguish SWC morphological subtypes and detect SWDs. RESULTS/CONCLUSIONS Proof-of-concept testing of the SWDfinder algorithm shows: (1) a major pattern of variation in SWC morphology maps to one axis of the phase difference distribution, (2) variability between the strain-specific distributions reflects differences in the proportions of SWC subtypes generated during SWD, and (3) regularities in the spectral power and phase profiles of SWCs can be used to detect waveforms possessing SWC-like morphology.
Collapse
Affiliation(s)
- C D Richard
- The Jackson Laboratory, Bar Harbor, ME 04609 USA; Graduate School for Biomedical Sciences and Engineering, University of Maine, Orono, ME 04469 USA.
| | - A Tanenbaum
- Department of Neurology, School of Medicine, Washington University, St. Louis, MO 63130 USA; CompuMAINE Lab, Department of Mathematics, University of Maine, Orono, ME 04469 USA
| | - B Audit
- Laboratoire de Physique, CNRS UMR 5672, Université de Lyon, École Normale Supérieure de Lyon, F-69007 Lyon, France
| | - A Arneodo
- Laboratoire de Physique, CNRS UMR 5672, Université de Lyon, École Normale Supérieure de Lyon, F-69007 Lyon, France
| | - A Khalil
- The Jackson Laboratory, Bar Harbor, ME 04609 USA; Graduate School for Biomedical Sciences and Engineering, University of Maine, Orono, ME 04469 USA; CompuMAINE Lab, Department of Mathematics, University of Maine, Orono, ME 04469 USA
| | - W N Frankel
- The Jackson Laboratory, Bar Harbor, ME 04609 USA; Graduate School for Biomedical Sciences and Engineering, University of Maine, Orono, ME 04469 USA; Tufts University School of Medicine, Sackler School, Boston, MA 02111 USA
| |
Collapse
|
23
|
Zaghloul L, Drillon G, Boulos RE, Argoul F, Thermes C, Arneodo A, Audit B. Large replication skew domains delimit GC-poor gene deserts in human. Comput Biol Chem 2014; 53 Pt A:153-65. [PMID: 25224847 DOI: 10.1016/j.compbiolchem.2014.08.020] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 07/11/2014] [Indexed: 01/25/2023]
Abstract
Besides their large-scale organization in isochores, mammalian genomes display megabase-sized regions, spanning both genes and intergenes, where the strand nucleotide composition asymmetry decreases linearly, possibly due to replication activity. These so-called skew-N domains cover about a third of the human genome and are bordered by two skew upward jumps that were hypothesized to compose a subset of "master" replication origins active in the germline. Skew-N domains were shown to exhibit a particular gene organization. Genes with CpG-rich promoters likely expressed in the germline are over represented near the master replication origins, with large genes being co-oriented with replication fork progression, which suggests some coordination of replication and transcription. In this study, we describe another skew structure that covers ∼13% of the human genome and that is bordered by putative master replication origins similar to the ones flanking skew-N domains. These skew-split-N domains have a shape reminiscent of a N, but split in half, leaving in the center a region of null skew whose length increases with domain size. These central regions (median size ∼860 kb) have a homogeneous composition, i.e. both a null and constant skew and a constant and low GC content. They correspond to heterochromatin gene deserts found in low-GC isochores with an average gene density of 0.81 promoters/Mb as compared to 7.73 promoters/Mb genome wide. The analysis of epigenetic marks and replication timing data confirms that, in these late replicating heterochomatic regions, the initiation of replication is likely to be random. This contrasts with the transcriptionally active euchromatin state found around the bordering well positioned master replication origins. Altogether skew-N domains and skew-split-N domains cover about 50% of the human genome.
Collapse
Affiliation(s)
- Lamia Zaghloul
- Université de Lyon, F-69000 Lyon, France; Laboratoire de Physique, CNRS UMR 5672, Ecole Normale Supérieure de Lyon, F-69007 Lyon, France
| | - Guénola Drillon
- Université de Lyon, F-69000 Lyon, France; Laboratoire de Physique, CNRS UMR 5672, Ecole Normale Supérieure de Lyon, F-69007 Lyon, France
| | - Rasha E Boulos
- Université de Lyon, F-69000 Lyon, France; Laboratoire de Physique, CNRS UMR 5672, Ecole Normale Supérieure de Lyon, F-69007 Lyon, France
| | - Françoise Argoul
- Université de Lyon, F-69000 Lyon, France; Laboratoire de Physique, CNRS UMR 5672, Ecole Normale Supérieure de Lyon, F-69007 Lyon, France
| | - Claude Thermes
- Centre de Génétique Moléculaire, CNRS UPR 3404, Gif-sur-Yvette, France
| | - Alain Arneodo
- Université de Lyon, F-69000 Lyon, France; Laboratoire de Physique, CNRS UMR 5672, Ecole Normale Supérieure de Lyon, F-69007 Lyon, France
| | - Benjamin Audit
- Université de Lyon, F-69000 Lyon, France; Laboratoire de Physique, CNRS UMR 5672, Ecole Normale Supérieure de Lyon, F-69007 Lyon, France.
| |
Collapse
|
24
|
Mukhopadhyay R, Lajugie J, Fourel N, Selzer A, Schizas M, Bartholdy B, Mar J, Lin CM, Martin MM, Ryan M, Aladjem MI, Bouhassira EE. Allele-specific genome-wide profiling in human primary erythroblasts reveal replication program organization. PLoS Genet 2014; 10:e1004319. [PMID: 24787348 PMCID: PMC4006724 DOI: 10.1371/journal.pgen.1004319] [Citation(s) in RCA: 41] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2013] [Accepted: 03/10/2014] [Indexed: 11/19/2022] Open
Abstract
We have developed a new approach to characterize allele-specific timing of DNA replication genome-wide in human primary basophilic erythroblasts. We show that the two chromosome homologs replicate at the same time in about 88% of the genome and that large structural variants are preferentially associated with asynchronous replication. We identified about 600 megabase-sized asynchronously replicated domains in two tested individuals. The longest asynchronously replicated domains are enriched in imprinted genes suggesting that structural variants and parental imprinting are two causes of replication asynchrony in the human genome. Biased chromosome X inactivation in one of the two individuals tested was another source of detectable replication asynchrony. Analysis of high-resolution TimEX profiles revealed small variations termed timing ripples, which were undetected in previous, lower resolution analyses. Timing ripples reflect highly reproducible, variations of the timing of replication in the 100 kb-range that exist within the well-characterized megabase-sized replication timing domains. These ripples correspond to clusters of origins of replication that we detected using novel nascent strands DNA profiling methods. Analysis of the distribution of replication origins revealed dramatic differences in initiation of replication frequencies during S phase and a strong association, in both synchronous and asynchronous regions, between origins of replication and three genomic features: G-quadruplexes, CpG Islands and transcription start sites. The frequency of initiation in asynchronous regions was similar in the two homologs. Asynchronous regions were richer in origins of replication than synchronous regions. DNA replication in mammalian cells proceeds according to a distinct order. Genes that are expressed tend to replicate before genes that are not expressed. We report here that we have developed a method to measure the timing of replication of the maternal and paternal chromosomes separately. We found that the paternal and maternal chromosomes replicate at exactly the same time in the large majority of the genome and that the 12% of the genome that replicated asynchronously was enriched in imprinted genes and in structural variants. Previous experiments have shown that chromosomes could be divided into replication timing domains that are a few hundred thousand to a few megabases in size. We show here that these domains can be divided into sub-domains defined by ripples in the timing profile. These ripples corresponded to clusters of origins of replication. Finally, we show that the frequency of initiation in asynchronous regions was similar in the two homologs.
Collapse
Affiliation(s)
- Rituparna Mukhopadhyay
- Department of Cell Biology, Albert Einstein College of Medicine, Bronx, New York, United States of America
| | - Julien Lajugie
- Department of Cell Biology, Albert Einstein College of Medicine, Bronx, New York, United States of America
| | - Nicolas Fourel
- Department of Cell Biology, Albert Einstein College of Medicine, Bronx, New York, United States of America
| | - Ari Selzer
- Department of Cell Biology, Albert Einstein College of Medicine, Bronx, New York, United States of America
| | - Michael Schizas
- Department of Cell Biology, Albert Einstein College of Medicine, Bronx, New York, United States of America
| | - Boris Bartholdy
- Department of Cell Biology, Albert Einstein College of Medicine, Bronx, New York, United States of America
| | - Jessica Mar
- Department of Systems and Computational Biology, Albert Einstein College of Medicine, Bronx, New York, United States of America
| | - Chii Mei Lin
- Laboratory of Molecular Pharmacology, Center for Cancer Research, National Cancer Institute, Bethesda, Maryland, United States of America
| | - Melvenia M. Martin
- Laboratory of Molecular Pharmacology, Center for Cancer Research, National Cancer Institute, Bethesda, Maryland, United States of America
| | - Michael Ryan
- Laboratory of Molecular Pharmacology, Center for Cancer Research, National Cancer Institute, Bethesda, Maryland, United States of America
| | - Mirit I. Aladjem
- Laboratory of Molecular Pharmacology, Center for Cancer Research, National Cancer Institute, Bethesda, Maryland, United States of America
| | - Eric E. Bouhassira
- Department of Cell Biology, Albert Einstein College of Medicine, Bronx, New York, United States of America
- * E-mail:
| |
Collapse
|
25
|
The spatiotemporal program of DNA replication is associated with specific combinations of chromatin marks in human cells. PLoS Genet 2014; 10:e1004282. [PMID: 24785686 PMCID: PMC4006723 DOI: 10.1371/journal.pgen.1004282] [Citation(s) in RCA: 95] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/13/2013] [Accepted: 02/18/2014] [Indexed: 11/19/2022] Open
Abstract
The duplication of mammalian genomes is under the control of a spatiotemporal program that orchestrates the positioning and the timing of firing of replication origins. The molecular mechanisms coordinating the activation of about predicted origins remain poorly understood, partly due to the intrinsic rarity of replication bubbles, making it difficult to purify short nascent strands (SNS). The precise identification of origins based on the high-throughput sequencing of SNS constitutes a new methodological challenge. We propose a new statistical method with a controlled resolution, adapted to the detection of replication origins from SNS data. We detected an average of 80,000 replication origins in different cell lines. To evaluate the consistency between different protocols, we compared SNS detections with bubble trapping detections. This comparison demonstrated a good agreement between genome-wide methods, with 65% of SNS-detected origins validated by bubble trapping, and 44% of bubble trapping origins validated by SNS origins, when compared at the same resolution. We investigated the interplay between the spatial and the temporal programs of replication at fine scales. We show that most of the origins detected in regions replicated in early S phase are shared by all the cell lines investigated whereas cell-type-specific origins tend to be replicated in late S phase. We shed a new light on the key role of CpG islands, by showing that 80% of the origins associated with CGIs are constitutive. Our results further show that at least 76% of CGIs are origins of replication. The analysis of associations with chromatin marks at different timing of cell division revealed new potential epigenetic regulators driving the spatiotemporal activity of replication origins. We highlight the potential role of H4K20me1 and H3K27me3, the coupling of which is correlated with increased efficiency of replication origins, clearly identifying those marks as potential key regulators of replication origins. Replication is the mechanism by which genomes are duplicated into two exact copies. Genomic stability is under the control of a spatiotemporal program that orchestrates both the positioning and the timing of firing of about 50,000 replication starting points, also called replication origins. Replication bubbles found at origins have been very difficult to map due to their short lifespan. Moreover, with the flood of data characterizing new sequencing technologies, the precise statistical analysis of replication data has become an additional challenge. We propose a new method to map replication origins on the human genome, and we assess the reliability of our finding using experimental validation and comparison with origins maps obtained by bubble trapping. This fine mapping then allowed us to identify potential regulators of the replication dynamics. Our study highlights the key role of CpG Islands and identifies new potential epigenetic regulators (methylation of lysine 4 on histone H4, and tri-methylation of lysine 27 on histone H3) whose coupling is correlated with an increase in the efficiency of replication origins, suggesting those marks as potential key regulators of replication. Overall, our study defines new potentially important pathways that might regulate the sequential firing of origins during genome duplication.
Collapse
|
26
|
Shavit Y, Lio' P. Combining a wavelet change point and the Bayes factor for analysing chromosomal interaction data. MOLECULAR BIOSYSTEMS 2014; 10:1576-85. [PMID: 24710657 DOI: 10.1039/c4mb00142g] [Citation(s) in RCA: 27] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/02/2023]
Abstract
Over the past few decades we have witnessed great efforts to understand the cellular function at the cytoplasm level. Nowadays there is a growing interest in understanding the relationship between function and structure at the nuclear, chromosomal and sub-chromosomal levels. Data on chromosomal interactions that are now becoming available in unprecedented resolution and scale open the way to address this challenge. Consequently, there is a growing need for new methods and tools that will transform these data into knowledge and insights. Here, we have developed all the steps required for the analysis of chromosomal interaction data (Hi-C data). The result is a methodology which combines a wavelet change point with the Bayes factor for useful correction, segmentation and comparison of Hi-C data. We further developed chromoR, an R package that implements the methods presented here. The chromoR package provides researchers with a means to analyse chromosomal interaction data using statistical bioinformatics, offering a new and comprehensive solution to this task.
Collapse
Affiliation(s)
- Yoli Shavit
- Computer Laboratory, University of Cambridge, Cambridge, CB3 0FD, UK.
| | | |
Collapse
|
27
|
Baker A, Bechhoefer J. Inferring the spatiotemporal DNA replication program from noisy data. PHYSICAL REVIEW. E, STATISTICAL, NONLINEAR, AND SOFT MATTER PHYSICS 2014; 89:032703. [PMID: 24730871 DOI: 10.1103/physreve.89.032703] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/13/2013] [Indexed: 06/03/2023]
Abstract
We generalize a stochastic model of DNA replication to the case where replication-origin-initiation rates vary locally along the genome and with time. Using this generalized model, we address the inverse problem of inferring initiation rates from experimental data concerning replication in cell populations. Previous work based on curve fitting depended on arbitrarily chosen functional forms for the initiation rate, with free parameters that were constrained by the data. We introduce a nonparametric method of inference that is based on Gaussian process regression. The method replaces specific assumptions about the functional form of the initiation rate with more general prior expectations about the smoothness of variation of this rate, along the genome and in time. Using this inference method, we recover, with high precision, simulated replication schemes from noisy data that are typical of current experiments.
Collapse
Affiliation(s)
- A Baker
- Department of Physics, Simon Fraser University, Burnaby, British Columbia, Canada V5A 1S6
| | - J Bechhoefer
- Department of Physics, Simon Fraser University, Burnaby, British Columbia, Canada V5A 1S6
| |
Collapse
|
28
|
Julienne H, Zoufir A, Audit B, Arneodo A. Human genome replication proceeds through four chromatin states. PLoS Comput Biol 2013; 9:e1003233. [PMID: 24130466 PMCID: PMC3794905 DOI: 10.1371/journal.pcbi.1003233] [Citation(s) in RCA: 47] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2013] [Accepted: 08/06/2013] [Indexed: 12/26/2022] Open
Abstract
Advances in genomic studies have led to significant progress in understanding the epigenetically controlled interplay between chromatin structure and nuclear functions. Epigenetic modifications were shown to play a key role in transcription regulation and genome activity during development and differentiation or in response to the environment. Paradoxically, the molecular mechanisms that regulate the initiation and the maintenance of the spatio-temporal replication program in higher eukaryotes, and in particular their links to epigenetic modifications, still remain elusive. By integrative analysis of the genome-wide distributions of thirteen epigenetic marks in the human cell line K562, at the 100 kb resolution of corresponding mean replication timing (MRT) data, we identify four major groups of chromatin marks with shared features. These states have different MRT, namely from early to late replicating, replication proceeds though a transcriptionally active euchromatin state (C1), a repressive type of chromatin (C2) associated with polycomb complexes, a silent state (C3) not enriched in any available marks, and a gene poor HP1-associated heterochromatin state (C4). When mapping these chromatin states inside the megabase-sized U-domains (U-shaped MRT profile) covering about 50% of the human genome, we reveal that the associated replication fork polarity gradient corresponds to a directional path across the four chromatin states, from C1 at U-domains borders followed by C2, C3 and C4 at centers. Analysis of the other genome half is consistent with early and late replication loci occurring in separate compartments, the former correspond to gene-rich, high-GC domains of intermingled chromatin states C1 and C2, whereas the latter correspond to gene-poor, low-GC domains of alternating chromatin states C3 and C4 or long C4 domains. This new segmentation sheds a new light on the epigenetic regulation of the spatio-temporal replication program in human and provides a framework for further studies in different cell types, in both health and disease. Previous studies revealed spatially coherent and biological-meaningful chromatin mark combinations in human cells. Here, we analyze thirteen epigenetic mark maps in the human cell line K562 at 100 kb resolution of MRT data. The complexity of epigenetic data is reduced to four chromatin states that display remarkable similarities with those reported in fly, worm and plants. These states have different MRT: (C1) is transcriptionally active, early replicating, enriched in CTCF; (C2) is Polycomb repressed, mid-S replicating; (C3) lacks of marks and replicates late and (C4) is a late-replicating gene-poor HP1 repressed heterochromatin state. When mapping these states inside the 876 replication U-domains of K562, the replication fork polarity gradient observed in these U-domains comes along with a remarkable epigenetic organization from C1 at U-domain borders to C2, C3 and ultimately C4 at centers. The remaining genome half displays early replicating, gene rich and high GC domains of intermingled C1 and C2 states segregating from late replicating, gene poor and low GC domains of concatenated C3 and/or C4 states. This constitutes the first evidence of epigenetic compartmentalization of the human genome into replication domains likely corresponding to autonomous units in the 3D chromatin architecture.
Collapse
Affiliation(s)
- Hanna Julienne
- Université de Lyon, Lyon, France
- Laboratoire de Physique, CNRS UMR 5672, Ecole Normale Supérieure de Lyon, Lyon, France
| | - Azedine Zoufir
- Université de Lyon, Lyon, France
- Laboratoire de Physique, CNRS UMR 5672, Ecole Normale Supérieure de Lyon, Lyon, France
| | - Benjamin Audit
- Université de Lyon, Lyon, France
- Laboratoire de Physique, CNRS UMR 5672, Ecole Normale Supérieure de Lyon, Lyon, France
- * E-mail:
| | - Alain Arneodo
- Université de Lyon, Lyon, France
- Laboratoire de Physique, CNRS UMR 5672, Ecole Normale Supérieure de Lyon, Lyon, France
| |
Collapse
|
29
|
Nguyen N, Vo A, Won KJ. A wavelet-based method to exploit epigenomic language in the regulatory region. ACTA ACUST UNITED AC 2013; 30:908-14. [PMID: 24096080 DOI: 10.1093/bioinformatics/btt467] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022]
Abstract
MOTIVATION Epigenetic landscapes in the regulatory regions reflect binding condition of transcription factors and their co-factors. Identifying epigenetic condition and its variation is important in understanding condition-specific gene regulation. Computational approaches to explore complex multi-dimensional landscapes are needed. RESULTS To study epigenomic condition for gene regulation, we developed a method, AWNFR, to classify epigenomic landscapes based on the detected epigenomic landscapes. Assuming mixture of Gaussians for a nucleosome, the proposed method captures the shape of histone modification and identifies potential regulatory regions in the wavelet domain. For accuracy estimation as well as enhanced computational speed, we developed a novel algorithm based on down-sampling operation and footprint in wavelet. We showed the algorithmic advantages of AWNFR using the simulated data. AWNFR identified regulatory regions more effectively and accurately than the previous approaches with the epigenome data in mouse embryonic stem cells and human lung fibroblast cells (IMR90). Based on the detected epigenomic landscapes, AWNFR classified epigenomic status and studied epigenomic codes. We studied co-occurring histone marks and showed that AWNFR captures the epigenomic variation across time. AVAILABILITY AND IMPLEMENTATION The source code and supplemental document of AWNFR are available at http://wonk.med.upenn.edu/AWNFR.
Collapse
Affiliation(s)
- Nha Nguyen
- Department of Genetics, Institute for Diabetes, Obesity and Metabolism, School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA and Center for Neurosciences, The Feinstein Institute for Medical Research, Manhasset, NY 11030, USA
| | | | | |
Collapse
|
30
|
Hyrien O, Rappailles A, Guilbaud G, Baker A, Chen CL, Goldar A, Petryk N, Kahli M, Ma E, d'Aubenton-Carafa Y, Audit B, Thermes C, Arneodo A. From simple bacterial and archaeal replicons to replication N/U-domains. J Mol Biol 2013; 425:4673-89. [PMID: 24095859 DOI: 10.1016/j.jmb.2013.09.021] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2013] [Revised: 09/15/2013] [Accepted: 09/19/2013] [Indexed: 10/26/2022]
Abstract
The Replicon Theory proposed 50 years ago has proven to apply for replicons of the three domains of life. Here, we review our knowledge of genome organization into single and multiple replicons in bacteria, archaea and eukarya. Bacterial and archaeal replicator/initiator systems are quite specific and efficient, whereas eukaryotic replicons show degenerate specificity and efficiency, allowing for complex regulation of origin firing time. We expand on recent evidence that ~50% of the human genome is organized as ~1,500 megabase-sized replication domains with a characteristic parabolic (U-shaped) replication timing profile and linear (N-shaped) gradient of replication fork polarity. These N/U-domains correspond to self-interacting segments of the chromatin fiber bordered by open chromatin zones and replicate by cascades of origin firing initiating at their borders and propagating to their center, possibly by fork-stimulated initiation. The conserved occurrence of this replication pattern in the germline of mammals has resulted over evolutionary times in the formation of megabase-sized domains with an N-shaped nucleotide compositional skew profile due to replication-associated mutational asymmetries. Overall, these results reveal an evolutionarily conserved but developmentally plastic organization of replication that is driving mammalian genome evolution.
Collapse
Affiliation(s)
- Olivier Hyrien
- Ecole Normale Supérieure, IBENS UMR8197 U1024, Paris 75005, France.
| | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
31
|
Boulos RE, Arneodo A, Jensen P, Audit B. Revealing long-range interconnected hubs in human chromatin interaction data using graph theory. PHYSICAL REVIEW LETTERS 2013; 111:118102. [PMID: 24074120 DOI: 10.1103/physrevlett.111.118102] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/19/2013] [Indexed: 06/02/2023]
Abstract
We use graph theory to analyze chromatin interaction (Hi-C) data in the human genome. We show that a key functional feature of the genome--"master" replication origins--corresponds to DNA loci of maximal network centrality. These loci form a set of interconnected hubs both within chromosomes and between different chromosomes. Our results open the way to a fruitful use of graph theory concepts to decipher DNA structural organization in relation to genome functions such as replication and transcription. This quantitative information should prove useful to discriminate between possible polymer models of nuclear organization.
Collapse
Affiliation(s)
- R E Boulos
- Université de Lyon, F-69000 Lyon, France and Laboratoire de Physique, ENS de Lyon, CNRS UMR5672, F-69007 Lyon, France
| | | | | | | |
Collapse
|
32
|
Julienne H, Zoufir A, Audit B, Arneodo A. Epigenetic regulation of the human genome: coherence between promoter activity and large-scale chromatin environment. FRONTIERS IN LIFE SCIENCE 2013. [DOI: 10.1080/21553769.2013.832706] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/26/2022]
|