1
|
Jaksik R, Wheeler DA, Kimmel M. Detection and characterization of constitutive replication origins defined by DNA polymerase epsilon. BMC Biol 2023; 21:41. [PMID: 36829160 PMCID: PMC9960419 DOI: 10.1186/s12915-023-01527-z] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2022] [Accepted: 01/24/2023] [Indexed: 02/26/2023] Open
Abstract
BACKGROUND Despite the process of DNA replication being mechanistically highly conserved, the location of origins of replication (ORI) may vary from one tissue to the next, or between rounds of replication in eukaryotes, suggesting flexibility in the choice of locations to initiate replication. Lists of human ORI therefore vary widely in number and location, and there are currently no methods available to compare them. Here, we propose a method of detection of ORI based on somatic mutation patterns generated by the mutator phenotype of damaged DNA polymerase epsilon (POLE). RESULTS We report the genome-wide localization of constitutive ORI in POLE-mutated human tumors using whole genome sequencing data. Mutations accumulated after many rounds of replication of unsynchronized dividing cell populations in tumors allow to identify constitutive origins, which we show are shared with high fidelity between individuals and tumor types. Using a Smith-Waterman-like dynamic programming approach, we compared replication origin positions obtained from multiple different methods. The comparison allowed us to define a consensus set of replication origins, identified consistently by multiple ORI detection methods. Many DNA features co-localized with the consensus set of ORI, including chromatin loop anchors, G-quadruplexes, S/MARs, and CpGs. Among all features, the H2A.Z histone exhibited the most significant association. CONCLUSIONS Our results show that mutation-based detection of replication origins is a viable approach to determining their location and associated sequence features.
Collapse
Affiliation(s)
- Roman Jaksik
- Department of Systems Biology and Engineering and Biotechnology Centre, Silesian University of Technology, Gliwice, Poland.
| | - David A. Wheeler
- grid.39382.330000 0001 2160 926XHuman Genome Sequencing Centre, Baylor College of Medicine, Houston, TX USA ,grid.240871.80000 0001 0224 711XPresent Address: Clinical Genomics Group, Department of Computational Biology, St Jude Children’s Research Hospital, Memphis, TN 38103 USA
| | - Marek Kimmel
- grid.6979.10000 0001 2335 3149Department of Systems Biology and Engineering and Biotechnology Centre, Silesian University of Technology, Gliwice, Poland ,grid.21940.3e0000 0004 1936 8278Department of Statistics, Rice University, Houston, TX USA ,grid.21940.3e0000 0004 1936 8278Department of Bioengineering, Rice University, Houston, TX USA
| |
Collapse
|
2
|
Blin M, Lacroix L, Petryk N, Jaszczyszyn Y, Chen CL, Hyrien O, Le Tallec B. DNA molecular combing-based replication fork directionality profiling. Nucleic Acids Res 2021; 49:e69. [PMID: 33836085 PMCID: PMC8266662 DOI: 10.1093/nar/gkab219] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2020] [Revised: 03/15/2021] [Accepted: 03/19/2021] [Indexed: 01/05/2023] Open
Abstract
The replication strategy of metazoan genomes is still unclear, mainly because definitive maps of replication origins are missing. High-throughput methods are based on population average and thus may exclusively identify efficient initiation sites, whereas inefficient origins go undetected. Single-molecule analyses of specific loci can detect both common and rare initiation events along the targeted regions. However, these usually concentrate on positioning individual events, which only gives an overview of the replication dynamics. Here, we computed the replication fork directionality (RFD) profiles of two large genes in different transcriptional states in chicken DT40 cells, namely untranscribed and transcribed DMD and CCSER1 expressed at WT levels or overexpressed, by aggregating hundreds of oriented replication tracks detected on individual DNA fibres stretched by molecular combing. These profiles reconstituted RFD domains composed of zones of initiation flanking a zone of termination originally observed in mammalian genomes and were highly consistent with independent population-averaging profiles generated by Okazaki fragment sequencing. Importantly, we demonstrate that inefficient origins do not appear as detectable RFD shifts, explaining why dispersed initiation has remained invisible to population-based assays. Our method can both generate quantitative profiles and identify discrete events, thereby constituting a comprehensive approach to study metazoan genome replication.
Collapse
Affiliation(s)
- Marion Blin
- Département de Gastro-entérologie, pôle MAD, Assistance Publique des Hôpitaux de Marseille, Centre Hospitalier Universitaire de Marseille, Marseille, France
| | - Laurent Lacroix
- Institut de Biologie de l’Ecole Normale Supérieure (IBENS), Ecole Normale Supérieure, CNRS, INSERM, Université PSL, 46 rue d’Ulm, F-75005 Paris, France
| | - Nataliya Petryk
- Institut de Biologie de l’Ecole Normale Supérieure (IBENS), Ecole Normale Supérieure, CNRS, INSERM, Université PSL, 46 rue d’Ulm, F-75005 Paris, France
- Université Paris-Saclay, CEA, CNRS, Institute for Integrative Biology of the Cell (I2BC), F-91198 Gif-sur-Yvette, France
| | - Yan Jaszczyszyn
- Université Paris-Saclay, CEA, CNRS, Institute for Integrative Biology of the Cell (I2BC), F-91198 Gif-sur-Yvette, France
| | - Chun-Long Chen
- Institut Curie, Université PSL, Sorbonne Université, CNRS UMR3244, F-75005 Paris, France
| | - Olivier Hyrien
- Institut de Biologie de l’Ecole Normale Supérieure (IBENS), Ecole Normale Supérieure, CNRS, INSERM, Université PSL, 46 rue d’Ulm, F-75005 Paris, France
| | - Benoît Le Tallec
- Institut de Biologie de l’Ecole Normale Supérieure (IBENS), Ecole Normale Supérieure, CNRS, INSERM, Université PSL, 46 rue d’Ulm, F-75005 Paris, France
| |
Collapse
|
3
|
Pongor LS, Gross JM, Vera Alvarez R, Murai J, Jang SM, Zhang H, Redon C, Fu H, Huang SY, Thakur B, Baris A, Marino-Ramirez L, Landsman D, Aladjem MI, Pommier Y. BAMscale: quantification of next-generation sequencing peaks and generation of scaled coverage tracks. Epigenetics Chromatin 2020; 13:21. [PMID: 32321568 PMCID: PMC7175505 DOI: 10.1186/s13072-020-00343-x] [Citation(s) in RCA: 25] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2020] [Accepted: 04/11/2020] [Indexed: 12/12/2022] Open
Abstract
Background Next-generation sequencing allows genome-wide analysis of changes in chromatin states and gene expression. Data analysis of these increasingly used methods either requires multiple analysis steps, or extensive computational time. We sought to develop a tool for rapid quantification of sequencing peaks from diverse experimental sources and an efficient method to produce coverage tracks for accurate visualization that can be intuitively displayed and interpreted by experimentalists with minimal bioinformatics background. We demonstrate its strength and usability by integrating data from several types of sequencing approaches. Results We have developed BAMscale, a one-step tool that processes a wide set of sequencing datasets. To demonstrate the usefulness of BAMscale, we analyzed multiple sequencing datasets from chromatin immunoprecipitation sequencing data (ChIP-seq), chromatin state change data (assay for transposase-accessible chromatin using sequencing: ATAC-seq, DNA double-strand break mapping sequencing: END-seq), DNA replication data (Okazaki fragments sequencing: OK-seq, nascent-strand sequencing: NS-seq, single-cell replication timing sequencing: scRepli-seq) and RNA-seq data. The outputs consist of raw and normalized peak scores (multiple normalizations) in text format and scaled bigWig coverage tracks that are directly accessible to data visualization programs. BAMScale also includes a visualization module facilitating direct, on-demand quantitative peak comparisons that can be used by experimentalists. Our tool can effectively analyze large sequencing datasets (~ 100 Gb size) in minutes, outperforming currently available tools. Conclusions BAMscale accurately quantifies and normalizes identified peaks directly from BAM files, and creates coverage tracks for visualization in genome browsers. BAMScale can be implemented for a wide set of methods for calculating coverage tracks, including ChIP-seq and ATAC-seq, as well as methods that currently require specialized, separate tools for analyses, such as splice-aware RNA-seq, END-seq and OK-seq for which no dedicated software is available. BAMscale is freely available on github (https://github.com/ncbi/BAMscale).
Collapse
Affiliation(s)
- Lorinc S Pongor
- Developmental Therapeutics Branch and Laboratory of Molecular Pharmacology, Center for Cancer Research, National Cancer Institute, NIH, 37 Convent Dr, Bethesda, MD, 20892, USA.
| | - Jacob M Gross
- Developmental Therapeutics Branch and Laboratory of Molecular Pharmacology, Center for Cancer Research, National Cancer Institute, NIH, 37 Convent Dr, Bethesda, MD, 20892, USA
| | - Roberto Vera Alvarez
- Computational Biology Branch, National Center for Biotechnology Information, National Library of Medicine, NIH, 8600 Rockville Pike, Bethesda, MD, 20892, USA
| | - Junko Murai
- Developmental Therapeutics Branch and Laboratory of Molecular Pharmacology, Center for Cancer Research, National Cancer Institute, NIH, 37 Convent Dr, Bethesda, MD, 20892, USA
| | - Sang-Min Jang
- Developmental Therapeutics Branch and Laboratory of Molecular Pharmacology, Center for Cancer Research, National Cancer Institute, NIH, 37 Convent Dr, Bethesda, MD, 20892, USA
| | - Hongliang Zhang
- Developmental Therapeutics Branch and Laboratory of Molecular Pharmacology, Center for Cancer Research, National Cancer Institute, NIH, 37 Convent Dr, Bethesda, MD, 20892, USA
| | - Christophe Redon
- Developmental Therapeutics Branch and Laboratory of Molecular Pharmacology, Center for Cancer Research, National Cancer Institute, NIH, 37 Convent Dr, Bethesda, MD, 20892, USA
| | - Haiqing Fu
- Developmental Therapeutics Branch and Laboratory of Molecular Pharmacology, Center for Cancer Research, National Cancer Institute, NIH, 37 Convent Dr, Bethesda, MD, 20892, USA
| | - Shar-Yin Huang
- Developmental Therapeutics Branch and Laboratory of Molecular Pharmacology, Center for Cancer Research, National Cancer Institute, NIH, 37 Convent Dr, Bethesda, MD, 20892, USA
| | - Bhushan Thakur
- Developmental Therapeutics Branch and Laboratory of Molecular Pharmacology, Center for Cancer Research, National Cancer Institute, NIH, 37 Convent Dr, Bethesda, MD, 20892, USA
| | - Adrian Baris
- Developmental Therapeutics Branch and Laboratory of Molecular Pharmacology, Center for Cancer Research, National Cancer Institute, NIH, 37 Convent Dr, Bethesda, MD, 20892, USA
| | - Leonardo Marino-Ramirez
- Computational Biology Branch, National Center for Biotechnology Information, National Library of Medicine, NIH, 8600 Rockville Pike, Bethesda, MD, 20892, USA
| | - David Landsman
- Computational Biology Branch, National Center for Biotechnology Information, National Library of Medicine, NIH, 8600 Rockville Pike, Bethesda, MD, 20892, USA
| | - Mirit I Aladjem
- Developmental Therapeutics Branch and Laboratory of Molecular Pharmacology, Center for Cancer Research, National Cancer Institute, NIH, 37 Convent Dr, Bethesda, MD, 20892, USA.
| | - Yves Pommier
- Developmental Therapeutics Branch and Laboratory of Molecular Pharmacology, Center for Cancer Research, National Cancer Institute, NIH, 37 Convent Dr, Bethesda, MD, 20892, USA.
| |
Collapse
|