1
|
Liu J, Zhou D. Minimum Functional Length Analysis of K-Mer Based on BPNN. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2022; 19:2920-2925. [PMID: 34310316 DOI: 10.1109/tcbb.2021.3098512] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
BP neural network (BPNN), as a multilayer feed-forward network, can realize the deep cognition to target data and high accuracy to output results. However, there were still no related research of k-mer based on BPNN yet. In present study, BPNN was used to train and test binary classification data of each classification mode respectively. All k-mer were divided into two categories according to the X + Y content or completely random mode. Results showed that 1) For classification mode of X + Y content, the accuracy of k-mers classification was 100 percent, no matter k ≤ 6 or k ≥ 7; 2) For completely random classification mode, the accuracy of classification is 100 percent for k-mers of k ≤ 6; But for k-mers of k ≥ 7, the accuracy is less than 100 percent, and with the increase of k value, the accuracy of classification gradually decreases (gradually approaches 50 percent). The k-mers of k ≥ 7 should be the basic functional fragment of nucleic acid, and perform basic nucleic acid function in the DNA sequence. The k-mers of k ≤ 6 should be the basic component fragment of nucleic acid, and no longer perform basic nucleic acid function.
Collapse
|
2
|
Ho AT, Hurst LD. Unusual mammalian usage of TGA stop codons reveals that sequence conservation need not imply purifying selection. PLoS Biol 2022; 20:e3001588. [PMID: 35550630 PMCID: PMC9129041 DOI: 10.1371/journal.pbio.3001588] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/17/2022] [Revised: 05/24/2022] [Accepted: 04/20/2022] [Indexed: 11/18/2022] Open
Abstract
The assumption that conservation of sequence implies the action of purifying selection is central to diverse methodologies to infer functional importance. GC-biased gene conversion (gBGC), a meiotic mismatch repair bias strongly favouring GC over AT, can in principle mimic the action of selection, this being thought to be especially important in mammals. As mutation is GC→AT biased, to demonstrate that gBGC does indeed cause false signals requires evidence that an AT-rich residue is selectively optimal compared to its more GC-rich allele, while showing also that the GC-rich alternative is conserved. We propose that mammalian stop codon evolution provides a robust test case. Although in most taxa TAA is the optimal stop codon, TGA is both abundant and conserved in mammalian genomes. We show that this mammalian exceptionalism is well explained by gBGC mimicking purifying selection and that TAA is the selectively optimal codon. Supportive of gBGC, we observe (i) TGA usage trends are consistent at the focal stop codon and elsewhere (in UTR sequences); (ii) that higher TGA usage and higher TAA→TGA substitution rates are predicted by a high recombination rate; and (iii) across species the difference in TAA <-> TGA substitution rates between GC-rich and GC-poor genes is largest in genomes that possess higher between-gene GC variation. TAA optimality is supported both by enrichment in highly expressed genes and trends associated with effective population size. High TGA usage and high TAA→TGA rates in mammals are thus consistent with gBGC’s predicted ability to “drive” deleterious mutations and supports the hypothesis that sequence conservation need not be indicative of purifying selection. A general trend for GC-rich trinucleotides to reside at frequencies far above their mutational equilibrium in high recombining domains supports the generality of these results.
Collapse
Affiliation(s)
- Alexander Thomas Ho
- Milner Centre for Evolution, University of Bath, Bath, United Kingdom
- * E-mail:
| | | |
Collapse
|
3
|
Gnan S, Matelot M, Weiman M, Arnaiz O, Guérin F, Sperling L, Bétermier M, Thermes C, Chen CL, Duharcourt S. GC content, but not nucleosome positioning, directly contributes to intron splicing efficiency in Paramecium. Genome Res 2022; 32:699-709. [PMID: 35264448 PMCID: PMC8997360 DOI: 10.1101/gr.276125.121] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2021] [Accepted: 02/14/2022] [Indexed: 11/24/2022]
Abstract
Eukaryotic genes are interrupted by introns that must be accurately spliced from mRNA precursors. With an average length of 25 nt, the more than 90,000 introns of Paramecium tetraurelia stand among the shortest introns reported in eukaryotes. The mechanisms specifying the correct recognition of these tiny introns remain poorly understood. Splicing can occur cotranscriptionally, and it has been proposed that chromatin structure might influence splice site recognition. To investigate the roles of nucleosome positioning in intron recognition, we determined the nucleosome occupancy along the P. tetraurelia genome. We show that P. tetraurelia displays a regular nucleosome array with a nucleosome repeat length of ∼151 bp, among the smallest periodicities reported. Our analysis has revealed that introns are frequently associated with inter-nucleosomal DNA, pointing to an evolutionary constraint favoring introns at the AT-rich nucleosome edge sequences. Using accurate splicing efficiency data from cells depleted for nonsense-mediated decay effectors, we show that introns located at the edge of nucleosomes display higher splicing efficiency than those at the center. However, multiple regression analysis indicates that the low GC content of introns, rather than nucleosome positioning, is associated with high splicing efficiency. Our data reveal a complex link between GC content, nucleosome positioning, and intron evolution in Paramecium.
Collapse
Affiliation(s)
- Stefano Gnan
- Institut Curie, Université PSL, Sorbonne Université, CNRS UMR3244, Dynamics of Genetic Information, Paris, 75005 France
| | - Mélody Matelot
- Université Paris Cité, CNRS, Institut Jacques Monod, F-75013 Paris, France
| | - Marion Weiman
- Université Paris-Saclay, CEA, CNRS, Institute for Integrative Biology of the Cell (I2BC), 91198 Gif-sur-Yvette, France
| | - Olivier Arnaiz
- Université Paris-Saclay, CEA, CNRS, Institute for Integrative Biology of the Cell (I2BC), 91198 Gif-sur-Yvette, France
| | - Frédéric Guérin
- Université Paris Cité, CNRS, Institut Jacques Monod, F-75013 Paris, France
| | - Linda Sperling
- Université Paris-Saclay, CEA, CNRS, Institute for Integrative Biology of the Cell (I2BC), 91198 Gif-sur-Yvette, France
| | - Mireille Bétermier
- Université Paris-Saclay, CEA, CNRS, Institute for Integrative Biology of the Cell (I2BC), 91198 Gif-sur-Yvette, France
| | - Claude Thermes
- Université Paris-Saclay, CEA, CNRS, Institute for Integrative Biology of the Cell (I2BC), 91198 Gif-sur-Yvette, France
| | - Chun-Long Chen
- Institut Curie, Université PSL, Sorbonne Université, CNRS UMR3244, Dynamics of Genetic Information, Paris, 75005 France
| | - Sandra Duharcourt
- Université Paris Cité, CNRS, Institut Jacques Monod, F-75013 Paris, France
| |
Collapse
|
4
|
Barbier J, Vaillant C, Volff JN, Brunet FG, Audit B. Coupling between Sequence-Mediated Nucleosome Organization and Genome Evolution. Genes (Basel) 2021; 12:genes12060851. [PMID: 34205881 PMCID: PMC8228248 DOI: 10.3390/genes12060851] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2021] [Revised: 05/27/2021] [Accepted: 05/27/2021] [Indexed: 12/12/2022] Open
Abstract
The nucleosome is a major modulator of DNA accessibility to other cellular factors. Nucleosome positioning has a critical importance in regulating cell processes such as transcription, replication, recombination or DNA repair. The DNA sequence has an influence on the position of nucleosomes on genomes, although other factors are also implicated, such as ATP-dependent remodelers or competition of the nucleosome with DNA binding proteins. Different sequence motifs can promote or inhibit the nucleosome formation, thus influencing the accessibility to the DNA. Sequence-encoded nucleosome positioning having functional consequences on cell processes can then be selected or counter-selected during evolution. We review the interplay between sequence evolution and nucleosome positioning evolution. We first focus on the different ways to encode nucleosome positions in the DNA sequence, and to which extent these mechanisms are responsible of genome-wide nucleosome positioning in vivo. Then, we discuss the findings about selection of sequences for their nucleosomal properties. Finally, we illustrate how the nucleosome can directly influence sequence evolution through its interactions with DNA damage and repair mechanisms. This review aims to provide an overview of the mutual influence of sequence evolution and nucleosome positioning evolution, possibly leading to complex evolutionary dynamics.
Collapse
Affiliation(s)
- Jérémy Barbier
- Institut de Génomique Fonctionnelle de Lyon, Univ Lyon, CNRS UMR 5242, Ecole Normale Supérieure de Lyon, Univ Claude Bernard Lyon 1, F-69364 Lyon, France; (J.B.); (F.G.B.)
- Laboratoire de Physique, Univ Lyon, ENS de Lyon, CNRS, F-69342 Lyon, France;
| | - Cédric Vaillant
- Laboratoire de Physique, Univ Lyon, ENS de Lyon, CNRS, F-69342 Lyon, France;
| | - Jean-Nicolas Volff
- Institut de Génomique Fonctionnelle de Lyon, Univ Lyon, CNRS UMR 5242, Ecole Normale Supérieure de Lyon, Univ Claude Bernard Lyon 1, F-69364 Lyon, France; (J.B.); (F.G.B.)
- Correspondence: (J.-N.V.); (B.A.)
| | - Frédéric G. Brunet
- Institut de Génomique Fonctionnelle de Lyon, Univ Lyon, CNRS UMR 5242, Ecole Normale Supérieure de Lyon, Univ Claude Bernard Lyon 1, F-69364 Lyon, France; (J.B.); (F.G.B.)
| | - Benjamin Audit
- Laboratoire de Physique, Univ Lyon, ENS de Lyon, CNRS, F-69342 Lyon, France;
- Correspondence: (J.-N.V.); (B.A.)
| |
Collapse
|
5
|
Li Y, Shen QS, Peng Q, Ding W, Zhang J, Zhong X, An NA, Ji M, Zhou WZ, Li CY. Polyadenylation-related isoform switching in human evolution revealed by full-length transcript structure. Brief Bioinform 2021; 22:6273384. [PMID: 33973996 PMCID: PMC8574621 DOI: 10.1093/bib/bbab157] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2021] [Revised: 03/22/2021] [Accepted: 04/04/2021] [Indexed: 11/26/2022] Open
Abstract
Rhesus macaque is a unique nonhuman primate model for human evolutionary and translational study, but the error-prone gene models critically limit its applications. Here, we de novo defined full-length macaque gene models based on single molecule, long-read transcriptome sequencing in four macaque tissues (frontal cortex, cerebellum, heart and testis). Overall, 8 588 227 poly(A)-bearing complementary DNA reads with a mean length of 14 106 nt were generated to compile the backbone of macaque transcripts, with the fine-scale structures further refined by RNA sequencing and cap analysis gene expression sequencing data. In total, 51 605 macaque gene models were accurately defined, covering 89.7% of macaque or 75.7% of human orthologous genes. Based on the full-length gene models, we performed a human–macaque comparative analysis on polyadenylation (PA) regulation. Using macaque and mouse as outgroup species, we identified 79 distal PA events newly originated in humans and found that the strengthening of the distal PA sites, rather than the weakening of the proximal sites, predominantly contributes to the origination of these human-specific isoforms. Notably, these isoforms are selectively constrained in general and contribute to the temporospatially specific reduction of gene expression, through the tinkering of previously existed mechanisms of nuclear retention and microRNA (miRNA) regulation. Overall, the protocol and resource highlight the application of bioinformatics in integrating multilayer genomics data to provide an intact reference for model animal studies, and the isoform switching detected may constitute a hitherto underestimated regulatory layer in shaping the human-specific transcriptome and phenotypic changes.
Collapse
Affiliation(s)
- Yumei Li
- Laboratory of Bioinformatics and Genomic Medicine, Institute of Molecular Medicine, Peking University, Beijing, China
| | - Qing Sunny Shen
- Laboratory of Bioinformatics and Genomic Medicine, Institute of Molecular Medicine, Peking University, Beijing, China
| | - Qi Peng
- Laboratory of Bioinformatics and Genomic Medicine, Institute of Molecular Medicine, Peking University, Beijing, China.,College of Future Technology, Peking University, Beijing, China
| | - Wanqiu Ding
- Laboratory of Bioinformatics and Genomic Medicine, Institute of Molecular Medicine, Peking University, Beijing, China.,College of Future Technology, Peking University, Beijing, China
| | - Jie Zhang
- Laboratory of Bioinformatics and Genomic Medicine, Institute of Molecular Medicine, Peking University, Beijing, China.,College of Future Technology, Peking University, Beijing, China
| | - Xiaoming Zhong
- Laboratory of Bioinformatics and Genomic Medicine, Institute of Molecular Medicine, Peking University, Beijing, China
| | - Ni A An
- Laboratory of Bioinformatics and Genomic Medicine, Institute of Molecular Medicine, Peking University, Beijing, China.,College of Future Technology, Peking University, Beijing, China
| | - Mingjun Ji
- Laboratory of Bioinformatics and Genomic Medicine, Institute of Molecular Medicine, Peking University, Beijing, China.,College of Future Technology, Peking University, Beijing, China
| | - Wei-Zhen Zhou
- State Key Laboratory of Cardiovascular Disease, Fuwai Hospital, Beijing, China
| | - Chuan-Yun Li
- Laboratory of Bioinformatics and Genomic Medicine, Institute of Molecular Medicine, Peking University, Beijing, China.,College of Future Technology, Peking University, Beijing, China
| |
Collapse
|
6
|
Espiritu D, Gribkova AK, Gupta S, Shaytan AK, Panchenko AR. Molecular Mechanisms of Oncogenesis through the Lens of Nucleosomes and Histones. J Phys Chem B 2021; 125:3963-3976. [PMID: 33769808 DOI: 10.1021/acs.jpcb.1c00694] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
At the cellular level, cancer is the disease of both the genome and the epigenome, and the interplay between genetic mutations and epigenetic states may occur at the level of elementary chromatin units, the nucleosomes. They are formed by a segment of DNA wrapped around an octamer of histone proteins. In this review, we survey various mechanisms of cancer etiology and progression mediated by histones and nucleosomes. In particular, we discuss the effects of mutations in histones, changes in their expression and slicing on epigenetic dysregulation and carcinogenesis. The links between cancer phenotypes and differential expression of histone variants and isoforms are summarized. Finally, we discourse the geometric and steric effects of DNA compaction in nucleosomes on DNA mutation rate, interactions with transcription factors, including pioneer transcription factors, and prospects of cancer cells' genome and epigenome editing.
Collapse
Affiliation(s)
- Daniel Espiritu
- Department of Pathology and Molecular Medicine, School of Medicine, Queen's University, Kingston, Ontario, Canada
| | - Anna K Gribkova
- Department of Biology, Lomonosov Moscow State University, 1-12 Leninskie Gory, Moscow, 119991, Russia.,Sirius University of Science and Technology, 1 Olympic Avenue, Sochi, 354340, Russia
| | - Shubhangi Gupta
- Department of Pathology and Molecular Medicine, School of Medicine, Queen's University, Kingston, Ontario, Canada
| | - Alexey K Shaytan
- Department of Biology, Lomonosov Moscow State University, 1-12 Leninskie Gory, Moscow, 119991, Russia.,Sirius University of Science and Technology, 1 Olympic Avenue, Sochi, 354340, Russia.,Bioinformatics Lab, Faculty of Computer Science, HSE University, 11 Pokrovsky Boulevard, Moscow, 109028, Russia
| | - Anna R Panchenko
- Department of Pathology and Molecular Medicine, School of Medicine, Queen's University, Kingston, Ontario, Canada.,Ontario Institute of Cancer Research, Toronto, Ontario, Canada
| |
Collapse
|
7
|
Lei H, Tao K. Somatic mutations in colorectal cancer are associated with the epigenetic modifications. J Cell Mol Med 2020; 24:11828-11836. [PMID: 32865336 PMCID: PMC7579689 DOI: 10.1111/jcmm.15799] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2020] [Revised: 07/22/2020] [Accepted: 08/09/2020] [Indexed: 01/13/2023] Open
Abstract
Colorectal cancer (CRC) mostly arises from progressive accumulation of somatic mutations within cells. Most commonly mutated genes like TP53, APC and KRAS can promote survival and proliferation of cancer cells. Although the molecular alterations and landscape of some specific mutations in CRC are well known, the presence of a somatic mutation signature related to genomic regions and epigenetic markers remain unclear. To find the signatures from a random distribution of somatic mutations in CRCs, we carried out enrichment analysis in different genomic regions and identified peaks of epigenetic markers. We validated that the mutation frequency in miRNA is dramatically higher than in flanking genomic regions. Moreover, we observed that somatic mutations in CRC and colon cancer cell lines are significantly enriched in CTCF binding sites. We also found these mutations are enriched for H3K27me3 in both normal sigmoid colon and colon cancer cell lines. Taken together, our findings suggest that there are some common somatic mutations signatures which provide new directions to study CRC.
Collapse
Affiliation(s)
- Hongwei Lei
- Department of Gastrointestinal Surgery, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, China
| | - Kaixiong Tao
- Department of Gastrointestinal Surgery, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, China
| |
Collapse
|
8
|
Feng JX, Riddle NC. Epigenetics and genome stability. Mamm Genome 2020; 31:181-195. [PMID: 32296924 DOI: 10.1007/s00335-020-09836-2] [Citation(s) in RCA: 16] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2019] [Accepted: 04/07/2020] [Indexed: 12/19/2022]
Abstract
Maintaining genome stability is essential to an organism's health and survival. Breakdown of the mechanisms protecting the genome and the resulting genome instability are an important aspect of the aging process and have been linked to diseases such as cancer. Thus, a large network of interconnected pathways is responsible for ensuring genome integrity in the face of the continuous challenges that induce DNA damage. While these pathways are diverse, epigenetic mechanisms play a central role in many of them. DNA modifications, histone variants and modifications, chromatin structure, and non-coding RNAs all carry out a variety of functions to ensure that genome stability is maintained. Epigenetic mechanisms ensure the functions of centromeres and telomeres that are essential for genome stability. Epigenetic mechanisms also protect the genome from the invasion by transposable elements and contribute to various DNA repair pathways. In this review, we highlight the integral role of epigenetic mechanisms in the maintenance of genome stability and draw attention to issues in need of further study.
Collapse
Affiliation(s)
- Justina X Feng
- Department of Biology, The University of Alabama at Birmingham, Birmingham, AL, USA
| | - Nicole C Riddle
- Department of Biology, The University of Alabama at Birmingham, Birmingham, AL, USA.
| |
Collapse
|
9
|
Li C, Luscombe NM. Nucleosome positioning stability is a modulator of germline mutation rate variation across the human genome. Nat Commun 2020; 11:1363. [PMID: 32170069 PMCID: PMC7070026 DOI: 10.1038/s41467-020-15185-0] [Citation(s) in RCA: 16] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2019] [Accepted: 02/23/2020] [Indexed: 02/08/2023] Open
Abstract
Nucleosome organization has been suggested to affect local mutation rates in the genome. However, the lack of de novo mutation and high-resolution nucleosome data has limited the investigation of this hypothesis. Additionally, analyses using indirect mutation rate measurements have yielded contradictory and potentially confounding results. Here, we combine data on >300,000 human de novo mutations with high-resolution nucleosome maps and find substantially elevated mutation rates around translationally stable (‘strong’) nucleosomes. We show that the mutational mechanisms affected by strong nucleosomes are low-fidelity replication, insufficient mismatch repair and increased double-strand breaks. Strong nucleosomes preferentially locate within young SINE/LINE transposons, suggesting that when subject to increased mutation rates, transposons are then more rapidly inactivated. Depletion of strong nucleosomes in older transposons suggests frequent positioning changes during evolution. The findings have important implications for human genetics and genome evolution. Nucleosome organization has been suggested to affect local mutation rates in the genome. Here, the authors analyse data on >300,000 human de novo mutations and high-resolution nucleosome maps and provide evidence that nucleosome positioning stability modulates germline mutation rate variation across the human genome.
Collapse
Affiliation(s)
- Cai Li
- The Francis Crick Institute, London, NW1 1AT, UK. .,School of Life Sciences, Sun Yat-sen University, Guangzhou, 510275, China.
| | - Nicholas M Luscombe
- The Francis Crick Institute, London, NW1 1AT, UK.,Okinawa Institute of Science & Technology Graduate University, Okinawa, 904-0495, Japan.,UCL Genetics Institute, University College London, London, WC1E 6BT, UK
| |
Collapse
|
10
|
Gonzalez-Perez A, Sabarinathan R, Lopez-Bigas N. Local Determinants of the Mutational Landscape of the Human Genome. Cell 2020; 177:101-114. [PMID: 30901533 DOI: 10.1016/j.cell.2019.02.051] [Citation(s) in RCA: 109] [Impact Index Per Article: 21.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2018] [Revised: 02/13/2019] [Accepted: 02/26/2019] [Indexed: 12/19/2022]
Abstract
Large-scale chromatin features, such as replication time and accessibility influence the rate of somatic and germline mutations at the megabase scale. This article reviews how local chromatin structures -e.g., DNA wrapped around nucleosomes, transcription factors bound to DNA- affect the mutation rate at a local scale. It dissects how the interaction of some mutagenic agents and/or DNA repair systems with these local structures influence the generation of mutations. We discuss how this local mutation rate variability affects our understanding of the evolution of the genomic sequence, and the study of the evolution of organisms and tumors.
Collapse
Affiliation(s)
- Abel Gonzalez-Perez
- Institute for Research in Biomedicine (IRB Barcelona), The Barcelona Institute of Science and Technology, Baldiri Reixac, 10, 08028 Barcelona, Spain; Research Program on Biomedical Informatics, Universitat Pompeu Fabra, Barcelona, Catalonia, Spain.
| | - Radhakrishnan Sabarinathan
- National Centre for Biological Sciences, Tata Institute of Fundamental Research, Bangalore 560065, India.
| | - Nuria Lopez-Bigas
- Institute for Research in Biomedicine (IRB Barcelona), The Barcelona Institute of Science and Technology, Baldiri Reixac, 10, 08028 Barcelona, Spain; Research Program on Biomedical Informatics, Universitat Pompeu Fabra, Barcelona, Catalonia, Spain; Institució Catalana de Recerca i Estudis Avançats (ICREA), Barcelona, Spain.
| |
Collapse
|
11
|
Drinnenberg IA, Berger F, Elsässer SJ, Andersen PR, Ausió J, Bickmore WA, Blackwell AR, Erwin DH, Gahan JM, Gaut BS, Harvey ZH, Henikoff S, Kao JY, Kurdistani SK, Lemos B, Levine MT, Luger K, Malik HS, Martín-Durán JM, Peichel CL, Renfree MB, Rutowicz K, Sarkies P, Schmitz RJ, Technau U, Thornton JW, Warnecke T, Wolfe KH. EvoChromo: towards a synthesis of chromatin biology and evolution. Development 2019; 146:146/19/dev178962. [PMID: 31558570 DOI: 10.1242/dev.178962] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]
Abstract
Over the past few years, interest in chromatin and its evolution has grown. To further advance these interests, we organized a workshop with the support of The Company of Biologists to debate the current state of knowledge regarding the origin and evolution of chromatin. This workshop led to prospective views on the development of a new field of research that we term 'EvoChromo'. In this short Spotlight article, we define the breadth and expected impact of this new area of scientific inquiry on our understanding of both chromatin and evolution.
Collapse
Affiliation(s)
- Ines A Drinnenberg
- Institut Curie, Paris Sciences et Lettres Research University, Centre National de la Recherche Scientifique UMR 3664, Paris 75005, France
| | - Frédéric Berger
- Gregor Mendel Institute, Austrian Academy of Sciences, Vienna BioCenter, Dr. Bohr-Gasse 3, 1030 Vienna, Austria
| | - Simon J Elsässer
- Science for Life Laboratory, Division of Translational Medicine and Chemical Biology, Department of Medical Biochemistry and Biophysics, Karolinska Institutet, Stockholm 17177, Sweden
| | - Peter R Andersen
- Institute of Molecular Biotechnology of the Austrian Academy of Sciences (IMBA), Vienna BioCenter (VBC), Dr. Bohrgasse 3, 1030 Vienna, Austria
| | - Juan Ausió
- Department of Biochemistry and Microbiology, University of Victoria, Victoria, BC, V8W 3P6, Canada
| | - Wendy A Bickmore
- Medical Research Council Human Genetics Unit, Institute of Genetics and Molecular Medicine, University of Edinburgh, Edinburgh EH4 2XU, UK
| | | | - Douglas H Erwin
- Department of Paleobiology, MRC-121, National Museum of Natural History, Washington, DC 20013-7012, USA
| | - James M Gahan
- Sars Centre for Marine Molecular Biology, University of Bergen, Thormøhlensgt. 55, 5008 Bergen, Norway
| | - Brandon S Gaut
- Department of Ecology and Evolutionary Biology, University of California, Irvine, Irvine, CA 92697, USA
| | - Zachary H Harvey
- Department of Chemical and Systems Biology, Stanford University School of Medicine, Stanford, CA 94305, USA
| | - Steven Henikoff
- Division of Basic Sciences and Howard Hughes Medical Institute, Fred Hutchinson Cancer Research Center, Seattle, WA 98109, USA
| | - Joyce Y Kao
- Center for Genomics and Systems Biology, New York University, 12 Waverly Place, New York, NY 10003, USA.,Institute of Molecular Systems Biology, Swiss Federal Institute of Technology (ETH) Zürich, Otto-Stern-Weg 3, 8093 Zürich, Switzerland
| | - Siavash K Kurdistani
- Department of Biological Chemistry, David Geffen School of Medicine, University of California Los Angeles, Los Angeles, CA 90095, USA
| | - Bernardo Lemos
- Program in Molecular and Integrative Physiological Sciences, Department of Environmental Health, Harvard T. H. Chan School of Public Health, Boston, MA 02115, USA
| | - Mia T Levine
- Department of Biology, Epigenetics Institute, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Karolin Luger
- Howard Hughes Medical Institute and Department of Biochemistry, CU Boulder, Boulder, CO 80303, USA
| | - Harmit S Malik
- Division of Basic Sciences and Howard Hughes Medical Institute, Fred Hutchinson Cancer Research Center, Seattle, WA 98109, USA
| | - José M Martín-Durán
- Queen Mary University of London, School of Biological and Chemical Sciences, Mile End Road, London E1 4NS, UK
| | - Catherine L Peichel
- Institute of Ecology and Evolution, University of Bern, 3012 Bern, Switzerland
| | - Marilyn B Renfree
- School of BioSciences, The University of Melbourne, Melbourne, 3010 VIC, Australia
| | - Kinga Rutowicz
- Institute of Plant and Microbial Biology, Zurich-Basel Plant Science Center, University of Zurich, 8092 Zürich, Switzerland
| | - Peter Sarkies
- MRC London Institute of Medical Sciences and Institute of Clinical Sciences, IMperial College London, Du Cane Road, London W12 0NN, UK
| | - Robert J Schmitz
- Department of Genetics, University of Georgia, Athens, GA 30602, USA
| | - Ulrich Technau
- Department for Molecular Evolution and Development, Centre of Organismal Systems Biology, University of Vienna, Vienna A-1090, Austria
| | - Joseph W Thornton
- Department of Human Genetics, and Department of Ecology and Evolution, University of Chicago, Chicago, IL 60637 USA
| | - Tobias Warnecke
- MRC London Institute of Medical Sciences and Institute of Clinical Sciences, IMperial College London, Du Cane Road, London W12 0NN, UK
| | - Kenneth H Wolfe
- Conway Institute and School of Medicine, University College Dublin, Dublin 4, Ireland
| |
Collapse
|
12
|
Somatic and Germline Mutation Periodicity Follow the Orientation of the DNA Minor Groove around Nucleosomes. Cell 2019; 175:1074-1087.e18. [PMID: 30388444 DOI: 10.1016/j.cell.2018.10.004] [Citation(s) in RCA: 73] [Impact Index Per Article: 12.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2018] [Revised: 08/27/2018] [Accepted: 10/01/2018] [Indexed: 12/11/2022]
Abstract
Mutation rates along the genome are highly variable and influenced by several chromatin features. Here, we addressed how nucleosomes, the most pervasive chromatin structure in eukaryotes, affect the generation of mutations. We discovered that within nucleosomes, the somatic mutation rate across several tumor cohorts exhibits a strong 10 base pair (bp) periodicity. This periodic pattern tracks the alternation of the DNA minor groove facing toward and away from the histones. The strength and phase of the mutation rate periodicity are determined by the mutational processes active in tumors. We uncovered similar periodic patterns in the genetic variation among human and Arabidopsis populations, also detectable in their divergence from close species, indicating that the same principles underlie germline and somatic mutation rates. We propose that differential DNA damage and repair processes dependent on the minor groove orientation in nucleosome-bound DNA contribute to the 10-bp periodicity in AT/CG content in eukaryotic genomes.
Collapse
|
13
|
Abstract
Nucleosomal modifications have been implicated in fundamental epigenetic regulation, whereas the roles of nucleosome binding in shaping changes through evolution remain to be addressed. Here we performed a comparative study to clarify the roles of nucleosome occupancy in exon origination. By profiling a high-resolution, cross-species mononucleosome landscape for mammalian tissues, we found nucleosome occupancy profiles are conserved across tissues and species. Further, through a phylogenetic approach, we found that the feature of differential nucleosome occupancy appears prior to the origination of new exons and, presumably, facilitates the origin of new exons by increasing the splice strength of the ancestral nonexonic regions through driving a local difference in GC content, which suggests the function of nucleosome binding in exonization. Nucleosomal modifications have been implicated in fundamental epigenetic regulation, but the roles of nucleosome occupancy in shaping changes through evolution remain to be addressed. Here we present high-resolution nucleosome occupancy profiles for multiple tissues derived from human, macaque, tree shrew, mouse, and pig. Genome-wide comparison reveals conserved nucleosome occupancy profiles across both different species and tissue types. Notably, we found significantly higher levels of nucleosome occupancy in exons than in introns, a pattern correlated with the different exon–intron GC content. We then determined whether this biased occupancy may play roles in the origination of new exons through evolution, rather than being a downstream effect of exonization, through a comparative approach to sequentially trace the order of the exonization and biased nucleosome binding. By identifying recently evolved exons in human but not in macaque using matched RNA sequencing, we found that higher exonic nucleosome occupancy also existed in macaque regions orthologous to these exons. Presumably, such biased nucleosome occupancy facilitates the origination of new exons by increasing the splice strength of the ancestral nonexonic regions through driving a local difference in GC content. These data thus support a model that sites bound by nucleosomes are more likely to evolve into exons, which we term the “nucleosome-first” model.
Collapse
|
14
|
Abrahams L, Hurst LD. Adenine Enrichment at the Fourth CDS Residue in Bacterial Genes Is Consistent with Error Proofing for +1 Frameshifts. Mol Biol Evol 2018; 34:3064-3080. [PMID: 28961919 PMCID: PMC5850271 DOI: 10.1093/molbev/msx223] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022] Open
Abstract
Beyond selection for optimal protein functioning, coding sequences (CDSs) are under selection at the RNA and DNA levels. Here, we identify a possible signature of “dual-coding,” namely extensive adenine (A) enrichment at bacterial CDS fourth sites. In 99.07% of studied bacterial genomes, fourth site A use is greater than expected given genomic A-starting codon use. Arguing for nucleotide level selection, A-starting serine and arginine second codons are heavily utilized when compared with their non-A starting synonyms. Several models have the ability to explain some of this trend. In part, A-enrichment likely reduces 5′ mRNA stability, promoting translation initiation. However T/U, which may also reduce stability, is avoided. Further, +1 frameshifts on the initiating ATG encode a stop codon (TGA) provided A is the fourth residue, acting either as a frameshift “catch and destroy” or a frameshift stop and adjust mechanism and hence implicated in translation initiation. Consistent with both, genomes lacking TGA stop codons exhibit weaker fourth site A-enrichment. Sequences lacking a Shine–Dalgarno sequence and those without upstream leader genes, that may be more error prone during initiation, have greater utilization of A, again suggesting a role in initiation. The frameshift correction model is consistent with the notion that many genomic features are error-mitigation factors and provides the first evidence for site-specific out of frame stop codon selection. We conjecture that the NTG universal start codon may have evolved as a consequence of TGA being a stop codon and the ability of NTGA to rapidly terminate or adjust a ribosome.
Collapse
Affiliation(s)
- Liam Abrahams
- Department of Biology and Biochemistry, The Milner Centre for Evolution, University of Bath, Bath, United Kingdom
| | - Laurence D Hurst
- Department of Biology and Biochemistry, The Milner Centre for Evolution, University of Bath, Bath, United Kingdom
| |
Collapse
|
15
|
Brunet FG, Audit B, Drillon G, Argoul F, Volff JN, Arneodo A. Evidence for DNA Sequence Encoding of an Accessible Nucleosomal Array across Vertebrates. Biophys J 2018; 114:2308-2316. [PMID: 29580552 PMCID: PMC6028776 DOI: 10.1016/j.bpj.2018.02.025] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2017] [Revised: 02/07/2018] [Accepted: 02/20/2018] [Indexed: 12/15/2022] Open
Abstract
Nucleosome-depleted regions around which nucleosomes order following the "statistical" positioning scenario were recently shown to be encoded in the DNA sequence in human. This intrinsic nucleosomal ordering strongly correlates with oscillations in the local GC content as well as with the interspecies and intraspecies mutation profiles, revealing the existence of both positive and negative selection. In this letter, we show that these predicted nucleosome inhibitory energy barriers (NIEBs) with compacted neighboring nucleosomes are indeed ubiquitous to all vertebrates tested. These 1 kb-sized chromatin patterns are widely distributed along vertebrate chromosomes, overall covering more than a third of the genome. We have previously observed in human deviations from neutral evolution at these genome-wide distributed regions, which we interpreted as a possible indication of the selection of an open, accessible, and dynamic nucleosomal array to constitutively facilitate the epigenetic regulation of nuclear functions in a cell-type-specific manner. As a first, very appealing observation supporting this hypothesis, we report evidence of a strong association between NIEB borders and the poly(A) tails of Alu sequences in human. These results suggest that NIEBs provide adequate chromatin patterns favorable to the integration of Alu retrotransposons and, more generally to various transposable elements in the genomes of primates and other vertebrates.
Collapse
Affiliation(s)
- Frédéric G Brunet
- Institut de Génomique Fonctionnelle de Lyon, Univ Lyon, CNRS UMR 5242, Ecole Normale Supérieure de Lyon, Univ Claude Bernard Lyon 1, Lyon, France
| | - Benjamin Audit
- Univ Lyon, ENS de Lyon, Univ Claude Bernard Lyon 1, CNRS Laboratoire de Physique, Lyon, France
| | - Guénola Drillon
- Univ Lyon, ENS de Lyon, Univ Claude Bernard Lyon 1, CNRS Laboratoire de Physique, Lyon, France
| | - Françoise Argoul
- Univ Lyon, ENS de Lyon, Univ Claude Bernard Lyon 1, CNRS Laboratoire de Physique, Lyon, France; LOMA, Université de Bordeaux, CNRS UMR 5798, Talence, France
| | - Jean-Nicolas Volff
- Institut de Génomique Fonctionnelle de Lyon, Univ Lyon, CNRS UMR 5242, Ecole Normale Supérieure de Lyon, Univ Claude Bernard Lyon 1, Lyon, France
| | - Alain Arneodo
- Univ Lyon, ENS de Lyon, Univ Claude Bernard Lyon 1, CNRS Laboratoire de Physique, Lyon, France; LOMA, Université de Bordeaux, CNRS UMR 5798, Talence, France.
| |
Collapse
|
16
|
García A, González S, Antequera F. Nucleosomal organization and DNA base composition patterns. Nucleus 2017. [PMID: 28635365 PMCID: PMC5703254 DOI: 10.1080/19491034.2017.1337611] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022] Open
Abstract
Nucleosomes are the basic units of chromatin. They compact the genome inside the nucleus and regulate the access of proteins to DNA. In the yeast genome, most nucleosomes occupy well-defined positions, which are maintained under many different physiological situations and genetic backgrounds. Although several short sequence elements have been described that favor or reduce the affinity between histones and DNA, the extent to which the DNA sequence affects nucleosome positioning in the genomic context remains unclear. Recent analyses indicate that the base composition pattern of mononucleosomal DNA differs among species, and that the same sequence elements have a different impact on nucleosome positioning in different genomes despite the high level of phylogenetic conservation of histones. These studies have also shown that the DNA sequence contributes to nucleosome positioning to the point that it is possible to design synthetic DNA molecules capable of generating regular and species-specific nucleosomal patterns in vivo.
Collapse
Affiliation(s)
- Alicia García
- a Instituto de Biología Funcional y Genómica, Consejo Superior de Investigaciones Científicas (CSIC)/Universidad de Salamanca , Salamanca , Spain
| | - Sara González
- a Instituto de Biología Funcional y Genómica, Consejo Superior de Investigaciones Científicas (CSIC)/Universidad de Salamanca , Salamanca , Spain
| | - Francisco Antequera
- a Instituto de Biología Funcional y Genómica, Consejo Superior de Investigaciones Científicas (CSIC)/Universidad de Salamanca , Salamanca , Spain
| |
Collapse
|
17
|
Lim ET, Uddin M, De Rubeis S, Chan Y, Kamumbu AS, Zhang X, D'Gama A, Kim SN, Hill RS, Goldberg AP, Poultney C, Minshew NJ, Kushima I, Aleksic B, Ozaki N, Parellada M, Arango C, Penzol MJ, Carracedo A, Kolevzon A, Hultman CM, Weiss LA, Fromer M, Chiocchetti AG, Freitag CM, Church GM, Scherer SW, Buxbaum JD, Walsh CA. Rates, distribution and implications of postzygotic mosaic mutations in autism spectrum disorder. Nat Neurosci 2017; 20:1217-1224. [PMID: 28714951 PMCID: PMC5672813 DOI: 10.1038/nn.4598] [Citation(s) in RCA: 185] [Impact Index Per Article: 23.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2016] [Accepted: 05/20/2017] [Indexed: 12/17/2022]
Abstract
We systematically analyzed postzygotic mutations (PZMs) in whole-exome sequences from the largest collection of trios (5,947) with autism spectrum disorder (ASD) available, including 282 unpublished trios, and performed resequencing using multiple independent technologies. We identified 7.5% of de novo mutations as PZMs, 83.3% of which were not described in previous studies. Damaging, nonsynonymous PZMs within critical exons of prenatally expressed genes were more common in ASD probands than controls (P < 1 × 10-6), and genes carrying these PZMs were enriched for expression in the amygdala (P = 5.4 × 10-3). Two genes (KLF16 and MSANTD2) were significantly enriched for PZMs genome-wide, and other PZMs involved genes (SCN2A, HNRNPU and SMARCA4) whose mutation is known to cause ASD or other neurodevelopmental disorders. PZMs constitute a significant proportion of de novo mutations and contribute importantly to ASD risk.
Collapse
Affiliation(s)
- Elaine T. Lim
- Division of Genetics and Genomics, Manton Center for Orphan Disease Research and Howard Hughes Medical Institute, Boston Children's Hospital, Boston, MA 02115, USA
- Departments of Genetics, Pediatrics and Neurology, Harvard Medical School, Boston, MA 02115, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA 02138, USA
- Wyss Institute for Biologically Inspired Engineering, Harvard University, Boston, MA 02115, USA
| | - Mohammed Uddin
- Mohammed Bin Rashid University of Medicine and Health Sciences, College of Medicine, Dubai, UAE
| | - Silvia De Rubeis
- Seaver Autism Center for Research and Treatment, Icahn School of Medicine at Mount Sinai, New York, New York 10029, USA
- Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, New York 10029, USA
| | - Yingleong Chan
- Departments of Genetics, Pediatrics and Neurology, Harvard Medical School, Boston, MA 02115, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA 02138, USA
- Wyss Institute for Biologically Inspired Engineering, Harvard University, Boston, MA 02115, USA
| | - Anne S. Kamumbu
- Division of Genetics and Genomics, Manton Center for Orphan Disease Research and Howard Hughes Medical Institute, Boston Children's Hospital, Boston, MA 02115, USA
- Departments of Genetics, Pediatrics and Neurology, Harvard Medical School, Boston, MA 02115, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA 02138, USA
| | - Xiaochang Zhang
- Division of Genetics and Genomics, Manton Center for Orphan Disease Research and Howard Hughes Medical Institute, Boston Children's Hospital, Boston, MA 02115, USA
- Departments of Genetics, Pediatrics and Neurology, Harvard Medical School, Boston, MA 02115, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA 02138, USA
| | - Alissa D'Gama
- Division of Genetics and Genomics, Manton Center for Orphan Disease Research and Howard Hughes Medical Institute, Boston Children's Hospital, Boston, MA 02115, USA
- Departments of Genetics, Pediatrics and Neurology, Harvard Medical School, Boston, MA 02115, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA 02138, USA
| | - Sonia N. Kim
- Division of Genetics and Genomics, Manton Center for Orphan Disease Research and Howard Hughes Medical Institute, Boston Children's Hospital, Boston, MA 02115, USA
- Departments of Genetics, Pediatrics and Neurology, Harvard Medical School, Boston, MA 02115, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA 02138, USA
| | - Robert Sean Hill
- Division of Genetics and Genomics, Manton Center for Orphan Disease Research and Howard Hughes Medical Institute, Boston Children's Hospital, Boston, MA 02115, USA
- Departments of Genetics, Pediatrics and Neurology, Harvard Medical School, Boston, MA 02115, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA 02138, USA
| | - Arthur P. Goldberg
- Seaver Autism Center for Research and Treatment, Icahn School of Medicine at Mount Sinai, New York, New York 10029, USA
- Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, New York 10029, USA
| | - Christopher Poultney
- Seaver Autism Center for Research and Treatment, Icahn School of Medicine at Mount Sinai, New York, New York 10029, USA
- Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, New York 10029, USA
| | - Nancy J. Minshew
- Department of Psychiatry, Center For Excellence in Autism Research, University of Pittsburgh, Pittsburgh, PA, USA
| | - Itaru Kushima
- Department of Psychiatry, Nagoya University Graduate School of Medicine, Nagoya 466-8550, Japan
| | - Branko Aleksic
- Department of Psychiatry, Nagoya University Graduate School of Medicine, Nagoya 466-8550, Japan
| | - Norio Ozaki
- Department of Psychiatry, Nagoya University Graduate School of Medicine, Nagoya 466-8550, Japan
| | - Mara Parellada
- Child and Adolescent Psychiatry Department, Hospital General Universitario Gregorio Marañón, School of Medicine, Universidad Complutense, IiSGM, CIBERSAM, Madrid 28007, Spain
| | - Celso Arango
- Child and Adolescent Psychiatry Department, Hospital General Universitario Gregorio Marañón, School of Medicine, Universidad Complutense, IiSGM, CIBERSAM, Madrid 28007, Spain
| | - Maria J. Penzol
- Child and Adolescent Psychiatry Department, Hospital General Universitario Gregorio Marañón, IiSGM, CIBERSAM, Madrid 28007, Spain
| | - Angel Carracedo
- Grupo de Medicina Xenomica, Universidade de Santiago de Compostela, Centro Nacional de Genotipado-Plataforma de Recursos Biomoleculares y Bioinformaticos-Instituto de Salud Carlos III (CeGen-PRB2-ISCIII), Santiago de Compostela 15782, Spain
- Grupo de Medicina Xenomica, CIBERER, Fundacion Publica Galega de Medicina Xenomica-SERGAS, Santiago de Compostela 15782, Spain
- Center of Excellence in Genomic Medicine Research, King Abdulaziz University, Jeddah, Saudi Arabia
| | - Alexander Kolevzon
- Seaver Autism Center for Research and Treatment, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
- Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
- Friedman Brain Institute, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
- The Mindich Child Health and Development Institute, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
- Department of Pediatrics, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | - Christina M. Hultman
- Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, Stockholm, Sweden
| | - Lauren A. Weiss
- Department of Psychiatry and Institute for Human Genetics, University of California, San Francisco, San Francisco, CA 94143, USA
| | - Menachem Fromer
- Seaver Autism Center for Research and Treatment, Icahn School of Medicine at Mount Sinai, New York, New York 10029, USA
- Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, New York 10029, USA
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, New York 10029, USA
| | - Andreas G. Chiocchetti
- Department of Child and Adolescent Psychiatry, Psychosomatics and Psychotherapy, Autism Research and Intervention Center of Excellence, University Hospital Frankfurt, Goethe University, Frankfurt am Main 60528, Germany
| | - Christine M. Freitag
- Department of Child and Adolescent Psychiatry, Psychosomatics and Psychotherapy, Autism Research and Intervention Center of Excellence, University Hospital Frankfurt, Goethe University, Frankfurt am Main 60528, Germany
| | | | - George M. Church
- Departments of Genetics, Pediatrics and Neurology, Harvard Medical School, Boston, MA 02115, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA 02138, USA
| | - Stephen W. Scherer
- The Centre for Applied Genomics, The Hospital for Sick Children, Toronto, Ontario, Canada
- Program in Genetics and Genome Biology (GGB), The Hospital for Sick Children, Toronto, Ontario, Canada
- Department of Molecular Genetics, University of Toronto, Toronto, Ontario, Canada
- McLaughlin Centre, University of Toronto, Toronto, Ontario, Canada
| | - Joseph D. Buxbaum
- Seaver Autism Center for Research and Treatment, Icahn School of Medicine at Mount Sinai, New York, New York 10029, USA
- Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, New York 10029, USA
- Friedman Brain Institute, Icahn School of Medicine at Mount Sinai, New York, New York, USA
- The Mindich Child Health and Development Institute, Icahn School of Medicine at Mount Sinai, New York, New York, USA
| | - Christopher A. Walsh
- Division of Genetics and Genomics, Manton Center for Orphan Disease Research and Howard Hughes Medical Institute, Boston Children's Hospital, Boston, MA 02115, USA
- Departments of Genetics, Pediatrics and Neurology, Harvard Medical School, Boston, MA 02115, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA 02138, USA
| |
Collapse
|
18
|
Xiong J, Gao S, Dui W, Yang W, Chen X, Taverna SD, Pearlman RE, Ashlock W, Miao W, Liu Y. Dissecting relative contributions of cis- and trans-determinants to nucleosome distribution by comparing Tetrahymena macronuclear and micronuclear chromatin. Nucleic Acids Res 2016; 44:10091-10105. [PMID: 27488188 PMCID: PMC5137421 DOI: 10.1093/nar/gkw684] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2016] [Revised: 07/21/2016] [Accepted: 07/24/2016] [Indexed: 02/06/2023] Open
Abstract
The ciliate protozoan Tetrahymena thermophila contains two types of structurally and functionally differentiated nuclei: the transcriptionally active somatic macronucleus (MAC) and the transcriptionally silent germ-line micronucleus (MIC). Here, we demonstrate that MAC features well-positioned nucleosomes downstream of transcription start sites and flanking splice sites. Transcription-associated trans-determinants promote nucleosome positioning in MAC. By contrast, nucleosomes in MIC are dramatically delocalized. Nucleosome occupancy in MAC and MIC are nonetheless highly correlated with each other, as well as with in vitro reconstitution and predictions based upon DNA sequence features, revealing unexpectedly strong contributions from cis-determinants. In particular, well-positioned nucleosomes are often matched with GC content oscillations. As many nucleosomes are coordinately accommodated by both cis- and trans-determinants, we propose that their distribution is shaped by the impact of these nucleosomes on the mutational and transcriptional landscape, and driven by evolutionary selection.
Collapse
Affiliation(s)
- Jie Xiong
- Department of Pathology, University of Michigan, Ann Arbor, MI 48109, USA,Key Laboratory of Aquatic Biodiversity and Conservation, Institute of Hydrobiology, Chinese Academy of Sciences, Wuhan 430072, China,These authors contributed equally to this work as first authors
| | - Shan Gao
- Institute of Evolution & Marine Biodiversity, Ocean University of China, Qingdao 266003, China,Laboratory for Marine Biology and Biotechnology, Qingdao National Laboratory for Marine Science and Technology, Qingdao 266003, China,These authors contributed equally to this work as first authors
| | - Wen Dui
- Department of Pathology, University of Michigan, Ann Arbor, MI 48109, USA
| | - Wentao Yang
- Key Laboratory of Aquatic Biodiversity and Conservation, Institute of Hydrobiology, Chinese Academy of Sciences, Wuhan 430072, China
| | - Xiao Chen
- Institute of Evolution & Marine Biodiversity, Ocean University of China, Qingdao 266003, China
| | - Sean D. Taverna
- Department of Pharmacology and Molecular Sciences and The Center for Epigenetics, Johns Hopkins University School of Medicine, Baltimore, MD 21205, USA
| | - Ronald E. Pearlman
- Department of Biology, York University, Toronto, Ontario M3J 1P3, Canada
| | - Wendy Ashlock
- Department of Biology, York University, Toronto, Ontario M3J 1P3, Canada
| | - Wei Miao
- Key Laboratory of Aquatic Biodiversity and Conservation, Institute of Hydrobiology, Chinese Academy of Sciences, Wuhan 430072, China,Correspondence may also be addressed to Wei Miao.
| | - Yifan Liu
- Department of Pathology, University of Michigan, Ann Arbor, MI 48109, USA,To whom correspondence should be addressed. Tel: +1 734 6154239;
| |
Collapse
|
19
|
González S, García A, Vázquez E, Serrano R, Sánchez M, Quintales L, Antequera F. Nucleosomal signatures impose nucleosome positioning in coding and noncoding sequences in the genome. Genome Res 2016; 26:1532-1543. [PMID: 27662899 PMCID: PMC5088595 DOI: 10.1101/gr.207241.116] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2016] [Accepted: 09/19/2016] [Indexed: 12/18/2022]
Abstract
In the yeast genome, a large proportion of nucleosomes occupy well-defined and stable positions. While the contribution of chromatin remodelers and DNA binding proteins to maintain this organization is well established, the relevance of the DNA sequence to nucleosome positioning in the genome remains controversial. Through quantitative analysis of nucleosome positioning, we show that sequence changes distort the nucleosomal pattern at the level of individual nucleosomes in three species of Schizosaccharomyces and in Saccharomyces cerevisiae. This effect is equally detected in transcribed and nontranscribed regions, suggesting the existence of sequence elements that contribute to positioning. To identify such elements, we incorporated information from nucleosomal signatures into artificial synthetic DNA molecules and found that they generated regular nucleosomal arrays indistinguishable from those of endogenous sequences. Strikingly, this information is species-specific and can be combined with coding information through the use of synonymous codons such that genes from one species can be engineered to adopt the nucleosomal organization of another. These findings open the possibility of designing coding and noncoding DNA molecules capable of directing their own nucleosomal organization.
Collapse
Affiliation(s)
- Sara González
- Instituto de Biología Funcional y Genómica, Consejo Superior de Investigaciones Científicas (CSIC)/Universidad de Salamanca, 37007 Salamanca, Spain
| | - Alicia García
- Instituto de Biología Funcional y Genómica, Consejo Superior de Investigaciones Científicas (CSIC)/Universidad de Salamanca, 37007 Salamanca, Spain
| | - Enrique Vázquez
- Instituto de Biología Funcional y Genómica, Consejo Superior de Investigaciones Científicas (CSIC)/Universidad de Salamanca, 37007 Salamanca, Spain
| | - Rebeca Serrano
- Instituto de Biología Funcional y Genómica, Consejo Superior de Investigaciones Científicas (CSIC)/Universidad de Salamanca, 37007 Salamanca, Spain
| | - Mar Sánchez
- Instituto de Biología Funcional y Genómica, Consejo Superior de Investigaciones Científicas (CSIC)/Universidad de Salamanca, 37007 Salamanca, Spain
| | - Luis Quintales
- Instituto de Biología Funcional y Genómica, Consejo Superior de Investigaciones Científicas (CSIC)/Universidad de Salamanca, 37007 Salamanca, Spain.,Departamento de Informática y Automática, Universidad de Salamanca/Facultad de Ciencias, 37007 Salamanca, Spain
| | - Francisco Antequera
- Instituto de Biología Funcional y Genómica, Consejo Superior de Investigaciones Científicas (CSIC)/Universidad de Salamanca, 37007 Salamanca, Spain
| |
Collapse
|
20
|
Drillon G, Audit B, Argoul F, Arneodo A. Evidence of selection for an accessible nucleosomal array in human. BMC Genomics 2016; 17:526. [PMID: 27472913 PMCID: PMC4966569 DOI: 10.1186/s12864-016-2880-2] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2015] [Accepted: 07/04/2016] [Indexed: 11/13/2022] Open
Abstract
BACKGROUND Recently, a physical model of nucleosome formation based on sequence-dependent bending properties of the DNA double-helix has been used to reveal some enrichment of nucleosome-inhibiting energy barriers (NIEBs) nearby ubiquitous human "master" replication origins. Here we use this model to predict the existence of about 1.6 millions NIEBs over the 22 human autosomes. RESULTS We show that these high energy barriers of mean size 153 bp correspond to nucleosome-depleted regions (NDRs) in vitro, as expected, but also in vivo. On either side of these NIEBs, we observe, in vivo and in vitro, a similar compacted nucleosome ordering, suggesting an absence of chromatin remodeling. This nucleosomal ordering strongly correlates with oscillations of the GC content as well as with the interspecies and intraspecies mutation profiles along these regions. Comparison of these divergence rates reveals the existence of both positive and negative selections linked to nucleosome positioning around these intrinsic NDRs. Overall, these NIEBs and neighboring nucleosomes cover 37.5 % of the human genome where nucleosome occupancy is stably encoded in the DNA sequence. These 1 kb-sized regions of intrinsic nucleosome positioning are equally found in GC-rich and GC-poor isochores, in early and late replicating regions, in intergenic and genic regions but not at gene promoters. CONCLUSION The source of selection pressure on the NIEBs has yet to be resolved in future work. One possible scenario is that these widely distributed chromatin patterns have been selected in human to impair the condensation of the nucleosomal array into the 30 nm chromatin fiber, so as to facilitate the epigenetic regulation of nuclear functions in a cell-type-specific manner.
Collapse
Affiliation(s)
- Guénola Drillon
- Univ Lyon, Ens de Lyon, Univ Claude Bernard Lyon 1, CNRS, Laboratoire de Physique, Lyon, F-69342 France
| | - Benjamin Audit
- Univ Lyon, Ens de Lyon, Univ Claude Bernard Lyon 1, CNRS, Laboratoire de Physique, Lyon, F-69342 France
| | - Françoise Argoul
- Univ Lyon, Ens de Lyon, Univ Claude Bernard Lyon 1, CNRS, Laboratoire de Physique, Lyon, F-69342 France
- LOMA, Université de Bordeaux, CNRS, UMR 5798, 51 Cours de le Libération, Talence, F-33405 France
| | - Alain Arneodo
- Univ Lyon, Ens de Lyon, Univ Claude Bernard Lyon 1, CNRS, Laboratoire de Physique, Lyon, F-69342 France
- LOMA, Université de Bordeaux, CNRS, UMR 5798, 51 Cours de le Libération, Talence, F-33405 France
| |
Collapse
|
21
|
Zhang W, Li Y, Kulik M, Tiedemann RL, Robertson KD, Dalton S, Zhao S. Nucleosome positioning changes during human embryonic stem cell differentiation. Epigenetics 2016; 11:426-37. [PMID: 27088311 PMCID: PMC4939925 DOI: 10.1080/15592294.2016.1176649] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2015] [Revised: 02/10/2016] [Accepted: 03/26/2016] [Indexed: 10/21/2022] Open
Abstract
Nucleosomes are the basic unit of chromatin. Nucleosome positioning (NP) plays a key role in transcriptional regulation and other biological processes. To better understand NP we used MNase-seq to investigate changes that occur as human embryonic stem cells (hESCs) transition to nascent mesoderm and then to smooth muscle cells (SMCs). Compared to differentiated cell derivatives, nucleosome occupancy at promoters and other notable genic sites, such as exon/intron junctions and adjacent regions, in hESCs shows a stronger correlation with transcript abundance and is less influenced by sequence content. Upon hESC differentiation, genes being silenced, but not genes being activated, display a substantial change in nucleosome occupancy at their promoters. Genome-wide, we detected a shift of NP to regions of higher G+C content as hESCs differentiate to SMCs. Notably, genomic regions with higher nucleosome occupancy harbor twice as many G↔C changes but fewer than half A↔T changes, compared to regions with lower nucleosome occupancy. Finally, our analysis indicates that the hESC genome is not rearranged and has a sequence mutation rate resembling normal human genomes. Our study reveals another unique feature of hESC chromatin, and sheds light on the relationship between nucleosome occupancy and sequence G+C content.
Collapse
Affiliation(s)
- Wenjuan Zhang
- Department of Biochemistry and Molecular Biology, Institute of Bioinformatics, University of Georgia, Athens, GA, USA
| | - Yaping Li
- Department of Biochemistry and Molecular Biology, Institute of Bioinformatics, University of Georgia, Athens, GA, USA
| | - Michael Kulik
- Department of Biochemistry and Molecular Biology, Institute of Bioinformatics, University of Georgia, Athens, GA, USA
| | - Rochelle L. Tiedemann
- Department of Molecular Pharmacology and Experimental Therapeutics, Mayo Clinic, Rochester, MN, USA
| | - Keith D. Robertson
- Department of Molecular Pharmacology and Experimental Therapeutics, Mayo Clinic, Rochester, MN, USA
| | - Stephen Dalton
- Department of Biochemistry and Molecular Biology, Institute of Bioinformatics, University of Georgia, Athens, GA, USA
| | - Shaying Zhao
- Department of Biochemistry and Molecular Biology, Institute of Bioinformatics, University of Georgia, Athens, GA, USA
| |
Collapse
|
22
|
Gouda N, Shiwa Y, Akashi M, Yoshikawa H, Kasahara K, Furusawa M. Distribution of human single-nucleotide polymorphisms is approximated by the power law and represents a fractal structure. Genes Cells 2016; 21:396-407. [PMID: 27030000 DOI: 10.1111/gtc.12344] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/10/2015] [Accepted: 12/12/2015] [Indexed: 01/31/2023]
Abstract
Single-nucleotide polymorphisms (SNPs) are one of the main causes of evolution. The distribution of human SNPs, which were examined in detail genomewide, was analyzed. Three discrete databases of human SNPs were used for this analysis, and similar results were obtained from these databases. It was found that the distribution of the distance between SNPs was approximated by the power law, and the shape of the regions including SNPs had the so-called fractal structure. Although the reason why the distribution of SNPs obeys such a certain law of physics is unclear, a speculation was attempted in connection with the three-dimensional structure of human chromatin which has a fractal structure.
Collapse
Affiliation(s)
- Norio Gouda
- Department of Systems Medicine, Sakaguchi Laboratory, Keio University School of Medicine, 35 Shinanomachi, Shinjuku, Tokyo, 160-8582, Japan
| | - Yuh Shiwa
- Genome Research Center, NODAI Research Institute, Tokyo University of Agriculture, 1-1-1 Sakuragaoka, Setagaya-ku, Tokyo, 156-8502, Japan
| | - Motohiro Akashi
- Department of Bioscience, Tokyo University of Agriculture, 1-1-1 Sakuragaoka, Setagaya-ku, Tokyo, 156-8502, Japan
| | - Hirofumi Yoshikawa
- Genome Research Center, NODAI Research Institute, Tokyo University of Agriculture, 1-1-1 Sakuragaoka, Setagaya-ku, Tokyo, 156-8502, Japan.,Department of Bioscience, Tokyo University of Agriculture, 1-1-1 Sakuragaoka, Setagaya-ku, Tokyo, 156-8502, Japan
| | - Ken Kasahara
- Chitose Laboratory Corp., Biotechnology Research Center, 907 Nogawa, Miyamae-ku, Kawasaki, 216-0001, Japan
| | - Mitsuru Furusawa
- Chitose Laboratory Corp., Biotechnology Research Center, 907 Nogawa, Miyamae-ku, Kawasaki, 216-0001, Japan
| |
Collapse
|
23
|
Quintales L, Soriano I, Vázquez E, Segurado M, Antequera F. A species-specific nucleosomal signature defines a periodic distribution of amino acids in proteins. Open Biol 2016; 5:140218. [PMID: 25854683 PMCID: PMC4422121 DOI: 10.1098/rsob.140218] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022] Open
Abstract
Nucleosomes are the basic structural units of chromatin. Most of the yeast genome is organized in a pattern of positioned nucleosomes that is stably maintained under a wide range of physiological conditions. In this work, we have searched for sequence determinants associated with positioned nucleosomes in four species of fission and budding yeasts. We show that mononucleosomal DNA follows a highly structured base composition pattern, which differs among species despite the high degree of histone conservation. These nucleosomal signatures are present in transcribed and non-transcribed regions across the genome. In the case of open reading frames, they correctly predict the relative distribution of codons on mononucleosomal DNA, and they also determine a periodicity in the average distribution of amino acids along the proteins. These results establish a direct and species-specific connection between the position of each codon around the histone octamer and protein composition.
Collapse
Affiliation(s)
- Luis Quintales
- Instituto de Biología Funcional y Genómica, Consejo Superior de Investigaciones Científicas (CSIC)/Universidad de Salamanca, Campus Miguel de Unamuno, 37007 Salamanca, Spain
| | - Ignacio Soriano
- Instituto de Biología Funcional y Genómica, Consejo Superior de Investigaciones Científicas (CSIC)/Universidad de Salamanca, Campus Miguel de Unamuno, 37007 Salamanca, Spain
| | - Enrique Vázquez
- Instituto de Biología Funcional y Genómica, Consejo Superior de Investigaciones Científicas (CSIC)/Universidad de Salamanca, Campus Miguel de Unamuno, 37007 Salamanca, Spain
| | - Mónica Segurado
- Instituto de Biología Funcional y Genómica, Consejo Superior de Investigaciones Científicas (CSIC)/Universidad de Salamanca, Campus Miguel de Unamuno, 37007 Salamanca, Spain
| | - Francisco Antequera
- Instituto de Biología Funcional y Genómica, Consejo Superior de Investigaciones Científicas (CSIC)/Universidad de Salamanca, Campus Miguel de Unamuno, 37007 Salamanca, Spain
| |
Collapse
|
24
|
Glastad KM, Goodisman MAD, Yi SV, Hunt BG. Effects of DNA Methylation and Chromatin State on Rates of Molecular Evolution in Insects. G3 (BETHESDA, MD.) 2015; 6:357-63. [PMID: 26637432 PMCID: PMC4751555 DOI: 10.1534/g3.115.023499] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/06/2015] [Accepted: 11/30/2015] [Indexed: 01/03/2023]
Abstract
Epigenetic information is widely appreciated for its role in gene regulation in eukaryotic organisms. However, epigenetic information can also influence genome evolution. Here, we investigate the effects of epigenetic information on gene sequence evolution in two disparate insects: the fly Drosophila melanogaster, which lacks substantial DNA methylation, and the ant Camponotus floridanus, which possesses a functional DNA methylation system. We found that DNA methylation was positively correlated with the synonymous substitution rate in C. floridanus, suggesting a key effect of DNA methylation on patterns of gene evolution. However, our data suggest the link between DNA methylation and elevated rates of synonymous substitution was explained, in large part, by the targeting of DNA methylation to genes with signatures of transcriptionally active chromatin, rather than the mutational effect of DNA methylation itself. This phenomenon may be explained by an elevated mutation rate for genes residing in transcriptionally active chromatin, or by increased structural constraints on genes in inactive chromatin. This result highlights the importance of chromatin structure as the primary epigenetic driver of genome evolution in insects. Overall, our study demonstrates how different epigenetic systems contribute to variation in the rates of coding sequence evolution.
Collapse
Affiliation(s)
- Karl M Glastad
- School of Biology, Georgia Institute of Technology, Atlanta, Georgia 30332
| | | | - Soojin V Yi
- School of Biology, Georgia Institute of Technology, Atlanta, Georgia 30332
| | - Brendan G Hunt
- Department of Entomology, University of Georgia, Griffin, Georgia 30223
| |
Collapse
|
25
|
Nakatani Y, Mello CC, Hashimoto SI, Shimada A, Nakamura R, Tsukahara T, Qu W, Yoshimura J, Suzuki Y, Sugano S, Takeda H, Fire A, Morishita S. Associations between nucleosome phasing, sequence asymmetry, and tissue-specific expression in a set of inbred Medaka species. BMC Genomics 2015; 16:978. [PMID: 26584643 PMCID: PMC4653950 DOI: 10.1186/s12864-015-2198-5] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2015] [Accepted: 11/07/2015] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Transcription start sites (TSSs) with pronounced and phased nucleosome arrays downstream and nucleosome-depleted regions upstream of TSSs are observed in various species. RESULTS We have characterized sequence variation and expression properties of this set of TSSs (which we call "Nucleocyclic TSSs") using germline and somatic cells of three medaka (Oryzias latipes) inbred isolates from different locations. We found nucleocyclic TSSs in medaka to be associated with higher gene expression and characterized by a clear boundary in sequence composition with potentially-nucleosome-destabilizing A/T-enrichment upstream (p < 10(-60)) and nucleosome- accommodating C/G-enrichment downstream (p < 10(-40)) that was highly conserved from an ancestor. A substantial genetic distance between the strains facilitated the in-depth analysis of patterns of fixed mutations, revealing a localization-specific equilibrium between the rates of distinct mutation categories that would serve to maintain the conserved sequence anisotropy around TSSs. Downstream of nucleocyclic TSSs, C to T, T to C, and other mutation rates on the sense strand increased around first nucleosome dyads and decreased around first linkers, which contrasted with genomewide mutational patterns around nucleosomes (p < 5 %). C to T rates are higher than G to A rates around nucleosome associated with germline nucleocyclic TSS sites (p < 5 %), potentially due to the asymmetric effect of transcription-coupled repair. CONCLUSIONS Our results demonstrate an atypical evolutionary process surrounding nucleocyclic TSSs.
Collapse
Affiliation(s)
- Yoichiro Nakatani
- Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, The University of Tokyo, Kashiwa, 277-0882, Japan.
| | - Cecilia C Mello
- Department of Pathology, School of Medicine, Stanford University, Stanford, CA, 94305-5324, USA.
| | - Shin-Ichi Hashimoto
- Graduate School of Medical Sciences, Kanazawa University, Kanazawa, 920-1192, Japan.
| | - Atsuko Shimada
- Department of Biological Sciences, Graduate School of Science, The University of Tokyo, Tokyo, 113-0033, Japan.
| | - Ryohei Nakamura
- Department of Biological Sciences, Graduate School of Science, The University of Tokyo, Tokyo, 113-0033, Japan.
| | - Tatsuya Tsukahara
- Department of Biological Sciences, Graduate School of Science, The University of Tokyo, Tokyo, 113-0033, Japan.
| | - Wei Qu
- Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, The University of Tokyo, Kashiwa, 277-0882, Japan.
| | - Jun Yoshimura
- Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, The University of Tokyo, Kashiwa, 277-0882, Japan.
| | - Yutaka Suzuki
- Department of Medical Genome Sciences, Graduate School of Frontier Sciences, The University of Tokyo, Tokyo, 108-8639, Japan.
| | - Sumio Sugano
- Department of Medical Genome Sciences, Graduate School of Frontier Sciences, The University of Tokyo, Tokyo, 108-8639, Japan.
| | - Hiroyuki Takeda
- Department of Biological Sciences, Graduate School of Science, The University of Tokyo, Tokyo, 113-0033, Japan.
| | - Andrew Fire
- Departments of Pathology and Genetics, School of Medicine, Stanford University, Stanford, CA, 94305-5324, USA.
| | - Shinichi Morishita
- Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, The University of Tokyo, Kashiwa, 277-0882, Japan.
| |
Collapse
|
26
|
Abstract
DNA damage is a constant threat to cells, causing cytotoxicity as well as inducing genetic alterations. The steady-state abundance of DNA lesions in a cell is minimized by a variety of DNA repair mechanisms, including DNA strand break repair, mismatch repair, nucleotide excision repair, base excision repair, and ribonucleotide excision repair. The efficiencies and mechanisms by which these pathways remove damage from chromosomes have been primarily characterized by investigating the processing of lesions at defined genomic loci, among bulk genomic DNA, on episomal DNA constructs, or using in vitro substrates. However, the structure of a chromosome is heterogeneous, consisting of heavily protein-bound heterochromatic regions, open regulatory regions, actively transcribed genes, and even areas of transient single stranded DNA. Consequently, DNA repair pathways function in a much more diverse set of chromosomal contexts than can be readily assessed using previous methods. Recent efforts to develop whole genome maps of DNA damage, repair processes, and even mutations promise to greatly expand our understanding of DNA repair and mutagenesis. Here we review the current efforts to utilize whole genome maps of DNA damage and mutation to understand how different chromosomal contexts affect DNA excision repair pathways.
Collapse
Affiliation(s)
- John J Wyrick
- School of Molecular Biosciences, Washington State University, Pullman, WA 99164, USA; Center for Reproductive Biology, Washington State University, Pullman, WA 99164, USA.
| | - Steven A Roberts
- School of Molecular Biosciences, Washington State University, Pullman, WA 99164, USA.
| |
Collapse
|
27
|
Vázquez E, Antequera F. Replication dynamics in fission and budding yeasts through DNA polymerase tracking. Bioessays 2015; 37:1067-73. [PMID: 26293347 PMCID: PMC5054902 DOI: 10.1002/bies.201500072] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023]
Abstract
The dynamics of eukaryotic DNA polymerases has been difficult to establish because of the difficulty of tracking them along the chromosomes during DNA replication. Recent work has addressed this problem in the yeasts Schizosaccharomyces pombe and Saccharomyces cerevisiae through the engineering of replicative polymerases to render them prone to incorporating ribonucleotides at high rates. Their use as tracers of the passage of each polymerase has provided a picture of unprecedented resolution of the organization of replicons and replication origins in the two yeasts and has uncovered important differences between them. Additional studies have found an overlapping distribution of DNA polymorphisms and the junctions of Okazaki fragments along mononucleosomal DNA. This sequence instability is caused by the premature release of polymerase δ and the retention of non proof‐read DNA tracts replicated by polymerase α. The possible implementation of these new experimental approaches in multicellular organisms opens the door to the analysis of replication dynamics under a broad range of genetic backgrounds and physiological or pathological conditions.
Collapse
Affiliation(s)
- Enrique Vázquez
- Instituto de Biología, Funcional y Genómica (IBFG), Consejo Superior de Investigaciones Científicas (CSIC), Universidad de Salamanca, Campus Miguel de Unamuno, Salamanca, Spain
| | - Francisco Antequera
- Instituto de Biología, Funcional y Genómica (IBFG), Consejo Superior de Investigaciones Científicas (CSIC), Universidad de Salamanca, Campus Miguel de Unamuno, Salamanca, Spain
| |
Collapse
|
28
|
Zhang T, Zhang W, Jiang J. Genome-Wide Nucleosome Occupancy and Positioning and Their Impact on Gene Expression and Evolution in Plants. PLANT PHYSIOLOGY 2015; 168:1406-16. [PMID: 26143253 PMCID: PMC4528733 DOI: 10.1104/pp.15.00125] [Citation(s) in RCA: 76] [Impact Index Per Article: 7.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/27/2015] [Accepted: 07/03/2015] [Indexed: 05/03/2023]
Abstract
The fundamental unit of chromatin is the nucleosome that consists of a protein octamer composed of the four core histones (Hs; H3, H4, H2A, and H2B) wrapped by 147 bp of DNA. Nucleosome occupancy and positioning have proven to be dynamic and have a critical impact on expression, regulation, and evolution of eukaryotic genes. We developed nucleosome occupancy and positioning data sets using leaf tissue of rice (Oryza sativa) and both leaf and flower tissues of Arabidopsis (Arabidopsis thaliana). We show that model plant and animal species share the fundamental characteristics associated with nucleosome dynamics. Only 12% and 16% of the Arabidopsis and rice genomes, respectively, were occupied by well-positioned nucleosomes. The cores of positioned nucleosomes were enriched with G/C dinucleotides and showed a lower C→T mutation rate than the linker sequences. We discovered that nucleosomes associated with heterochromatic regions were more spaced with longer linkers than those in euchromatic regions in both plant species. Surprisingly, different nucleosome densities were found to be associated with chromatin in leaf and flower tissues in Arabidopsis. We show that deep MNase-seq data sets can be used to map nucleosome occupancy of specific genomic loci and reveal gene expression patterns correlated with chromatin dynamics in plant genomes.
Collapse
Affiliation(s)
- Tao Zhang
- Department of Horticulture, University of Wisconsin, Madison, Wisconsin 53706 (T.Z., W.Z., J.J.); andNational Key Laboratory for Crop Genetics and Germplasm Enhancement, Nanjing Agricultural University, Nanjing 210095, China (W.Z.)
| | - Wenli Zhang
- Department of Horticulture, University of Wisconsin, Madison, Wisconsin 53706 (T.Z., W.Z., J.J.); andNational Key Laboratory for Crop Genetics and Germplasm Enhancement, Nanjing Agricultural University, Nanjing 210095, China (W.Z.)
| | - Jiming Jiang
- Department of Horticulture, University of Wisconsin, Madison, Wisconsin 53706 (T.Z., W.Z., J.J.); andNational Key Laboratory for Crop Genetics and Germplasm Enhancement, Nanjing Agricultural University, Nanjing 210095, China (W.Z.)
| |
Collapse
|
29
|
Abstract
Species survival depends on the faithful replication of genetic information, which is continually monitored and maintained by DNA repair pathways that correct replication errors and the thousands of lesions that arise daily from the inherent chemical lability of DNA and the effects of genotoxic agents. Nonetheless, neutrally evolving DNA (not under purifying selection) accumulates base substitutions with time (the neutral mutation rate). Thus, repair processes are not 100% efficient. The neutral mutation rate varies both between and within chromosomes. For example it is 10-50 fold higher at CpGs than at non-CpG positions. Interestingly, the neutral mutation rate at non-CpG sites is positively correlated with CpG content. Although the basis of this correlation was not immediately apparent, some bioinformatic results were consistent with the induction of non-CpG mutations by DNA repair at flanking CpG sites. Recent studies with a model system showed that in vivo repair of preformed lesions (mismatches, abasic sites, single stranded nicks) can in fact induce mutations in flanking DNA. Mismatch repair (MMR) is an essential component for repair-induced mutations, which can occur as distant as 5 kb from the introduced lesions. Most, but not all, mutations involved the C of TpCpN (G of NpGpA) which is the target sequence of the C-preferring single-stranded DNA specific APOBEC deaminases. APOBEC-mediated mutations are not limited to our model system: Recent studies by others showed that some tumors harbor mutations with the same signature, as can intermediates in RNA-guided endonuclease-mediated genome editing. APOBEC deaminases participate in normal physiological functions such as generating mutations that inactivate viruses or endogenous retrotransposons, or that enhance immunoglobulin diversity in B cells. The recruitment of normally physiological error-prone processes during DNA repair would have important implications for disease, aging and evolution. This perspective briefly reviews both the bioinformatic and biochemical literature relevant to repair-induced mutagenesis and discusses future directions required to understand the mechanistic basis of this process.
Collapse
Affiliation(s)
- Jia Chen
- School of Life Science and Technology, ShanghaiTech University, Building 8, 319 Yueyang Road, Shanghai 200031, China
| | - Anthony V Furano
- Section on Genomic Structure and Function, Laboratory of Cell and Molecular Biology, National Institute of Diabetes and Digestive and Kidney Diseases, National Institutes of Health, Building 8, Room 203, 8 Center Drive, MSC 0830, Bethesda, MD 20892-0830, USA.
| |
Collapse
|
30
|
Makova KD, Hardison RC. The effects of chromatin organization on variation in mutation rates in the genome. Nat Rev Genet 2015; 16:213-23. [PMID: 25732611 PMCID: PMC4500049 DOI: 10.1038/nrg3890] [Citation(s) in RCA: 160] [Impact Index Per Article: 16.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
The variation in local rates of mutations can affect both the evolution of genes and their function in normal and cancer cells. Deciphering the molecular determinants of this variation will be aided by the elucidation of distinct types of mutations, as they differ in regional preferences and in associations with genomic features. Chromatin organization contributes to regional variation in mutation rates, but its contribution differs among mutation types. In both germline and somatic mutations, base substitutions are more abundant in regions of closed chromatin, perhaps reflecting error accumulation late in replication. By contrast, a distinctive mutational state with very high levels of insertions and deletions (indels) and substitutions is enriched in regions of open chromatin. These associations indicate an intricate interplay between the nucleotide sequence of DNA and its dynamic packaging into chromatin, and have important implications for current biomedical research. This Review focuses on recent studies showing associations between chromatin state and mutation rates, including pairwise and multivariate investigations of germline and somatic (particularly cancer) mutations.
Collapse
Affiliation(s)
- Kateryna D Makova
- Department of Biology, Huck Institute for Genome Sciences, The Pennsylvania State University, University Park, State College, Pennsylvania 16802, USA
| | - Ross C Hardison
- Department of Biochemistry and Molecular Biology, Huck Institute for Genome Sciences, The Pennsylvania State University, University Park, State College, Pennsylvania 16802, USA
| |
Collapse
|
31
|
Drillon G, Audit B, Argoul F, Arneodo A. Ubiquitous human 'master' origins of replication are encoded in the DNA sequence via a local enrichment in nucleosome excluding energy barriers. JOURNAL OF PHYSICS. CONDENSED MATTER : AN INSTITUTE OF PHYSICS JOURNAL 2015; 27:064102. [PMID: 25563930 DOI: 10.1088/0953-8984/27/6/064102] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/04/2023]
Abstract
As the elementary building block of eukaryotic chromatin, the nucleosome is at the heart of the compromise between the necessity of compacting DNA in the cell nucleus and the required accessibility to regulatory proteins. The recent availability of genome-wide experimental maps of nucleosome positions for many different organisms and cell types has provided an unprecedented opportunity to elucidate to what extent the DNA sequence conditions the primary structure of chromatin and in turn participates in the chromatin-mediated regulation of nuclear functions, such as gene expression and DNA replication. In this study, we use in vivo and in vitro genome-wide nucleosome occupancy data together with the set of nucleosome-free regions (NFRs) predicted by a physical model of nucleosome formation based on sequence-dependent bending properties of the DNA double-helix, to investigate the role of intrinsic nucleosome occupancy in the regulation of the replication spatio-temporal programme in human. We focus our analysis on the so-called replication U/N-domains that were shown to cover about half of the human genome in the germline (skew-N domains) as well as in embryonic stem cells, somatic and HeLa cells (mean replication timing U-domains). The 'master' origins of replication (MaOris) that border these megabase-sized U/N-domains were found to be specified by a few hundred kb wide regions that are hyper-sensitive to DNase I cleavage, hypomethylated, and enriched in epigenetic marks involved in transcription regulation, the hallmarks of localized open chromatin structures. Here we show that replication U/N-domain borders that are conserved in all considered cell lines have an environment highly enriched in nucleosome-excluding-energy barriers, suggesting that these ubiquitous MaOris have been selected during evolution. In contrast, MaOris that are cell-type-specific are mainly regulated epigenetically and are no longer favoured by a local abundance of intrinsic NFRs encoded in the DNA sequence. At the smaller few hundred bp scale of gene promoters, CpG-rich promoters of housekeeping genes found nearby ubiquitous MaOris as well as CpG-poor promoters of tissue-specific genes found nearby cell-type-specific MaOris, both correspond to in vivo NFRs that are not coded as nucleosome-excluding-energy barriers. Whereas the former promoters are likely to correspond to high occupancy transcription factor binding regions, the latter are an illustration that gene regulation in human is typically cell-type-specific.
Collapse
Affiliation(s)
- Guénola Drillon
- Université de Lyon, F-69000 Lyon, France. Laboratoire de Physique, CNRS UMR 5672, École Normale Supérieure de Lyon, F-69007 Lyon, France
| | | | | | | |
Collapse
|
32
|
Lagging-strand replication shapes the mutational landscape of the genome. Nature 2015; 518:502-506. [PMID: 25624100 PMCID: PMC4374164 DOI: 10.1038/nature14183] [Citation(s) in RCA: 184] [Impact Index Per Article: 18.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2014] [Accepted: 01/05/2015] [Indexed: 12/21/2022]
Abstract
The origin of mutations is central to understanding evolution and of key relevance to health. Variation occurs non-randomly across the genome, and mechanisms for this remain to be defined. Here, we report that the 5′-ends of Okazaki fragments have significantly elevated levels of nucleotide substitution, indicating a replicative origin for such mutations. With a novel method, emRiboSeq, we map the genome-wide contribution of polymerases, and show that despite Okazaki fragment processing, DNA synthesised by error-prone Pol-α is retained in vivo, comprising ~1.5% of the mature genome. We propose that DNA-binding proteins that rapidly re-associate post-replication act as partial barriers to Pol-δ mediated displacement of Pol-α synthesised DNA, resulting in incorporation of such Pol-α tracts and elevated mutation rates at specific sites. We observe a mutational cost to chromatin and regulatory protein binding, resulting in mutation hotspots at regulatory elements, with signatures of this process detectable in both yeast and humans.
Collapse
|
33
|
Chromatin structure is distinct between coding and non-coding single nucleotide polymorphisms. BMC Mol Biol 2014; 15:22. [PMID: 25282079 PMCID: PMC4193957 DOI: 10.1186/1471-2199-15-22] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2014] [Accepted: 09/26/2014] [Indexed: 12/20/2022] Open
Abstract
Background Previous studies suggested that nucleosomes are enriched with single nucleotide polymorphisms (SNPs) in humans and that the occurrence of mutations is closely associated with CpG dinucleotides. We aimed to determine if the chromatin organization is genomic locus specific around SNPs, and if newly occurring mutations are associated with SNPs. Results Here, we classified SNPs according their loci and investigated chromatin organization in both CD4+ T cell and lymphoblastoid cell in humans. We calculated the SNP frequency around somatic mutations. The results indicated that nucleosome occupancy is different around SNPs sites in different genomic loci. Coding SNPs are mainly enriched at nucleosomes and associated with repressed histone modifications (HMs) and DNA methylation. Contrastingly, intron SNPs occur in nucleosome-depleted regions and lack HMs. Interestingly, risk-associated non-coding SNPs are also enriched at nucleosomes with HMs but associated with low GC-content and low DNA methylation level. The base-transversion allele frequency is significantly low in coding-synonymous SNPs (P < 10-11). Another finding is that at the -1 and +1 positions relative to the somatic mutation sites, the SNP frequency was significantly higher (P < 3.2 × 10-5). Conclusions The results suggested chromatin structure is different around coding SNPs and non-coding SNPs. New mutations tend to occur at the -1 and +1 position immediately near the SNPs.
Collapse
|
34
|
Abstract
Mutational heterogeneity must be taken into account when reconstructing evolutionary histories, calibrating molecular clocks, and predicting links between genes and disease. Selective pressures and various DNA transactions have been invoked to explain the heterogeneous distribution of genetic variation between species, within populations, and in tissue-specific tumors. To examine relationships between such heterogeneity and variations in leading- and lagging-strand replication fidelity and mismatch repair, we accumulated 40,000 spontaneous mutations in eight diploid yeast strains in the absence of selective pressure. We found that replicase error rates vary by fork direction, coding state, nucleosome proximity, and sequence context. Further, error rates and DNA mismatch repair efficiency both vary by mismatch type, responsible polymerase, replication time, and replication origin proximity. Mutation patterns implicate replication infidelity as one driver of variation in somatic and germline evolution, suggest mechanisms of mutual modulation of genome stability and composition, and predict future observations in specific cancers.
Collapse
|
35
|
Babbitt GA, Alawad MA, Schulze KV, Hudson AO. Synonymous codon bias and functional constraint on GC3-related DNA backbone dynamics in the prokaryotic nucleoid. Nucleic Acids Res 2014; 42:10915-26. [PMID: 25200075 PMCID: PMC4176184 DOI: 10.1093/nar/gku811] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023] Open
Abstract
While mRNA stability has been demonstrated to control rates of translation, generating both global and local synonymous codon biases in many unicellular organisms, this explanation cannot adequately explain why codon bias strongly tracks neighboring intergene GC content; suggesting that structural dynamics of DNA might also influence codon choice. Because minor groove width is highly governed by 3-base periodicity in GC, the existence of triplet-based codons might imply a functional role for the optimization of local DNA molecular dynamics via GC content at synonymous sites (≈GC3). We confirm a strong association between GC3-related intrinsic DNA flexibility and codon bias across 24 different prokaryotic multiple whole-genome alignments. We develop a novel test of natural selection targeting synonymous sites and demonstrate that GC3-related DNA backbone dynamics have been subject to moderate selective pressure, perhaps contributing to our observation that many genes possess extreme DNA backbone dynamics for their given protein space. This dual function of codons may impose universal functional constraints affecting the evolution of synonymous and non-synonymous sites. We propose that synonymous sites may have evolved as an 'accessory' during an early expansion of a primordial genetic code, allowing for multiplexed protein coding and structural dynamic information within the same molecular context.
Collapse
Affiliation(s)
- Gregory A Babbitt
- Thomas H. Gosnell School of Life Sciences, Rochester Institute of Technology, Rochester NY, USA 14623
| | - Mohammed A Alawad
- B. Thomas Golisano College of Computing and Information Sciences, Rochester Institute of Technology, Rochester NY, USA 14623
| | - Katharina V Schulze
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston TX, USA 77030
| | - André O Hudson
- Thomas H. Gosnell School of Life Sciences, Rochester Institute of Technology, Rochester NY, USA 14623
| |
Collapse
|
36
|
Nucleosomes shape DNA polymorphism and divergence. PLoS Genet 2014; 10:e1004457. [PMID: 24991813 PMCID: PMC4081404 DOI: 10.1371/journal.pgen.1004457] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2013] [Accepted: 05/12/2014] [Indexed: 11/30/2022] Open
Abstract
An estimated 80% of genomic DNA in eukaryotes is packaged as nucleosomes, which, together with the remaining interstitial linker regions, generate higher order chromatin structures [1]. Nucleosome sequences isolated from diverse organisms exhibit ∼10 bp periodic variations in AA, TT and GC dinucleotide frequencies. These sequence elements generate intrinsically curved DNA and help establish the histone-DNA interface. We investigated an important unanswered question concerning the interplay between chromatin organization and genome evolution: do the DNA sequence preferences inherent to the highly conserved histone core exert detectable natural selection on genomic divergence and polymorphism? To address this hypothesis, we isolated nucleosomal DNA sequences from Drosophila melanogaster embryos and examined the underlying genomic variation within and between species. We found that divergence along the D. melanogaster lineage is periodic across nucleosome regions with base changes following preferred nucleotides, providing new evidence for systematic evolutionary forces in the generation and maintenance of nucleosome-associated dinucleotide periodicities. Further, Single Nucleotide Polymorphism (SNP) frequency spectra show striking periodicities across nucleosomal regions, paralleling divergence patterns. Preferred alleles occur at higher frequencies in natural populations, consistent with a central role for natural selection. These patterns are stronger for nucleosomes in introns than in intergenic regions, suggesting selection is stronger in transcribed regions where nucleosomes undergo more displacement, remodeling and functional modification. In addition, we observe a large-scale (∼180 bp) periodic enrichment of AA/TT dinucleotides associated with nucleosome occupancy, while GC dinucleotide frequency peaks in linker regions. Divergence and polymorphism data also support a role for natural selection in the generation and maintenance of these super-nucleosomal patterns. Our results demonstrate that nucleosome-associated sequence periodicities are under selective pressure, implying that structural interactions between nucleosomes and DNA sequence shape sequence evolution, particularly in introns. In eukaryotic cells, the majority of DNA is packaged in nucleosomes comprised of ∼147 bp of DNA wound tightly around the highly conserved histone octamer. Nucleosomal DNA from diverse organisms shows an anti-correlated ∼10 bp periodicity of AT-rich and GC-rich dinucleotides. These sequence features influence DNA bending and shape, facilitating structural interactions. We asked whether natural selection mediated through the periodic sequence preferences of nucleosomes shapes the evolution of non-protein-coding regions of D. melanogaster by examining the inter- and intra-species genomic variation relative to these fundamental chromatin building blocks. The sequence changes across nucleosome-bound regions on the melanogaster lineage mirror the observed nucleosome dinucleotide periodicities. Importantly, we show that the frequencies of polymorphisms in natural populations vary across these regions, paralleling divergence, with higher frequencies of preferred alleles. These patterns are most evident for intronic regions and indicate that non-protein coding regions are evolving toward sequences that facilitate the canonical association with the histone core. This result is consistent with the hypothesis that interactions between DNA and the core have systematic impacts on function that are subject to natural selection and are not solely due to mutational bias. These ubiquitous interactions with the histone core partially account for the evolutionary constraint observed in unannotated genomic regions, and may drive broad changes in base composition.
Collapse
|
37
|
Li MJ, Yan B, Sham PC, Wang J. Exploring the function of genetic variants in the non-coding genomic regions: approaches for identifying human regulatory variants affecting gene expression. Brief Bioinform 2014; 16:393-412. [PMID: 24916300 DOI: 10.1093/bib/bbu018] [Citation(s) in RCA: 45] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2014] [Accepted: 04/23/2014] [Indexed: 12/13/2022] Open
Abstract
Understanding the genetic basis of human traits/diseases and the underlying mechanisms of how these traits/diseases are affected by genetic variations is critical for public health. Current genome-wide functional genomics data uncovered a large number of functional elements in the noncoding regions of human genome, providing new opportunities to study regulatory variants (RVs). RVs play important roles in transcription factor bindings, chromatin states and epigenetic modifications. Here, we systematically review an array of methods currently used to map RVs as well as the computational approaches in annotating and interpreting their regulatory effects, with emphasis on regulatory single-nucleotide polymorphism. We also briefly introduce experimental methods to validate these functional RVs.
Collapse
|
38
|
Juan D, Rico D, Marques-Bonet T, Fernández-Capetillo Ó, Valencia A. Late-replicating CNVs as a source of new genes. Biol Open 2013; 2:1402-11. [PMID: 24285712 PMCID: PMC3863426 DOI: 10.1242/bio.20136924] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2013] [Accepted: 10/23/2013] [Indexed: 01/09/2023] Open
Abstract
Asynchronous replication of the genome has been associated with different rates of point mutation and copy number variation (CNV) in human populations. Here, our aim was to investigate whether the bias in the generation of CNV that is associated with DNA replication timing might have conditioned the birth of new protein-coding genes during evolution. We show that genes that were duplicated during primate evolution are more commonly found among the human genes located in late-replicating CNV regions. We traced the relationship between replication timing and the evolutionary age of duplicated genes. Strikingly, we found that there is a significant enrichment of evolutionary younger duplicates in late-replicating regions of the human and mouse genome. Indeed, the presence of duplicates in late-replicating regions gradually decreases as the evolutionary time since duplication extends. Our results suggest that the accumulation of recent duplications in late-replicating CNV regions is an active process influencing genome evolution.
Collapse
Affiliation(s)
- David Juan
- Structural Biology and BioComputing Programme, Spanish National Cancer Research Center (CNIO), Melchor Fernández Almagro 3, 28029 Madrid, Spain
| | - Daniel Rico
- Structural Biology and BioComputing Programme, Spanish National Cancer Research Center (CNIO), Melchor Fernández Almagro 3, 28029 Madrid, Spain
| | - Tomas Marques-Bonet
- Institut Catala de Recerca i Estudis Avancats (ICREA) and Institut de Biologia Evolutiva (UPF/CSIC), Dr Aiguader 88, PRBB, 08003 Barcelona, Spain
| | - Óscar Fernández-Capetillo
- Structural Biology and BioComputing Programme, Spanish National Cancer Research Center (CNIO), Melchor Fernández Almagro 3, 28029 Madrid, Spain
| | - Alfonso Valencia
- Structural Biology and BioComputing Programme, Spanish National Cancer Research Center (CNIO), Melchor Fernández Almagro 3, 28029 Madrid, Spain
| |
Collapse
|
39
|
Warnecke T, Becker EA, Facciotti MT, Nislow C, Lehner B. Conserved substitution patterns around nucleosome footprints in eukaryotes and Archaea derive from frequent nucleosome repositioning through evolution. PLoS Comput Biol 2013; 9:e1003373. [PMID: 24278010 PMCID: PMC3836710 DOI: 10.1371/journal.pcbi.1003373] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2013] [Accepted: 10/13/2013] [Indexed: 11/21/2022] Open
Abstract
Nucleosomes, the basic repeat units of eukaryotic chromatin, have been suggested to influence the evolution of eukaryotic genomes, both by altering the propensity of DNA to mutate and by selection acting to maintain or exclude nucleosomes in particular locations. Contrary to the popular idea that nucleosomes are unique to eukaryotes, histone proteins have also been discovered in some archaeal genomes. Archaeal nucleosomes, however, are quite unlike their eukaryotic counterparts in many respects, including their assembly into tetramers (rather than octamers) from histone proteins that lack N- and C-terminal tails. Here, we show that despite these fundamental differences the association between nucleosome footprints and sequence evolution is strikingly conserved between humans and the model archaeon Haloferax volcanii. In light of this finding we examine whether selection or mutation can explain concordant substitution patterns in the two kingdoms. Unexpectedly, we find that neither the mutation nor the selection model are sufficient to explain the observed association between nucleosomes and sequence divergence. Instead, we demonstrate that nucleosome-associated substitution patterns are more consistent with a third model where sequence divergence results in frequent repositioning of nucleosomes during evolution. Indeed, we show that nucleosome repositioning is both necessary and largely sufficient to explain the association between current nucleosome positions and biased substitution patterns. This finding highlights the importance of considering the direction of causality between genetic and epigenetic change. Genome sequences as well as epigenetic states, such as DNA methylation or nucleosome binding patterns, change during evolution. But what is the causal relationship between the two? We already know that nucleotide variation within and between species is distributed unevenly around nucleosome footprints, but does this mean that sequence evolution follows a biased course because the presence of nucleosomes affects mutation and DNA repair dynamics? Or is it, in fact, the other way around, i.e. changes happen at the DNA level and prompt shifts in nucleosome positioning? To investigate the direction of causality in genetic versus epigenetic evolution, we analyze substitutions patterns in eukaryotes as well as the archaeon Haloferax volcanii in the context of genome-wide nucleosome binding maps. We demonstrate that the relationship between nucleosome positions and between-species divergence patterns, strikingly similar in eukaryotes and archaea, can be explained in large parts by nucleosomes shifting positions in response to substitution, although both mutation and selection biases might still exist. Our results illustrate that it is important to consider the direction of causality between epigenetic and genetic change when analyzing patterns of sequence divergence and using sequence conservation to infer selection on epigenetic states.
Collapse
Affiliation(s)
- Tobias Warnecke
- Bioinformatics and Genomics Program, Centre for Genomic Regulation (CRG) and UPF, Barcelona, Spain
- Universitat Pompeu Fabra (UPF), Barcelona, Spain
- * E-mail:
| | - Erin A. Becker
- Microbiology Graduate Group, University of California, Davis, Davis, California, United States of America
| | - Marc T. Facciotti
- Microbiology Graduate Group, University of California, Davis, Davis, California, United States of America
- Department of Biomedical Engineering, University of California, Davis, Davis, California, United States of America
- Genome Center, University of California, Davis, Davis, California, United States of America
| | - Corey Nislow
- Department of Pharmaceutical Sciences, University of British Columbia, Vancouver, British Columbia, Canada
| | - Ben Lehner
- Universitat Pompeu Fabra (UPF), Barcelona, Spain
- EMBL-CRG Systems Biology Unit, Centre for Genomic Regulation (CRG), Barcelona, Spain
- Institució Catalana de Recerca i Estudis Avançats, Centre for Genomic Regulation (CRG) and UPF, Barcelona, Spain
| |
Collapse
|
40
|
Abstract
Histone-DNA complexes, so-called nucleosomes, are the building blocks of DNA packaging in eukaryotic cells. The histone-binding affinity of a local DNA segment depends on its elastic properties and determines its accessibility within the nucleus, which plays an important role in the regulation of gene expression. Here, we derive a fitness landscape for intergenic DNA segments in yeast as a function of two molecular phenotypes: their elasticity-dependent histone affinity and their coverage with transcription factor binding sites. This landscape reveals substantial selection against nucleosome formation over a wide range of both phenotypes. We use it as the core component of a quantitative evolutionary model for intergenic DNA segments. This model consistently predicts the observed diversity of histone affinities within wild Saccharomyces paradoxus populations, as well as the affinity divergence between neighboring Saccharomyces species. Our analysis establishes histone binding and transcription factor binding as two separable modes of sequence evolution, each of which is a direct target of natural selection.
Collapse
|
41
|
Strong purifying selection at synonymous sites in D. melanogaster. PLoS Genet 2013; 9:e1003527. [PMID: 23737754 PMCID: PMC3667748 DOI: 10.1371/journal.pgen.1003527] [Citation(s) in RCA: 146] [Impact Index Per Article: 12.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2013] [Accepted: 04/08/2013] [Indexed: 11/19/2022] Open
Abstract
Synonymous sites are generally assumed to be subject to weak selective constraint. For this reason, they are often neglected as a possible source of important functional variation. We use site frequency spectra from deep population sequencing data to show that, contrary to this expectation, 22% of four-fold synonymous (4D) sites in Drosophila melanogaster evolve under very strong selective constraint while few, if any, appear to be under weak constraint. Linking polymorphism with divergence data, we further find that the fraction of synonymous sites exposed to strong purifying selection is higher for those positions that show slower evolution on the Drosophila phylogeny. The function underlying the inferred strong constraint appears to be separate from splicing enhancers, nucleosome positioning, and the translational optimization generating canonical codon bias. The fraction of synonymous sites under strong constraint within a gene correlates well with gene expression, particularly in the mid-late embryo, pupae, and adult developmental stages. Genes enriched in strongly constrained synonymous sites tend to be particularly functionally important and are often involved in key developmental pathways. Given that the observed widespread constraint acting on synonymous sites is likely not limited to Drosophila, the role of synonymous sites in genetic disease and adaptation should be reevaluated.
Collapse
|
42
|
Kenigsberg E, Tanay A. Drosophila functional elements are embedded in structurally constrained sequences. PLoS Genet 2013; 9:e1003512. [PMID: 23750124 PMCID: PMC3671938 DOI: 10.1371/journal.pgen.1003512] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2012] [Accepted: 03/04/2013] [Indexed: 12/22/2022] Open
Abstract
Modern functional genomics uncovered numerous functional elements in metazoan genomes. Nevertheless, only a small fraction of the typical non-exonic genome contains elements that code for function directly. On the other hand, a much larger fraction of the genome is associated with significant evolutionary constraints, suggesting that much of the non-exonic genome is weakly functional. Here we show that in flies, local (30–70 bp) conserved sequence elements that are associated with multiple regulatory functions serve as focal points to a pattern of punctuated regional increase in G/C nucleotide frequencies. We show that this pattern, which covers a region tenfold larger than the conserved elements themselves, is an evolutionary consequence of a shift in the balance between gain and loss of G/C nucleotides and that it is correlated with nucleosome occupancy across multiple classes of epigenetic state. Evidence for compensatory evolution and analysis of SNP allele frequencies show that the evolutionary regime underlying this balance shift is likely to be non-neutral. These data suggest that current gaps in our understanding of genome function and evolutionary dynamics are explicable by a model of sparse sequence elements directly encoding for function, embedded into structural sequences that help to define the local and global epigenomic context of such functional elements. A key challenge in functional genomics is to predict evolutionary dynamics from functional annotation of the genome and vice versa. Modern epigenomic studies helped assign function to numerous new sequence elements, but left most of the genome essentially uncharacterized. Evolutionary genomics, on the other hand, consistently suggests that a much larger fraction of the un-annotated genome evolves under selective pressure. We hypothesize that this function-selection gap can be attributed to sequences that facilitate the physical organization of functional elements, such as transcription factor binding sites, within chromosomes. We exemplify this by studying in detail the sequences embedding small conserved elements (CEs) in Drosophila. We show that, while CEs have typically high AT content, high GC content levels around them are maintained by a non-neutral evolutionary balance between gain and loss of GC nucleotides. This non-uniform pattern is highly correlated with nucleosome organization around CEs, potentially imposing an evolutionary constraint on as much as one quarter of the genome. We suggest this can at least partly explain the above function-selection gap. Weak evolutionary constraints on “structural” sequences (at scales ranging from one nucleosome to recently described multi-megabase topological domains) may affect genome evolution just like structural motifs shape protein evolution.
Collapse
Affiliation(s)
- Ephraim Kenigsberg
- Department of Computer Science and Applied Mathematics and Department of Biological Regulation, Weizmann Institute, Rehovot, Israel
| | - Amos Tanay
- Department of Computer Science and Applied Mathematics and Department of Biological Regulation, Weizmann Institute, Rehovot, Israel
- * E-mail:
| |
Collapse
|
43
|
Affiliation(s)
- James G. D. Prendergast
- MRC Institute of Genetics and Molecular Medicine, University of Edinburgh, Western General Hospital, Edinburgh, United Kingdom
| | - Colin A. Semple
- MRC Institute of Genetics and Molecular Medicine, University of Edinburgh, Western General Hospital, Edinburgh, United Kingdom
- * E-mail:
| |
Collapse
|
44
|
Chambers EV, Bickmore WA, Semple CA. Divergence of mammalian higher order chromatin structure is associated with developmental loci. PLoS Comput Biol 2013; 9:e1003017. [PMID: 23592965 PMCID: PMC3617018 DOI: 10.1371/journal.pcbi.1003017] [Citation(s) in RCA: 34] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/08/2012] [Accepted: 02/18/2013] [Indexed: 02/03/2023] Open
Abstract
Several recent studies have examined different aspects of mammalian higher order chromatin structure - replication timing, lamina association and Hi-C inter-locus interactions - and have suggested that most of these features of genome organisation are conserved over evolution. However, the extent of evolutionary divergence in higher order structure has not been rigorously measured across the mammalian genome, and until now little has been known about the characteristics of any divergent loci present. Here, we generate a dataset combining multiple measurements of chromatin structure and organisation over many embryonic cell types for both human and mouse that, for the first time, allows a comprehensive assessment of the extent of structural divergence between mammalian genomes. Comparison of orthologous regions confirms that all measurable facets of higher order structure are conserved between human and mouse, across the vast majority of the detectably orthologous genome. This broad similarity is observed in spite of many loci possessing cell type specific structures. However, we also identify hundreds of regions (from 100 Kb to 2.7 Mb in size) showing consistent evidence of divergence between these species, constituting at least 10% of the orthologous mammalian genome and encompassing many hundreds of human and mouse genes. These regions show unusual shifts in human GC content, are unevenly distributed across both genomes, and are enriched in human subtelomeric regions. Divergent regions are also relatively enriched for genes showing divergent expression patterns between human and mouse ES cells, implying these regions cause divergent regulation. Particular divergent loci are strikingly enriched in genes implicated in vertebrate development, suggesting important roles for structural divergence in the evolution of mammalian developmental programmes. These data suggest that, though relatively rare in the mammalian genome, divergence in higher order chromatin structure has played important roles during evolution.
Collapse
Affiliation(s)
- Emily V. Chambers
- MRC Human Genetics Unit, MRC Institute of Genetics and Molecular Medicine, University of Edinburgh, Western General Hospital, Edinburgh, United Kingdom
| | - Wendy A. Bickmore
- MRC Human Genetics Unit, MRC Institute of Genetics and Molecular Medicine, University of Edinburgh, Western General Hospital, Edinburgh, United Kingdom
| | - Colin A. Semple
- MRC Human Genetics Unit, MRC Institute of Genetics and Molecular Medicine, University of Edinburgh, Western General Hospital, Edinburgh, United Kingdom
- * E-mail:
| |
Collapse
|
45
|
H2A.Z nucleosome positioning has no impact on genetic variation in Drosophila genome. PLoS One 2013; 8:e58295. [PMID: 23472174 PMCID: PMC3589275 DOI: 10.1371/journal.pone.0058295] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2012] [Accepted: 02/01/2013] [Indexed: 11/20/2022] Open
Abstract
Nucleosome occupancy results in complex sequence variation rate heterogeneity by either increasing mutation rate or inhibiting DNA repair in yeast, fish, and human. H2A.Z nucleosome is extensively involved in gene transcription activation and regulation. To test whether H2A.Z nucleosome has the similar impact on sequence variability in the Drosophila genome, we profiled the H2A.Z nucleosome occupancy and sequence variation rate at gene ends and splicing sites. Consistent with previous studies, H2A.Z nucleosome positioning helps to demarcate the borders of exons. Nucleosome occupancy is anticorrelated with sequence divergence rate in the regions flanking transcription start sites and splicing sites. However, there is no rate heterogeneity between the linker DNA and H2A.Z nucleosomal DNA regardless of nucleosome occupancy, fuzziness, positioning in promoter, coding, and intergenic regions, young or old genes. But the rate at intergenic nucleosomes and the flanking linker regions is higher than that at the genic counterparts. Further analyses found that the high sequence divergence rate in the promoter regions that are usually nucleosome depleted regions may be likely resulted from the high mutation rate in the enriched tandem repeats. Interestingly, within nucleosomes spanning splicing sites, sequence variability of nucleosomal DNA significantly increases from the end within exons to the other end protruding into introns. The relaxed functional constraint in introns contributes to the high rate of nucleosomal DNA residing in introns while the strict functional constraint in exons maintains the low rate of nucleosomal DNA residing in exons. Taken together, H2A.Z nucleosome occupancy has no effect on sequence variability of Drosophila genome, which is likely determined by local sequence composition and the concomitant selection pressure.
Collapse
|
46
|
Levitsky VG, Babenko VN, Vershinin AV. The roles of the monomer length and nucleotide context of plant tandem repeats in nucleosome positioning. J Biomol Struct Dyn 2013; 32:115-26. [PMID: 23384242 DOI: 10.1080/07391102.2012.755796] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/04/2023]
Abstract
Similar to regularly spaced nucleosomes in chromatin, long tandem DNA arrays are composed of regularly alternating monomers that have almost identical primary DNA structures. Such a similarity in the structural organization makes these arrays especially interesting for studying the role of intrinsic DNA preferences in nucleosome positioning. We have studied the nucleosome formation potential of DNA tandem repeat families with different monomer lengths (ML). In total, 165 plant tandem repeat families from the PlantSat database (http://w3lamc.umbr.cas.cz/PlantSat/) were divided into two classes based on the number of nucleosome repeats in one DNA monomer. For predicting nucleosome formation potential, we developed the Phase method, which combines the advantages of multiple bioinformatics models. The Phase method was able to distinguish interfamily differences and intrafamily monomer variation and identify the influence of nucleotide context on nucleosome formation potential. Three main types of nucleosome arrangement in DNA tandem repeat arrays--regular, partially regular (partial), and flexible--were distinguished among a great variety of Phase profiles. The regular type, in which all nucleosomes of the monomer array are positioned in a context-dependent manner, is the most representative type of the class 1 families, with ML equal to or a multiple of the nucleosome repeat length (NRL). In the partially regular type, nucleotide context influences the positioning of only a subset of nucleosomes. The influence of the nucleotide context on nucleosome positioning has the least effect in the flexible type, which contains the greatest number of families (65). The majority of these families belong to class 2 and have nonmultiple ML to NRL ratios.
Collapse
Affiliation(s)
- Victor G Levitsky
- a Laboratory of Molecular Genetics Systems , Institute of Cytology and Genetics , Novosibirsk , 630090 , Russia
| | | | | |
Collapse
|
47
|
Warnecke T, Supek F, Lehner B. Nucleoid-associated proteins affect mutation dynamics in E. coli in a growth phase-specific manner. PLoS Comput Biol 2012; 8:e1002846. [PMID: 23284284 PMCID: PMC3527292 DOI: 10.1371/journal.pcbi.1002846] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2012] [Accepted: 11/03/2012] [Indexed: 02/06/2023] Open
Abstract
The binding of proteins can shield DNA from mutagenic processes but also interfere with efficient repair. How the presence of DNA-binding proteins shapes intra-genomic differences in mutability and, ultimately, sequence variation in natural populations, however, remains poorly understood. In this study, we examine sequence evolution in Escherichia coli in relation to the binding of four abundant nucleoid-associated proteins: Fis, H-NS, IhfA, and IhfB. We find that, for a subset of mutations, protein occupancy is associated with both increased and decreased mutability in the underlying sequence depending on when the protein is bound during the bacterial growth cycle. On average, protein-bound DNA exhibits reduced mutability compared to protein-free DNA. However, this net protective effect is weak and can be abolished or even reversed during stages of colony growth where binding coincides – and hence likely interferes with – DNA repair activity. We suggest that the four nucleoid-associated proteins analyzed here have played a minor but significant role in patterning extant sequence variation in E. coli. Mutations can be more or less likely to occur depending on whether DNA is naked or bound by proteins. On the one hand, DNA-binding proteins can shield the DNA from certain mutagenic processes. On the other hand, the very same proteins can interfere with efficient DNA repair. In this study, we reconstruct the history of mutations across 54 E. coli genomes and ask whether mutation risk is higher or lower in regions occupied by proteins that help organize bacterial DNA into chromatin. Intriguingly, we find that the effect of binding depends on its timing. When we consider genomic regions bound during stationary phase, we observe that binding is associated with lower mutation risk for some mutation classes compared to naked DNA, albeit weakly. However, when binding occurs during exponential phase, bound regions actually experience more mutations on average. We argue that this is because, during exponential phase, the major effect of binding is that it interferes with efficient DNA repair, whereas in stationary phase – when many repair pathways are inactive – the protective effect of binding dominates. Our results suggest that the four DNA-binding proteins considered here have a small but significant growth phase-specific effect on mutation dynamics in E. coli.
Collapse
Affiliation(s)
- Tobias Warnecke
- Bioinformatics and Genomics Program, Centre for Genomic Regulation (CRG), Barcelona, Spain.
| | | | | |
Collapse
|
48
|
Predicting nucleosome binding motif set and analyzing their distributions around functional sites of human genes. Chromosome Res 2012; 20:685-98. [DOI: 10.1007/s10577-012-9305-0] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2012] [Revised: 07/13/2012] [Accepted: 07/17/2012] [Indexed: 01/30/2023]
|
49
|
Prendergast JGD, Tong P, Hay DC, Farrington SM, Semple CAM. A genome-wide screen in human embryonic stem cells reveals novel sites of allele-specific histone modification associated with known disease loci. Epigenetics Chromatin 2012; 5:6. [PMID: 22607690 PMCID: PMC3438052 DOI: 10.1186/1756-8935-5-6] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2012] [Accepted: 04/10/2012] [Indexed: 12/12/2022] Open
Abstract
BACKGROUND Chromatin structure at a given site can differ between chromosome copies in a cell, and such imbalances in chromatin structure have been shown to be important in understanding the molecular mechanisms controlling several disease loci. Human genetic variation, DNA methylation, and disease have been intensely studied, uncovering many sites of allele-specific DNA methylation (ASM). However, little is known about the genome-wide occurrence of sites of allele-specific histone modification (ASHM) and their relationship to human disease. The aim of this study was to investigate the extent and characteristics of sites of ASHM in human embryonic stem cells (hESCs). RESULTS Using a statistically rigorous protocol, we investigated the genomic distribution of ASHM in hESCs, and their relationship to sites of allele-specific expression (ASE) and DNA methylation. We found that, although they were rare, sites of ASHM were substantially enriched at loci displaying ASE. Many were also found at known imprinted regions, hence sites of ASHM are likely to be better markers of imprinted regions than sites of ASM. We also found that sites of ASHM and ASE in hESCs colocalize at risk loci for developmental syndromes mediated by deletions, providing insights into the etiology of these disorders. CONCLUSION These results demonstrate the potential importance of ASHM patterns in the interpretation of disease loci, and the protocol described provides a basis for similar studies of ASHM in other cell types to further our understanding of human disease susceptibility.
Collapse
Affiliation(s)
- James G D Prendergast
- MRC Human Genetics Unit, MRC Institute of Genetics and Molecular Medicine, University of Edinburgh, Western General Hospital, Crewe Road, , Edinburgh, EH4 2XU, UK
| | - Pin Tong
- UCD Conway Institute for Biomolecular and Biomedical Research, Dublin, Ireland
| | - David C Hay
- MRC Centre for Regenerative Medicine, University of Edinburgh, 49 Little France Crescent, Edinburgh, EH16 4SB, UK
| | - Susan M Farrington
- MRC Human Genetics Unit, MRC Institute of Genetics and Molecular Medicine, University of Edinburgh, Western General Hospital, Crewe Road, , Edinburgh, EH4 2XU, UK
| | - Colin A M Semple
- MRC Human Genetics Unit, MRC Institute of Genetics and Molecular Medicine, University of Edinburgh, Western General Hospital, Crewe Road, , Edinburgh, EH4 2XU, UK
| |
Collapse
|
50
|
Amit M, Donyo M, Hollander D, Goren A, Kim E, Gelfman S, Lev-Maor G, Burstein D, Schwartz S, Postolsky B, Pupko T, Ast G. Differential GC content between exons and introns establishes distinct strategies of splice-site recognition. Cell Rep 2012; 1:543-56. [PMID: 22832277 DOI: 10.1016/j.celrep.2012.03.013] [Citation(s) in RCA: 219] [Impact Index Per Article: 16.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/27/2011] [Revised: 03/07/2012] [Accepted: 03/30/2012] [Indexed: 12/12/2022] Open
Abstract
During evolution segments of homeothermic genomes underwent a GC content increase. Our analyses reveal that two exon-intron architectures have evolved from an ancestral state of low GC content exons flanked by short introns with a lower GC content. One group underwent a GC content elevation that abolished the differential exon-intron GC content, with introns remaining short. The other group retained the overall low GC content as well as the differential exon-intron GC content, and is associated with longer introns. We show that differential exon-intron GC content regulates exon inclusion level in this group, in which disease-associated mutations often lead to exon skipping. This group's exons also display higher nucleosome occupancy compared to flanking introns and exons of the other group, thus "marking" them for spliceosomal recognition. Collectively, our results reveal that differential exon-intron GC content is a previously unidentified determinant of exon selection and argue that the two GC content architectures reflect the two mechanisms by which splicing signals are recognized: exon definition and intron definition.
Collapse
Affiliation(s)
- Maayan Amit
- Department of Human Molecular Genetics and Biochemistry, Sackler Faculty of Medicine, Tel-Aviv University, Ramat Aviv 69978, Israel
| | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|