1
|
Khandia R, Pandey MK, Garg R, Khan AA, Baklanov I, Alanazi AM, Nepali P, Gurjar P, Choudhary OP. Molecular insights into codon usage analysis of mitochondrial fission and fusion gene: relevance to neurodegenerative diseases. Ann Med Surg (Lond) 2024; 86:1416-1425. [PMID: 38463054 PMCID: PMC10923317 DOI: 10.1097/ms9.0000000000001725] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2023] [Accepted: 01/05/2024] [Indexed: 03/12/2024] Open
Abstract
Mitochondrial dysfunction is the leading cause of neurodegenerative disorders like Alzheimer's disease and Parkinson's disease. Mitochondria is a highly dynamic organelle continuously undergoing the process of fission and fusion for even distribution of components and maintaining proper shape, number, and bioenergetic functionality. A set of genes governs the process of fission and fusion. OPA1, Mfn1, and Mfn2 govern fusion, while Drp1, Fis1, MIEF1, and MIEF2 genes control fission. Determination of specific molecular patterns of transcripts of these genes revealed the impact of compositional constraints on selecting optimal codons. AGA and CCA codons were over-represented, and CCC, GTC, TTC, GGG, ACG were under-represented in the fusion gene set. In contrast, CTG was over-represented, and GCG, CCG, and TCG were under-represented in the fission gene set. Hydropathicity analysis revealed non-polar protein products of both fission and fusion gene set transcripts. AGA codon repeats are an integral part of translational regulation machinery and present a distinct pattern of over-representation and under-representation in different transcripts within the gene sets, suggestive of selective translational force precisely controlling the occurrence of the codon. Out of six synonymous codons, five synonymous codons encoding for leucine were used differently in both gene sets. Hence, forces regulating the occurrence of AGA and five synonymous leucine-encoding codons suggest translational selection. A correlation of mutational bias with gene expression and codon bias and GRAVY and AROMA signifies the selection pressure in both gene sets, while the correlation of compositional bias with gene expression, codon bias, protein properties, and minimum free energy signifies the presence of compositional constraints. More than 25% of codons of both gene sets showed a significant difference in codon usage. The overall analysis shed light on molecular features of gene sets involved in fission and fusion.
Collapse
Affiliation(s)
| | - Megha Katare Pandey
- Translational Medicine Center, All India Institute of Medical Sciences, Bhopal
| | | | - Azmat Ali Khan
- Pharmaceutical Biotechnology Laboratory, Department of Pharmaceutical Chemistry, College of Pharmacy, King Saud University, Riyadh, Saudi Arabia
| | - Igor Baklanov
- Department of Philosophy, North Caucasus Federal University, Stavropol, Russia
| | - Amer M. Alanazi
- Pharmaceutical Biotechnology Laboratory, Department of Pharmaceutical Chemistry, College of Pharmacy, King Saud University, Riyadh, Saudi Arabia
| | - Prakash Nepali
- Government Medical Officer, Bhimad Primary Health Care Center, Government of Nepal, Tanahun, Nepal
| | - Pankaj Gurjar
- Centre for Global Health Research, Saveetha Medical College and Hospital, Saveetha Institute of Medical and Technical Sciences, Saveetha University, Chennai, Tamil Nadu, India
- Department of Science and Engineering, Novel Global Community Educational Foundation, Hebersham, NSW, Australia
| | - Om Prakash Choudhary
- Department of Veterinary Anatomy, College of Veterinary Science, Guru Angad Dev Veterinary and Animal Sciences University (GADVASU), Rampura Phul, Bathinda, Punjab, India
| |
Collapse
|
2
|
Findlay SD, Romo L, Burge CB. Quantifying negative selection in human 3' UTRs uncovers constrained targets of RNA-binding proteins. Nat Commun 2024; 15:85. [PMID: 38168060 PMCID: PMC10762232 DOI: 10.1038/s41467-023-44456-9] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2022] [Accepted: 12/14/2023] [Indexed: 01/05/2024] Open
Abstract
Many non-coding variants associated with phenotypes occur in 3' untranslated regions (3' UTRs), and may affect interactions with RNA-binding proteins (RBPs) to regulate gene expression post-transcriptionally. However, identifying functional 3' UTR variants has proven difficult. We use allele frequencies from the Genome Aggregation Database (gnomAD) to identify classes of 3' UTR variants under strong negative selection in humans. We develop intergenic mutability-adjusted proportion singleton (iMAPS), a generalized measure related to MAPS, to quantify negative selection in non-coding regions. This approach, in conjunction with in vitro and in vivo binding data, identifies precise RBP binding sites, miRNA target sites, and polyadenylation signals (PASs) under strong selection. For each class of sites, we identify thousands of gnomAD variants under selection comparable to missense coding variants, and find that sites in core 3' UTR regions upstream of the most-used PAS are under strongest selection. Together, this work improves our understanding of selection on human genes and validates approaches for interpreting genetic variants in human 3' UTRs.
Collapse
Affiliation(s)
- Scott D Findlay
- Department of Biology, Massachusetts Institute of Technology, Cambridge, MA, 02142, USA
| | - Lindsay Romo
- Department of Biology, Massachusetts Institute of Technology, Cambridge, MA, 02142, USA
- Boston Children's Hospital, Boston, MA, 02115, USA
| | - Christopher B Burge
- Department of Biology, Massachusetts Institute of Technology, Cambridge, MA, 02142, USA.
| |
Collapse
|
3
|
Yıldırım B, Vogl C. Purifying selection against spurious splicing signals contributes to the base composition evolution of the polypyrimidine tract. J Evol Biol 2023; 36:1295-1312. [PMID: 37564008 PMCID: PMC10946897 DOI: 10.1111/jeb.14205] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2023] [Revised: 05/31/2023] [Accepted: 06/15/2023] [Indexed: 08/12/2023]
Abstract
Among eukaryotes, the major spliceosomal pathway is highly conserved. While long introns may contain additional regulatory sequences, the ones in short introns seem to be nearly exclusively related to splicing. Although these regulatory sequences involved in splicing are well-characterized, little is known about their evolution. At the 3' end of introns, the splice signal nearly universally contains the dimer AG, which consists of purines, and the polypyrimidine tract upstream of this 3' splice signal is characterized by over-representation of pyrimidines. If the over-representation of pyrimidines in the polypyrimidine tract is also due to avoidance of a premature splicing signal, we hypothesize that AG should be the most under-represented dimer. Through the use of DNA-strand asymmetry patterns, we confirm this prediction in fruit flies of the genus Drosophila and by comparing the asymmetry patterns to a presumably neutrally evolving region, we quantify the selection strength acting on each motif. Moreover, our inference and simulation method revealed that the best explanation for the base composition evolution of the polypyrimidine tract is the joint action of purifying selection against a spurious 3' splice signal and the selection for pyrimidines. Patterns of asymmetry in other eukaryotes indicate that avoidance of premature splicing similarly affects the nucleotide composition in their polypyrimidine tracts.
Collapse
Affiliation(s)
- Burçin Yıldırım
- Department of Biomedical SciencesVetmeduni ViennaViennaAustria
- Vienna Graduate School of Population GeneticsViennaAustria
| | - Claus Vogl
- Department of Biomedical SciencesVetmeduni ViennaViennaAustria
- Vienna Graduate School of Population GeneticsViennaAustria
| |
Collapse
|
4
|
Moeckel C, Zaravinos A, Georgakopoulos-Soares I. Strand Asymmetries Across Genomic Processes. Comput Struct Biotechnol J 2023; 21:2036-2047. [PMID: 36968020 PMCID: PMC10030826 DOI: 10.1016/j.csbj.2023.03.007] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2023] [Revised: 03/08/2023] [Accepted: 03/08/2023] [Indexed: 03/12/2023] Open
Abstract
Across biological systems, a number of genomic processes, including transcription, replication, DNA repair, and transcription factor binding, display intrinsic directionalities. These directionalities are reflected in the asymmetric distribution of nucleotides, motifs, genes, transposon integration sites, and other functional elements across the two complementary strands. Strand asymmetries, including GC skews and mutational biases, have shaped the nucleotide composition of diverse organisms. The investigation of strand asymmetries often serves as a method to understand underlying biological mechanisms, including protein binding preferences, transcription factor interactions, retrotransposition, DNA damage and repair preferences, transcription-replication collisions, and mutagenesis mechanisms. Research into this subject also enables the identification of functional genomic sites, such as replication origins and transcription start sites. Improvements in our ability to detect and quantify DNA strand asymmetries will provide insights into diverse functionalities of the genome, the contribution of different mutational mechanisms in germline and somatic mutagenesis, and our knowledge of genome instability and evolution, which all have significant clinical implications in human disease, including cancer. In this review, we describe key developments that have been made across the field of genomic strand asymmetries, as well as the discovery of associated mechanisms.
Collapse
Affiliation(s)
- Camille Moeckel
- Institute for Personalized Medicine, Department of Biochemistry and Molecular Biology, The Pennsylvania State University College of Medicine, Hershey, PA, USA
| | - Apostolos Zaravinos
- Department of Life Sciences, European University Cyprus, Diogenis Str., 6, Nicosia 2404, Cyprus
- Cancer Genetics, Genomics and Systems Biology laboratory, Basic and Translational Cancer Research Center (BTCRC), Nicosia 1516, Cyprus
- Corresponding author at: Department of Life Sciences, European University Cyprus, Diogenis Str., 6, Nicosia 2404, Cyprus.
| | - Ilias Georgakopoulos-Soares
- Institute for Personalized Medicine, Department of Biochemistry and Molecular Biology, The Pennsylvania State University College of Medicine, Hershey, PA, USA
- Corresponding author.
| |
Collapse
|
5
|
Genome-wide measurement of DNA replication fork directionality and quantification of DNA replication initiation and termination with Okazaki fragment sequencing. Nat Protoc 2023; 18:1260-1295. [PMID: 36653528 DOI: 10.1038/s41596-022-00793-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2021] [Accepted: 11/09/2022] [Indexed: 01/19/2023]
Abstract
Studying the dynamics of genome replication in mammalian cells has been historically challenging. To reveal the location of replication initiation and termination in the human genome, we developed Okazaki fragment sequencing (OK-seq), a quantitative approach based on the isolation and strand-specific sequencing of Okazaki fragments, the lagging strand replication intermediates. OK-seq quantitates the proportion of leftward- and rightward-oriented forks at every genomic locus and reveals the location and efficiency of replication initiation and termination events. Here we provide the detailed experimental procedures for performing OK-seq in unperturbed cultured human cells and budding yeast and the bioinformatics pipelines for data processing and computation of replication fork directionality. Furthermore, we present the analytical approach based on a hidden Markov model, which allows automated detection of ascending, descending and flat replication fork directionality segments revealing the zones of replication initiation, termination and unidirectional fork movement across the entire genome. These tools are essential for the accurate interpretation of human and yeast replication programs. The experiments and the data processing can be accomplished within six days. Besides revealing the genome replication program in fine detail, OK-seq has been instrumental in numerous studies unravelling mechanisms of genome stability, epigenome maintenance and genome evolution.
Collapse
|
6
|
Miller MB, Huang AY, Kim J, Zhou Z, Kirkham SL, Maury EA, Ziegenfuss JS, Reed HC, Neil JE, Rento L, Ryu SC, Ma CC, Luquette LJ, Ames HM, Oakley DH, Frosch MP, Hyman BT, Lodato MA, Lee EA, Walsh CA. Somatic genomic changes in single Alzheimer's disease neurons. Nature 2022; 604:714-722. [PMID: 35444284 PMCID: PMC9357465 DOI: 10.1038/s41586-022-04640-1] [Citation(s) in RCA: 90] [Impact Index Per Article: 45.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2020] [Accepted: 03/14/2022] [Indexed: 02/02/2023]
Abstract
Dementia in Alzheimer's disease progresses alongside neurodegeneration1-4, but the specific events that cause neuronal dysfunction and death remain poorly understood. During normal ageing, neurons progressively accumulate somatic mutations5 at rates similar to those of dividing cells6,7 which suggests that genetic factors, environmental exposures or disease states might influence this accumulation5. Here we analysed single-cell whole-genome sequencing data from 319 neurons from the prefrontal cortex and hippocampus of individuals with Alzheimer's disease and neurotypical control individuals. We found that somatic DNA alterations increase in individuals with Alzheimer's disease, with distinct molecular patterns. Normal neurons accumulate mutations primarily in an age-related pattern (signature A), which closely resembles 'clock-like' mutational signatures that have been previously described in healthy and cancerous cells6-10. In neurons affected by Alzheimer's disease, additional DNA alterations are driven by distinct processes (signature C) that highlight C>A and other specific nucleotide changes. These changes potentially implicate nucleotide oxidation4,11, which we show is increased in Alzheimer's-disease-affected neurons in situ. Expressed genes exhibit signature-specific damage, and mutations show a transcriptional strand bias, which suggests that transcription-coupled nucleotide excision repair has a role in the generation of mutations. The alterations in Alzheimer's disease affect coding exons and are predicted to create dysfunctional genetic knockout cells and proteostatic stress. Our results suggest that known pathogenic mechanisms in Alzheimer's disease may lead to genomic damage to neurons that can progressively impair function. The aberrant accumulation of DNA alterations in neurodegeneration provides insight into the cascade of molecular and cellular events that occurs in the development of Alzheimer's disease.
Collapse
Affiliation(s)
- Michael B Miller
- Division of Neuropathology, Department of Pathology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
- Division of Genetics and Genomics, Manton Center for Orphan Diseases, Boston Children's Hospital, Boston, MA, USA
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Department of Pediatrics, Harvard Medical School, Boston, MA, USA
| | - August Yue Huang
- Division of Genetics and Genomics, Manton Center for Orphan Diseases, Boston Children's Hospital, Boston, MA, USA
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Department of Pediatrics, Harvard Medical School, Boston, MA, USA
| | - Junho Kim
- Division of Genetics and Genomics, Manton Center for Orphan Diseases, Boston Children's Hospital, Boston, MA, USA
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Department of Pediatrics, Harvard Medical School, Boston, MA, USA
- Department of Biological Sciences, Sungkyunkwan University, Suwon, South Korea
| | - Zinan Zhou
- Division of Genetics and Genomics, Manton Center for Orphan Diseases, Boston Children's Hospital, Boston, MA, USA
- Department of Pediatrics, Harvard Medical School, Boston, MA, USA
| | - Samantha L Kirkham
- Division of Genetics and Genomics, Manton Center for Orphan Diseases, Boston Children's Hospital, Boston, MA, USA
- Department of Pediatrics, Harvard Medical School, Boston, MA, USA
| | - Eduardo A Maury
- Division of Genetics and Genomics, Manton Center for Orphan Diseases, Boston Children's Hospital, Boston, MA, USA
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Department of Pediatrics, Harvard Medical School, Boston, MA, USA
- Bioinformatics and Integrative Genomics Program, Harvard-MIT MD-PhD Program, Harvard Medical School, Boston, MA, USA
| | - Jennifer S Ziegenfuss
- Department of Molecular, Cell and Cancer Biology, University of Massachusetts Chan Medical School, Worcester, MA, USA
| | - Hannah C Reed
- Division of Genetics and Genomics, Manton Center for Orphan Diseases, Boston Children's Hospital, Boston, MA, USA
- Department of Pediatrics, Harvard Medical School, Boston, MA, USA
- Allegheny College, Meadville, PA, USA
| | - Jennifer E Neil
- Division of Genetics and Genomics, Manton Center for Orphan Diseases, Boston Children's Hospital, Boston, MA, USA
- Department of Pediatrics, Harvard Medical School, Boston, MA, USA
- Howard Hughes Medical Institute, Boston, MA, USA
| | - Lariza Rento
- Division of Genetics and Genomics, Manton Center for Orphan Diseases, Boston Children's Hospital, Boston, MA, USA
- Department of Pediatrics, Harvard Medical School, Boston, MA, USA
- Howard Hughes Medical Institute, Boston, MA, USA
| | - Steven C Ryu
- Division of Genetics and Genomics, Manton Center for Orphan Diseases, Boston Children's Hospital, Boston, MA, USA
- Department of Pediatrics, Harvard Medical School, Boston, MA, USA
| | - Chanthia C Ma
- Division of Genetics and Genomics, Manton Center for Orphan Diseases, Boston Children's Hospital, Boston, MA, USA
- Department of Pediatrics, Harvard Medical School, Boston, MA, USA
| | - Lovelace J Luquette
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
| | - Heather M Ames
- Department of Pathology, University of Maryland School of Medicine, Baltimore, MD, USA
| | - Derek H Oakley
- Department of Pathology, Harvard Medical School, Massachusetts General Hospital, Boston, MA, USA
| | - Matthew P Frosch
- Department of Pathology, Harvard Medical School, Massachusetts General Hospital, Boston, MA, USA
- Department of Neurology, Harvard Medical School, Massachusetts General Hospital, Boston, MA, USA
| | - Bradley T Hyman
- Department of Neurology, Harvard Medical School, Massachusetts General Hospital, Boston, MA, USA
| | - Michael A Lodato
- Division of Genetics and Genomics, Manton Center for Orphan Diseases, Boston Children's Hospital, Boston, MA, USA.
- Department of Pediatrics, Harvard Medical School, Boston, MA, USA.
- Department of Molecular, Cell and Cancer Biology, University of Massachusetts Chan Medical School, Worcester, MA, USA.
| | - Eunjung Alice Lee
- Division of Genetics and Genomics, Manton Center for Orphan Diseases, Boston Children's Hospital, Boston, MA, USA.
- Broad Institute of MIT and Harvard, Cambridge, MA, USA.
- Department of Pediatrics, Harvard Medical School, Boston, MA, USA.
| | - Christopher A Walsh
- Division of Genetics and Genomics, Manton Center for Orphan Diseases, Boston Children's Hospital, Boston, MA, USA.
- Broad Institute of MIT and Harvard, Cambridge, MA, USA.
- Department of Pediatrics, Harvard Medical School, Boston, MA, USA.
- Howard Hughes Medical Institute, Boston, MA, USA.
- Department of Neurology, Harvard Medical School, Boston, MA, USA.
| |
Collapse
|
7
|
Khandia R, Ali Khan A, Alexiou A, Povetkin SN, Nikolaevna VM. Codon Usage Analysis of Pro-Apoptotic Bim Gene Isoforms. J Alzheimers Dis 2022; 86:1711-1725. [DOI: 10.3233/jad-215691] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]
Abstract
Background: Bim is a Bcl-2 homology 3 (BH3)-only proteins, a group of pro-apoptotic proteins involved in physiological and pathological conditions. Both the overexpression and under-expression of Bim protein are associated with the diseased condition, and various isoforms of Bim protein are present with differential apoptotic potential. Objective: The present study attempted to envisage the association of various molecular signatures with the codon choices of Bim isoforms. Methods: Molecular signatures like composition, codon usage, nucleotide skews, the free energy of mRNA transcript, physical properties of proteins, codon adaptation index, relative synonymous codon usage, and dinucleotide odds ratio were determined and analyzed for their associations with codon choices of Bim gene. Results: Skew analysis of the Bim gene indicated the preference of C nucleotide over G, A, and T and preference of G over T and A nucleotides was observed. An increase in C content at the first and third codon position increased gene expression while it decreased at the second codon position. Compositional constraints on nucleotide C at all three codon positions affected gene expression. The analysis revealed an exceptionally high usage of CpC dinucleotide in all the envisaged 31 isoforms of Bim. We correlated it with the requirement of rapid demethylation machinery to fine-tune the Bimgene expression. Also, mutational pressure played a dominant role in shaping codon usage bias in Bim isoforms. Conclusion: An exceptionally high usage of CpC dinucleotide in all the envisaged 31 isoforms of Bim indicates a high order selectional force to fine tune Bim gene expression.
Collapse
Affiliation(s)
- Rekha Khandia
- Department of Biochemistry and Genetics, Barkatullah University, Bhopal, India
| | - Azmat Ali Khan
- Pharmaceutical Biotechnology Laboratory, Department of Pharmaceutical Chemistry, College of Pharmacy, King Saud University, Riyadh, Saudi Arabia
| | - Athanasios Alexiou
- Novel Global Community Educational Foundation, Australia & AFNP Med, Austria
| | | | | |
Collapse
|
8
|
St Germain C, Zhao H, Sinha V, Sanz LA, Chédin F, Barlow J. OUP accepted manuscript. Nucleic Acids Res 2022; 50:2051-2073. [PMID: 35100392 PMCID: PMC8887484 DOI: 10.1093/nar/gkac035] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2021] [Revised: 01/05/2022] [Accepted: 01/14/2022] [Indexed: 11/13/2022] Open
Abstract
Conflicts between transcription and replication machinery are a potent source of replication stress and genome instability; however, no technique currently exists to identify endogenous genomic locations prone to transcription–replication interactions. Here, we report a novel method to identify genomic loci prone to transcription–replication interactions termed transcription–replication immunoprecipitation on nascent DNA sequencing, TRIPn-Seq. TRIPn-Seq employs the sequential immunoprecipitation of RNA polymerase 2 phosphorylated at serine 5 (RNAP2s5) followed by enrichment of nascent DNA previously labeled with bromodeoxyuridine. Using TRIPn-Seq, we mapped 1009 unique transcription–replication interactions (TRIs) in mouse primary B cells characterized by a bimodal pattern of RNAP2s5, bidirectional transcription, an enrichment of RNA:DNA hybrids, and a high probability of forming G-quadruplexes. TRIs are highly enriched at transcription start sites and map to early replicating regions. TRIs exhibit enhanced Replication Protein A association and TRI-associated genes exhibit higher replication fork termination than control transcription start sites, two marks of replication stress. TRIs colocalize with double-strand DNA breaks, are enriched for deletions, and accumulate mutations in tumors. We propose that replication stress at TRIs induces mutations potentially contributing to age-related disease, as well as tumor formation and development.
Collapse
Affiliation(s)
- Commodore P St Germain
- Department of Microbiology and Molecular Genetics, University of California Davis, One Shields Avenue, Davis, CA 95616, USA
- School of Mathematics and Science, Solano Community College, 4000 Suisun Valley Road, Fairfield, CA 94534, USA
| | - Hongchang Zhao
- Department of Microbiology and Molecular Genetics, University of California Davis, One Shields Avenue, Davis, CA 95616, USA
| | - Vrishti Sinha
- Department of Microbiology and Molecular Genetics, University of California Davis, One Shields Avenue, Davis, CA 95616, USA
| | - Lionel A Sanz
- Department of Molecular and Cellular Biology, University of California Davis, One Shields Avenue, Davis, CA 95616, USA
| | - Frédéric Chédin
- Department of Molecular and Cellular Biology, University of California Davis, One Shields Avenue, Davis, CA 95616, USA
| | - Jacqueline H Barlow
- To whom correspondence should be addressed. Tel: +1 530 752 9529; Fax: +1 530 752 9014;
| |
Collapse
|
9
|
Rosendahl Huber A, Van Hoeck A, Van Boxtel R. The Mutagenic Impact of Environmental Exposures in Human Cells and Cancer: Imprints Through Time. Front Genet 2021; 12:760039. [PMID: 34745228 PMCID: PMC8565797 DOI: 10.3389/fgene.2021.760039] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2021] [Accepted: 10/05/2021] [Indexed: 12/25/2022] Open
Abstract
During life, the DNA of our cells is continuously exposed to external damaging processes. Despite the activity of various repair mechanisms, DNA damage eventually results in the accumulation of mutations in the genomes of our cells. Oncogenic mutations are at the root of carcinogenesis, and carcinogenic agents are often highly mutagenic. Over the past decade, whole genome sequencing data of healthy and tumor tissues have revealed how cells in our body gradually accumulate mutations because of exposure to various mutagenic processes. Dissection of mutation profiles based on the type and context specificities of the altered bases has revealed a variety of signatures that reflect past exposure to environmental mutagens, ranging from chemotherapeutic drugs to genotoxic gut bacteria. In this review, we discuss the latest knowledge on somatic mutation accumulation in human cells, and how environmental mutagenic factors further shape the mutation landscapes of tissues. In addition, not all carcinogenic agents induce mutations, which may point to alternative tumor-promoting mechanisms, such as altered clonal selection dynamics. In short, we provide an overview of how environmental factors induce mutations in the DNA of our healthy cells and how this contributes to carcinogenesis. A better understanding of how environmental mutagens shape the genomes of our cells can help to identify potential preventable causes of cancer.
Collapse
Affiliation(s)
- Axel Rosendahl Huber
- Princess Máxima Center for Pediatric Oncology, Utrecht, Netherlands
- Oncode Institute, Utrecht, Netherlands
| | - Arne Van Hoeck
- Oncode Institute, Utrecht, Netherlands
- Center for Molecular Medicine, University Medical Centre Utrecht, Utrecht, Netherlands
| | - Ruben Van Boxtel
- Princess Máxima Center for Pediatric Oncology, Utrecht, Netherlands
- Oncode Institute, Utrecht, Netherlands
| |
Collapse
|
10
|
Lal A, Liu K, Tibshirani R, Sidow A, Ramazzotti D. De novo mutational signature discovery in tumor genomes using SparseSignatures. PLoS Comput Biol 2021; 17:e1009119. [PMID: 34181655 PMCID: PMC8270462 DOI: 10.1371/journal.pcbi.1009119] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2020] [Revised: 07/09/2021] [Accepted: 05/27/2021] [Indexed: 11/18/2022] Open
Abstract
Cancer is the result of mutagenic processes that can be inferred from tumor genomes by analyzing rate spectra of point mutations, or "mutational signatures". Here we present SparseSignatures, a novel framework to extract signatures from somatic point mutation data. Our approach incorporates a user-specified background signature, employs regularization to reduce noise in non-background signatures, uses cross-validation to identify the number of signatures, and is scalable to large datasets. We show that SparseSignatures outperforms current state-of-the-art methods on simulated data using a variety of standard metrics. We then apply SparseSignatures to whole genome sequences of pancreatic and breast tumors, discovering well-differentiated signatures that are linked to known mutagenic mechanisms and are strongly associated with patient clinical features.
Collapse
Affiliation(s)
- Avantika Lal
- Department of Pathology, Stanford University, Stanford, California, United States of America
| | - Keli Liu
- Department of Statistics, Stanford University, Stanford, California, United States of America
| | - Robert Tibshirani
- Department of Statistics, Stanford University, Stanford, California, United States of America
- Department of Biomedical Data Science, Stanford University, Stanford, California, United States of America
| | - Arend Sidow
- Department of Pathology, Stanford University, Stanford, California, United States of America
- Department of Genetics, Stanford University, Stanford, California, United States of America
| | - Daniele Ramazzotti
- Department of Pathology, Stanford University, Stanford, California, United States of America
- Department of Computer Science, Stanford University, Stanford, California, United States of America
| |
Collapse
|
11
|
Manrubia S, Cuesta JA, Aguirre J, Ahnert SE, Altenberg L, Cano AV, Catalán P, Diaz-Uriarte R, Elena SF, García-Martín JA, Hogeweg P, Khatri BS, Krug J, Louis AA, Martin NS, Payne JL, Tarnowski MJ, Weiß M. From genotypes to organisms: State-of-the-art and perspectives of a cornerstone in evolutionary dynamics. Phys Life Rev 2021; 38:55-106. [PMID: 34088608 DOI: 10.1016/j.plrev.2021.03.004] [Citation(s) in RCA: 36] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2020] [Accepted: 03/01/2021] [Indexed: 12/21/2022]
Abstract
Understanding how genotypes map onto phenotypes, fitness, and eventually organisms is arguably the next major missing piece in a fully predictive theory of evolution. We refer to this generally as the problem of the genotype-phenotype map. Though we are still far from achieving a complete picture of these relationships, our current understanding of simpler questions, such as the structure induced in the space of genotypes by sequences mapped to molecular structures, has revealed important facts that deeply affect the dynamical description of evolutionary processes. Empirical evidence supporting the fundamental relevance of features such as phenotypic bias is mounting as well, while the synthesis of conceptual and experimental progress leads to questioning current assumptions on the nature of evolutionary dynamics-cancer progression models or synthetic biology approaches being notable examples. This work delves with a critical and constructive attitude into our current knowledge of how genotypes map onto molecular phenotypes and organismal functions, and discusses theoretical and empirical avenues to broaden and improve this comprehension. As a final goal, this community should aim at deriving an updated picture of evolutionary processes soundly relying on the structural properties of genotype spaces, as revealed by modern techniques of molecular and functional analysis.
Collapse
Affiliation(s)
- Susanna Manrubia
- Department of Systems Biology, Centro Nacional de Biotecnología (CSIC), Madrid, Spain; Grupo Interdisciplinar de Sistemas Complejos (GISC), Madrid, Spain.
| | - José A Cuesta
- Grupo Interdisciplinar de Sistemas Complejos (GISC), Madrid, Spain; Departamento de Matemáticas, Universidad Carlos III de Madrid, Leganés, Spain; Instituto de Biocomputación y Física de Sistemas Complejos (BiFi), Universidad de Zaragoza, Spain; UC3M-Santander Big Data Institute (IBiDat), Getafe, Madrid, Spain
| | - Jacobo Aguirre
- Grupo Interdisciplinar de Sistemas Complejos (GISC), Madrid, Spain; Centro de Astrobiología, CSIC-INTA, ctra. de Ajalvir km 4, 28850 Torrejón de Ardoz, Madrid, Spain
| | - Sebastian E Ahnert
- Department of Chemical Engineering and Biotechnology, University of Cambridge, Philippa Fawcett Drive, Cambridge CB3 0AS, UK; The Alan Turing Institute, British Library, 96 Euston Road, London NW1 2DB, UK
| | | | - Alejandro V Cano
- Institute of Integrative Biology, ETH Zurich, Zurich, Switzerland; Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Pablo Catalán
- Grupo Interdisciplinar de Sistemas Complejos (GISC), Madrid, Spain; Departamento de Matemáticas, Universidad Carlos III de Madrid, Leganés, Spain
| | - Ramon Diaz-Uriarte
- Department of Biochemistry, Universidad Autónoma de Madrid, Madrid, Spain; Instituto de Investigaciones Biomédicas "Alberto Sols" (UAM-CSIC), Madrid, Spain
| | - Santiago F Elena
- Instituto de Biología Integrativa de Sistemas, I(2)SysBio (CSIC-UV), València, Spain; The Santa Fe Institute, Santa Fe, NM, USA
| | | | - Paulien Hogeweg
- Theoretical Biology and Bioinformatics Group, Utrecht University, the Netherlands
| | - Bhavin S Khatri
- The Francis Crick Institute, London, UK; Department of Life Sciences, Imperial College London, London, UK
| | - Joachim Krug
- Institute for Biological Physics, University of Cologne, Köln, Germany
| | - Ard A Louis
- Rudolf Peierls Centre for Theoretical Physics, University of Oxford, Oxford, UK
| | - Nora S Martin
- Theory of Condensed Matter Group, Cavendish Laboratory, University of Cambridge, Cambridge, UK; Sainsbury Laboratory, University of Cambridge, Cambridge, UK
| | - Joshua L Payne
- Institute of Integrative Biology, ETH Zurich, Zurich, Switzerland; Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | | | - Marcel Weiß
- Theory of Condensed Matter Group, Cavendish Laboratory, University of Cambridge, Cambridge, UK; Sainsbury Laboratory, University of Cambridge, Cambridge, UK
| |
Collapse
|
12
|
Mordstein C, Cano L, Morales AC, Young B, Ho AT, Rice AM, Liss M, Hurst LD, Kudla G. Transcription, mRNA export and immune evasion shape the codon usage of viruses. Genome Biol Evol 2021; 13:6275682. [PMID: 33988683 PMCID: PMC8410142 DOI: 10.1093/gbe/evab106] [Citation(s) in RCA: 21] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 05/10/2021] [Indexed: 12/15/2022] Open
Abstract
The nucleotide composition, dinucleotide composition, and codon usage of many viruses differs from their hosts. These differences arise because viruses are subject to unique mutation and selection pressures that do not apply to host genomes; however, the molecular mechanisms that underlie these evolutionary forces are unclear. Here, we analysed the patterns of codon usage in 1,520 vertebrate-infecting viruses, focusing on parameters known to be under selection and associated with gene regulation. We find that GC content, dinucleotide content, and splicing and m6A modification-related sequence motifs are associated with the type of genetic material (DNA or RNA), strandedness, and replication compartment of viruses. In an experimental follow-up, we find that the effects of GC content on gene expression depend on whether the genetic material is delivered to the cell as DNA or mRNA, whether it is transcribed by endogenous or exogenous RNA polymerase, and whether transcription takes place in the nucleus or cytoplasm. Our results suggest that viral codon usage cannot be explained by a simple adaptation to the codon usage of the host - instead, it reflects the combination of multiple selective and mutational pressures, including the need for efficient transcription, export, and immune evasion.
Collapse
Affiliation(s)
- Christine Mordstein
- MRC Human Genetics Unit, Institute for Genetics and Molecular Medicine, The University of Edinburgh, Edinburgh, UK.,The Milner Centre for Evolution, Department of Biology and Biochemistry, University of Bath, Bath, BA2 7AY, UK
| | - Laura Cano
- MRC Human Genetics Unit, Institute for Genetics and Molecular Medicine, The University of Edinburgh, Edinburgh, UK
| | - Atahualpa Castillo Morales
- The Milner Centre for Evolution, Department of Biology and Biochemistry, University of Bath, Bath, BA2 7AY, UK
| | - Bethan Young
- MRC Human Genetics Unit, Institute for Genetics and Molecular Medicine, The University of Edinburgh, Edinburgh, UK.,The Milner Centre for Evolution, Department of Biology and Biochemistry, University of Bath, Bath, BA2 7AY, UK
| | - Alexander T Ho
- The Milner Centre for Evolution, Department of Biology and Biochemistry, University of Bath, Bath, BA2 7AY, UK
| | - Alan M Rice
- The Milner Centre for Evolution, Department of Biology and Biochemistry, University of Bath, Bath, BA2 7AY, UK
| | - Michael Liss
- Thermo Fisher Scientific, GENEART GmbH, Regensburg, Germany
| | - Laurence D Hurst
- The Milner Centre for Evolution, Department of Biology and Biochemistry, University of Bath, Bath, BA2 7AY, UK
| | - Grzegorz Kudla
- MRC Human Genetics Unit, Institute for Genetics and Molecular Medicine, The University of Edinburgh, Edinburgh, UK
| |
Collapse
|
13
|
Roberts BS, Partridge EC, Moyers BA, Agarwal V, Newberry KM, Martin BK, Shendure J, Myers RM, Cooper GM. Genome-wide strand asymmetry in massively parallel reporter activity favors genic strands. Genome Res 2021; 31:866-876. [PMID: 33879525 PMCID: PMC8092006 DOI: 10.1101/gr.270751.120] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2020] [Accepted: 02/18/2021] [Indexed: 11/24/2022]
Abstract
Massively parallel reporter assays (MPRAs) are useful tools to characterize regulatory elements in human genomes. An aspect of MPRAs that is not typically the focus of analysis is their intrinsic ability to differentiate activity levels for a given sequence element when placed in both of its possible orientations relative to the reporter construct. Here, we describe pervasive strand asymmetry of MPRA signals in data sets from multiple reporter configurations in both published and newly reported data. These effects are reproducible across different cell types and in different treatments within a cell type and are observed both within and outside of annotated regulatory elements. From elements in gene bodies, MPRA strand asymmetry favors the sense strand, suggesting that function related to endogenous transcription is driving the phenomenon. Similarly, we find that within Alu mobile element insertions, strand asymmetry favors the transcribed strand of the ancestral retrotransposon. The effect is consistent across the multiplicity of Alu elements in human genomes and is more pronounced in less diverged Alu elements. We find sequence features driving MPRA strand asymmetry and show its prediction from sequence alone. We see some evidence for RNA stabilization and transcriptional activation mechanisms and hypothesize that the effect is driven by natural selection favoring efficient transcription. Our results indicate that strand asymmetry is a pervasive and reproducible feature in MPRA data. More importantly, the fact that MPRA asymmetry favors naturally transcribed strands suggests that it stems from preserved biological functions that have a substantial, global impact on gene and genome evolution.
Collapse
Affiliation(s)
- Brian S Roberts
- HudsonAlpha Institute for Biotechnology, Huntsville, Alabama 35806, USA.,Department of Biological Sciences, The University of Alabama in Huntsville, Huntsville, Alabama 35899, USA
| | | | - Bryan A Moyers
- HudsonAlpha Institute for Biotechnology, Huntsville, Alabama 35806, USA
| | - Vikram Agarwal
- Calico Life Sciences LLC, South San Francisco, California 94080, USA
| | | | - Beth K Martin
- Department of Genome Sciences, University of Washington, Seattle, Washington 98195, USA
| | - Jay Shendure
- Department of Genome Sciences, University of Washington, Seattle, Washington 98195, USA.,Howard Hughes Medical Institute, Seattle, Washington 98195, USA.,Brotman Baty Institute for Precision Medicine, University of Washington, Seattle, Washington 98195, USA
| | - Richard M Myers
- HudsonAlpha Institute for Biotechnology, Huntsville, Alabama 35806, USA
| | - Gregory M Cooper
- HudsonAlpha Institute for Biotechnology, Huntsville, Alabama 35806, USA
| |
Collapse
|
14
|
Kumar N, Kaushik R, Tennakoon C, Uversky VN, Longhi S, Zhang KYJ, Bhatia S. Insights into the evolutionary forces that shape the codon usage in the viral genome segments encoding intrinsically disordered protein regions. Brief Bioinform 2021; 22:6231751. [PMID: 33866372 DOI: 10.1093/bib/bbab145] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2021] [Revised: 03/17/2021] [Accepted: 03/26/2021] [Indexed: 12/22/2022] Open
Abstract
Intrinsically disordered regions/proteins (IDRs) are abundant across all the domains of life, where they perform important regulatory roles and supplement the biological functions of structured proteins/regions (SRs). Despite the multifunctionality features of IDRs, several interrogations on the evolution of viral genomic regions encoding IDRs in diverse viral proteins remain unreciprocated. To fill this gap, we benchmarked the findings of two most widely used and reliable intrinsic disorder prediction algorithms (IUPred2A and ESpritz) to a dataset of 6108 reference viral proteomes to unravel the multifaceted evolutionary forces that shape the codon usage in the viral genomic regions encoding for IDRs and SRs. We found persuasive evidence that the natural selection predominantly governs the evolution of codon usage in regions encoding IDRs by most of the viruses. In addition, we confirm not only that codon usage in regions encoding IDRs is less optimized for the protein synthesis machinery (transfer RNAs pool) of their host than for those encoding SRs, but also that the selective constraints imposed by codon bias sustain this reduced optimization in IDRs. Our analysis also establishes that IDRs in viruses are likely to tolerate more translational errors than SRs. All these findings hold true, irrespective of the disorder prediction algorithms used to classify IDRs. In conclusion, our study offers a novel perspective on the evolution of viral IDRs and the evolutionary adaptability to multiple taxonomically divergent hosts.
Collapse
Affiliation(s)
- Naveen Kumar
- Diagnostic & Vaccine Group, ICAR-National Institute of High Security Animal Diseases, Bhopal 462022, India
| | - Rahul Kaushik
- Laboratory for Structural Bioinformatics, Center for Biosystems Dynamics Research, RIKEN, 1-7-22 Suehiro, Yokohama, Kanagawa 230-0045, Japan
| | | | - Vladimir N Uversky
- Department of Molecular Medicine, Morsani College of Medicine, University of South Florida, Tampa, FL 33612, USA.,Institute for Biological Instrumentation of the Russian Academy of Sciences, Federal Research Center 'Pushchino Scientific Center for Biological Research of the Russian Academy of Sciences', Moscow region, Pushchino 142290, Russia
| | - Sonia Longhi
- Aix-Marseille Université and CNRS, Laboratoire Architecture et Fonction des Macromolecules Biologiques (AFMB), UMR 7257, Marseille, France
| | - Kam Y J Zhang
- Laboratory for Structural Bioinformatics, Center for Biosystems Dynamics Research, RIKEN, 1-7-22 Suehiro, Yokohama, Kanagawa 230-0045, Japan
| | - Sandeep Bhatia
- Diagnostic & Vaccine Group, ICAR-National Institute of High Security Animal Diseases, Bhopal 462022, India
| |
Collapse
|
15
|
Yan Y, Li Z, Li Y, Wu Z, Yang R. Correlated Evolution of Large DNA Fragments in the 3D Genome of Arabidopsis thaliana. Mol Biol Evol 2021; 37:1621-1636. [PMID: 32044988 DOI: 10.1093/molbev/msaa031] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
In eukaryotes, the three-dimensional (3D) conformation of the genome is far from random, and this nonrandom chromatin organization is strongly correlated with gene expression and protein function, which are two critical determinants of the selective constraints and evolutionary rates of genes. However, whether genes and other elements that are located close to each other in the 3D genome evolve in a coordinated way has not been investigated in any organism. To address this question, we constructed chromatin interaction networks (CINs) in Arabidopsis thaliana based on high-throughput chromosome conformation capture data and demonstrated that adjacent large DNA fragments in the CIN indeed exhibit more similar levels of polymorphism and evolutionary rates than random fragment pairs. Using simulations that account for the linear distance between fragments, we proved that the 3D chromosomal organization plays a role in the observed correlated evolution. Spatially interacting fragments also exhibit more similar mutation rates and functional constraints in both coding and noncoding regions than the random expectations, indicating that the correlated evolution between 3D neighbors is a result of combined evolutionary forces. A collection of 39 genomic and epigenomic features can explain much of the variance in genetic diversity and evolutionary rates across the genome. Moreover, features that have a greater effect on the evolution of regional sequences tend to show higher similarity between neighboring fragments in the CIN, suggesting a pivotal role of epigenetic modifications and chromatin organization in determining the correlated evolution of large DNA fragments in the 3D genome.
Collapse
Affiliation(s)
- Yubin Yan
- College of Life Sciences, Northwest A&F University, Yangling, Shaanxi, China
| | - Zhaohong Li
- College of Life Sciences, Northwest A&F University, Yangling, Shaanxi, China
| | - Ye Li
- College of Life Sciences, Northwest A&F University, Yangling, Shaanxi, China
| | - Zefeng Wu
- College of Life Sciences, Northwest A&F University, Yangling, Shaanxi, China
| | - Ruolin Yang
- College of Life Sciences, Northwest A&F University, Yangling, Shaanxi, China
| |
Collapse
|
16
|
Heilbrun EE, Merav M, Adar S. Exons and introns exhibit transcriptional strand asymmetry of dinucleotide distribution, damage formation and DNA repair. NAR Genom Bioinform 2021; 3:lqab020. [PMID: 33817640 PMCID: PMC8002178 DOI: 10.1093/nargab/lqab020] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2020] [Revised: 02/24/2021] [Accepted: 03/22/2021] [Indexed: 12/29/2022] Open
Abstract
Recent cancer sequencing efforts have uncovered asymmetry in DNA damage induced mutagenesis between the transcribed and non-transcribed strands of genes. Here, we investigate the major type of damage induced by ultraviolet (UV) radiation, the cyclobutane pyrimidine dimers (CPDs), which are formed primarily in TT dinucleotides. We reveal that a transcriptional asymmetry already exists at the level of TT dinucleotide frequency and therefore also in CPD damage formation. This asymmetry is conserved in vertebrates and invertebrates and is completely reversed between introns and exons. We show the asymmetry in introns is linked to the transcription process itself, and is also found in enhancer elements. In contrast, the asymmetry in exons is not correlated to transcription, and is associated with codon usage preferences. Reanalysis of nucleotide excision repair, normalizing repair to the underlying TT frequencies, we show repair of CPDs is more efficient in exons compared to introns, contributing to the maintenance and integrity of coding regions. Our results highlight the importance of considering the primary sequence of the DNA in determining DNA damage sensitivity and mutagenic potential.
Collapse
Affiliation(s)
- Elisheva E Heilbrun
- Department of Microbiology and Molecular Genetics, Institute for Medical Research Israel Canada, Faculty of Medicine, Hebrew University of Jerusalem, Ein Kerem, Jerusalem 91120, Israel
| | - May Merav
- Department of Microbiology and Molecular Genetics, Institute for Medical Research Israel Canada, Faculty of Medicine, Hebrew University of Jerusalem, Ein Kerem, Jerusalem 91120, Israel
| | - Sheera Adar
- Department of Microbiology and Molecular Genetics, Institute for Medical Research Israel Canada, Faculty of Medicine, Hebrew University of Jerusalem, Ein Kerem, Jerusalem 91120, Israel
| |
Collapse
|
17
|
Georgakopoulos-Soares I, Mouratidis I, Parada GE, Matharu N, Hemberg M, Ahituv N. Asymmetron: a toolkit for the identification of strand asymmetry patterns in biological sequences. Nucleic Acids Res 2021; 49:e4. [PMID: 33211865 PMCID: PMC7797064 DOI: 10.1093/nar/gkaa1052] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2020] [Revised: 10/15/2020] [Accepted: 10/20/2020] [Indexed: 11/23/2022] Open
Abstract
DNA strand asymmetries can have a major effect on several biological functions, including replication, transcription and transcription factor binding. As such, DNA strand asymmetries and mutational strand bias can provide information about biological function. However, a versatile tool to explore this does not exist. Here, we present Asymmetron, a user-friendly computational tool that performs statistical analysis and visualizations for the evaluation of strand asymmetries. Asymmetron takes as input DNA features provided with strand annotation and outputs strand asymmetries for consecutive occurrences of a single DNA feature or between pairs of features. We illustrate the use of Asymmetron by identifying transcriptional and replicative strand asymmetries of germline structural variant breakpoints. We also show that the orientation of the binding sites of 45% of human transcription factors analyzed have a significant DNA strand bias in transcribed regions, that is also corroborated in ChIP-seq analyses, and is likely associated with transcription. In summary, we provide a novel tool to assess DNA strand asymmetries and show how it can be used to derive new insights across a variety of biological disciplines.
Collapse
Affiliation(s)
- Ilias Georgakopoulos-Soares
- Department of Bioengineering and Therapeutic Sciences, University of California San Francisco, San Francisco, CA, USA
- Institute for Human Genetics, University of California San Francisco, San Francisco, CA, USA
| | - Ioannis Mouratidis
- Aristotle University of Thessaloniki, Department of Mathematics, Thessaloniki, GR, Greece
| | - Guillermo E Parada
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, CB10 1SA, UK
- Wellcome Trust Cancer Research UK Gurdon Institute, University of Cambridge, Tennis Court Road, Cambridge CB2 1QN, UK
| | - Navneet Matharu
- Department of Bioengineering and Therapeutic Sciences, University of California San Francisco, San Francisco, CA, USA
- Institute for Human Genetics, University of California San Francisco, San Francisco, CA, USA
- Innovative Genomics Institute, University of California San Francisco, San Francisco, CA, USA
| | - Martin Hemberg
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, CB10 1SA, UK
- Wellcome Trust Cancer Research UK Gurdon Institute, University of Cambridge, Tennis Court Road, Cambridge CB2 1QN, UK
| | - Nadav Ahituv
- Department of Bioengineering and Therapeutic Sciences, University of California San Francisco, San Francisco, CA, USA
- Institute for Human Genetics, University of California San Francisco, San Francisco, CA, USA
| |
Collapse
|
18
|
Deb B, Uddin A, Chakraborty S. Composition, codon usage pattern, protein properties, and influencing factors in the genomes of members of the family Anelloviridae. Arch Virol 2021; 166:461-474. [PMID: 33392821 PMCID: PMC7779081 DOI: 10.1007/s00705-020-04890-2] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2020] [Accepted: 10/02/2020] [Indexed: 01/31/2023]
Abstract
The present study was carried out on 62 genome sequences of members of the family Anelloviridae, as there have been no reports of genome analysis of these DNA viruses using a bioinformatics approach. The genes were found to be rich in AC content with low codon usage bias (CUB). Relative synonymous codon usage (RSCU) values identified the preferred codons for each amino acid in the family. The codon AGA was overrepresented, while the codons TCG, TTG, CGG, CGT, ACG, GCG and GAT were underrepresented in all of the genomes. A significant correlation was found between the effective number of codons (ENC) and base constraints, indicating that compositional properties might have influenced the CUB. A highly significant correlation was observed between the overall base content and the base content at the third codon position, indicating that mutations might have affected the CUB. A highly significant positive correlation was observed between GC12 and GC3 (r = 0.904, p < 0.01), which indicated that directional mutation pressure influenced all three codon positions. A neutrality plot revealed that the contribution of mutation and natural selection in determining the CUB was 58.6% and 41.4%, respectively.
Collapse
Affiliation(s)
- Bornali Deb
- Department of Biotechnology, Assam University, Silchar, Assam 788150 India
| | - Arif Uddin
- Department of Zoology, Moinul Hoque Choudhury Memorial Science College, Algapur, Hailakandi, Assam 788150 India
| | | |
Collapse
|
19
|
Barbhuiya PA, Uddin A, Chakraborty S. Codon usage pattern and evolutionary forces of mitochondrial ND genes among orders of class Amphibia. J Cell Physiol 2020; 236:2850-2868. [PMID: 32960450 DOI: 10.1002/jcp.30050] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/18/2020] [Revised: 08/07/2020] [Accepted: 08/31/2020] [Indexed: 12/18/2022]
Abstract
In this study, we used a bioinformatics approach to analyze the nucleotide composition and pattern of synonymous codon usage in mitochondrial ND genes in three amphibian groups, that is, orders Anura, Caudata, and Gymnophiona to identify the commonality and the differences of codon usage as no research work was reported yet. The high value of the effective number of codons revealed that the codon usage bias (CUB) was low in mitochondrial ND genes among the orders. Nucleotide composition analysis suggested that for each gene, the compositional features differed among Anura, Caudata, and Gymnophiona and the GC content was lower than AT content. Furthermore, a highly significant difference (p < .05) for GC content was found in each gene among the orders. The heat map showed contrasting patterns of codon usage among different ND genes. The regression of GC12 on GC3 suggested a narrow range of GC3 distribution and some points were located in the diagonal, indicating both mutation pressure and natural selection might influence the CUB. Moreover, the slope of the regression line was less than 0.5 in all ND genes among orders, indicating natural selection might have played the dominant role whereas mutation pressure had played a minor role in shaping CUB of ND genes across orders.
Collapse
Affiliation(s)
| | - Arif Uddin
- Department of Zoology, Moinul Hoque Choudhury Memorial Science College, Hailakandi, Assam, India
| | | |
Collapse
|
20
|
Subramanian H, Gatenby RA. Evolutionary advantage of anti-parallel strand orientation of duplex DNA. Sci Rep 2020; 10:9883. [PMID: 32555277 PMCID: PMC7303137 DOI: 10.1038/s41598-020-66705-3] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/01/2020] [Accepted: 05/22/2020] [Indexed: 11/09/2022] Open
Abstract
DNA in all living systems shares common properties that are remarkably well suited to its function, suggesting refinement by evolution. However, DNA also shares some counter-intuitive properties which confer no obvious benefit, such as strand directionality and anti-parallel strand orientation, which together result in the complicated lagging strand replication. The evolutionary dynamics that led to these properties of DNA remain unknown but their universality suggests that they confer as yet unknown selective advantage to DNA. In this article, we identify an evolutionary advantage of anti-parallel strand orientation of duplex DNA, within a given set of plausible premises. The advantage stems from the increased rate of replication, achieved by dividing the DNA into predictable, independently and simultaneously replicating segments, as opposed to sequentially replicating the entire DNA, thereby parallelizing the replication process. We show that anti-parallel strand orientation is essential for such a replicative organization of DNA, given our premises, the most important of which is the assumption of the presence of sequence-dependent asymmetric cooperativity in DNA.
Collapse
Affiliation(s)
| | - Robert A Gatenby
- Integrated Mathematical Oncology Department, Cancer Biology and Evolution Program, H. Lee Moffitt Cancer Center and Research Institute, 12902, USF Magnolia Dr, Tampa, Florida, USA
| |
Collapse
|
21
|
Ueberham U, Arendt T. Genomic Indexing by Somatic Gene Recombination of mRNA/ncRNA - Does It Play a Role in Genomic Mosaicism, Memory Formation, and Alzheimer's Disease? Front Genet 2020; 11:370. [PMID: 32411177 PMCID: PMC7200996 DOI: 10.3389/fgene.2020.00370] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/06/2020] [Accepted: 03/25/2020] [Indexed: 12/26/2022] Open
Abstract
Recent evidence indicates that genomic individuality of neurons, characterized by DNA-content variation, is a common if not universal phenomenon in the human brain that occurs naturally but can also show aberrancies that have been linked to the pathomechanism of Alzheimer’s disease and related neurodegenerative disorders. Etiologically, this genomic mosaic has been suggested to arise from defects of cell cycle regulation that may occur either during brain development or in the mature brain after terminal differentiation of neurons. Here, we aim to draw attention towards another mechanism that can give rise to genomic individuality of neurons, with far-reaching consequences. This mechanism has its origin in the transcriptome rather than in replication defects of the genome, i.e., somatic gene recombination of RNA. We continue to develop the concept that somatic gene recombination of RNA provides a physiological process that, through integration of intronless mRNA/ncRNA into the genome, allows a particular functional state at the level of the individual neuron to be indexed. By insertion of defined RNAs in a somatic recombination process, the presence of specific mRNA transcripts within a definite temporal context can be “frozen” and can serve as an index that can be recalled at any later point in time. This allows information related to a specific neuronal state of differentiation and/or activity relevant to a memory trace to be fixed. We suggest that this process is used throughout the lifetime of each neuron and might have both advantageous and deleterious consequences.
Collapse
Affiliation(s)
- Uwe Ueberham
- Paul Flechsig Institute for Brain Research, University of Leipzig, Leipzig, Germany
| | - Thomas Arendt
- Paul Flechsig Institute for Brain Research, University of Leipzig, Leipzig, Germany
| |
Collapse
|
22
|
Evolutionary Forces and Codon Bias in Different Flavors of Intrinsic Disorder in the Human Proteome. J Mol Evol 2019; 88:164-178. [DOI: 10.1007/s00239-019-09921-4] [Citation(s) in RCA: 20] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2019] [Accepted: 11/26/2019] [Indexed: 12/22/2022]
|
23
|
Zhou C, Sun Y, Yan R, Liu Y, Zuo E, Gu C, Han L, Wei Y, Hu X, Zeng R, Li Y, Zhou H, Guo F, Yang H. Off-target RNA mutation induced by DNA base editing and its elimination by mutagenesis. Nature 2019; 571:275-278. [PMID: 31181567 DOI: 10.1038/s41586-019-1314-0] [Citation(s) in RCA: 301] [Impact Index Per Article: 60.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2019] [Accepted: 05/30/2019] [Indexed: 12/21/2022]
Abstract
Recently developed DNA base editing methods enable the direct generation of desired point mutations in genomic DNA without generating any double-strand breaks1-3, but the issue of off-target edits has limited the application of these methods. Although several previous studies have evaluated off-target mutations in genomic DNA4-8, it is now clear that the deaminases that are integral to commonly used DNA base editors often bind to RNA9-13. For example, the cytosine deaminase APOBEC1-which is used in cytosine base editors (CBEs)-targets both DNA and RNA12, and the adenine deaminase TadA-which is used in adenine base editors (ABEs)-induces site-specific inosine formation on RNA9,11. However, any potential RNA mutations caused by DNA base editors have not been evaluated. Adeno-associated viruses are the most common delivery system for gene therapies that involve DNA editing; these viruses can sustain long-term gene expression in vivo, so the extent of potential RNA mutations induced by DNA base editors is of great concern14-16. Here we quantitatively evaluated RNA single nucleotide variations (SNVs) that were induced by CBEs or ABEs. Both the cytosine base editor BE3 and the adenine base editor ABE7.10 generated tens of thousands of off-target RNA SNVs. Subsequently, by engineering deaminases, we found that three CBE variants and one ABE variant showed a reduction in off-target RNA SNVs to the baseline while maintaining efficient DNA on-target activity. This study reveals a previously overlooked aspect of off-target effects in DNA editing and also demonstrates that such effects can be eliminated by engineering deaminases.
Collapse
Affiliation(s)
- Changyang Zhou
- Institute of Neuroscience, State Key Laboratory of Neuroscience, Key Laboratory of Primate Neurobiology, CAS Center for Excellence in Brain Science and Intelligence Technology, Shanghai Research Center for Brain Science and Brain-Inspired Intelligence, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai, China.,College of Life Sciences, University of Chinese Academy of Sciences, Beijing, China
| | - Yidi Sun
- College of Life Sciences, University of Chinese Academy of Sciences, Beijing, China.,CAS Key Laboratory of Systems Biology, CAS Center for Excellence in Molecular Cell Science, Institute of Biochemistry and Cell Biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai, China.,Bio-Med Big Data Center, Key Laboratory of Computational Biology, CAS-MPG Partner Institute for Computational Biology, Shanghai Institute of Nutrition and Health, Shanghai Institutes for Biological Sciences, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai, China
| | - Rui Yan
- Center for Translational Medicine, Ministry of Education Key Laboratory of Birth Defects and Related Diseases of Women and Children, Department of Obstetrics and Gynecology, West China Second University Hospital, College of Life Sciences, Sichuan University, Chengdu, China
| | - Yajing Liu
- College of Life Sciences, University of Chinese Academy of Sciences, Beijing, China.,School of Life Science and Technology, Shanghai Tech University, Shanghai, China
| | - Erwei Zuo
- Institute of Neuroscience, State Key Laboratory of Neuroscience, Key Laboratory of Primate Neurobiology, CAS Center for Excellence in Brain Science and Intelligence Technology, Shanghai Research Center for Brain Science and Brain-Inspired Intelligence, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai, China.,Center for Animal Genomics, Agricultural Genome Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, China
| | - Chan Gu
- Center for Translational Medicine, Ministry of Education Key Laboratory of Birth Defects and Related Diseases of Women and Children, Department of Obstetrics and Gynecology, West China Second University Hospital, College of Life Sciences, Sichuan University, Chengdu, China
| | - Linxiao Han
- Institute of Neuroscience, State Key Laboratory of Neuroscience, Key Laboratory of Primate Neurobiology, CAS Center for Excellence in Brain Science and Intelligence Technology, Shanghai Research Center for Brain Science and Brain-Inspired Intelligence, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai, China
| | - Yu Wei
- Institute of Neuroscience, State Key Laboratory of Neuroscience, Key Laboratory of Primate Neurobiology, CAS Center for Excellence in Brain Science and Intelligence Technology, Shanghai Research Center for Brain Science and Brain-Inspired Intelligence, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai, China
| | - Xinde Hu
- Institute of Neuroscience, State Key Laboratory of Neuroscience, Key Laboratory of Primate Neurobiology, CAS Center for Excellence in Brain Science and Intelligence Technology, Shanghai Research Center for Brain Science and Brain-Inspired Intelligence, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai, China.,College of Life Sciences, University of Chinese Academy of Sciences, Beijing, China
| | - Rong Zeng
- CAS Key Laboratory of Systems Biology, CAS Center for Excellence in Molecular Cell Science, Institute of Biochemistry and Cell Biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai, China.,School of Life Science and Technology, Shanghai Tech University, Shanghai, China
| | - Yixue Li
- Center for Translational Medicine, Ministry of Education Key Laboratory of Birth Defects and Related Diseases of Women and Children, Department of Obstetrics and Gynecology, West China Second University Hospital, College of Life Sciences, Sichuan University, Chengdu, China. .,School of Life Science and Technology, Shanghai Tech University, Shanghai, China. .,Shanghai Jiao Tong University, Fudan University, Shanghai Academy of Science & Technology, Shanghai, China.
| | - Haibo Zhou
- Institute of Neuroscience, State Key Laboratory of Neuroscience, Key Laboratory of Primate Neurobiology, CAS Center for Excellence in Brain Science and Intelligence Technology, Shanghai Research Center for Brain Science and Brain-Inspired Intelligence, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai, China.
| | - Fan Guo
- Center for Translational Medicine, Ministry of Education Key Laboratory of Birth Defects and Related Diseases of Women and Children, Department of Obstetrics and Gynecology, West China Second University Hospital, College of Life Sciences, Sichuan University, Chengdu, China.
| | - Hui Yang
- Institute of Neuroscience, State Key Laboratory of Neuroscience, Key Laboratory of Primate Neurobiology, CAS Center for Excellence in Brain Science and Intelligence Technology, Shanghai Research Center for Brain Science and Brain-Inspired Intelligence, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai, China.
| |
Collapse
|
24
|
Sultana T, van Essen D, Siol O, Bailly-Bechet M, Philippe C, Zine El Aabidine A, Pioger L, Nigumann P, Saccani S, Andrau JC, Gilbert N, Cristofari G. The Landscape of L1 Retrotransposons in the Human Genome Is Shaped by Pre-insertion Sequence Biases and Post-insertion Selection. Mol Cell 2019; 74:555-570.e7. [PMID: 30956044 DOI: 10.1016/j.molcel.2019.02.036] [Citation(s) in RCA: 88] [Impact Index Per Article: 17.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2018] [Revised: 01/28/2019] [Accepted: 02/25/2019] [Indexed: 01/10/2023]
Abstract
L1 retrotransposons are transposable elements and major contributors of genetic variation in humans. Where L1 integrates into the genome can directly impact human evolution and disease. Here, we experimentally induced L1 retrotransposition in cells and mapped integration sites at nucleotide resolution. At local scales, L1 integration is mostly restricted by genome sequence biases and the specificity of the L1 machinery. At regional scales, L1 shows a broad capacity for integration into all chromatin states, in contrast to other known mobile genetic elements. However, integration is influenced by the replication timing of target regions, suggesting a link to host DNA replication. The distribution of new L1 integrations differs from those of preexisting L1 copies, which are significantly reshaped by natural selection. Our findings reveal that the L1 machinery has evolved to efficiently target all genomic regions and underline a predominant role for post-integrative processes on the distribution of endogenous L1 elements.
Collapse
Affiliation(s)
- Tania Sultana
- Université Côte d'Azur, Inserm, CNRS, IRCAN, Nice, France
| | | | - Oliver Siol
- Institut de Génétique Humaine, University of Montpellier, CNRS, Montpellier, France
| | | | | | - Amal Zine El Aabidine
- Institut de Génétique Moléculaire de Montpellier, University of Montpellier, CNRS, Montpellier, France
| | - Léo Pioger
- Institut de Génétique Moléculaire de Montpellier, University of Montpellier, CNRS, Montpellier, France
| | - Pilvi Nigumann
- Université Côte d'Azur, Inserm, CNRS, IRCAN, Nice, France
| | - Simona Saccani
- Université Côte d'Azur, Inserm, CNRS, IRCAN, Nice, France
| | - Jean-Christophe Andrau
- Institut de Génétique Moléculaire de Montpellier, University of Montpellier, CNRS, Montpellier, France
| | - Nicolas Gilbert
- Institut de Génétique Humaine, University of Montpellier, CNRS, Montpellier, France; Institut de Médecine Régénératrice et de Biothérapie, Inserm U1183, CHU Montpellier, Montpellier, France
| | | |
Collapse
|
25
|
Uddin A, Paul N, Chakraborty S. The codon usage pattern of genes involved in ovarian cancer. Ann N Y Acad Sci 2019; 1440:67-78. [PMID: 30843242 DOI: 10.1111/nyas.14019] [Citation(s) in RCA: 21] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/10/2018] [Revised: 01/04/2019] [Accepted: 01/14/2019] [Indexed: 12/20/2022]
Abstract
In this study, we analyzed the compositional dynamics and codon usage pattern of genes involved in ovarian cancer (OC) using a computational method. Mutations in specific genes are associated with OC, and some genes are risk factors for progression of OC, but no work has been reported yet on the codon usage pattern of genes involved in OC. Nucleotide composition analysis of OC-related genes suggested that the overall GC content was higher than AT content; that is, the genes were GC rich. The improved effective number of codons indicated that the overall extent of codon usage bias of genes involved in OC was low. The codons AGC, CTG, ATC, ACC, GTG, and GCC were overrepresented, while the codons TCG, TTA, CTA, CCG, CAA, CGT, ATA, ACG, GTA, GTT, GCG, and GGT were underrepresented in the genes. Correspondence analysis suggested that the codon usage pattern was different in different genes. A highly significant correlation was observed between GC12 and GC3 (r = 0.587, P < 0.01) of genes, suggesting that directional mutation affected the three codon positions. Our report on the codon usage pattern of genes involved in OC includes a new perspective for elucidating the mechanisms of biased usage of synonymous codons, as well as providing useful clues for molecular genetic engineering.
Collapse
Affiliation(s)
- Arif Uddin
- Department of Zoology, Moinul Hoque Choudhury Memorial Science College, Assam, India
| | - Nirmal Paul
- Department of Biotechnology, Assam University, Assam, India
| | | |
Collapse
|
26
|
Uddin A, Mazumder TH, Chakraborty S. Understanding molecular biology of codon usage in mitochondrial complex IV genes of electron transport system: Relevance to mitochondrial diseases. J Cell Physiol 2018; 234:6397-6413. [DOI: 10.1002/jcp.27375] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2018] [Accepted: 08/17/2018] [Indexed: 12/17/2022]
Affiliation(s)
- Arif Uddin
- Department of Zoology Moinul Hoque Choudhury Memorial Science College Hailakandi Assam India
| | | | | |
Collapse
|
27
|
Camiolo S, Toome-Heller M, Aime MC, Haridas S, Grigoriev IV, Porceddu A, Mannazzu I. An analysis of codon bias in six red yeast species. Yeast 2018; 36:53-64. [PMID: 30264407 DOI: 10.1002/yea.3359] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2018] [Revised: 09/10/2018] [Accepted: 09/23/2018] [Indexed: 11/11/2022] Open
Abstract
Red yeasts, primarily species of Rhodotorula, Sporobolomyces, and other genera of Pucciniomycotina, are traditionally considered proficient systems for lipid and terpene production, and only recently have also gained consideration for the production of a wider range of molecules of biotechnological potential. Improvements of transgene delivery protocols and regulated gene expression systems have been proposed, but a dearth of information on compositional and/or structural features of genes has prevented transgene sequence optimization efforts for high expression levels. Here, the codon compositional features of genes in six red yeast species were characterized, and the impact that evolutionary forces may have played in shaping this compositional bias was dissected by using several computational approaches. Results obtained are compatible with the hypothesis that mutational bias, although playing a significant role, cannot alone explain synonymous codon usage bias of genes. Nevertheless, several lines of evidences indicated a role for translational selection in driving the synonymous codons that allow high expression efficiency. These optimal synonymous codons are identified for each of the six species analyzed. Moreover, the presence of intragenic patterns of codon usage, which are thought to facilitate polyribosome formation, was highlighted. The information presented should be taken into consideration for transgene design for optimal expression in red yeast species.
Collapse
Affiliation(s)
- Salvatore Camiolo
- Dipartimento di Agraria, Università degli Studi di Sassari, Sassari, Italy
| | - Merje Toome-Heller
- Department of Botany and Plant Pathology, Purdue University, West Lafayette, Indiana, USA
| | - M Catherine Aime
- Department of Botany and Plant Pathology, Purdue University, West Lafayette, Indiana, USA
| | - Sajeet Haridas
- US Department of Energy Joint Genome Institute, Walnut Creek, California, USA
| | - Igor V Grigoriev
- US Department of Energy Joint Genome Institute, Walnut Creek, California, USA
| | - Andrea Porceddu
- Dipartimento di Agraria, Università degli Studi di Sassari, Sassari, Italy
| | - Ilaria Mannazzu
- Dipartimento di Agraria, Università degli Studi di Sassari, Sassari, Italy
| |
Collapse
|
28
|
Thornlow BP, Hough J, Roger JM, Gong H, Lowe TM, Corbett-Detig RB. Transfer RNA genes experience exceptionally elevated mutation rates. Proc Natl Acad Sci U S A 2018; 115:8996-9001. [PMID: 30127029 PMCID: PMC6130373 DOI: 10.1073/pnas.1801240115] [Citation(s) in RCA: 25] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/03/2023] Open
Abstract
Transfer RNAs (tRNAs) are a central component for the biological synthesis of proteins, and they are among the most highly conserved and frequently transcribed genes in all living things. Despite their clear significance for fundamental cellular processes, the forces governing tRNA evolution are poorly understood. We present evidence that transcription-associated mutagenesis and strong purifying selection are key determinants of patterns of sequence variation within and surrounding tRNA genes in humans and diverse model organisms. Remarkably, the mutation rate at broadly expressed cytosolic tRNA loci is likely between 7 and 10 times greater than the nuclear genome average. Furthermore, evolutionary analyses provide strong evidence that tRNA genes, but not their flanking sequences, experience strong purifying selection acting against this elevated mutation rate. We also find a strong correlation between tRNA expression levels and the mutation rates in their immediate flanking regions, suggesting a simple method for estimating individual tRNA gene activity. Collectively, this study illuminates the extreme competing forces in tRNA gene evolution and indicates that mutations at tRNA loci contribute disproportionately to mutational load and have unexplored fitness consequences in human populations.
Collapse
Affiliation(s)
- Bryan P Thornlow
- Department of Biomolecular Engineering, University of California, Santa Cruz, CA 95064
| | - Josh Hough
- Department of Biomolecular Engineering, University of California, Santa Cruz, CA 95064
| | - Jacquelyn M Roger
- Department of Biomolecular Engineering, University of California, Santa Cruz, CA 95064
| | - Henry Gong
- Department of Biomolecular Engineering, University of California, Santa Cruz, CA 95064
| | - Todd M Lowe
- Department of Biomolecular Engineering, University of California, Santa Cruz, CA 95064;
- Genomics Institute, University of California, Santa Cruz, CA 95064
| | - Russell B Corbett-Detig
- Department of Biomolecular Engineering, University of California, Santa Cruz, CA 95064;
- Genomics Institute, University of California, Santa Cruz, CA 95064
| |
Collapse
|
29
|
Lodato MA, Rodin RE, Bohrson CL, Coulter ME, Barton AR, Kwon M, Sherman MA, Vitzthum CM, Luquette LJ, Yandava CN, Yang P, Chittenden TW, Hatem NE, Ryu SC, Woodworth MB, Park PJ, Walsh CA. Aging and neurodegeneration are associated with increased mutations in single human neurons. Science 2018; 359:555-559. [PMID: 29217584 PMCID: PMC5831169 DOI: 10.1126/science.aao4426] [Citation(s) in RCA: 370] [Impact Index Per Article: 61.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2017] [Accepted: 11/22/2017] [Indexed: 12/27/2022]
Abstract
It has long been hypothesized that aging and neurodegeneration are associated with somatic mutation in neurons; however, methodological hurdles have prevented testing this hypothesis directly. We used single-cell whole-genome sequencing to perform genome-wide somatic single-nucleotide variant (sSNV) identification on DNA from 161 single neurons from the prefrontal cortex and hippocampus of 15 normal individuals (aged 4 months to 82 years), as well as 9 individuals affected by early-onset neurodegeneration due to genetic disorders of DNA repair (Cockayne syndrome and xeroderma pigmentosum). sSNVs increased approximately linearly with age in both areas (with a higher rate in hippocampus) and were more abundant in neurodegenerative disease. The accumulation of somatic mutations with age-which we term genosenium-shows age-related, region-related, and disease-related molecular signatures and may be important in other human age-associated conditions.
Collapse
Affiliation(s)
- Michael A Lodato
- Division of Genetics and Genomics, Manton Center for Orphan Disease, and Howard Hughes Medical Institute, Boston Children's Hospital, Boston, MA, USA
- Departments of Neurology and Pediatrics, Harvard Medical School, Boston, MA, USA
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Rachel E Rodin
- Division of Genetics and Genomics, Manton Center for Orphan Disease, and Howard Hughes Medical Institute, Boston Children's Hospital, Boston, MA, USA
- Departments of Neurology and Pediatrics, Harvard Medical School, Boston, MA, USA
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Program in Neuroscience and Harvard/MIT MD-PHD Program, Harvard Medical School, Boston, MA, USA
| | - Craig L Bohrson
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
| | - Michael E Coulter
- Division of Genetics and Genomics, Manton Center for Orphan Disease, and Howard Hughes Medical Institute, Boston Children's Hospital, Boston, MA, USA
- Departments of Neurology and Pediatrics, Harvard Medical School, Boston, MA, USA
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Program in Neuroscience and Harvard/MIT MD-PHD Program, Harvard Medical School, Boston, MA, USA
| | - Alison R Barton
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
| | - Minseok Kwon
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
| | - Maxwell A Sherman
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
| | - Carl M Vitzthum
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
| | - Lovelace J Luquette
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
| | - Chandri N Yandava
- Computational Statistics and Bioinformatics Group, Advanced Artificial Intelligence Research Laboratory, WuXi NextCODE, Cambridge, MA, USA
| | - Pengwei Yang
- Computational Statistics and Bioinformatics Group, Advanced Artificial Intelligence Research Laboratory, WuXi NextCODE, Cambridge, MA, USA
| | - Thomas W Chittenden
- Computational Statistics and Bioinformatics Group, Advanced Artificial Intelligence Research Laboratory, WuXi NextCODE, Cambridge, MA, USA
- Department of Biological Engineering, Massachusetts Institute of Technology, Cambridge, MA, USA
- Division of Genetics and Genomics, Boston Children's Hospital, Harvard Medical School, Boston, MA, USA
| | - Nicole E Hatem
- Division of Genetics and Genomics, Manton Center for Orphan Disease, and Howard Hughes Medical Institute, Boston Children's Hospital, Boston, MA, USA
- Departments of Neurology and Pediatrics, Harvard Medical School, Boston, MA, USA
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Steven C Ryu
- Division of Genetics and Genomics, Manton Center for Orphan Disease, and Howard Hughes Medical Institute, Boston Children's Hospital, Boston, MA, USA
- Departments of Neurology and Pediatrics, Harvard Medical School, Boston, MA, USA
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Mollie B Woodworth
- Division of Genetics and Genomics, Manton Center for Orphan Disease, and Howard Hughes Medical Institute, Boston Children's Hospital, Boston, MA, USA
- Departments of Neurology and Pediatrics, Harvard Medical School, Boston, MA, USA
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Peter J Park
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA.
- Division of Genetics, Brigham and Women's Hospital, Boston, MA, USA
| | - Christopher A Walsh
- Division of Genetics and Genomics, Manton Center for Orphan Disease, and Howard Hughes Medical Institute, Boston Children's Hospital, Boston, MA, USA.
- Departments of Neurology and Pediatrics, Harvard Medical School, Boston, MA, USA
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| |
Collapse
|
30
|
Bergman J, Betancourt AJ, Vogl C. Transcription-Associated Compositional Skews in Drosophila Genes. Genome Biol Evol 2018; 10:269-275. [PMID: 29036491 PMCID: PMC5786239 DOI: 10.1093/gbe/evx200] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 09/25/2017] [Indexed: 12/23/2022] Open
Abstract
In many organisms, local deviations from Chargaff's second parity rule are observed around replication and transcription start sites and within intron sequences. Here, we use expression data as well as a whole-genome data set of nearly 200 haplotypes to investigate such compositional skews in Drosophila melanogaster genes. We find a positive correlation between compositional skew and gene expression, comparable in strength to similar correlations between expression levels and genome-wide sequence features. This correlation is relatively stronger for germline, compared with somatic expression, consistent with the process of transcription-associated mutation bias. We also inferred mutation rates from alleles segregating at low frequencies in short introns, and show that, whereas the overall GC content of short introns does not conform to the equilibrium expectation, the level of the observed deviation from the second parity rule is generally consistent with the inferred rates.
Collapse
Affiliation(s)
- Juraj Bergman
- Institut für Populationsgenetik, Vetmeduni Vienna, Wien, Austria
- Vienna Graduate School of Population Genetics, Vetmeduni Vienna, Wien, Austria
| | - Andrea J Betancourt
- Institut für Populationsgenetik, Vetmeduni Vienna, Wien, Austria
- Present address: Institute of Integrative Biology, University of Liverpool, Liverpool, United Kingdom
| | - Claus Vogl
- Institut für Tierzucht und Genetik, Vetmeduni Vienna, Wien, Austria
| |
Collapse
|
31
|
Boulianne B, Feldhahn N. Transcribing malignancy: transcription-associated genomic instability in cancer. Oncogene 2017; 37:971-981. [DOI: 10.1038/onc.2017.402] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2017] [Revised: 09/12/2017] [Accepted: 09/12/2017] [Indexed: 12/17/2022]
|
32
|
Vitelli V, Galbiati A, Iannelli F, Pessina F, Sharma S, d'Adda di Fagagna F. Recent Advancements in DNA Damage-Transcription Crosstalk and High-Resolution Mapping of DNA Breaks. Annu Rev Genomics Hum Genet 2017; 18:87-113. [PMID: 28859573 DOI: 10.1146/annurev-genom-091416-035314] [Citation(s) in RCA: 32] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Until recently, DNA damage arising from physiological DNA metabolism was considered a detrimental by-product for cells. However, an increasing amount of evidence has shown that DNA damage could have a positive role in transcription activation. In particular, DNA damage has been detected in transcriptional elements following different stimuli. These physiological DNA breaks are thought to be instrumental for the correct expression of genomic loci through different mechanisms. In this regard, although a plethora of methods are available to precisely map transcribed regions and transcription start sites, commonly used techniques for mapping DNA breaks lack sufficient resolution and sensitivity to draw a robust correlation between DNA damage generation and transcription. Recently, however, several methods have been developed to map DNA damage at single-nucleotide resolution, thus providing a new set of tools to correlate DNA damage and transcription. Here, we review how DNA damage can positively regulate transcription initiation, the current techniques for mapping DNA breaks at high resolution, and how these techniques can benefit future studies of DNA damage and transcription.
Collapse
Affiliation(s)
- Valerio Vitelli
- FIRC Institute of Molecular Oncology (IFOM), Milan 20139, Italy;
| | | | - Fabio Iannelli
- FIRC Institute of Molecular Oncology (IFOM), Milan 20139, Italy;
| | - Fabio Pessina
- FIRC Institute of Molecular Oncology (IFOM), Milan 20139, Italy;
| | - Sheetal Sharma
- FIRC Institute of Molecular Oncology (IFOM), Milan 20139, Italy;
| | - Fabrizio d'Adda di Fagagna
- FIRC Institute of Molecular Oncology (IFOM), Milan 20139, Italy; .,Istituto di Genetica Molecolare, Consiglio Nazionale delle Ricerche (CNR), Pavia 27100, Italy
| |
Collapse
|
33
|
Chen C, Qi H, Shen Y, Pickrell J, Przeworski M. Contrasting Determinants of Mutation Rates in Germline and Soma. Genetics 2017; 207:255-267. [PMID: 28733365 PMCID: PMC5586376 DOI: 10.1534/genetics.117.1114] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2017] [Accepted: 07/01/2017] [Indexed: 12/13/2022] Open
Abstract
Recent studies of somatic and germline mutations have led to the identification of a number of factors that influence point mutation rates, including CpG methylation, expression levels, replication timing, and GC content. Intriguingly, some of the effects appear to differ between soma and germline: in particular, whereas mutation rates have been reported to decrease with expression levels in tumors, no clear effect has been detected in the germline. Distinct approaches were taken to analyze the data, however, so it is hard to know whether these apparent differences are real. To enable a cleaner comparison, we considered a statistical model in which the mutation rate of a coding region is predicted by GC content, expression levels, replication timing, and two histone repressive marks. We applied this model to both a set of germline mutations identified in exomes and to exonic somatic mutations in four types of tumors. Most determinants of mutations are shared: notably, we detected an effect of expression levels on both germline and somatic mutation rates. Moreover, in all tissues considered, higher expression levels are associated with greater strand asymmetry of mutations. However, mutation rates increase with expression levels in testis (and, more tentatively, in ovary), whereas they decrease with expression levels in somatic tissues. This contrast points to differences in damage or repair rates during transcription in soma and germline.
Collapse
Affiliation(s)
- Chen Chen
- Department of Biological Sciences, Columbia University, New York, New York 10025
- New York Genome Center, New York, New York 10013
| | - Hongjian Qi
- Department of Systems Biology, Columbia University Medical Center, New York, New York 10032
- Department of Applied Physics and Applied Mathematics, Columbia University, New York, New York 10025
| | - Yufeng Shen
- Department of Systems Biology, Columbia University Medical Center, New York, New York 10032
- Department of Biomedical Informatics, Columbia University, New York, New York 10025
| | - Joseph Pickrell
- Department of Biological Sciences, Columbia University, New York, New York 10025
- New York Genome Center, New York, New York 10013
| | - Molly Przeworski
- Department of Biological Sciences, Columbia University, New York, New York 10025
- Department of Systems Biology, Columbia University Medical Center, New York, New York 10032
| |
Collapse
|
34
|
Tang D, Lam C, Louie S, Hoi KH, Shaw D, Yim M, Snedecor B, Misaghi S. Supplementation of Nucleosides During Selection can Reduce Sequence Variant Levels in CHO Cells Using GS/MSX Selection System. Biotechnol J 2017; 13. [DOI: 10.1002/biot.201700335] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2017] [Revised: 06/13/2017] [Indexed: 12/20/2022]
Affiliation(s)
- Danming Tang
- Department of Early Stage Cell Culture; Genentech, Inc.; South San Francisco CA USA
| | - Cynthia Lam
- Department of Early Stage Cell Culture; Genentech, Inc.; South San Francisco CA USA
| | - Salina Louie
- Department of Early Stage Cell Culture; Genentech, Inc.; South San Francisco CA USA
| | - Kam Hon Hoi
- Department of Antibody Engineering; Genentech, Inc.; South San Francisco CA USA
| | - David Shaw
- Department of Early Stage Cell Culture; Genentech, Inc.; South San Francisco CA USA
| | - Mandy Yim
- Department of Early Stage Cell Culture; Genentech, Inc.; South San Francisco CA USA
| | - Brad Snedecor
- Department of Early Stage Cell Culture; Genentech, Inc.; South San Francisco CA USA
| | - Shahram Misaghi
- Department of Early Stage Cell Culture; Genentech, Inc.; South San Francisco CA USA
| |
Collapse
|
35
|
Abstract
Males and females exhibit highly dimorphic phenotypes, particularly in their gonads, which is believed to be driven largely by differential gene expression. Typically, the protein sequences of genes upregulated in males, or male-biased genes, evolve rapidly as compared to female-biased and unbiased genes. To date, the specific study of gonad-biased genes remains uncommon in metazoans. Here, we identified and studied a total of 2927, 2013, and 4449 coding sequences (CDS) with ovary-biased, testis-biased, and unbiased expression, respectively, in the yellow fever mosquito Aedes aegypti The results showed that ovary-biased and unbiased CDS had higher nonsynonymous to synonymous substitution rates (dN/dS) and lower optimal codon usage (those codons that promote efficient translation) than testis-biased genes. Further, we observed higher dN/dS in ovary-biased genes than in testis-biased genes, even for genes coexpressed in nonsexual (embryo) tissues. Ovary-specific genes evolved exceptionally fast, as compared to testis- or embryo-specific genes, and exhibited higher frequency of positive selection. Genes with ovary expression were preferentially involved in olfactory binding and reception. We hypothesize that at least two potential mechanisms could explain rapid evolution of ovary-biased genes in this mosquito: (1) the evolutionary rate of ovary-biased genes may be accelerated by sexual selection (including female-female competition or male-mate choice) affecting olfactory genes during female swarming by males, and/or by adaptive evolution of olfactory signaling within the female reproductive system (e.g., sperm-ovary signaling); and/or (2) testis-biased genes may exhibit decelerated evolutionary rates due to the formation of mating plugs in the female after copulation, which limits male-male sperm competition.
Collapse
|
36
|
Lynch M, Ackerman MS, Gout JF, Long H, Sung W, Thomas WK, Foster PL. Genetic drift, selection and the evolution of the mutation rate. Nat Rev Genet 2017; 17:704-714. [PMID: 27739533 DOI: 10.1038/nrg.2016.104] [Citation(s) in RCA: 438] [Impact Index Per Article: 62.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/23/2023]
Abstract
As one of the few cellular traits that can be quantified across the tree of life, DNA-replication fidelity provides an excellent platform for understanding fundamental evolutionary processes. Furthermore, because mutation is the ultimate source of all genetic variation, clarifying why mutation rates vary is crucial for understanding all areas of biology. A potentially revealing hypothesis for mutation-rate evolution is that natural selection primarily operates to improve replication fidelity, with the ultimate limits to what can be achieved set by the power of random genetic drift. This drift-barrier hypothesis is consistent with comparative measures of mutation rates, provides a simple explanation for the existence of error-prone polymerases and yields a formal counter-argument to the view that selection fine-tunes gene-specific mutation rates.
Collapse
Affiliation(s)
- Michael Lynch
- Department of Biology, Indiana University, Bloomington, Indiana 47401, USA
| | - Matthew S Ackerman
- Department of Biology, Indiana University, Bloomington, Indiana 47401, USA
| | - Jean-Francois Gout
- Department of Biology, Indiana University, Bloomington, Indiana 47401, USA
| | - Hongan Long
- Department of Biology, Indiana University, Bloomington, Indiana 47401, USA
| | - Way Sung
- Department of Biology, Indiana University, Bloomington, Indiana 47401, USA
| | - W Kelley Thomas
- Department of Molecular, Cellular, and Biomedical Sciences, University of New Hampshire, Durham, New Hampshire 03824, USA
| | - Patricia L Foster
- Department of Biology, Indiana University, Bloomington, Indiana 47401, USA
| |
Collapse
|
37
|
Mathieson I, Reich D. Differences in the rare variant spectrum among human populations. PLoS Genet 2017; 13:e1006581. [PMID: 28146552 PMCID: PMC5310914 DOI: 10.1371/journal.pgen.1006581] [Citation(s) in RCA: 72] [Impact Index Per Article: 10.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2016] [Revised: 02/15/2017] [Accepted: 01/12/2017] [Indexed: 12/30/2022] Open
Abstract
Mutations occur at vastly different rates across the genome, and populations, leading to differences in the spectrum of segregating polymorphisms. Here, we investigate variation in the rare variant spectrum in a sample of human genomes representing all major world populations. We find at least two distinct signatures of variation. One, consistent with a previously reported signature is characterized by an increased rate of TCC>TTC mutations in people from Western Eurasia and South Asia, likely related to differences in the rate, or efficiency of repair, of damage due to deamination of methylated guanine. We describe the geographic extent of this signature and show that it is detectable in the genomes of ancient, but not archaic humans. The second signature is private to certain Native American populations, and is concentrated at CpG sites. We show that this signature is not driven by differences in the CpG mutation rate, but is a result of the fact that highly mutable CpG sites are more likely to undergo multiple independent mutations across human populations, and the spectrum of such mutations is highly sensitive to recent demography. Both of these effects dramatically affect the spectrum of rare variants across human populations, and should be taken into account when using mutational clocks to make inference about demography.
Collapse
Affiliation(s)
- Iain Mathieson
- Department of Genetics, Harvard Medical School, Boston, Massachusetts, United States of America
| | - David Reich
- Department of Genetics, Harvard Medical School, Boston, Massachusetts, United States of America
- Broad Institute of MIT and Harvard, Cambridge, Massachusetts, United States of America
- Howard Hughes Medical Institute, Harvard Medical School, Boston, Massachusetts, United States of America
| |
Collapse
|
38
|
Seplyarskiy VB, Andrianova MA, Bazykin GA. APOBEC3A/B-induced mutagenesis is responsible for 20% of heritable mutations in the TpCpW context. Genome Res 2016; 27:175-184. [PMID: 27940951 PMCID: PMC5287224 DOI: 10.1101/gr.210336.116] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2016] [Accepted: 12/01/2016] [Indexed: 12/18/2022]
Abstract
APOBEC3A/B cytidine deaminase is responsible for the majority of cancerous mutations in a large fraction of cancer samples. However, its role in heritable mutagenesis remains very poorly understood. Recent studies have demonstrated that both in yeast and in human cancerous cells, most APOBEC3A/B-induced mutations occur on the lagging strand during replication and on the nontemplate strand of transcribed regions. Here, we use data on rare human polymorphisms, interspecies divergence, and de novo mutations to study germline mutagenesis and to analyze mutations at nucleotide contexts prone to attack by APOBEC3A/B. We show that such mutations occur preferentially on the lagging strand and on nontemplate strands of transcribed regions. Moreover, we demonstrate that APOBEC3A/B-like mutations tend to produce strand-coordinated clusters, which are also biased toward the lagging strand. Finally, we show that the mutation rate is increased 3' of C→G mutations to a greater extent than 3' of C→T mutations, suggesting pervasive trans-lesion bypass of the APOBEC3A/B-induced damage. Our study demonstrates that 20% of C→T and C→G mutations in the TpCpW context-where W denotes A or T, segregating as polymorphisms in human population-or 1.4% of all heritable mutations are attributable to APOBEC3A/B activity.
Collapse
Affiliation(s)
- Vladimir B Seplyarskiy
- Institute for Information Transmission Problems of the Russian Academy of Sciences (Kharkevich Institute), Moscow 127994, Russia.,Pirogov Russian National Research Medical University, Moscow 117997, Russia
| | - Maria A Andrianova
- Institute for Information Transmission Problems of the Russian Academy of Sciences (Kharkevich Institute), Moscow 127994, Russia.,Pirogov Russian National Research Medical University, Moscow 117997, Russia.,Lomonosov Moscow State University, Moscow 119234, Russia
| | - Georgii A Bazykin
- Institute for Information Transmission Problems of the Russian Academy of Sciences (Kharkevich Institute), Moscow 127994, Russia.,Pirogov Russian National Research Medical University, Moscow 117997, Russia.,Lomonosov Moscow State University, Moscow 119234, Russia.,Skolkovo Institute of Science and Technology, Skolkovo 143026, Russia
| |
Collapse
|
39
|
Harpak A, Bhaskar A, Pritchard JK. Mutation Rate Variation is a Primary Determinant of the Distribution of Allele Frequencies in Humans. PLoS Genet 2016; 12:e1006489. [PMID: 27977673 PMCID: PMC5157949 DOI: 10.1371/journal.pgen.1006489] [Citation(s) in RCA: 36] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2016] [Accepted: 11/16/2016] [Indexed: 01/06/2023] Open
Abstract
The site frequency spectrum (SFS) has long been used to study demographic history and natural selection. Here, we extend this summary by examining the SFS conditional on the alleles found at the same site in other species. We refer to this extension as the "phylogenetically-conditioned SFS" or cSFS. Using recent large-sample data from the Exome Aggregation Consortium (ExAC), combined with primate genome sequences, we find that human variants that occurred independently in closely related primate lineages are at higher frequencies in humans than variants with parallel substitutions in more distant primates. We show that this effect is largely due to sites with elevated mutation rates causing significant departures from the widely-used infinite sites mutation model. Our analysis also suggests substantial variation in mutation rates even among mutations involving the same nucleotide changes. In summary, we show that variable mutation rates are key determinants of the SFS in humans.
Collapse
Affiliation(s)
- Arbel Harpak
- Department of Biology, Stanford University, Stanford, California, United States of America
| | - Anand Bhaskar
- Department of Genetics, Stanford University, Stanford, California, United States of America
- Howard Hughes Medical Institute, Stanford University, Stanford, California, United States of America
| | - Jonathan K. Pritchard
- Department of Biology, Stanford University, Stanford, California, United States of America
- Department of Genetics, Stanford University, Stanford, California, United States of America
- Howard Hughes Medical Institute, Stanford University, Stanford, California, United States of America
| |
Collapse
|
40
|
Acuna-Hidalgo R, Veltman JA, Hoischen A. New insights into the generation and role of de novo mutations in health and disease. Genome Biol 2016; 17:241. [PMID: 27894357 PMCID: PMC5125044 DOI: 10.1186/s13059-016-1110-1] [Citation(s) in RCA: 276] [Impact Index Per Article: 34.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022] Open
Abstract
Aside from inheriting half of the genome of each of our parents, we are born with a small number of novel mutations that occurred during gametogenesis and postzygotically. Recent genome and exome sequencing studies of parent-offspring trios have provided the first insights into the number and distribution of these de novo mutations in health and disease, pointing to risk factors that increase their number in the offspring. De novo mutations have been shown to be a major cause of severe early-onset genetic disorders such as intellectual disability, autism spectrum disorder, and other developmental diseases. In fact, the occurrence of novel mutations in each generation explains why these reproductively lethal disorders continue to occur in our population. Recent studies have also shown that de novo mutations are predominantly of paternal origin and that their number increases with advanced paternal age. Here, we review the recent literature on de novo mutations, covering their detection, biological characterization, and medical impact.
Collapse
Affiliation(s)
- Rocio Acuna-Hidalgo
- Department of Human Genetics, Radboud Institute for Molecular Life Sciences, Radboud University Medical Center, Geert Grooteplein 10, 6525 GA, Nijmegen, The Netherlands
| | - Joris A Veltman
- Department of Human Genetics, Donders Institute of Neuroscience, Radboud University Medical Center, Geert Grooteplein 10, 6525 GA, Nijmegen, The Netherlands.
- Department of Clinical Genetics, GROW - School for Oncology and Developmental Biology, Maastricht University Medical Centre, Universiteitssingel 50, 6229 ER, Maastricht, The Netherlands.
| | - Alexander Hoischen
- Department of Human Genetics, Donders Institute of Neuroscience, Radboud University Medical Center, Geert Grooteplein 10, 6525 GA, Nijmegen, The Netherlands
| |
Collapse
|
41
|
Whittle CA, Extavour CG. Expression-Linked Patterns of Codon Usage, Amino Acid Frequency, and Protein Length in the Basally Branching Arthropod Parasteatoda tepidariorum. Genome Biol Evol 2016; 8:2722-36. [PMID: 27017527 PMCID: PMC5630913 DOI: 10.1093/gbe/evw068] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022] Open
Abstract
Spiders belong to the Chelicerata, the most basally branching arthropod subphylum. The common house spider, Parasteatoda tepidariorum, is an emerging model and provides a valuable system to address key questions in molecular evolution in an arthropod system that is distinct from traditionally studied insects. Here, we provide evidence suggesting that codon usage, amino acid frequency, and protein lengths are each influenced by expression-mediated selection in P. tepidariorum. First, highly expressed genes exhibited preferential usage of T3 codons in this spider, suggestive of selection. Second, genes with elevated transcription favored amino acids with low or intermediate size/complexity (S/C) scores (glycine and alanine) and disfavored those with large S/C scores (such as cysteine), consistent with the minimization of biosynthesis costs of abundant proteins. Third, we observed a negative correlation between expression level and coding sequence length. Together, we conclude that protein-coding genes exhibit signals of expression-related selection in this emerging, noninsect, arthropod model.
Collapse
Affiliation(s)
- Carrie A Whittle
- Department of Organismic and Evolutionary Biology, Harvard University
| | - Cassandra G Extavour
- Department of Organismic and Evolutionary Biology, Harvard University Department of Molecular and Cellular Biology, Harvard University
| |
Collapse
|
42
|
Shporer S, Chor B, Rosset S, Horn D. Inversion symmetry of DNA k-mer counts: validity and deviations. BMC Genomics 2016; 17:696. [PMID: 27580854 PMCID: PMC5006273 DOI: 10.1186/s12864-016-3012-8] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2016] [Accepted: 08/11/2016] [Indexed: 01/25/2023] Open
Abstract
Background The generalization of the second Chargaff rule states that counts of any string of nucleotides of length k on a single chromosomal strand equal the counts of its inverse (reverse-complement) k-mer. This Inversion Symmetry (IS) holds for many species, both eukaryotes and prokaryotes, for ranges of k which may vary from 7 to 10 as chromosomal lengths vary from 2Mbp to 200 Mbp. The existence of IS has been demonstrated in the literature, and other pair-wise candidate symmetries (e.g. reverse or complement) have been ruled out. Results Studying IS in the human genome, we find that IS holds up to k = 10. It holds for complete chromosomes, also after applying the low complexity mask. We introduce a numerical IS criterion, and define the k-limit, KL, as the highest k for which this criterion is valid. We demonstrate that chromosomes of different species, as well as different human chromosomal sections, follow a universal logarithmic dependence of KL ~ 0.7 ln(L), where L is the length of the chromosome. We introduce a statistical IS-Poisson model that allows us to apply confidence measures to our numerical findings. We find good agreement for large k, where the variance of the Poisson distribution determines the outcome of the analysis. This model predicts the observed logarithmic increase of KL with length. The model allows us to conclude that for low k, e.g. k = 1 where IS becomes the 2nd Chargaff rule, IS violation, although extremely small, is significant. Studying this violation we come up with an unexpected observation for human chromosomes, finding a meaningful correlation with the excess of genes on particular strands. Conclusions Our IS-Poisson model agrees well with genomic data, and accounts for the universal behavior of k-limits. For low k we point out minute, yet significant, deviations from the model, including excess of counts of nucleotides T vs A and G vs C on positive strands of human chromosomes. Interestingly, this correlates with a significant (but small) excess of genes on the same positive strands. Electronic supplementary material The online version of this article (doi:10.1186/s12864-016-3012-8) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Sagi Shporer
- Blavatnik School of Computer Science, Tel Aviv University, Tel Aviv, 69978, Israel
| | - Benny Chor
- Blavatnik School of Computer Science, Tel Aviv University, Tel Aviv, 69978, Israel
| | - Saharon Rosset
- Sackler School of Mathematical Sciences, Tel Aviv University, Tel Aviv, 69978, Israel
| | - David Horn
- Sackler School of Physics and Astronomy, Tel Aviv University, Tel Aviv, 69978, Israel.
| |
Collapse
|
43
|
Uddin A, Chakraborty S. Codon usage trend in mitochondrial CYB gene. Gene 2016; 586:105-14. [DOI: 10.1016/j.gene.2016.04.005] [Citation(s) in RCA: 36] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2015] [Revised: 03/11/2016] [Accepted: 04/02/2016] [Indexed: 11/25/2022]
|
44
|
Affiliation(s)
- Hélène Gaillard
- Centro Andaluz de Biología Molecular y Medicina Regenerativa (CABIMER), Universidad de Sevilla, Sevilla 41092, Spain; ,
| | - Andrés Aguilera
- Centro Andaluz de Biología Molecular y Medicina Regenerativa (CABIMER), Universidad de Sevilla, Sevilla 41092, Spain; ,
| |
Collapse
|
45
|
Kim J, Mouw KW, Polak P, Braunstein LZ, Kamburov A, Kwiatkowski DJ, Rosenberg JE, Van Allen EM, D'Andrea A, Getz G. Somatic ERCC2 mutations are associated with a distinct genomic signature in urothelial tumors. Nat Genet 2016; 48:600-606. [PMID: 27111033 PMCID: PMC4936490 DOI: 10.1038/ng.3557] [Citation(s) in RCA: 277] [Impact Index Per Article: 34.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2015] [Accepted: 04/01/2016] [Indexed: 12/17/2022]
Abstract
Alterations in DNA repair pathways are common in tumors and can result in characteristic mutational signatures; however, a specific mutational signature associated with somatic alterations in the nucleotide- excision repair (NER) pathway has not yet been identified. Here we examine the mutational processes operating in urothelial cancer, a tumor type in which the core NER gene ERCC2 is significantly mutated. Analysis of three independent urothelial tumor cohorts demonstrates a strong association between somatic ERCC2 mutations and the activity of a mutational signature characterized by a broad spectrum of base changes. In addition, we note an association between the activity of this signature and smoking that is independent of ERCC2 mutation status, providing genomic evidence of tobacco-related mutagenesis in urothelial cancer. Together, these analyses identify an NER-related mutational signature and highlight the related roles of DNA damage and subsequent DNA repair in shaping tumor mutational landscape.
Collapse
Affiliation(s)
- Jaegil Kim
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Kent W Mouw
- Department of Radiation Oncology, Brigham & Women's Hospital, Dana-Farber Cancer Institute, Boston, MA, USA
- Harvard Medical School, Boston, MA, USA
| | - Paz Polak
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Harvard Medical School, Boston, MA, USA
- Department of Pathology, Massachusetts General Hospital, Boston, MA, USA
- Cancer Center, Massachusetts General Hospital, Boston, MA, USA
| | - Lior Z Braunstein
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Harvard Medical School, Boston, MA, USA
| | - Atanas Kamburov
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Harvard Medical School, Boston, MA, USA
- Department of Pathology, Massachusetts General Hospital, Boston, MA, USA
- Cancer Center, Massachusetts General Hospital, Boston, MA, USA
| | - David J Kwiatkowski
- Harvard Medical School, Boston, MA, USA
- Division of Pulmonary Medicine, Brigham & Women's Hospital, Boston, MA, USA
| | - Jonathan E Rosenberg
- Genitourinary Oncology Service, Department of Medicine, Memorial Sloan Kettering Cancer Center, New York, NY, USA
| | - Eliezer M Van Allen
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Harvard Medical School, Boston, MA, USA
- Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Alan D'Andrea
- Department of Radiation Oncology, Brigham & Women's Hospital, Dana-Farber Cancer Institute, Boston, MA, USA
- Harvard Medical School, Boston, MA, USA
- Center for DNA Damage and Repair, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Gad Getz
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Harvard Medical School, Boston, MA, USA
- Department of Pathology, Massachusetts General Hospital, Boston, MA, USA
- Cancer Center, Massachusetts General Hospital, Boston, MA, USA
| |
Collapse
|
46
|
Mutation Detection in an Antibody-Producing Chinese Hamster Ovary Cell Line by Targeted RNA Sequencing. BIOMED RESEARCH INTERNATIONAL 2016; 2016:8356435. [PMID: 27088091 PMCID: PMC4818804 DOI: 10.1155/2016/8356435] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/18/2015] [Revised: 02/04/2016] [Accepted: 02/21/2016] [Indexed: 01/04/2023]
Abstract
Chinese hamster ovary (CHO) cells have been used widely in the pharmaceutical industry for production of biological therapeutics including monoclonal antibodies (mAb). The integrity of the gene of interest and the accuracy of the relay of genetic information impact product quality and patient safety. Here we employed next-generation sequencing, particularly RNA-seq, and developed a method to systematically analyze the mutation rate of the mRNA of CHO cell lines producing a mAb. The effect of an extended culturing period to mimic the scale of cell expansion in a manufacturing process and varying selection pressure in the cell culture were also closely examined.
Collapse
|
47
|
Haradhvala NJ, Polak P, Stojanov P, Covington KR, Shinbrot E, Hess JM, Rheinbay E, Kim J, Maruvka YE, Braunstein LZ, Kamburov A, Hanawalt PC, Wheeler DA, Koren A, Lawrence MS, Getz G. Mutational Strand Asymmetries in Cancer Genomes Reveal Mechanisms of DNA Damage and Repair. Cell 2016; 164:538-49. [PMID: 26806129 DOI: 10.1016/j.cell.2015.12.050] [Citation(s) in RCA: 288] [Impact Index Per Article: 36.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2015] [Revised: 12/21/2015] [Accepted: 12/24/2015] [Indexed: 12/20/2022]
Abstract
Mutational processes constantly shape the somatic genome, leading to immunity, aging, cancer, and other diseases. When cancer is the outcome, we are afforded a glimpse into these processes by the clonal expansion of the malignant cell. Here, we characterize a less explored layer of the mutational landscape of cancer: mutational asymmetries between the two DNA strands. Analyzing whole-genome sequences of 590 tumors from 14 different cancer types, we reveal widespread asymmetries across mutagenic processes, with transcriptional ("T-class") asymmetry dominating UV-, smoking-, and liver-cancer-associated mutations and replicative ("R-class") asymmetry dominating POLE-, APOBEC-, and MSI-associated mutations. We report a striking phenomenon of transcription-coupled damage (TCD) on the non-transcribed DNA strand and provide evidence that APOBEC mutagenesis occurs on the lagging-strand template during DNA replication. As more genomes are sequenced, studying and classifying their asymmetries will illuminate the underlying biological mechanisms of DNA damage and repair.
Collapse
Affiliation(s)
- Nicholas J Haradhvala
- Massachusetts General Hospital Cancer Center and Department of Pathology, 55 Fruit Street, Boston, MA 02114, USA; Broad Institute of Harvard and MIT, 415 Main Street, Cambridge, MA 02142, USA
| | - Paz Polak
- Massachusetts General Hospital Cancer Center and Department of Pathology, 55 Fruit Street, Boston, MA 02114, USA; Broad Institute of Harvard and MIT, 415 Main Street, Cambridge, MA 02142, USA; Harvard Medical School, 25 Shattuck Street, Boston, MA 02115, USA
| | - Petar Stojanov
- Carnegie Mellon University School of Computer Science, 5000 Forbes Avenue, Pittsburgh, PA 15213, USA
| | - Kyle R Covington
- Baylor College of Medicine, 1 Baylor Plaza, Houston, TX 77030, USA
| | - Eve Shinbrot
- Baylor College of Medicine, 1 Baylor Plaza, Houston, TX 77030, USA
| | - Julian M Hess
- Broad Institute of Harvard and MIT, 415 Main Street, Cambridge, MA 02142, USA
| | - Esther Rheinbay
- Massachusetts General Hospital Cancer Center and Department of Pathology, 55 Fruit Street, Boston, MA 02114, USA; Broad Institute of Harvard and MIT, 415 Main Street, Cambridge, MA 02142, USA
| | - Jaegil Kim
- Broad Institute of Harvard and MIT, 415 Main Street, Cambridge, MA 02142, USA
| | - Yosef E Maruvka
- Massachusetts General Hospital Cancer Center and Department of Pathology, 55 Fruit Street, Boston, MA 02114, USA; Broad Institute of Harvard and MIT, 415 Main Street, Cambridge, MA 02142, USA
| | - Lior Z Braunstein
- Broad Institute of Harvard and MIT, 415 Main Street, Cambridge, MA 02142, USA
| | - Atanas Kamburov
- Massachusetts General Hospital Cancer Center and Department of Pathology, 55 Fruit Street, Boston, MA 02114, USA; Broad Institute of Harvard and MIT, 415 Main Street, Cambridge, MA 02142, USA; Harvard Medical School, 25 Shattuck Street, Boston, MA 02115, USA
| | - Philip C Hanawalt
- Stanford University Department of Biology, 450 Serra Mall, Stanford, CA 94305, USA
| | - David A Wheeler
- Baylor College of Medicine, 1 Baylor Plaza, Houston, TX 77030, USA
| | - Amnon Koren
- Broad Institute of Harvard and MIT, 415 Main Street, Cambridge, MA 02142, USA; Cornell University Department of Molecular Biology and Genetics, 526 Campus Road, Ithaca, NY 14853, USA
| | - Michael S Lawrence
- Broad Institute of Harvard and MIT, 415 Main Street, Cambridge, MA 02142, USA.
| | - Gad Getz
- Massachusetts General Hospital Cancer Center and Department of Pathology, 55 Fruit Street, Boston, MA 02114, USA; Broad Institute of Harvard and MIT, 415 Main Street, Cambridge, MA 02142, USA; Harvard Medical School, 25 Shattuck Street, Boston, MA 02115, USA.
| |
Collapse
|
48
|
Gao Z, Wyman MJ, Sella G, Przeworski M. Interpreting the Dependence of Mutation Rates on Age and Time. PLoS Biol 2016; 14:e1002355. [PMID: 26761240 PMCID: PMC4711947 DOI: 10.1371/journal.pbio.1002355] [Citation(s) in RCA: 74] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2015] [Accepted: 12/11/2015] [Indexed: 01/06/2023] Open
Abstract
Mutations can originate from the chance misincorporation of nucleotides during DNA replication or from DNA lesions that arise between replication cycles and are not repaired correctly. We introduce a model that relates the source of mutations to their accumulation with cell divisions, providing a framework for understanding how mutation rates depend on sex, age, and cell division rate. We show that the accrual of mutations should track cell divisions not only when mutations are replicative in origin but also when they are non-replicative and repaired efficiently. One implication is that observations from diverse fields that to date have been interpreted as pointing to a replicative origin of most mutations could instead reflect the accumulation of mutations arising from endogenous reactions or exogenous mutagens. We further find that only mutations that arise from inefficiently repaired lesions will accrue according to absolute time; thus, unless life history traits co-vary, the phylogenetic “molecular clock” should not be expected to run steadily across species. Modeling how the source of mutations relates to their rate of accumulation with age, sex, and number of cell divisions helps to explain perplexing observations about germline and somatic mutations. We relate how mutations arise to how they accumulate in different sexes, with age and with cell division. This model provides a single framework within which to interpret emerging results from evolutionary biology, human genetics, and cancer genetics. We show that the accrual of mutations should track cell divisions not only when mutations originate during DNA replication but also when they arise through non-replicative mechanisms and are repaired efficiently. This realization means that previous observations of correlations between mutation and cell division rates actually provide little support to the commonly held belief that most germline and somatic mutations arise from replication errors. We further find that only mutations that arise from inefficiently repaired lesions will accrue according to absolute time; thus, without covariation in life history traits, the phylogenetic “molecular clock” should not be expected to run at constant rates across species.
Collapse
Affiliation(s)
- Ziyue Gao
- Committee on Genetics, Genomics and Systems Biology, University of Chicago, Chicago, Illinois, United States of America
- * E-mail: (ZG); (MP)
| | - Minyoung J. Wyman
- Department of Biological Sciences, Columbia University, New York, New York, United States of America
| | - Guy Sella
- Department of Biological Sciences, Columbia University, New York, New York, United States of America
| | - Molly Przeworski
- Department of Biological Sciences, Columbia University, New York, New York, United States of America
- Department of Systems Biology, Columbia University, New York, New York, United States of America
- * E-mail: (ZG); (MP)
| |
Collapse
|
49
|
Nakatani Y, Mello CC, Hashimoto SI, Shimada A, Nakamura R, Tsukahara T, Qu W, Yoshimura J, Suzuki Y, Sugano S, Takeda H, Fire A, Morishita S. Associations between nucleosome phasing, sequence asymmetry, and tissue-specific expression in a set of inbred Medaka species. BMC Genomics 2015; 16:978. [PMID: 26584643 PMCID: PMC4653950 DOI: 10.1186/s12864-015-2198-5] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2015] [Accepted: 11/07/2015] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Transcription start sites (TSSs) with pronounced and phased nucleosome arrays downstream and nucleosome-depleted regions upstream of TSSs are observed in various species. RESULTS We have characterized sequence variation and expression properties of this set of TSSs (which we call "Nucleocyclic TSSs") using germline and somatic cells of three medaka (Oryzias latipes) inbred isolates from different locations. We found nucleocyclic TSSs in medaka to be associated with higher gene expression and characterized by a clear boundary in sequence composition with potentially-nucleosome-destabilizing A/T-enrichment upstream (p < 10(-60)) and nucleosome- accommodating C/G-enrichment downstream (p < 10(-40)) that was highly conserved from an ancestor. A substantial genetic distance between the strains facilitated the in-depth analysis of patterns of fixed mutations, revealing a localization-specific equilibrium between the rates of distinct mutation categories that would serve to maintain the conserved sequence anisotropy around TSSs. Downstream of nucleocyclic TSSs, C to T, T to C, and other mutation rates on the sense strand increased around first nucleosome dyads and decreased around first linkers, which contrasted with genomewide mutational patterns around nucleosomes (p < 5 %). C to T rates are higher than G to A rates around nucleosome associated with germline nucleocyclic TSS sites (p < 5 %), potentially due to the asymmetric effect of transcription-coupled repair. CONCLUSIONS Our results demonstrate an atypical evolutionary process surrounding nucleocyclic TSSs.
Collapse
Affiliation(s)
- Yoichiro Nakatani
- Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, The University of Tokyo, Kashiwa, 277-0882, Japan.
| | - Cecilia C Mello
- Department of Pathology, School of Medicine, Stanford University, Stanford, CA, 94305-5324, USA.
| | - Shin-Ichi Hashimoto
- Graduate School of Medical Sciences, Kanazawa University, Kanazawa, 920-1192, Japan.
| | - Atsuko Shimada
- Department of Biological Sciences, Graduate School of Science, The University of Tokyo, Tokyo, 113-0033, Japan.
| | - Ryohei Nakamura
- Department of Biological Sciences, Graduate School of Science, The University of Tokyo, Tokyo, 113-0033, Japan.
| | - Tatsuya Tsukahara
- Department of Biological Sciences, Graduate School of Science, The University of Tokyo, Tokyo, 113-0033, Japan.
| | - Wei Qu
- Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, The University of Tokyo, Kashiwa, 277-0882, Japan.
| | - Jun Yoshimura
- Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, The University of Tokyo, Kashiwa, 277-0882, Japan.
| | - Yutaka Suzuki
- Department of Medical Genome Sciences, Graduate School of Frontier Sciences, The University of Tokyo, Tokyo, 108-8639, Japan.
| | - Sumio Sugano
- Department of Medical Genome Sciences, Graduate School of Frontier Sciences, The University of Tokyo, Tokyo, 108-8639, Japan.
| | - Hiroyuki Takeda
- Department of Biological Sciences, Graduate School of Science, The University of Tokyo, Tokyo, 113-0033, Japan.
| | - Andrew Fire
- Departments of Pathology and Genetics, School of Medicine, Stanford University, Stanford, CA, 94305-5324, USA.
| | - Shinichi Morishita
- Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, The University of Tokyo, Kashiwa, 277-0882, Japan.
| |
Collapse
|
50
|
Price N, Graur D. Are Synonymous Sites in Primates and Rodents Functionally Constrained? J Mol Evol 2015; 82:51-64. [PMID: 26563252 DOI: 10.1007/s00239-015-9719-3] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/07/2015] [Accepted: 11/04/2015] [Indexed: 11/28/2022]
Abstract
It has been claimed that synonymous sites in mammals are under selective constraint. Furthermore, in many studies the selective constraint at such sites in primates was claimed to be more stringent than that in rodents. Given the larger effective population sizes in rodents than in primates, the theoretical expectation is that selection in rodents would be more effective than that in primates. To resolve this contradiction between expectations and observations, we used processed pseudogenes as a model for strict neutral evolution, and estimated selective constraint on synonymous sites using the rate of substitution at pseudosynonymous and pseudononsynonymous sites in pseudogenes as the neutral expectation. After controlling for the effects of GC content, our results were similar to those from previous studies, i.e., synonymous sites in primates exhibited evidence for higher selective constraint that those in rodents. Specifically, our results indicated that in primates up to 24% of synonymous sites could be under purifying selection, while in rodents synonymous sites evolved neutrally. To further control for shifts in GC content, we estimated selective constraint at fourfold degenerate sites using a maximum parsimony approach. This allowed us to estimate selective constraint using mutational patterns that cause a shift in GC content (GT ↔ TG, CT ↔ TC, GA ↔ AG, and CA ↔ AC) and ones that do not (AT ↔ TA and CG ↔ GC). Using this approach, we found that synonymous sites evolve neutrally in both primates and rodents. Apparent deviations from neutrality were caused by a higher rate of C → A and C → T mutations in pseudogenes. Such differences are most likely caused by the shift in GC content experienced by pseudogenes. We conclude that previous estimates according to which 20-40% of synonymous sites in primates were under selective constraint were most likely artifacts of the biased pattern of mutation.
Collapse
Affiliation(s)
- Nicholas Price
- Department of Bioagricultural Sciences and Pest Management, Colorado State University, Fort Collins, CO, 80523, USA.
| | - Dan Graur
- Department of Biology and Biochemistry, University of Houston, Houston, TX, 77204-5001, USA
| |
Collapse
|