1
|
Luo N, Huang Q, Dong L, Liu W, Song J, Sun H, Wu H, Gao Y, Yi C. Near-cognate tRNAs increase the efficiency and precision of pseudouridine-mediated readthrough of premature termination codons. Nat Biotechnol 2025; 43:114-123. [PMID: 38448662 DOI: 10.1038/s41587-024-02165-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2023] [Accepted: 02/02/2024] [Indexed: 03/08/2024]
Abstract
Programmable RNA pseudouridylation has emerged as a new type of RNA base editor to suppress premature termination codons (PTCs) that can lead to truncated and nonfunctional proteins. However, current methods to correct disease-associated PTCs suffer from low efficiency and limited precision. Here we develop RESTART v3, which uses near-cognate tRNAs to improve the readthrough efficiency of pseudouridine-modified PTCs. We show an average of ~5-fold (range: 2.1- to 9.5-fold) higher editing efficiency than RESTART v2 in cultured cells and achieve functional PTC readthrough in disease cell models of cystic fibrosis and Hurler syndrome. Furthermore, RESTART v3 enables accurate incorporation of the original amino acid for nearly half of the PTC sites, considering the naturally occurring frequencies of sense-to-nonsense codons, without affecting normal termination codons. Although off-target sites were detected, we did not observe changes to the coding information or the expression level of transcripts, and the overall natural tRNA abundance remained constant.
Collapse
Affiliation(s)
- Nan Luo
- State Key Laboratory of Protein and Plant Gene Research, School of Life Sciences, Peking University, Beijing, China
| | - Qiang Huang
- State Key Laboratory of Protein and Plant Gene Research, School of Life Sciences, Peking University, Beijing, China
| | - Liting Dong
- State Key Laboratory of Protein and Plant Gene Research, School of Life Sciences, Peking University, Beijing, China
- Peking-Tsinghua Center for Life Sciences, Peking University, Beijing, China
| | - Wenqing Liu
- School of Life Sciences, Tsinghua University, Beijing, China
- Tsinghua-Peking Joint Center for Life Sciences, Tsinghua University, Beijing, China
| | - Jinghui Song
- State Key Laboratory of Protein and Plant Gene Research, School of Life Sciences, Peking University, Beijing, China
| | - Hanxiao Sun
- State Key Laboratory of Protein and Plant Gene Research, School of Life Sciences, Peking University, Beijing, China
| | - Hao Wu
- State Key Laboratory of Protein and Plant Gene Research, School of Life Sciences, Peking University, Beijing, China
- Peking-Tsinghua Center for Life Sciences, Peking University, Beijing, China
- Academy for Advanced Interdisciplinary Studies, Peking University, Beijing, China
| | - Yuan Gao
- Modit Therapeutics Beijing Limited, K115 Beijing ATLATL International Innovation Platform, Beijing, China
| | - Chengqi Yi
- State Key Laboratory of Protein and Plant Gene Research, School of Life Sciences, Peking University, Beijing, China.
- Peking-Tsinghua Center for Life Sciences, Peking University, Beijing, China.
- Department of Chemical Biology and Synthetic and Functional Biomolecules Center, College of Chemistry and Molecular Engineering, Peking University, Beijing, China.
- Beijing Advanced Center of RNA Biology (BEACON), Peking University, Beijing, China.
| |
Collapse
|
2
|
Wang Y, Chen KP. C and G are frequently mutated into T and A in coding regions of human genes. Mol Genet Genomics 2024; 299:23. [PMID: 38431687 DOI: 10.1007/s00438-024-02118-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2023] [Accepted: 01/24/2024] [Indexed: 03/05/2024]
Abstract
Nucleotide mutations in human genes have long been a hot subject for study because some of them may lead to severe human diseases. Understanding the general mutational process and evolutionary trend of human genes could help answer such questions as why certain diseases occur and what challenges we face in protecting human health. In this study, we conducted statistics on 89,895 single-nucleotide variations identified in coding regions of 18,339 human genes. The results show that C and G are frequently mutated into T and A in human genes. C/G (C or G)-to-T/A mutations lead to reduction of hydrogen bonds in double-stranded DNA because C-G and T-A base pairs are maintained by three and two hydrogen bonds respectively. C-to-T and G-to-A mutations occur predominantly in human genes because they not only reduce hydrogen bonds but also belong to transition mutation. Reduction of hydrogen bonds could reduce energy consumption not only in separating double strands of mutated DNA for transcription and replication but also in disrupting stem-loop structure of mutated mRNA for translation. It is thus considered that to reduce hydrogen bonds (and thus to reduce energy consumption in gene expression) is one of the driving forces for nucleotide mutation. Moreover, codon mutation is positively correlated to its content, suggesting that most mutations are not targeted on changing any specific codons (amino acids) but are merely for reducing hydrogen bonds. Our study provides an example of utilizing single-nucleotide variation data to infer evolutionary trend of human genes, which can be referenced to conduct similar studies in other organisms.
Collapse
Affiliation(s)
- Yong Wang
- School of Food and Biological Engineering, Jiangsu University, 301 Xuefu Road, Zhenjiang, 212013, China.
| | - Ke-Ping Chen
- School of Life Sciences, Jiangsu University, 301 Xuefu Road, Zhenjiang, 212013, China
| |
Collapse
|
3
|
Trexler M, Bányai L, Kerekes K, Patthy L. Evolution of termination codons of proteins and the TAG-TGA paradox. Sci Rep 2023; 13:14294. [PMID: 37653005 PMCID: PMC10471768 DOI: 10.1038/s41598-023-41410-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2023] [Accepted: 08/25/2023] [Indexed: 09/02/2023] Open
Abstract
In most eukaryotes and prokaryotes TGA is used at a significantly higher frequency than TAG as termination codon of protein-coding genes. Although this phenomenon has been recognized several years ago, there is no generally accepted explanation for the TAG-TGA paradox. Our analyses of human mutation data revealed that out of the eighteen sense codons that can give rise to a nonsense codon by single base substitution, the CGA codon is exceptional: it gives rise to the TGA stop codon at an order of magnitude higher rate than the other codons. Here we propose that the TAG-TGA paradox is due to methylation and hypermutabilty of CpG dinucleotides. In harmony with this explanation, we show that the coding genomes of organisms with strong CpG methylation have a significant bias for TGA whereas those from organisms that lack CpG methylation use TGA and TAG termination codons with similar probability.
Collapse
Affiliation(s)
- Mária Trexler
- Institute of Enzymology, Research Centre for Natural Sciences, Budapest, 1117, Hungary
| | - László Bányai
- Institute of Enzymology, Research Centre for Natural Sciences, Budapest, 1117, Hungary
| | - Krisztina Kerekes
- Institute of Enzymology, Research Centre for Natural Sciences, Budapest, 1117, Hungary
| | - László Patthy
- Institute of Enzymology, Research Centre for Natural Sciences, Budapest, 1117, Hungary.
| |
Collapse
|
4
|
Nov Y. Learning Context-Dependent DNA Mutation Patterns in Error-Prone Polymerase Chain Reaction. Biochemistry 2023; 62:345-350. [PMID: 36153985 DOI: 10.1021/acs.biochem.2c00292] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/02/2023]
Abstract
We present a novel statistical learning method for studying context-dependent error rates in error-prone polymerase chain reaction (PCR) experiments. We demonstrate the method by applying it to error-prone PCR sequencing data and show how it may be exploited to improve the evolvability of genes in protein engineering.
Collapse
Affiliation(s)
- Yuval Nov
- Department of Statistics, University of Haifa, Haifa 3498838, Israel
| |
Collapse
|
5
|
Tenaillon O, Matic I. L’impact des mutations neutres sur l’évolvabilité et l’évolution des génomes. Med Sci (Paris) 2022; 38:777-785. [DOI: 10.1051/medsci/2022122] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022] Open
Abstract
Les mutations bénéfiques à forts effets sont rares et les mutations délétères sont éliminées par la sélection naturelle. La majorité des mutations qui s’accumulent dans les génomes ont donc des effets sélectifs très faibles, voire nuls ; elles sont alors appelées mutations neutres. Au cours des deux dernières décennies, il a été montré que les mutations, même en l’absence d’effet sur la valeur sélective des organismes, affectent leur évolvabilité, en donnant accès à de nouveaux phénotypes par le biais de mutations apparaissant ultérieurement, et qui n’auraient pas été disponibles autrement. En plus de cet effet, de nombreuses mutations neutres – indépendamment de leurs effets sélectifs – peuvent affecter la mutabilité de séquences d’ADN voisines, et moduler l’efficacité de la recombinaison homologue. De telles mutations ne modifient pas le spectre des phénotypes accessibles, mais plutôt la vitesse à laquelle de nouveaux phénotypes seront produits, un processus qui a des conséquences à long terme mais aussi potentiellement à court terme, en lien avec l’émergence de cancers.
Collapse
|
6
|
Lai D, Gade M, Yang E, Koh HY, Lu J, Walley NM, Buckley AF, Sands TT, Akman CI, Mikati MA, McKhann GM, Goldman JE, Canoll P, Alexander AL, Park KL, Von Allmen GK, Rodziyevska O, Bhattacharjee MB, Lidov HGW, Vogel H, Grant GA, Porter BE, Poduri AH, Crino PB, Heinzen EL. Somatic variants in diverse genes leads to a spectrum of focal cortical malformations. Brain 2022; 145:2704-2720. [PMID: 35441233 PMCID: PMC9612793 DOI: 10.1093/brain/awac117] [Citation(s) in RCA: 44] [Impact Index Per Article: 14.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2021] [Revised: 02/19/2022] [Accepted: 03/13/2022] [Indexed: 11/14/2022] Open
Abstract
Post-zygotically acquired genetic variants, or somatic variants, that arise during cortical development have emerged as important causes of focal epilepsies, particularly those due to malformations of cortical development. Pathogenic somatic variants have been identified in many genes within the PI3K-AKT-mTOR-signalling pathway in individuals with hemimegalencephaly and focal cortical dysplasia (type II), and more recently in SLC35A2 in individuals with focal cortical dysplasia (type I) or non-dysplastic epileptic cortex. Given the expanding role of somatic variants across different brain malformations, we sought to delineate the landscape of somatic variants in a large cohort of patients who underwent epilepsy surgery with hemimegalencephaly or focal cortical dysplasia. We evaluated samples from 123 children with hemimegalencephaly (n = 16), focal cortical dysplasia type I and related phenotypes (n = 48), focal cortical dysplasia type II (n = 44), or focal cortical dysplasia type III (n = 15). We performed high-depth exome sequencing in brain tissue-derived DNA from each case and identified somatic single nucleotide, indel and large copy number variants. In 75% of individuals with hemimegalencephaly and 29% with focal cortical dysplasia type II, we identified pathogenic variants in PI3K-AKT-mTOR pathway genes. Four of 48 cases with focal cortical dysplasia type I (8%) had a likely pathogenic variant in SLC35A2. While no other gene had multiple disease-causing somatic variants across the focal cortical dysplasia type I cohort, four individuals in this group had a single pathogenic or likely pathogenic somatic variant in CASK, KRAS, NF1 and NIPBL, genes previously associated with neurodevelopmental disorders. No rare pathogenic or likely pathogenic somatic variants in any neurological disease genes like those identified in the focal cortical dysplasia type I cohort were found in 63 neurologically normal controls (P = 0.017), suggesting a role for these novel variants. We also identified a somatic loss-of-function variant in the known epilepsy gene, PCDH19, present in a small number of alleles in the dysplastic tissue from a female patient with focal cortical dysplasia IIIa with hippocampal sclerosis. In contrast to focal cortical dysplasia type II, neither focal cortical dysplasia type I nor III had somatic variants in genes that converge on a unifying biological pathway, suggesting greater genetic heterogeneity compared to type II. Importantly, we demonstrate that focal cortical dysplasia types I, II and III are associated with somatic gene variants across a broad range of genes, many associated with epilepsy in clinical syndromes caused by germline variants, as well as including some not previously associated with radiographically evident cortical brain malformations.
Collapse
Affiliation(s)
- Dulcie Lai
- Division of Pharmacology and Experimental Therapeutics, Eshelman School of Pharmacy, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Meethila Gade
- Division of Pharmacology and Experimental Therapeutics, Eshelman School of Pharmacy, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Edward Yang
- Department of Radiology, Boston Children's Hospital, Harvard Medical School, Boston, MA 02115, USA
| | - Hyun Yong Koh
- Division of Epilepsy and Clinical Neurophysiology, Department of Neurology, Boston Children's Hospital, Boston, MA 02115, USA.,Epilepsy Genetics Program, Department of Neurology, Boston Children's Hospital and Harvard Medical School, Boston, MA 02115, USA
| | - Jinfeng Lu
- Division of Pharmacology and Experimental Therapeutics, Eshelman School of Pharmacy, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Nicole M Walley
- Division of Medical Genetics, Department of Pediatrics, Duke University School of Medicine, Durham, NC 27710, USA
| | - Anne F Buckley
- Department of Pathology, Duke University Medical Center, Durham, NC 27710, USA
| | - Tristan T Sands
- Institute for Genomic Medicine, Columbia University Medical Center, New York, NY 10032, USA.,Department of Neurology, Columbia University Medical Center, New York, NY 10032, USA
| | - Cigdem I Akman
- Department of Neurology, Columbia University Medical Center, New York, NY 10032, USA
| | - Mohamad A Mikati
- Department of Neurobiology, Duke University, Durham, NC 27708, USA.,Division of Pediatric Neurology, Duke University Medical Center, Durham, NC 27710, USA
| | - Guy M McKhann
- Department of Neurosurgery, Columbia University, New York Presbyterian Hospital, New York, NY 10032, USA
| | - James E Goldman
- Department of Pathology and Cell Biology, Columbia University, New York, NY 10032, USA
| | - Peter Canoll
- Department of Pathology and Cell Biology, Columbia University, New York, NY 10032, USA
| | - Allyson L Alexander
- Department of Neurosurgery, University of Colorado School of Medicine, Aurora, CO 80045, USA
| | - Kristen L Park
- Department of Pediatrics and Neurology, University of Colorado School of Medicine, Aurora, CO 80045, USA
| | - Gretchen K Von Allmen
- Department of Neurology, McGovern Medical School, Houston, TX 77030, USA.,Division of Child Neurology, Department of Pediatrics, McGovern Medical School, Houston, TX 77030, USA
| | - Olga Rodziyevska
- Division of Child Neurology, Department of Pediatrics, McGovern Medical School, Houston, TX 77030, USA
| | | | - Hart G W Lidov
- Department of Pathology, Boston Children's Hospital, Harvard Medical School, Boston, MA 02115, USA
| | - Hannes Vogel
- Department of Pathology, Stanford University, School of Medicine, Stanford, CA 94305, USA
| | - Gerald A Grant
- Department of Neurosurgery, Lucile Packard Children's Hospital at Stanford, School of Medicine, Stanford, CA 94305, USA
| | - Brenda E Porter
- Department of Neurology and Neurological Sciences, Stanford University, School of Medicine, Stanford, CA 94305, USA
| | - Annapurna H Poduri
- Division of Epilepsy and Clinical Neurophysiology, Department of Neurology, Boston Children's Hospital, Boston, MA 02115, USA.,Epilepsy Genetics Program, Department of Neurology, Boston Children's Hospital and Harvard Medical School, Boston, MA 02115, USA
| | - Peter B Crino
- Department of Neurology, University of Maryland School of Medicine, Baltimore, MD 21201, USA
| | - Erin L Heinzen
- Division of Pharmacology and Experimental Therapeutics, Eshelman School of Pharmacy, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA.,Department of Genetics, School of Medicine, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| |
Collapse
|
7
|
Shepherd MJ, Horton JS, Taylor TB. A near-deterministic mutational hotspot in Pseudomonas fluorescens is constructed by multiple interacting genomic features. Mol Biol Evol 2022; 39:msac132. [PMID: 35707979 PMCID: PMC9234803 DOI: 10.1093/molbev/msac132] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2022] [Revised: 05/27/2022] [Accepted: 06/06/2022] [Indexed: 01/12/2023] Open
Abstract
Mutation - whilst stochastic - is frequently biased toward certain loci. When combined with selection this results in highly repeatable and predictable evolutionary outcomes. Immotile variants of the bacterium Pseudomonas fluorescens (SBW25) possess a 'mutational hotspot' that facilitates repeated occurrences of an identical de novo single nucleotide polymorphism when re-evolving motility, where ≥95% independent lines fix the mutation ntrB A289C. Identifying hotspots of similar potency in other genes and genomic backgrounds would prove valuable for predictive evolutionary models, but to do so we must understand the genomic features that enable such a hotspot to form. Here we reveal that genomic location, local nucleotide sequence, gene strandedness and presence of mismatch repair proteins operate in combination to facilitate the formation of this mutational hotspot. Our study therefore provides a framework for utilising genomic features to predict and identify hotspot positions capable of enforcing near-deterministic evolution.
Collapse
Affiliation(s)
- M J Shepherd
- Milner Centre for Evolution, Department of Biology & Biochemistry, University of Bath, Claverton Down, Bath BA2 7AY, United Kingdom
| | - J S Horton
- Milner Centre for Evolution, Department of Biology & Biochemistry, University of Bath, Claverton Down, Bath BA2 7AY, United Kingdom
| | - T B Taylor
- Milner Centre for Evolution, Department of Biology & Biochemistry, University of Bath, Claverton Down, Bath BA2 7AY, United Kingdom
| |
Collapse
|
8
|
Li B, Roden DM, Capra JA. The 3D mutational constraint on amino acid sites in the human proteome. Nat Commun 2022; 13:3273. [PMID: 35672414 PMCID: PMC9174330 DOI: 10.1038/s41467-022-30936-x] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2021] [Accepted: 05/19/2022] [Indexed: 12/16/2022] Open
Abstract
Quantification of the tolerance of protein sites to genetic variation has become a cornerstone of variant interpretation. We hypothesize that the constraint on missense variation at individual amino acid sites is largely shaped by direct interactions with 3D neighboring sites. To quantify this constraint, we introduce a framework called COntact Set MISsense tolerance (or COSMIS) and comprehensively map the landscape of 3D mutational constraint on 6.1 million amino acid sites covering 16,533 human proteins. We show that 3D mutational constraint is pervasive and that the level of constraint is strongly associated with disease relevance both at the site and the protein level. We demonstrate that COSMIS performs significantly better at variant interpretation tasks than other population-based constraint metrics while also providing structural insight into the functional roles of constrained sites. We anticipate that COSMIS will facilitate the interpretation of protein-coding variation in evolution and prioritization of sites for mechanistic investigation.
Collapse
Affiliation(s)
- Bian Li
- Department of Biological Sciences, Vanderbilt University, Nashville, TN, 37203, USA.
- Department of Medicine, Vanderbilt University Medical Center, Nashville, TN, 37232, USA.
| | - Dan M Roden
- Department of Medicine, Vanderbilt University Medical Center, Nashville, TN, 37232, USA
- Departments of Pharmacology and Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN, 37232, USA
| | - John A Capra
- Department of Biological Sciences, Vanderbilt University, Nashville, TN, 37203, USA.
- Bakar Computational Health Sciences Institute and Department of Epidemiology and Biostatistics, University of California, San Francisco, CA, 94143, USA.
| |
Collapse
|
9
|
Abstract
How do mutational biases influence the process of adaptation? A common assumption is that selection alone determines the course of adaptation from abundant preexisting variation. Yet, theoretical work shows broad conditions under which the mutation rate to a given type of variant strongly influences its probability of contributing to adaptation. Here we introduce a statistical approach to analyzing how mutation shapes protein sequence adaptation. Using large datasets from three different species, we show that the mutation spectrum has a proportional influence on the types of changes fixed in adaptation. We also show via computer simulations that a variety of factors can influence how closely the spectrum of adaptive substitutions reflects the spectrum of variants introduced by mutation. Evolutionary adaptation often occurs by the fixation of beneficial mutations. This mode of adaptation can be characterized quantitatively by a spectrum of adaptive substitutions, i.e., a distribution for types of changes fixed in adaptation. Recent work establishes that the changes involved in adaptation reflect common types of mutations, raising the question of how strongly the mutation spectrum shapes the spectrum of adaptive substitutions. We address this question with a codon-based model for the spectrum of adaptive amino acid substitutions, applied to three large datasets covering thousands of amino acid changes identified in natural and experimental adaptation in Saccharomyces cerevisiae, Escherichia coli, and Mycobacterium tuberculosis. Using species-specific mutation spectra based on prior knowledge, we find that the mutation spectrum has a proportional influence on the spectrum of adaptive substitutions in all three species. Indeed, we find that by inferring the mutation rates that best explain the spectrum of adaptive substitutions, we can accurately recover the species-specific mutation spectra. However, we also find that the predictive power of the model differs substantially between the three species. To better understand these differences, we use population simulations to explore the factors that influence how closely the spectrum of adaptive substitutions mirrors the mutation spectrum. The results show that the influence of the mutation spectrum decreases with increasing mutational supply (Nμ) and that predictive power is strongly affected by the number and diversity of beneficial mutations.
Collapse
|
10
|
Context dependency of nucleotide probabilities and variants in human DNA. BMC Genomics 2022; 23:87. [PMID: 35100973 PMCID: PMC8802520 DOI: 10.1186/s12864-021-08246-1] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2021] [Accepted: 12/10/2021] [Indexed: 12/20/2022] Open
Abstract
Background Genomic DNA has been shaped by mutational processes through evolution. The cellular machinery for error correction and repair has left its marks in the nucleotide composition along with structural and functional constraints. Therefore, the probability of observing a base in a certain position in the human genome is highly context-dependent. Results Here we develop context-dependent nucleotide models. We first investigate models of nucleotides conditioned on sequence context. We develop a bidirectional Markov model that use an average of the probability from a Markov model applied to both strands of the sequence and thus depends on up to 14 bases to each side of the nucleotide. We show how the genome predictability varies across different types of genomic regions. Surprisingly, this model can predict a base from its context with an average of more than 50% accuracy. For somatic variants we show a tendency towards higher probability for the variant base than for the reference base. Inspired by DNA substitution models, we develop a model of mutability that estimates a mutation matrix (called the alpha matrix) on top of the nucleotide distribution. The alpha matrix can be estimated from a much smaller context than the nucleotide model, but the final model will still depend on the full context of the nucleotide model. With the bidirectional Markov model of order 14 and an alpha matrix dependent on just one base to each side, we obtain a model that compares well with a model of mutability that estimates mutation probabilities directly conditioned on three nucleotides to each side. For somatic variants in particular, our model fits better than the simpler model. Interestingly, the model is not very sensitive to the size of the context for the alpha matrix. Conclusions Our study found strong context dependencies of nucleotides in the human genome. The best model uses a context of 14 nucleotides to each side. Based on these models, a substitution model was constructed that separates into the context model and a matrix dependent on a small context. The model fit somatic variants particularly well. Supplementary Information The online version contains supplementary material available at (10.1186/s12864-021-08246-1).
Collapse
|
11
|
Menzies GE, Prior IA, Brancale A, Reed SH, Lewis PD. Carcinogen-induced DNA structural distortion differences in the RAS gene isoforms; the importance of local sequence. BMC Chem 2021; 15:51. [PMID: 34521464 PMCID: PMC8439098 DOI: 10.1186/s13065-021-00777-8] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2021] [Accepted: 08/17/2021] [Indexed: 11/12/2022] Open
Abstract
BACKGROUND Local sequence context is known to have an impact on the mutational pattern seen in cancer. The RAS genes and a smoking carcinogen, Benzo[a]pyrene diol epoxide (BPDE), have been utilised to explore these context effects. BPDE is known to form an adduct at the guanines in a number of RAS gene sites, KRAS codons 12, 13 and 14, NRAS codon 12, and HRAS codons 12 and 14. RESULTS Molecular modelling techniques, along with multivariate analysis, have been utilised to determine the sequence influenced differences between BPDE-adducted RAS gene sequences as well as the local distortion caused by the adducts. CONCLUSIONS We conclude that G:C > T:A mutations at KRAS codon 12 in the tumours of lung cancer patients (who smoke), proposed to be predominantly caused by BPDE, are due to the effect of the interaction methyl group at the C5 position of the thymine base in the KRAS sequence with the BPDE carcinogen investigated causing increased distortion. We further suggest methylated cytosine would have a similar effect, showing the importance of methylation in cancer development.
Collapse
Affiliation(s)
- Georgina E Menzies
- School of Biosciences and Dementia Research Institute at Cardiff, Cardiff University, Cardiff, CF10 3NX, UK.
| | - Ian A Prior
- Department of Cellular and Molecular Physiology, Institute of Translational Medicine, University of Liverpool, Liverpool, L69 3BX, UK
| | - Andrea Brancale
- School of Pharmacy and Pharmaceutical Sciences, Cardiff University, Cardiff, CF10 3NB, UK
| | - Simon H Reed
- Division of Cancer and Genetics, School of Medicine, Cardiff University, Cardiff, CF14 4XN, UK
| | - Paul D Lewis
- School of Management, Swansea University Bay Campus, Swansea, SA1 8EN, UK
| |
Collapse
|
12
|
Ling G, Miller D, Nielsen R, Stern A. A Bayesian Framework for Inferring the Influence of Sequence Context on Point Mutations. Mol Biol Evol 2020; 37:893-903. [PMID: 31651955 PMCID: PMC7038660 DOI: 10.1093/molbev/msz248] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/14/2023] Open
Abstract
The probability of point mutations is expected to be highly influenced by the flanking nucleotides that surround them, known as the sequence context. This phenomenon may be mainly attributed to the enzyme that modifies or mutates the genetic material, because most enzymes tend to have specific sequence contexts that dictate their activity. Here, we develop a statistical model that allows for the detection and evaluation of the effects of different sequence contexts on mutation rates from deep population sequencing data. This task is computationally challenging, as the complexity of the model increases exponentially as the context size increases. We established our novel Bayesian method based on sparse model selection methods, with the leading assumption that the number of actual sequence contexts that directly influence mutation rates is minuscule compared with the number of possible sequence contexts. We show that our method is highly accurate on simulated data using pentanucleotide contexts, even when accounting for noisy data. We next analyze empirical population sequencing data from polioviruses and HIV-1 and detect a significant enrichment in sequence contexts associated with deamination by the cellular deaminases ADAR 1/2 and APOBEC3G, respectively. In the current era, where next-generation sequencing data are highly abundant, our approach can be used on any population sequencing data to reveal context-dependent base alterations and may assist in the discovery of novel mutable sites or editing sites.
Collapse
Affiliation(s)
- Guy Ling
- School of Molecular Cell Biology and Biotechnology, Tel-Aviv University, Tel-Aviv, Israel
| | - Danielle Miller
- School of Molecular Cell Biology and Biotechnology, Tel-Aviv University, Tel-Aviv, Israel
| | - Rasmus Nielsen
- Department of Integrative Biology, University of California, Berkeley, Berkeley, CA.,Department of Statistics, University of California, Berkeley, Berkeley, CA.,Center for Computational Biology at UC Berkeley (CCB), Berkeley, CA
| | - Adi Stern
- School of Molecular Cell Biology and Biotechnology, Tel-Aviv University, Tel-Aviv, Israel.,Edmond J. Safra Center for Bioinformatics at Tel Aviv University, Tel-Aviv, Israel
| |
Collapse
|
13
|
Liu J, Robinson-Rechavi M. Robust inference of positive selection on regulatory sequences in the human brain. SCIENCE ADVANCES 2020; 6:6/48/eabc9863. [PMID: 33246961 PMCID: PMC7695467 DOI: 10.1126/sciadv.abc9863] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/25/2020] [Accepted: 10/16/2020] [Indexed: 05/07/2023]
Abstract
A longstanding hypothesis is that divergence between humans and chimpanzees might have been driven more by regulatory level adaptations than by protein sequence adaptations. This has especially been suggested for regulatory adaptations in the evolution of the human brain. We present a new method to detect positive selection on transcription factor binding sites on the basis of measuring predicted affinity change with a machine learning model of binding. Unlike other methods, this approach requires neither defining a priori neutral sites nor detecting accelerated evolution, thus removing major sources of bias. We scanned the signals of positive selection for CTCF binding sites in 29 human and 11 mouse tissues or cell types. We found that human brain-related cell types have the highest proportion of positive selection. This result is consistent with the view that adaptive evolution to gene regulation has played an important role in evolution of the human brain.
Collapse
Affiliation(s)
- Jialin Liu
- Department of Ecology and Evolution, University of Lausanne, 1015 Lausanne, Switzerland.
- Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland
| | - Marc Robinson-Rechavi
- Department of Ecology and Evolution, University of Lausanne, 1015 Lausanne, Switzerland.
- Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland
| |
Collapse
|
14
|
Simon H, Huttley G. Quantifying Influences on Intragenomic Mutation Rate. G3 (BETHESDA, MD.) 2020; 10:2641-2652. [PMID: 32527747 PMCID: PMC7407452 DOI: 10.1534/g3.120.401335] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/28/2020] [Accepted: 05/28/2020] [Indexed: 12/14/2022]
Abstract
We report work to quantify the impact on the probability of human genome polymorphism both of recombination and of sequence context at different scales. We use population-based analyses of data on human genetic variants obtained from the public Ensembl database. For recombination, we calculate the variance due to recombination and the probability that a recombination event causes a mutation. We employ novel statistical procedures to take account of the spatial auto-correlation of recombination and mutation rates along the genome. Our results support the view that genomic diversity in recombination hotspots arises largely from a direct effect of recombination on mutation rather than predominantly from the effect of selective sweeps. We also use the statistic of variance due to context to compare the effect on the probability of polymorphism of contexts of various sizes. We find that when the 12 point mutations are considered separately, variance due to context increases significantly as we move from 3-mer to 5-mer and from 5-mer to 7-mer contexts. However, when all mutations are considered in aggregate, these differences are outweighed by the effect of interaction between the central base and its immediate neighbors. This interaction is itself dominated by the transition mutations, including, but not limited to, the CpG effect. We also demonstrate strand-asymmetry of contextual influence in intronic regions, which is hypothesized to be a result of transcription coupled DNA repair. We consider the extent to which the measures we have used can be used to meaningfully compare the relative magnitudes of the impact of recombination and context on mutation.
Collapse
Affiliation(s)
- Helmut Simon
- Research School of Biology, the Australian National University
| | - Gavin Huttley
- Research School of Biology, the Australian National University
| |
Collapse
|
15
|
Wang Y, Mao JM, Wang GD, Luo ZP, Yang L, Yao Q, Chen KP. Human SARS-CoV-2 has evolved to reduce CG dinucleotide in its open reading frames. Sci Rep 2020; 10:12331. [PMID: 32704018 PMCID: PMC7378049 DOI: 10.1038/s41598-020-69342-y] [Citation(s) in RCA: 29] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2020] [Accepted: 07/09/2020] [Indexed: 12/21/2022] Open
Abstract
The outbreak of COVID-19 has brought great threat to human health. Its causative agent is a severe acute respiratory syndrome-related coronavirus which has been officially named SARS-CoV-2. Here we report the discovery of extremely low CG abundance in its open reading frames. We found that CG reduction in SARS-CoV-2 is achieved mainly through mutating C/G into A/T, and CG is the best target for mutation. Meanwhile, 5'-untranslated region of SARS-CoV-2 has high CG content and is capable of forming an internal ribosome entry site (IRES) to recruit host ribosome for translating its RNA. These features allow SARS-CoV-2 to reproduce efficiently in host cells, because less energy is consumed in disrupting the stem-loops formed by its genomic RNA. Notably, genomes of cellular organisms also have very low CG abundance, suggesting that mutating C/G into A/T occurs universally in all life forms. Moreover, CG is the dinucleotide related to CpG island, mutational hotspot and single nucleotide polymorphism in cellular organisms. The relationship between these features is worthy of further investigations.
Collapse
Affiliation(s)
- Yong Wang
- School of Food and Biological Engineering, Jiangsu University, 301 Xuefu Road, Zhenjiang, 212013, China.
| | - Jun-Ming Mao
- Institute of Life Sciences, Jiangsu University, 301 Xuefu Road, Zhenjiang, 212013, China
| | - Guang-Dong Wang
- Institute of Life Sciences, Jiangsu University, 301 Xuefu Road, Zhenjiang, 212013, China
| | - Zhi-Peng Luo
- School of Food and Biological Engineering, Jiangsu University, 301 Xuefu Road, Zhenjiang, 212013, China
| | - Liu Yang
- School of Food and Biological Engineering, Jiangsu University, 301 Xuefu Road, Zhenjiang, 212013, China
| | - Qin Yao
- Institute of Life Sciences, Jiangsu University, 301 Xuefu Road, Zhenjiang, 212013, China
| | - Ke-Ping Chen
- Institute of Life Sciences, Jiangsu University, 301 Xuefu Road, Zhenjiang, 212013, China
| |
Collapse
|
16
|
Cross E, Duncan-Flavell PJ, Howarth RJ, Crooks RO, Thomas NS, Bunyan DJ. Screening of a large PAX6 cohort identified many novel variants and emphasises the importance of the paired and homeobox domains. Eur J Med Genet 2020; 63:103940. [DOI: 10.1016/j.ejmg.2020.103940] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2019] [Revised: 12/20/2019] [Accepted: 04/23/2020] [Indexed: 12/21/2022]
|
17
|
Decoding whole-genome mutational signatures in 37 human pan-cancers by denoising sparse autoencoder neural network. Oncogene 2020; 39:5031-5041. [PMID: 32528130 PMCID: PMC7334101 DOI: 10.1038/s41388-020-1343-z] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2019] [Revised: 05/19/2020] [Accepted: 05/29/2020] [Indexed: 12/28/2022]
Abstract
Millions of somatic mutations have recently been discovered in cancer genomes. These mutations in cancer genomes occur due to internal and external mutagenesis forces. Decoding the mutational processes by examining their unique patterns has successfully revealed many known and novel signatures from whole exome data, but many still remain undiscovered. Here, we developed a deep learning approach, DeepMS, to decompose mutational signatures using 52,671,908 somatic mutations from 2780 highly curated cancer genomes with whole genome sequencing (WGS) in 37 cancer types/subtypes. With rigorous model training and comparison, we characterized 54 signatures for single base substitutions (SBSs), 11 for doublet base substitutions (DBSs) and 16 for small insertions and deletions (Indels). Compared to the previous methods, DeepMS could discover 37 SBS, 5 DBS and 9 Indel new signatures, many of which represent associations with DNA mismatch or base excision repair and cisplatin therapy mechanisms. We further developed a regression-based model to estimate the correlation between signatures and clinical and demographical phenotypes. The first deep learning model DeepMS on WGS somatic mutational profiles enable us identify more comprehensive context-based mutational signatures than traditional NMF approaches. Our work substantially expands the landscape of the naturally occurring mutational signatures in cancer genomes, and provides new insights into cancer biology.
Collapse
|
18
|
Abstract
Beneficial mutations are rare and deleterious mutations are purged by natural selection. As a result, the vast majority of mutations that accumulate in genomes belong to the class of neutral mutations. Over the last two decades, neutral mutations, despite their null effect on fitness, have been shown to affect evolvability by providing access to new phenotypes through subsequent mutations that would not have been available otherwise. Here we propose that in addition, many mutations - independent of their selective effects - can affect the mutability of neighboring DNA sequences and modulate the efficacy of homologous recombination. Such mutations do not change the spectrum of accessible phenotypes, but rather the rate at which new phenotypes will be produced. Therefore, neutral mutations that accumulate in genomes have an important long-term impact on the evolutionary fate of genomes.
Collapse
|
19
|
Baez-Ortega A, Gori K. Computational approaches for discovery of mutational signatures in cancer. Brief Bioinform 2019; 20:77-88. [PMID: 28968631 DOI: 10.1093/bib/bbx082] [Citation(s) in RCA: 29] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2017] [Indexed: 01/07/2023] Open
Abstract
The accumulation of somatic mutations in a genome is the result of the activity of one or more mutagenic processes, each of which leaves its own imprint. The study of these DNA fingerprints, termed mutational signatures, holds important potential for furthering our understanding of the causes and evolution of cancer, and can provide insights of relevance for cancer prevention and treatment. In this review, we focus our attention on the mathematical models and computational techniques that have driven recent advances in the field.
Collapse
Affiliation(s)
| | - Kevin Gori
- Transmissible Cancer Group, University of Cambridge
| |
Collapse
|
20
|
Rogozin IB, Pavlov YI, Goncearenco A, De S, Lada AG, Poliakov E, Panchenko AR, Cooper DN. Mutational signatures and mutable motifs in cancer genomes. Brief Bioinform 2019; 19:1085-1101. [PMID: 28498882 DOI: 10.1093/bib/bbx049] [Citation(s) in RCA: 23] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2017] [Indexed: 12/22/2022] Open
Abstract
Cancer is a genetic disorder, meaning that a plethora of different mutations, whether somatic or germ line, underlie the etiology of the 'Emperor of Maladies'. Point mutations, chromosomal rearrangements and copy number changes, whether they have occurred spontaneously in predisposed individuals or have been induced by intrinsic or extrinsic (environmental) mutagens, lead to the activation of oncogenes and inactivation of tumor suppressor genes, thereby promoting malignancy. This scenario has now been recognized and experimentally confirmed in a wide range of different contexts. Over the past decade, a surge in available sequencing technologies has allowed the sequencing of whole genomes from liquid malignancies and solid tumors belonging to different types and stages of cancer, giving birth to the new field of cancer genomics. One of the most striking discoveries has been that cancer genomes are highly enriched with mutations of specific kinds. It has been suggested that these mutations can be classified into 'families' based on their mutational signatures. A mutational signature may be regarded as a type of base substitution (e.g. C:G to T:A) within a particular context of neighboring nucleotide sequence (the bases upstream and/or downstream of the mutation). These mutational signatures, supplemented by mutable motifs (a wider mutational context), promise to help us to understand the nature of the mutational processes that operate during tumor evolution because they represent the footprints of interactions between DNA, mutagens and the enzymes of the repair/replication/modification pathways.
Collapse
Affiliation(s)
- Igor B Rogozin
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, USA
| | - Youri I Pavlov
- Eppley Institute for Cancer Research, University of Nebraska Medical Center, USA
| | | | | | - Artem G Lada
- Department Microbiology and Molecular Genetics, University of California, Davis, USA
| | - Eugenia Poliakov
- Laboratory of Retinal Cell and Molecular Biology, National Eye Institute, National Institutes of Health, USA
| | - Anna R Panchenko
- National Center for Biotechnology Information, National Institutes of Health, USA
| | | |
Collapse
|
21
|
Chen X, Deng S, Xu H, Hou D, Hu P, Yang Y, Wen J, Deng H, Yuan L. Novel and Recurring NOTCH3 Mutations in Two Chinese Patients with CADASIL. NEURODEGENER DIS 2019; 19:35-42. [PMID: 31212292 DOI: 10.1159/000500166] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2019] [Accepted: 04/05/2019] [Indexed: 11/19/2022] Open
Abstract
BACKGROUND Cerebral autosomal-dominant arteriopathy with subcortical infarcts and leukoencephalopathy (CADASIL) is an autosomal-dominant, inherited, systemic, vascular disorder primarily involving the small arteries. It is characterized by migraine, recurrent ischemic strokes, cognitive decline, and dementia. Mutations in the Notch receptor 3 gene (NOTCH3) and the HtrA serine peptidase 1 gene (HTRA1) are 2 genetic causes for CADASIL. The NOTCH3 gene, located on chromosome 19p13.12, is the most common disease-causing gene in CADASIL. OBJECTIVE To investigate genetic causes in 2 unrelated Han-Chinese patients with presentations strongly suggestive of CADASIL. METHODS Exome sequencing was performed on both patients and potential pathogenic mutations were validated by Sanger sequencing. RESULTS This study reports on 2 unrelated Han-Chinese patients with presentations strongly suggestive of CADASIL, identifying that NOTCH3 mutations were the genetic cause. A common mutation, c.268C>T (p.Arg90Cys), and a novel mutation, c.331G>T (p.Gly111Cys) in the NOTCH3 gene, were detected and confirmed in the patients, respectively, and were predicted to be deleterious based on bioinformation analyses. CONCLUSIONS We identified 2 NOTCH3 mutations as likely genetic causes for CADASIL in these 2 patients. Our findings broaden the mutational spectrum of the NOTCH3 gene accountable for CADASIL. Clinical manifestations supplemented with molecular genetic analyses are critical for accurate diagnosis, the provision of genetic counseling, and the development of therapies for CADASIL.
Collapse
Affiliation(s)
- Xiangyu Chen
- Center for Experimental Medicine, the Third Xiangya Hospital, Central South University, Changsha, China
| | - Sheng Deng
- Center for Experimental Medicine, the Third Xiangya Hospital, Central South University, Changsha, China.,Department of Pharmacy, Xiangya Hospital, Central South University, Changsha, China
| | - Hongbo Xu
- Center for Experimental Medicine, the Third Xiangya Hospital, Central South University, Changsha, China
| | - Deren Hou
- Department of Neurology, the Third Xiangya Hospital, Central South University, Changsha, China
| | - Pengzhi Hu
- Department of Radiology, the Third Xiangya Hospital, Central South University, Changsha, China
| | - Yan Yang
- Department of Neurology, the Third Xiangya Hospital, Central South University, Changsha, China
| | - Jie Wen
- Center for Experimental Medicine, the Third Xiangya Hospital, Central South University, Changsha, China
| | - Hao Deng
- Center for Experimental Medicine, the Third Xiangya Hospital, Central South University, Changsha, China.,Department of Neurology, the Third Xiangya Hospital, Central South University, Changsha, China
| | - Lamei Yuan
- Center for Experimental Medicine, the Third Xiangya Hospital, Central South University, Changsha, China,
| |
Collapse
|
22
|
Potapova NA, Andrianova MA, Bazykin GA, Kondrashov AS. Are Nonsense Alleles of Drosophila melanogaster Genes under Any Selection? Genome Biol Evol 2018; 10:1012-1018. [PMID: 29425311 PMCID: PMC5888714 DOI: 10.1093/gbe/evy032] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 02/06/2018] [Indexed: 12/03/2022] Open
Abstract
A gene which carries a bona fide loss-of-function mutation effectively becomes a functionless pseudogene, free from selective constraint. However, there is a number of molecular mechanisms that may lead to at least a partial preservation of the function of genes carrying even drastic alleles. We performed a direct measurement of the strength of negative selection acting on nonsense alleles of protein-coding genes in the Zambian population of Drosophila melanogaster. Within those exons that carry nonsense mutations, negative selection, assayed by the ratio of missense over synonymous nucleotide diversity levels, appears to be absent, consistent with total loss of function. In other exons of nonsense alleles, negative selection was deeply relaxed but likely not completely absent, and the per site number of missense alleles declined significantly with the distance from the premature stop codon. This pattern may be due to alternative splicing which preserves function of some isoforms of nonsense alleles of genes.
Collapse
Affiliation(s)
- Nadezhda A Potapova
- Institute of Information Transmission Problems (Kharkevich Institute) of the Russian Academy of Sciences, Moscow, Russia.,Department of Bioengineering and Bioinformatics, Lomonosov Moscow State University, Moscow, Russia
| | - Maria A Andrianova
- Institute of Information Transmission Problems (Kharkevich Institute) of the Russian Academy of Sciences, Moscow, Russia.,Skolkovo Institute of Science and Technology, Moscow, Russia
| | - Georgii A Bazykin
- Institute of Information Transmission Problems (Kharkevich Institute) of the Russian Academy of Sciences, Moscow, Russia.,Skolkovo Institute of Science and Technology, Moscow, Russia
| | - Alexey S Kondrashov
- Department of Bioengineering and Bioinformatics, Lomonosov Moscow State University, Moscow, Russia.,University of Michigan, Ann Arbor, USA
| |
Collapse
|
23
|
Růžička M, Kulhánek P, Radová L, Čechová A, Špačková N, Fajkusová L, Réblová K. DNA mutation motifs in the genes associated with inherited diseases. PLoS One 2017; 12:e0182377. [PMID: 28767725 PMCID: PMC5540541 DOI: 10.1371/journal.pone.0182377] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2017] [Accepted: 07/17/2017] [Indexed: 11/18/2022] Open
Abstract
Mutations in human genes can be responsible for inherited genetic disorders and cancer. Mutations can arise due to environmental factors or spontaneously. It has been shown that certain DNA sequences are more prone to mutate. These sites are termed hotspots and exhibit a higher mutation frequency than expected by chance. In contrast, DNA sequences with lower mutation frequencies than expected by chance are termed coldspots. Mutation hotspots are usually derived from a mutation spectrum, which reflects particular population where an effect of a common ancestor plays a role. To detect coldspots/hotspots unaffected by population bias, we analysed the presence of germline mutations obtained from HGMD database in the 5-nucleotide segments repeatedly occurring in genes associated with common inherited disorders, in particular, the PAH, LDLR, CFTR, F8, and F9 genes. Statistically significant sequences (mutational motifs) rarely associated with mutations (coldspots) and frequently associated with mutations (hotspots) exhibited characteristic sequence patterns, e.g. coldspots contained purine tract while hotspots showed alternating purine-pyrimidine bases, often with the presence of CpG dinucleotide. Using molecular dynamics simulations and free energy calculations, we analysed the global bending properties of two selected coldspots and two hotspots with a G/T mismatch. We observed that the coldspots were inherently more flexible than the hotspots. We assume that this property might be critical for effective mismatch repair as DNA with a mutation recognized by MutSα protein is noticeably bent.
Collapse
Affiliation(s)
- Michal Růžička
- CEITEC—Central European Institute of Technology, Masaryk University, Kamenice 5, Brno, Czech Republic
- Department of Condensed Matter Physics, Faculty of Science, Masaryk University, Kotlářská 2, Brno, Czech Republic
| | - Petr Kulhánek
- CEITEC—Central European Institute of Technology, Masaryk University, Kamenice 5, Brno, Czech Republic
- National Centre for Biomolecular Research, Faculty of Science, Masaryk University, Kamenice 5, Brno, Czech Republic
| | - Lenka Radová
- CEITEC—Central European Institute of Technology, Masaryk University, Kamenice 5, Brno, Czech Republic
| | - Andrea Čechová
- CEITEC—Central European Institute of Technology, Masaryk University, Kamenice 5, Brno, Czech Republic
| | - Naďa Špačková
- Department of Condensed Matter Physics, Faculty of Science, Masaryk University, Kotlářská 2, Brno, Czech Republic
| | - Lenka Fajkusová
- Centre of Molecular Biology and Gene Therapy, University Hospital Brno and Masaryk University, Jihlavská 20, Brno, Czech Republic
| | - Kamila Réblová
- CEITEC—Central European Institute of Technology, Masaryk University, Kamenice 5, Brno, Czech Republic
- * E-mail:
| |
Collapse
|
24
|
Zhou Z, Zou Y, Liu G, Zhou J, Wu J, Zhao S, Su Z, Gu X. Mutation-profile-based methods for understanding selection forces in cancer somatic mutations: a comparative analysis. Oncotarget 2017; 8:58835-58846. [PMID: 28938601 PMCID: PMC5601697 DOI: 10.18632/oncotarget.19371] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2016] [Accepted: 07/12/2017] [Indexed: 12/30/2022] Open
Abstract
Human genes exhibit different effects on fitness in cancer and normal cells. Here, we present an evolutionary approach to measure the selection pressure on human genes, using the well-known ratio of the nonsynonymous to synonymous substitution rate in both cancer genomes (CN /CS ) and normal populations (pN /pS ). A new mutation-profile-based method that adopts sample-specific mutation rate profiles instead of conventional substitution models was developed. We found that cancer-specific selection pressure is quite different from the selection pressure at the species and population levels. Both the relaxation of purifying selection on passenger mutations and the positive selection of driver mutations may contribute to the increased CN /CS values of human genes in cancer genomes compared with the pN /pS values in human populations. The CN /CS values also contribute to the improved classification of cancer genes and a better understanding of the onco-functionalization of cancer genes during oncogenesis. The use of our computational pipeline to identify cancer-specific positively and negatively selected genes may provide useful information for understanding the evolution of cancers and identifying possible targets for therapeutic intervention.
Collapse
Affiliation(s)
- Zhan Zhou
- Ministry of Education Key Laboratory of Contemporary Anthropology, School of Life Sciences, Fudan University, Shanghai, China.,College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, China
| | - Yangyun Zou
- Ministry of Education Key Laboratory of Contemporary Anthropology, School of Life Sciences, Fudan University, Shanghai, China
| | - Gangbiao Liu
- Ministry of Education Key Laboratory of Contemporary Anthropology, School of Life Sciences, Fudan University, Shanghai, China
| | - Jingqi Zhou
- Ministry of Education Key Laboratory of Contemporary Anthropology, School of Life Sciences, Fudan University, Shanghai, China
| | - Jingcheng Wu
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, China
| | - Shimin Zhao
- State Key Laboratory of Genetic Engineering, Collaborative Innovation Center of Genetics and Development, School of Life Sciences, Fudan University, Shanghai, China
| | - Zhixi Su
- Ministry of Education Key Laboratory of Contemporary Anthropology, School of Life Sciences, Fudan University, Shanghai, China
| | - Xun Gu
- Department of Genetics, Development and Cell Biology, Program of Bioinformatics and Computational Biology, Iowa State University, Ames, Iowa, USA.,Ministry of Education Key Laboratory of Contemporary Anthropology, School of Life Sciences, Fudan University, Shanghai, China
| |
Collapse
|
25
|
Vieitez I, Gallano P, González-Quereda L, Borrego S, Marcos I, Millán J, Jairo T, Prior C, Molano J, Trujillo-Tiebas M, Gallego-Merlo J, García-Barcina M, Fenollar M, Navarro C. Mutational spectrum of Duchenne muscular dystrophy in Spain: study of 284 cases. NEUROLOGÍA (ENGLISH EDITION) 2017. [DOI: 10.1016/j.nrleng.2015.12.004] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022] Open
|
26
|
Romanov GA, Sukhoverov VS. Arginine CGA codons as a source of nonsense mutations: a possible role in multivariant gene expression, control of mRNA quality, and aging. Mol Genet Genomics 2017; 292:1013-1026. [DOI: 10.1007/s00438-017-1328-y] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2017] [Accepted: 05/11/2017] [Indexed: 12/21/2022]
|
27
|
Husain N, Yuan Q, Yen YC, Pletnikova O, Sally DQ, Worley P, Bichler Z, Shawn Je H. TRIAD3/RNF216 mutations associated with Gordon Holmes syndrome lead to synaptic and cognitive impairments via Arc misregulation. Aging Cell 2017; 16:281-292. [PMID: 27995769 PMCID: PMC5334534 DOI: 10.1111/acel.12551] [Citation(s) in RCA: 33] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 10/27/2016] [Indexed: 12/26/2022] Open
Abstract
Multiple loss-of-function mutations in TRIAD3 (a.k.a. RNF216) have recently been identified in patients suffering from Gordon Holmes syndrome (GHS), characterized by cognitive decline, dementia, and movement disorders. TRIAD3A is an E3 ubiquitin ligase that recognizes and facilitates the ubiquitination of its target for degradation by the ubiquitin-proteasome system (UPS). Here, we demonstrate that two of these missense substitutions in TRIAD3 (R660C and R694C) could not regulate the degradation of their neuronal target, activity-regulated cytoskeletal-associated protein (Arc/Arg 3.1), whose expression is critical for synaptic plasticity and memory. The synaptic deficits due to the loss of endogenous TRIAD3A could not be rescued by TRIAD3A harboring GHS-associated missense mutations. Moreover, we demonstrate that the loss of endogenous TRIAD3A in the mouse hippocampal CA1 region led to deficits in spatial learning and memory. Finally, we show that these missense mutations abolished the interaction of TRIAD3A with Arc, disrupting Arc ubiquitination, and consequently Arc degradation. Our current findings of Arc misregulation by TRIAD3A variants suggest that loss-of-function mutations in TRIAD3A may contribute to dementia observed in patients with GHS driven by dysfunctional UPS components, leading to cognitive impairments through the synaptic protein Arc.
Collapse
Affiliation(s)
- Nilofer Husain
- Signature Program in Neuroscience and Behavioral Disorders; Duke-NUS Medical School Singapore; 8 College Road Singapore 169857 Singapore
| | - Qiang Yuan
- Signature Program in Neuroscience and Behavioral Disorders; Duke-NUS Medical School Singapore; 8 College Road Singapore 169857 Singapore
| | - Yi-Chun Yen
- Signature Program in Neuroscience and Behavioral Disorders; Duke-NUS Medical School Singapore; 8 College Road Singapore 169857 Singapore
| | - Olga Pletnikova
- Department of Pathology; Johns Hopkins University School of Medicine; Baltimore MD 21205 USA
| | - Dong Qianying Sally
- Behavioral Neuroscience Laboratory; National Neuroscience Institute; 11 Jalan Tan Tock Seng 308433 Singapore Singapore
| | - Paul Worley
- Solomon H. Snyder Department of Neuroscience; Johns Hopkins University School of Medicine; Baltimore MD 21205 USA
| | - Zoë Bichler
- Signature Program in Neuroscience and Behavioral Disorders; Duke-NUS Medical School Singapore; 8 College Road Singapore 169857 Singapore
- Behavioral Neuroscience Laboratory; National Neuroscience Institute; 11 Jalan Tan Tock Seng 308433 Singapore Singapore
| | - H. Shawn Je
- Signature Program in Neuroscience and Behavioral Disorders; Duke-NUS Medical School Singapore; 8 College Road Singapore 169857 Singapore
- Department of Physiology; Yong Loo Lin School of Medicine; National University of Singapore; Singapore 117597 Singapore
| |
Collapse
|
28
|
Nellen RGL, Steijlen PM, van Steensel MAM, Vreeburg M, Frank J, van Geel M. Mendelian Disorders of Cornification Caused by Defects in Intracellular Calcium Pumps: Mutation Update and Database for Variants in ATP2A2 and ATP2C1 Associated with Darier Disease and Hailey-Hailey Disease. Hum Mutat 2017; 38:343-356. [PMID: 28035777 DOI: 10.1002/humu.23164] [Citation(s) in RCA: 40] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2016] [Revised: 10/30/2016] [Accepted: 12/26/2016] [Indexed: 12/22/2022]
Abstract
The two disorders of cornification associated with mutations in genes coding for intracellular calcium pumps are Darier disease (DD) and Hailey-Hailey disease (HHD). DD is caused by mutations in the ATP2A2 gene, whereas the ATP2C1 gene is associated with HHD. Both are inherited as autosomal-dominant traits. DD is mainly defined by warty papules in seborrheic and flexural areas, whereas the major symptoms of HHD are vesicles and erosions in flexural skin. Both phenotypes are highly variable. In 12%-40% of DD patients and 12%-55% of HHD patients, no mutations in ATP2A2 or ATP2C1 are found. We provide a comprehensive review of clinical variability in DD and HHD and a review of all reported mutations in ATP2A2 and ATP2C1. Having the entire spectrum of ATP2A2 and ATP2C1 variants allows us to address the question of a genotype-phenotype correlation, which has not been settled unequivocally in DD and HHD. We created a database for all mutations in ATP2A2 and ATP2C1 using the Leiden Open Variation Database (LOVD v3.0), for variants reported in the literature and future inclusions. This data may be of use as a reference tool in further research on treatment of DD and HHD.
Collapse
Affiliation(s)
- Ruud G L Nellen
- Departments of Dermatology, Maastricht University Medical Centre, Maastricht, The Netherlands.,GROW Research School for Oncology and Developmental Biology, Maastricht University Medical Centre, Maastricht, The Netherlands
| | - Peter M Steijlen
- Departments of Dermatology, Maastricht University Medical Centre, Maastricht, The Netherlands.,GROW Research School for Oncology and Developmental Biology, Maastricht University Medical Centre, Maastricht, The Netherlands
| | - Maurice A M van Steensel
- Departments of Dermatology, Maastricht University Medical Centre, Maastricht, The Netherlands.,GROW Research School for Oncology and Developmental Biology, Maastricht University Medical Centre, Maastricht, The Netherlands.,Clinical Genetics, Maastricht University Medical Centre, Maastricht, The Netherlands
| | - Maaike Vreeburg
- Clinical Genetics, Maastricht University Medical Centre, Maastricht, The Netherlands
| | -
- Departments of Dermatology, Maastricht University Medical Centre, Maastricht, The Netherlands
| | - Jorge Frank
- Department of Dermatology, Venereology and Allergology, University Medical Center Göttingen, Göttingen, Germany
| | - Michel van Geel
- Departments of Dermatology, Maastricht University Medical Centre, Maastricht, The Netherlands.,GROW Research School for Oncology and Developmental Biology, Maastricht University Medical Centre, Maastricht, The Netherlands.,Clinical Genetics, Maastricht University Medical Centre, Maastricht, The Netherlands
| |
Collapse
|
29
|
Statistical Methods for Identifying Sequence Motifs Affecting Point Mutations. Genetics 2016; 205:843-856. [PMID: 27974498 DOI: 10.1534/genetics.116.195677] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2016] [Accepted: 12/01/2016] [Indexed: 11/18/2022] Open
Abstract
Mutation processes differ between types of point mutation, genomic locations, cells, and biological species. For some point mutations, specific neighboring bases are known to be mechanistically influential. Beyond these cases, numerous questions remain unresolved, including: what are the sequence motifs that affect point mutations? How large are the motifs? Are they strand symmetric? And, do they vary between samples? We present new log-linear models that allow explicit examination of these questions, along with sequence logo style visualization to enable identifying specific motifs. We demonstrate the performance of these methods by analyzing mutation processes in human germline and malignant melanoma. We recapitulate the known CpG effect, and identify novel motifs, including a highly significant motif associated with A[Formula: see text]G mutations. We show that major effects of neighbors on germline mutation lie within [Formula: see text] of the mutating base. Models are also presented for contrasting the entire mutation spectra (the distribution of the different point mutations). We show the spectra vary significantly between autosomes and X-chromosome, with a difference in T[Formula: see text]C transition dominating. Analyses of malignant melanoma confirmed reported characteristic features of this cancer, including statistically significant strand asymmetry, and markedly different neighboring influences. The methods we present are made freely available as a Python library https://bitbucket.org/pycogent3/mutationmotif.
Collapse
|
30
|
Vieitez I, Gallano P, González-Quereda L, Borrego S, Marcos I, Millán JM, Jairo T, Prior C, Molano J, Trujillo-Tiebas MJ, Gallego-Merlo J, García-Barcina M, Fenollar M, Navarro C. Mutational spectrum of Duchenne muscular dystrophy in Spain: Study of 284 cases. Neurologia 2016; 32:377-385. [PMID: 26968818 DOI: 10.1016/j.nrl.2015.12.009] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2015] [Revised: 12/09/2015] [Accepted: 12/12/2015] [Indexed: 10/22/2022] Open
Abstract
INTRODUCTION Duchenne muscular dystrophy (DMD) is a severe X-linked recessive neuromuscular disease that affects one in 3500 live-born males. The total absence of dystrophin observed in DMD patients is generally caused by mutations that disrupt the reading frame of the DMD gene, and about 80% of cases harbour deletions or duplications of one or more exons. METHODS We reviewed 284 cases of males with a genetic diagnosis of DMD between 2007 and 2014. These patients were selected from 8 Spanish reference hospitals representing most areas of Spain. Multiplex PCR, MLPA, and sequencing were performed to identify mutations. RESULTS Most of these DMD patients present large deletions (46.1%) or large duplications (19.7%) in the dystrophin gene. The remaining 34.2% correspond to point mutations, and half of these correspond to nonsense mutations. In this study we identified 23 new mutations in DMD: 7 large deletions and 16 point mutations. CONCLUSIONS The algorithm for genetic diagnosis applied by the participating centres is the most appropriate for genotyping patients with DMD. The genetic specificity of different therapies currently being developed emphasises the importance of identifying the mutation appearing in each patient; 38.7% of the cases in this series are eligible to participate in current clinical trials.
Collapse
Affiliation(s)
- I Vieitez
- Grupo de Patología Neonatal y Pediátrica, Enfermedades raras, Instituto de Investigación Biomédica de Ourense-Pontevedra-Vigo (IBI), Vigo, España; Complexo Hospitalario Universitario de Vigo (CHUVI), SERGAS, Vigo, España
| | - P Gallano
- Departamento de Genética, Hospital de la Santa Creu i Sant Pau, Barcelona, España; CIBERER (Centro de Investigación Biomédica en Red de Enfermedades Raras), Instituto de Salud Carlos III, Madrid, España
| | - L González-Quereda
- Departamento de Genética, Hospital de la Santa Creu i Sant Pau, Barcelona, España; CIBERER (Centro de Investigación Biomédica en Red de Enfermedades Raras), Instituto de Salud Carlos III, Madrid, España
| | - S Borrego
- Departamento de Genética, Reproducción y Medicina fetal, Instituto de Biomedicina de Sevilla, Hospital Universitario Virgen del Rocío/CSIC/Universidad de Sevilla, Sevilla, España; CIBERER (Centro de Investigación Biomédica en Red de Enfermedades Raras), Instituto de Salud Carlos III, Madrid, España
| | - I Marcos
- Departamento de Genética, Reproducción y Medicina fetal, Instituto de Biomedicina de Sevilla, Hospital Universitario Virgen del Rocío/CSIC/Universidad de Sevilla, Sevilla, España; CIBERER (Centro de Investigación Biomédica en Red de Enfermedades Raras), Instituto de Salud Carlos III, Madrid, España
| | - J M Millán
- Unidad de Genética y Diagnóstico Prenatal, Hospital Universitario La Fe, Valencia, España; CIBERER (Centro de Investigación Biomédica en Red de Enfermedades Raras), Instituto de Salud Carlos III, Madrid, España
| | - T Jairo
- Unidad de Genética y Diagnóstico Prenatal, Hospital Universitario La Fe, Valencia, España; CIBERER (Centro de Investigación Biomédica en Red de Enfermedades Raras), Instituto de Salud Carlos III, Madrid, España
| | - C Prior
- Instituto de Genética Médica y Molecular (INGEMM), Hospital Universitario La Paz, Madrid, España
| | - J Molano
- Instituto de Genética Médica y Molecular (INGEMM), Hospital Universitario La Paz, Madrid, España
| | - M J Trujillo-Tiebas
- Departamento de Genética, Hospital Universitario Fundación Jiménez Díaz, Madrid, España; CIBERER (Centro de Investigación Biomédica en Red de Enfermedades Raras), Instituto de Salud Carlos III, Madrid, España
| | - J Gallego-Merlo
- Departamento de Genética, Hospital Universitario Fundación Jiménez Díaz, Madrid, España; CIBERER (Centro de Investigación Biomédica en Red de Enfermedades Raras), Instituto de Salud Carlos III, Madrid, España
| | - M García-Barcina
- Unidad de Genética, Hospital Universitario de Basurto, Vizcaya, España
| | - M Fenollar
- Sección de Genética Clínica, Servicio de Análisis Clínicos, Hospital Clínico San Carlos, Madrid, España
| | - C Navarro
- Complexo Hospitalario Universitario de Vigo (CHUVI), SERGAS, Vigo, España.
| |
Collapse
|
31
|
Ramakodi MP, Kulathinal RJ, Chung Y, Serebriiskii I, Liu JC, Ragin CC. Ancestral-derived effects on the mutational landscape of laryngeal cancer. Genomics 2015; 107:76-82. [PMID: 26721311 DOI: 10.1016/j.ygeno.2015.12.004] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2015] [Revised: 11/26/2015] [Accepted: 12/21/2015] [Indexed: 10/22/2022]
Abstract
Laryngeal cancer disproportionately affects more African-Americans than European-Americans. Here, we analyze the genome-wide somatic point mutations from the tumors of 13 African-Americans and 57 European-Americans from TCGA to differentiate between environmental and ancestrally-inherited factors. The mean number of mutations was different between African-Americans (151.31) and European-Americans (277.63). Other differences in the overall mutational landscape between African-American and European-American were also found. The frequency of C>A, and C>G were significantly different between the two populations (p-value<0.05). Context nucleotide signatures for some mutation types significantly differ between these two populations. Thus, the context nucleotide signatures along with other factors could be related to the observed mutational landscape differences between two races. Finally, we show that mutated genes associated with these mutational differences differ between the two populations. Thus, at the molecular level, race appears to be a factor in the progression of laryngeal cancer with ancestral genomic signatures best explaining these differences.
Collapse
Affiliation(s)
- Meganathan P Ramakodi
- Cancer Prevention and Control Program, Fox Chase Cancer Center-Temple Health, Philadelphia, PA 19111, USA; Department of Biology, Temple University, Philadelphia, PA 19122, USA; African-Caribbean Cancer Consortium; Center for Computational Genetics and Genomics, Temple University, Philadelphia, PA 19122, USA
| | - Rob J Kulathinal
- Department of Biology, Temple University, Philadelphia, PA 19122, USA; African-Caribbean Cancer Consortium; Center for Computational Genetics and Genomics, Temple University, Philadelphia, PA 19122, USA
| | - Yujin Chung
- Department of Biology, Temple University, Philadelphia, PA 19122, USA; Center for Computational Genetics and Genomics, Temple University, Philadelphia, PA 19122, USA
| | - Ilya Serebriiskii
- Developmental Therapeutics, Fox Chase Cancer Center- Temple Health, Philadelphia, PA 19111, USA; Kazan Federal University, Kazan, Russia
| | - Jeffrey C Liu
- Cancer Prevention and Control Program, Fox Chase Cancer Center-Temple Health, Philadelphia, PA 19111, USA; Department of Otolaryngology - Head and Neck Surgery, Temple University School of Medicine, Philadelphia, PA 19140, USA
| | - Camille C Ragin
- Cancer Prevention and Control Program, Fox Chase Cancer Center-Temple Health, Philadelphia, PA 19111, USA; African-Caribbean Cancer Consortium; Department of Otolaryngology - Head and Neck Surgery, Temple University School of Medicine, Philadelphia, PA 19140, USA; College of Public Health, Temple University, Philadelphia, PA 19122, USA.
| |
Collapse
|
32
|
Shiraishi Y, Tremmel G, Miyano S, Stephens M. A Simple Model-Based Approach to Inferring and Visualizing Cancer Mutation Signatures. PLoS Genet 2015; 11:e1005657. [PMID: 26630308 PMCID: PMC4667891 DOI: 10.1371/journal.pgen.1005657] [Citation(s) in RCA: 106] [Impact Index Per Article: 10.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2015] [Accepted: 10/19/2015] [Indexed: 01/08/2023] Open
Abstract
Recent advances in sequencing technologies have enabled the production of massive amounts of data on somatic mutations from cancer genomes. These data have led to the detection of characteristic patterns of somatic mutations or “mutation signatures” at an unprecedented resolution, with the potential for new insights into the causes and mechanisms of tumorigenesis. Here we present new methods for modelling, identifying and visualizing such mutation signatures. Our methods greatly simplify mutation signature models compared with existing approaches, reducing the number of parameters by orders of magnitude even while increasing the contextual factors (e.g. the number of flanking bases) that are accounted for. This improves both sensitivity and robustness of inferred signatures. We also provide a new intuitive way to visualize the signatures, analogous to the use of sequence logos to visualize transcription factor binding sites. We illustrate our new method on somatic mutation data from urothelial carcinoma of the upper urinary tract, and a larger dataset from 30 diverse cancer types. The results illustrate several important features of our methods, including the ability of our new visualization tool to clearly highlight the key features of each signature, the improved robustness of signature inferences from small sample sizes, and more detailed inference of signature characteristics such as strand biases and sequence context effects at the base two positions 5′ to the mutated site. The overall framework of our work is based on probabilistic models that are closely connected with “mixed-membership models” which are widely used in population genetic admixture analysis, and in machine learning for document clustering. We argue that recognizing these relationships should help improve understanding of mutation signature extraction problems, and suggests ways to further improve the statistical methods. Our methods are implemented in an R package pmsignature (https://github.com/friend1ws/pmsignature) and a web application available at https://friend1ws.shinyapps.io/pmsignature_shiny/. Somatic (non-inherited) mutations are acquired throughout our lives in cells throughout our body. These mutations can be caused, for example, by DNA replication errors or exposure to environmental mutagens such as tobacco smoke. Some of these mutations can lead to cancer. Different cancers, and even different instances of the same cancer, can show different distinctive patterns of somatic mutations. These distinctive patterns have become known as “mutation signatures”. For example, C > A mutations are frequent in lung caners whereas C > T and CC > TT mutations are frequent in skin cancers. Each mutation signature may be associated with a specific kind of carcinogen, such as tobacco smoke or ultraviolet light. Identifying mutation signatures therefore has the potential to identify new carcinogens, and yield new insights into the mechanisms and causes of cancer, In this paper, we introduce new statistical tools for tackling this important problem. These tools provide more robust and interpretable mutation signatures compared to previous approaches, as we demonstrate by applying them to large-scale cancer genomic data.
Collapse
Affiliation(s)
- Yuichi Shiraishi
- Laboratory of DNA Information Analysis, Human Genome Center, Institute of Medical Science, The University of Tokyo, Tokyo, Japan
| | - Georg Tremmel
- Laboratory of DNA Information Analysis, Human Genome Center, Institute of Medical Science, The University of Tokyo, Tokyo, Japan
| | - Satoru Miyano
- Laboratory of DNA Information Analysis, Human Genome Center, Institute of Medical Science, The University of Tokyo, Tokyo, Japan
| | - Matthew Stephens
- Department of Human Genetics, University of Chicago, Chicago, Illinois, United States of America.,Department of Statistics, University of Chicago, Chicago, Illinois, United States of America
| |
Collapse
|
33
|
Fauth C, Steindl K, Toutain A, Farrell S, Witsch-Baumgartner M, Karall D, Joset P, Böhm S, Baumer A, Maier O, Zschocke J, Weksberg R, Marshall CR, Rauch A. A recurrent germline mutation in the PIGA gene causes Simpson-Golabi-Behmel syndrome type 2. Am J Med Genet A 2015; 170A:392-402. [PMID: 26545172 DOI: 10.1002/ajmg.a.37452] [Citation(s) in RCA: 31] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2015] [Accepted: 10/15/2015] [Indexed: 11/10/2022]
Abstract
Hypomorphic germline mutations in the PIGA (phosphatidylinositol glycan class A) gene recently were recognized as the cause of a clinically heterogeneous spectrum of X-linked disorders including (i) early onset epileptic encephalopathy with severe muscular hypotonia, dysmorphism, multiple congenital anomalies, and early death ("MCAHS2"), (ii) neurodegenerative encephalopathy with systemic iron overload (ferro-cerebro-cutaneous syndrome, "FCCS"), and (iii) intellectual disability and seizures without dysmorphism. Previous studies showed that the recurrent PIGA germline mutation c.1234C>T (p.Arg412*) leads to a clinical phenotype at the most severe end of the spectrum associated with early infantile lethality. We identified three additional individuals from two unrelated families with the same PIGA mutation. Major clinical findings include early onset intractable epileptic encephalopathy with a burst-suppression pattern on EEG, generalized muscular hypotonia, structural brain abnormalities, macrocephaly and increased birth weight, joint contractures, coarse facial features, widely spaced eyes, a short nose with anteverted nares, gingival overgrowth, a wide mouth, short limbs with short distal phalanges, and a small penis. Based on the phenotypic overlap with Simpson-Golabi-Behmel syndrome type 2 (SGBS2), we hypothesized that both disorders might have the same underlying cause. We were able to confirm the same c.1234C>T (p.Arg412*) mutation in the DNA sample from an affected fetus of the original family affected with SGBS2. We conclude that the recurrent PIGA germline mutation c.1234C>T leads to a recognizable clinical phenotype with a poor prognosis and is the cause of SGBS2.
Collapse
Affiliation(s)
- Christine Fauth
- Division of Human Genetics, Department of Medical Genetics, Molecular and Clinical Pharmacology, Medical University of Innsbruck, Innsbruck, Austria
| | - Katharina Steindl
- Institute of Medical Genetics, University of Zürich, Schlieren-Zürich, Switzerland
| | - Annick Toutain
- Department of Genetics, Tours University Hospital, Tours, France
| | - Sandra Farrell
- Department of Laboratory Medicine and Genetics, Trillium Health Partners, Credit Valley Hospital, Mississauga, Ontario, Canada
| | - Martina Witsch-Baumgartner
- Division of Human Genetics, Department of Medical Genetics, Molecular and Clinical Pharmacology, Medical University of Innsbruck, Innsbruck, Austria
| | - Daniela Karall
- Clinic for Pediatrics I, Inherited Metabolic Disorders, Medical University of Innsbruck, Innsbruck, Austria
| | - Pascal Joset
- Institute of Medical Genetics, University of Zürich, Schlieren-Zürich, Switzerland
| | - Sebastian Böhm
- Children's Hospital of Eastern Switzerland, St. Gallen, Switzerland
| | - Alessandra Baumer
- Institute of Medical Genetics, University of Zürich, Schlieren-Zürich, Switzerland
| | - Oliver Maier
- Children's Hospital of Eastern Switzerland, St. Gallen, Switzerland
| | - Johannes Zschocke
- Division of Human Genetics, Department of Medical Genetics, Molecular and Clinical Pharmacology, Medical University of Innsbruck, Innsbruck, Austria
| | - Rosanna Weksberg
- Division of Clinical and Metabolic Genetics, The Hospital for Sick Children, Toronto, Ontario, Canada.,Genetics and Genome Biology Program, The Hospital for Sick Children, Toronto, Ontario, Canada.,Institute of Medical Science and Department of Pediatrics, University of Toronto, Toronto, Ontario, Canada
| | - Christian R Marshall
- Department of Pediatric Laboratory Medicine, The Hospital for Sick Children, Toronto, Ontario, Canada.,The Centre for Applied Genomics, Genetics and Genome Biology, The Hospital for Sick Children, Toronto, Ontario, Canada
| | - Anita Rauch
- Institute of Medical Genetics, University of Zürich, Schlieren-Zürich, Switzerland
| |
Collapse
|
34
|
Kopp N, Climer S, Dougherty JD. Moving from capstones toward cornerstones: successes and challenges in applying systems biology to identify mechanisms of autism spectrum disorders. Front Genet 2015; 6:301. [PMID: 26500678 PMCID: PMC4595802 DOI: 10.3389/fgene.2015.00301] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2015] [Accepted: 09/11/2015] [Indexed: 11/28/2022] Open
Abstract
The substantial progress in the last few years toward uncovering genetic causes and risk factors for autism spectrum disorders (ASDs) has opened new experimental avenues for identifying the underlying neurobiological mechanism of the condition. The bounty of genetic findings has led to a variety of data-driven exploratory analyses aimed at deriving new insights about the shared features of these genes. These approaches leverage data from a variety of different sources such as co-expression in transcriptomic studies, protein–protein interaction networks, gene ontologies (GOs) annotations, or multi-level combinations of all of these. Here, we review the recurrent themes emerging from these analyses and highlight some of the challenges going forward. Themes include findings that ASD associated genes discovered by a variety of methods have been shown to contain disproportionate amounts of neurite outgrowth/cytoskeletal, synaptic, and more recently Wnt-related and chromatin modifying genes. Expression studies have highlighted a disproportionate expression of ASD gene sets during mid fetal cortical development, particularly for rare variants, with multiple analyses highlighting the striatum and cortical projection and interneurons as well. While these explorations have highlighted potentially interesting relationships among these ASD-related genes, there are challenges in how to best transition these insights into empirically testable hypotheses. Nonetheless, defining shared molecular or cellular pathology downstream of the diverse genes associated with ASDs could provide the cornerstones needed to build toward broadly applicable therapeutic approaches.
Collapse
Affiliation(s)
- Nathan Kopp
- Department of Genetics, School of Medicine, Washington University in St. Louis, St. Louis MO, USA ; Department of Psychiatry, School of Medicine, Washington University in St. Louis, St. Louis MO, USA
| | - Sharlee Climer
- Department of Computer Science and Engineering, Washington University in St. Louis, St. Louis MO, USA
| | - Joseph D Dougherty
- Department of Genetics, School of Medicine, Washington University in St. Louis, St. Louis MO, USA ; Department of Psychiatry, School of Medicine, Washington University in St. Louis, St. Louis MO, USA
| |
Collapse
|
35
|
Ware JS, Samocha KE, Homsy J, Daly MJ. Interpreting de novo Variation in Human Disease Using denovolyzeR. ACTA ACUST UNITED AC 2015; 87:7.25.1-7.25.15. [PMID: 26439716 DOI: 10.1002/0471142905.hg0725s87] [Citation(s) in RCA: 57] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Spontaneously arising (de novo) genetic variants are important in human disease, yet every individual carries many such variants, with a median of 1 de novo variant affecting the protein-coding portion of the genome. A recently described mutational model provides a powerful framework for the robust statistical evaluation of such coding variants, enabling the interpretation of de novo variation in human disease. Here we describe a new open-source software package, denovolyzeR, that implements this model and provides tools for the analysis of de novo coding sequence variants.
Collapse
Affiliation(s)
- James S Ware
- Department of Genetics, Harvard Medical School, Boston, Massachusetts.,Broad Institute of MIT and Harvard, Cambridge, Massachusetts.,Analytical and Translational Genetics Unit, Massachusetts General Hospital and Harvard Medical School, Boston, Massachusetts.,NIHR Cardiovascular Biomedical Research Unit at Royal Brompton Hospital and Imperial College London, London, United Kingdom
| | - Kaitlin E Samocha
- Department of Genetics, Harvard Medical School, Boston, Massachusetts.,Broad Institute of MIT and Harvard, Cambridge, Massachusetts.,Analytical and Translational Genetics Unit, Massachusetts General Hospital and Harvard Medical School, Boston, Massachusetts
| | - Jason Homsy
- Department of Genetics, Harvard Medical School, Boston, Massachusetts.,Cardiovascular Research Center, Massachusetts General Hospital, Boston, Massachusetts
| | - Mark J Daly
- Department of Genetics, Harvard Medical School, Boston, Massachusetts.,Broad Institute of MIT and Harvard, Cambridge, Massachusetts.,Analytical and Translational Genetics Unit, Massachusetts General Hospital and Harvard Medical School, Boston, Massachusetts
| |
Collapse
|
36
|
He F, Jacobson A. Nonsense-Mediated mRNA Decay: Degradation of Defective Transcripts Is Only Part of the Story. Annu Rev Genet 2015; 49:339-66. [PMID: 26436458 DOI: 10.1146/annurev-genet-112414-054639] [Citation(s) in RCA: 200] [Impact Index Per Article: 20.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
Abstract
Nonsense-mediated mRNA decay (NMD) is a eukaryotic surveillance mechanism that monitors cytoplasmic mRNA translation and targets mRNAs undergoing premature translation termination for rapid degradation. From yeasts to humans, activation of NMD requires the function of the three conserved Upf factors: Upf1, Upf2, and Upf3. Here, we summarize the progress in our understanding of the molecular mechanisms of NMD in several model systems and discuss recent experiments that address the roles of Upf1, the principal regulator of NMD, in the initial targeting and final degradation of NMD-susceptible mRNAs. We propose a unified model for NMD in which the Upf factors provide several functions during premature termination, including the stimulation of release factor activity and the dissociation and recycling of ribosomal subunits. In this model, the ultimate degradation of the mRNA is the last step in a complex premature termination process.
Collapse
Affiliation(s)
- Feng He
- Department of Microbiology and Physiological Systems, University of Massachusetts Medical School, Worcester, Massachusetts 01655; ,
| | - Allan Jacobson
- Department of Microbiology and Physiological Systems, University of Massachusetts Medical School, Worcester, Massachusetts 01655; ,
| |
Collapse
|
37
|
Menzies GE, Reed SH, Brancale A, Lewis PD. Base damage, local sequence context and TP53 mutation hotspots: a molecular dynamics study of benzo[a]pyrene induced DNA distortion and mutability. Nucleic Acids Res 2015; 43:9133-46. [PMID: 26400171 PMCID: PMC4627081 DOI: 10.1093/nar/gkv910] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2015] [Accepted: 08/26/2015] [Indexed: 12/22/2022] Open
Abstract
The mutational pattern for the TP53 tumour suppressor gene in lung tumours differs to other cancer types by having a higher frequency of G:C>T:A transversions. The aetiology of this differing mutation pattern is still unknown. Benzo[a]pyrene,diol epoxide (BPDE) is a potent cigarette smoke carcinogen that forms guanine adducts at TP53 CpG mutation hotspot sites including codons 157, 158, 245, 248 and 273. We performed molecular modelling of BPDE-adducted TP53 duplex sequences to determine the degree of local distortion caused by adducts which could influence the ability of nucleotide excision repair. We show that BPDE adducted codon 157 has greater structural distortion than other TP53 G:C>T:A hotspot sites and that sequence context more distal to adjacent bases must influence local distortion. Using TP53 trinucleotide mutation signatures for lung cancer in smokers and non-smokers we further show that codons 157 and 273 have the highest mutation probability in smokers. Combining this information with adduct structural data we predict that G:C>T:A mutations at codon 157 in lung tumours of smokers are predominantly caused by BPDE. Our results provide insight into how different DNA sequence contexts show variability in DNA distortion at mutagen adduct sites that could compromise DNA repair at well characterized cancer related mutation hotspots.
Collapse
Affiliation(s)
- Georgina E Menzies
- Institute of Life Science, Swansea University School of Medicine, Swansea University, SA2 8PP, UK
| | - Simon H Reed
- Institute of Cancer & Genetics, School of Medicine, Cardiff University, CF14 4XN, UK
| | - Andrea Brancale
- School of Pharmacy and Pharmacology, Cardiff University, CF10 3NB, UK
| | - Paul D Lewis
- Institute of Life Science, Swansea University School of Medicine, Swansea University, SA2 8PP, UK
| |
Collapse
|
38
|
DMD Mutations in 576 Dystrophinopathy Families: A Step Forward in Genotype-Phenotype Correlations. PLoS One 2015; 10:e0135189. [PMID: 26284620 PMCID: PMC4540588 DOI: 10.1371/journal.pone.0135189] [Citation(s) in RCA: 104] [Impact Index Per Article: 10.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2015] [Accepted: 07/17/2015] [Indexed: 11/19/2022] Open
Abstract
Recent advances in molecular therapies for Duchenne muscular dystrophy (DMD) require precise genetic diagnosis because most therapeutic strategies are mutation-specific. To understand more about the genotype-phenotype correlations of the DMD gene we performed a comprehensive analysis of the DMD mutational spectrum in a large series of families. Here we provide the clinical, pathological and genetic features of 576 dystrophinopathy patients. DMD gene analysis was performed using the MLPA technique and whole gene sequencing in blood DNA and muscle cDNA. The impact of the DNA variants on mRNA splicing and protein functionality was evaluated by in silico analysis using computational algorithms. DMD mutations were detected in 576 unrelated dystrophinopathy families by combining the analysis of exonic copies and the analysis of small mutations. We found that 471 of these mutations were large intragenic rearrangements. Of these, 406 (70.5%) were exonic deletions, 64 (11.1%) were exonic duplications, and one was a deletion/duplication complex rearrangement (0.2%). Small mutations were identified in 105 cases (18.2%), most being nonsense/frameshift types (75.2%). Mutations in splice sites, however, were relatively frequent (20%). In total, 276 mutations were identified, 85 of which have not been previously described. The diagnostic algorithm used proved to be accurate for the molecular diagnosis of dystrophinopathies. The reading frame rule was fulfilled in 90.4% of DMD patients and in 82.4% of Becker muscular dystrophy patients (BMD), with significant differences between the mutation types. We found that 58% of DMD patients would be included in single exon-exon skipping trials, 63% from strategies directed against multiexon-skipping exons 45 to 55, and 14% from PTC therapy. A detailed analysis of missense mutations provided valuable information about their impact on the protein structure.
Collapse
|
39
|
Plyler ZE, Hill AE, McAtee CW, Cui X, Moseley LA, Sorscher EJ. SNP Formation Bias in the Murine Genome Provides Evidence for Parallel Evolution. Genome Biol Evol 2015; 7:2506-19. [PMID: 26253317 PMCID: PMC4607513 DOI: 10.1093/gbe/evv150] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
In this study, we show novel DNA motifs that promote single nucleotide polymorphism (SNP) formation and are conserved among exons, introns, and intergenic DNA from mice (Sanger Mouse Genomes Project), human genes (1000 Genomes), and tumor-specific somatic mutations (data from TCGA). We further characterize SNPs likely to be very recent in origin (i.e., formed in otherwise congenic mice) and show enrichment for both synonymous and parallel DNA variants occurring under circumstances not attributable to purifying selection. The findings provide insight regarding SNP contextual bias and eukaryotic codon usage as strategies that favor long-term exonic stability. The study also furnishes new information concerning rates of murine genomic evolution and features of DNA mutagenesis (at the time of SNP formation) that should be viewed as "adaptive."
Collapse
Affiliation(s)
| | - Aubrey E Hill
- Department of Computer and Information Sciences, University of Alabama at Birmingham
| | - Christopher W McAtee
- Gregory Fleming James Cystic Fibrosis Research Center, University of Alabama at Birmingham
| | - Xiangqin Cui
- Department of Biostatistics, University of Alabama at Birmingham
| | - Leah A Moseley
- Gregory Fleming James Cystic Fibrosis Research Center, University of Alabama at Birmingham
| | - Eric J Sorscher
- Department of Pediatrics, Emory University School of Medicine
| |
Collapse
|
40
|
Temiz NA, Donohue DE, Bacolla A, Vasquez KM, Cooper DN, Mudunuri U, Ivanic J, Cer RZ, Yi M, Stephens RM, Collins JR, Luke BT. The somatic autosomal mutation matrix in cancer genomes. Hum Genet 2015; 134:851-64. [PMID: 26001532 PMCID: PMC4495249 DOI: 10.1007/s00439-015-1566-1] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/17/2015] [Accepted: 05/12/2015] [Indexed: 01/26/2023]
Abstract
DNA damage in somatic cells originates from both environmental and endogenous sources, giving rise to mutations through multiple mechanisms. When these mutations affect the function of critical genes, cancer may ensue. Although identifying genomic subsets of mutated genes may inform therapeutic options, a systematic survey of tumor mutational spectra is required to improve our understanding of the underlying mechanisms of mutagenesis involved in cancer etiology. Recent studies have presented genome-wide sets of somatic mutations as a 96-element vector, a procedure that only captures the immediate neighbors of the mutated nucleotide. Herein, we present a 32 × 12 mutation matrix that captures the nucleotide pattern two nucleotides upstream and downstream of the mutation. A somatic autosomal mutation matrix (SAMM) was constructed from tumor-specific mutations derived from each of 909 individual cancer genomes harboring a total of 10,681,843 single-base substitutions. In addition, mechanistic template mutation matrices (MTMMs) representing oxidative DNA damage, ultraviolet-induced DNA damage, (5m)CpG deamination, and APOBEC-mediated cytosine mutation, are presented. MTMMs were mapped to the individual tumor SAMMs to determine the maximum contribution of each mutational mechanism to the overall mutation pattern. A Manhattan distance across all SAMM elements between any two tumor genomes was used to determine their relative distance. Employing this metric, 89.5% of all tumor genomes were found to have a nearest neighbor from the same tissue of origin. When a distance-dependent 6-nearest neighbor classifier was used, 10.4% of the SAMMs had an Undetermined tissue of origin, and 92.2% of the remaining SAMMs were assigned to the correct tissue of origin. [corrected]. Thus, although tumors from different tissues may have similar mutation patterns, their SAMMs often display signatures that are characteristic of specific tissues.
Collapse
Affiliation(s)
- Nuri A. Temiz
- />In Silico Research Centers of Excellence, Advanced Biomedical Computing Center, Information Systems Program, Frederick National Laboratory for Cancer Research, Leidos Biomedical Research Inc., P.O. Box B, Frederick, MD 21702 USA
- />Masonic Cancer Center, University of Minnesota, 2-120 CCRB, 2231 6th St SE, Minneapolis, MN 55455 USA
| | - Duncan E. Donohue
- />In Silico Research Centers of Excellence, Advanced Biomedical Computing Center, Information Systems Program, Frederick National Laboratory for Cancer Research, Leidos Biomedical Research Inc., P.O. Box B, Frederick, MD 21702 USA
- />US Army Medical Research and Material Command, 568 Doughten Dr., Fort Detrick, Frederick, MD 21702 USA
| | - Albino Bacolla
- />In Silico Research Centers of Excellence, Advanced Biomedical Computing Center, Information Systems Program, Frederick National Laboratory for Cancer Research, Leidos Biomedical Research Inc., P.O. Box B, Frederick, MD 21702 USA
- />Division of Pharmacology and Toxicology, The University of Texas at Austin, Austin, TX 78723 USA
| | - Karen M. Vasquez
- />Division of Pharmacology and Toxicology, The University of Texas at Austin, Austin, TX 78723 USA
| | - David N. Cooper
- />Institute of Medical Genetics, School of Medicine, Cardiff University, Cardiff, CF14 4XN UK
| | - Uma Mudunuri
- />In Silico Research Centers of Excellence, Advanced Biomedical Computing Center, Information Systems Program, Frederick National Laboratory for Cancer Research, Leidos Biomedical Research Inc., P.O. Box B, Frederick, MD 21702 USA
| | - Joseph Ivanic
- />In Silico Research Centers of Excellence, Advanced Biomedical Computing Center, Information Systems Program, Frederick National Laboratory for Cancer Research, Leidos Biomedical Research Inc., P.O. Box B, Frederick, MD 21702 USA
| | - Regina Z. Cer
- />In Silico Research Centers of Excellence, Advanced Biomedical Computing Center, Information Systems Program, Frederick National Laboratory for Cancer Research, Leidos Biomedical Research Inc., P.O. Box B, Frederick, MD 21702 USA
- />Naval Medical Research Center-Frederick, 8400 Research Plaza, Fort Detrick, Frederick, MD 21702 USA
| | - Ming Yi
- />In Silico Research Centers of Excellence, Advanced Biomedical Computing Center, Information Systems Program, Frederick National Laboratory for Cancer Research, Leidos Biomedical Research Inc., P.O. Box B, Frederick, MD 21702 USA
| | - Robert M. Stephens
- />In Silico Research Centers of Excellence, Advanced Biomedical Computing Center, Information Systems Program, Frederick National Laboratory for Cancer Research, Leidos Biomedical Research Inc., P.O. Box B, Frederick, MD 21702 USA
| | - Jack R. Collins
- />In Silico Research Centers of Excellence, Advanced Biomedical Computing Center, Information Systems Program, Frederick National Laboratory for Cancer Research, Leidos Biomedical Research Inc., P.O. Box B, Frederick, MD 21702 USA
| | - Brian T. Luke
- />In Silico Research Centers of Excellence, Advanced Biomedical Computing Center, Information Systems Program, Frederick National Laboratory for Cancer Research, Leidos Biomedical Research Inc., P.O. Box B, Frederick, MD 21702 USA
| |
Collapse
|
41
|
Bladen CL, Salgado D, Monges S, Foncuberta ME, Kekou K, Kosma K, Dawkins H, Lamont L, Roy AJ, Chamova T, Guergueltcheva V, Chan S, Korngut L, Campbell C, Dai Y, Wang J, Barišić N, Brabec P, Lahdetie J, Walter MC, Schreiber-Katz O, Karcagi V, Garami M, Viswanathan V, Bayat F, Buccella F, Kimura E, Koeks Z, van den Bergen JC, Rodrigues M, Roxburgh R, Lusakowska A, Kostera-Pruszczyk A, Zimowski J, Santos R, Neagu E, Artemieva S, Rasic VM, Vojinovic D, Posada M, Bloetzer C, Jeannet PY, Joncourt F, Díaz-Manera J, Gallardo E, Karaduman AA, Topaloğlu H, El Sherif R, Stringer A, Shatillo AV, Martin AS, Peay HL, Bellgard MI, Kirschner J, Flanigan KM, Straub V, Bushby K, Verschuuren J, Aartsma-Rus A, Béroud C, Lochmüller H. The TREAT-NMD DMD Global Database: analysis of more than 7,000 Duchenne muscular dystrophy mutations. Hum Mutat 2015; 36:395-402. [PMID: 25604253 PMCID: PMC4405042 DOI: 10.1002/humu.22758] [Citation(s) in RCA: 487] [Impact Index Per Article: 48.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2014] [Accepted: 01/13/2015] [Indexed: 12/22/2022]
Abstract
Analyzing the type and frequency of patient-specific mutations that give rise to Duchenne muscular dystrophy (DMD) is an invaluable tool for diagnostics, basic scientific research, trial planning, and improved clinical care. Locus-specific databases allow for the collection, organization, storage, and analysis of genetic variants of disease. Here, we describe the development and analysis of the TREAT-NMD DMD Global database (http://umd.be/TREAT_DMD/). We analyzed genetic data for 7,149 DMD mutations held within the database. A total of 5,682 large mutations were observed (80% of total mutations), of which 4,894 (86%) were deletions (1 exon or larger) and 784 (14%) were duplications (1 exon or larger). There were 1,445 small mutations (smaller than 1 exon, 20% of all mutations), of which 358 (25%) were small deletions and 132 (9%) small insertions and 199 (14%) affected the splice sites. Point mutations totalled 756 (52% of small mutations) with 726 (50%) nonsense mutations and 30 (2%) missense mutations. Finally, 22 (0.3%) mid-intronic mutations were observed. In addition, mutations were identified within the database that would potentially benefit from novel genetic therapies for DMD including stop codon read-through therapies (10% of total mutations) and exon skipping therapy (80% of deletions and 55% of total mutations).
Collapse
Affiliation(s)
- Catherine L Bladen
- The John Walton Muscular Dystrophy Research Centre, MRC Centre for Neuromuscular Diseases Institute of Genetic Medicine, University of Newcastle, Central Parkway, Newcastle upon Tyne, UK
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
42
|
Cheung PPH, Rogozin IB, Choy KT, Ng HY, Peiris JSM, Yen HL. Comparative mutational analyses of influenza A viruses. RNA (NEW YORK, N.Y.) 2015; 21:36-47. [PMID: 25404565 PMCID: PMC4274636 DOI: 10.1261/rna.045369.114] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/12/2023]
Abstract
The error-prone RNA-dependent RNA polymerase (RdRP) and external selective pressures are the driving forces for RNA viral diversity. When confounded by selective pressures, it is difficult to assess if influenza A viruses (IAV) that have a wide host range possess comparable or distinct spontaneous mutational frequency in their RdRPs. We used in-depth bioinformatics analyses to assess the spontaneous mutational frequencies of two RdRPs derived from human seasonal (A/Wuhan/359/95; Wuhan) and H5N1 (A/Vietnam/1203/04; VN1203) viruses using the mini-genome system with a common firefly luciferase reporter serving as the template. High-fidelity reverse transcriptase was applied to generate high-quality mutational spectra which allowed us to assess and compare the mutational frequencies and mutable motifs along a target sequence of the two RdRPs of two different subtypes. We observed correlated mutational spectra (τ correlation P < 0.0001), comparable mutational frequencies (H3N2:5.8 ± 0.9; H5N1:6.0 ± 0.5), and discovered a highly mutable motif "(A)AAG" for both Wuhan and VN1203 RdRPs. Results were then confirmed with two recombinant A/Puerto Rico/8/34 (PR8) viruses that possess RdRP derived from Wuhan or VN1203 (RG-PR8×Wuhan(PB2, PB1, PA, NP) and RG-PR8×VN1203(PB2, PB1, PA, NP)). Applying novel bioinformatics analysis on influenza mutational spectra, we provide a platform for a comprehensive analysis of the spontaneous mutation spectra for an RNA virus.
Collapse
Affiliation(s)
- Peter Pak-Hang Cheung
- School of Public Health, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Pokfulam, Hong Kong
| | - Igor B Rogozin
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland 20894-6075, USA
| | - Ka-Tim Choy
- School of Public Health, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Pokfulam, Hong Kong
| | - Hoi Yee Ng
- School of Public Health, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Pokfulam, Hong Kong
| | - Joseph Sriyal Malik Peiris
- School of Public Health, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Pokfulam, Hong Kong
| | - Hui-Ling Yen
- School of Public Health, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Pokfulam, Hong Kong
| |
Collapse
|
43
|
Niyazoglu M, Sayitoglu M, Firtina S, Hatipoglu E, Gazioglu N, Kadioglu P. Familial acromegaly due to aryl hydrocarbon receptor-interacting protein (AIP) gene mutation in a Turkish cohort. Pituitary 2014; 17:220-6. [PMID: 23743763 DOI: 10.1007/s11102-013-0493-1] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
Aryl hydrocarbon receptor-interacting protein (AIP) is associated with 15-20% of familial isolated pituitary adenomas and 50-80% of cases with AIP mutation exhibit a somatotropinoma. Herein we report clinical characteristics of a large family where AIP R304X variants have been identified. AIP mutation analysis was performed on a large (n = 52) Turkish family across six generations. Sella MRIs of 30 family members were obtained. Basal pituitary hormone levels were evaluated in 13 family members harboring an AIP mutation. Thirteen of 52 family members (25%) were found to have a heterozygous nonsense germline R304X mutation in the AIP gene. Seven of the 13 mutation carriers (53.8%) had current or previous history of pituitary adenoma. Of these 7 mutation carriers, all but one had somatotropinoma/somatolactotropinoma (85.7% of the pituitary adenomas). Of the 6 acromegaly patients with AIP mutation (F/M: 3/3) the mean age at diagnosis of acromegaly was 32 ± 10.3 years while the mean age of symptom onset was 24.8 ± 9.9 years. Three of the six (50%) acromegaly cases with AIP mutation within the family presented with a macroadenoma and none presented with gigantism. Biochemical disease control was achieved in 66.6% (4/6) of the mutation carriers with acromegaly after a mean follow-up period of 18.6 ± 17.6 years. Common phenotypic characteristics of familial pituitary adenoma or somatotropinoma due to AIP mutation vary between families or even between individuals within a family.
Collapse
Affiliation(s)
- Mutlu Niyazoglu
- Division of Endocrinology and Metabolism, Department of Internal Medicine, Cerrahpasa Medical Faculty, Istanbul University, Istanbul, Turkey
| | | | | | | | | | | |
Collapse
|
44
|
Nonsense-mediated decay in genetic disease: friend or foe? MUTATION RESEARCH-REVIEWS IN MUTATION RESEARCH 2014; 762:52-64. [PMID: 25485595 DOI: 10.1016/j.mrrev.2014.05.001] [Citation(s) in RCA: 154] [Impact Index Per Article: 14.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/01/2013] [Revised: 05/02/2014] [Accepted: 05/03/2014] [Indexed: 12/11/2022]
Abstract
Eukaryotic cells utilize various RNA quality control mechanisms to ensure high fidelity of gene expression, thus protecting against the accumulation of nonfunctional RNA and the subsequent production of abnormal peptides. Messenger RNAs (mRNAs) are largely responsible for protein production, and mRNA quality control is particularly important for protecting the cell against the downstream effects of genetic mutations. Nonsense-mediated decay (NMD) is an evolutionarily conserved mRNA quality control system in all eukaryotes that degrades transcripts containing premature termination codons (PTCs). By degrading these aberrant transcripts, NMD acts to prevent the production of truncated proteins that could otherwise harm the cell through various insults, such as dominant negative effects or the ER stress response. Although NMD functions to protect the cell against the deleterious effects of aberrant mRNA, there is a growing body of evidence that mutation-, codon-, gene-, cell-, and tissue-specific differences in NMD efficiency can alter the underlying pathology of genetic disease. In addition, the protective role that NMD plays in genetic disease can undermine current therapeutic strategies aimed at increasing the production of full-length functional protein from genes harboring nonsense mutations. Here, we review the normal function of this RNA surveillance pathway and how it is regulated, provide current evidence for the role that it plays in modulating genetic disease phenotypes, and how NMD can be used as a therapeutic target.
Collapse
|
45
|
Erickson RP, Mitchison NA. The low frequency of recessive disease: insights from ENU mutagenesis, severity of disease phenotype, GWAS associations, and demography: an analytical review. J Appl Genet 2014; 55:319-27. [PMID: 24652618 DOI: 10.1007/s13353-014-0203-3] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2014] [Revised: 02/25/2014] [Accepted: 02/26/2014] [Indexed: 11/26/2022]
Abstract
A survey of a select panel of 14 genetic diseases with mixed inheritance confirms that, while autosomal recessive (AR) disease genes are more numerous than autosomal dominant (AD) or X-linked (XL) ones, they make a smaller average contribution to disease. Data collected from N-ethyl-N-nitrosourea (ENU) mutagenesis studies show a similar excess of AR mutations. The smaller AR contribution may partially reflect disease severity, but only in the comparison of AR with AD mutations. On the contrary, XL mutations for the 14 diseases are generally more severe. Genome-wide associations studies (GWAS) data provide fresh insight into the shortage, with a limited negative selection effect mediated by the pleiotropic expression of recessive disease genes in other deleterious phenotypes. Genomic data provide further evidence of purging selection in a past European population bottleneck followed by a dramatic population explosion, now more clearly associated with past climate change. We consider these likely to be the main factors responsible for the low AR to AD/XL inheritance ratio.
Collapse
Affiliation(s)
- Robert P Erickson
- Department of Pediatrics, University of Arizona, Tucson, AZ, 85724, USA,
| | | |
Collapse
|
46
|
Towers RE, Murgiano L, Millar DS, Glen E, Topf A, Jagannathan V, Drögemüller C, Goodship JA, Clarke AJ, Leeb T. A nonsense mutation in the IKBKG gene in mares with incontinentia pigmenti. PLoS One 2013; 8:e81625. [PMID: 24324710 PMCID: PMC3852476 DOI: 10.1371/journal.pone.0081625] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2013] [Accepted: 10/25/2013] [Indexed: 11/19/2022] Open
Abstract
Ectodermal dysplasias (EDs) are a large and heterogeneous group of hereditary disorders characterized by abnormalities in structures of ectodermal origin. Incontinentia pigmenti (IP) is an ED characterized by skin lesions evolving over time, as well as dental, nail, and ocular abnormalities. Due to X-linked dominant inheritance IP symptoms can only be seen in female individuals while affected males die during development in utero. We observed a family of horses, in which several mares developed signs of a skin disorder reminiscent of human IP. Cutaneous manifestations in affected horses included the development of pruritic, exudative lesions soon after birth. These developed into wart-like lesions and areas of alopecia with occasional wooly hair re-growth. Affected horses also had streaks of darker and lighter coat coloration from birth. The observation that only females were affected together with a high number of spontaneous abortions suggested an X-linked dominant mechanism of transmission. Using next generation sequencing we sequenced the whole genome of one affected mare. We analyzed the sequence data for non-synonymous variants in candidate genes and found a heterozygous nonsense variant in the X-chromosomal IKBKG gene (c.184C>T; p.Arg62*). Mutations in IKBKG were previously reported to cause IP in humans and the homologous p.Arg62* variant has already been observed in a human IP patient. The comparative data thus strongly suggest that this is also the causative variant for the observed IP in horses. To our knowledge this is the first large animal model for IP.
Collapse
Affiliation(s)
- Rachel E. Towers
- Institute of Medical Genetics, Cardiff University, Cardiff, United Kingdom
- Institute of Genetic Medicine, Newcastle University, Newcastle upon Tyne, United Kingdom
| | - Leonardo Murgiano
- Institute of Genetics, Vetsuisse Faculty, University of Bern, Bern, Switzerland
- DermFocus, Vetsuisse Faculty, University of Bern, Bern, Switzerland
| | - David S. Millar
- Institute of Medical Genetics, Cardiff University, Cardiff, United Kingdom
| | - Elise Glen
- Institute of Genetic Medicine, Newcastle University, Newcastle upon Tyne, United Kingdom
| | - Ana Topf
- Institute of Genetic Medicine, Newcastle University, Newcastle upon Tyne, United Kingdom
| | - Vidhya Jagannathan
- Institute of Genetics, Vetsuisse Faculty, University of Bern, Bern, Switzerland
- DermFocus, Vetsuisse Faculty, University of Bern, Bern, Switzerland
| | - Cord Drögemüller
- Institute of Genetics, Vetsuisse Faculty, University of Bern, Bern, Switzerland
- DermFocus, Vetsuisse Faculty, University of Bern, Bern, Switzerland
| | - Judith A. Goodship
- Institute of Genetic Medicine, Newcastle University, Newcastle upon Tyne, United Kingdom
| | - Angus J. Clarke
- Institute of Medical Genetics, Cardiff University, Cardiff, United Kingdom
| | - Tosso Leeb
- Institute of Genetics, Vetsuisse Faculty, University of Bern, Bern, Switzerland
- DermFocus, Vetsuisse Faculty, University of Bern, Bern, Switzerland
- * E-mail:
| |
Collapse
|
47
|
Molecular genetic epidemiology of human diseases: from patterns to predictions. Hum Genet 2013; 133:425-30. [PMID: 24241280 DOI: 10.1007/s00439-013-1396-y] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2013] [Accepted: 11/07/2013] [Indexed: 10/26/2022]
Abstract
Databases of disease-associated or disease-causing mutations allow the study, not only of the molecular mechanisms underlying the primary lesions at the DNA level, but also of the functional consequences of mutation at the phenotypic level. The Human Gene Mutation Database (HGMD) and the bioinformatics analyses of its content provide an illustrative example of this indirect approach to molecular genetic epidemiology. In fact, the Bayesian type of reasoning underlying previous scientific analyses of HGMD data is also reflected in current software tools used to predict the likely disease relevance of a newly detected genetic variant. After a brief resume of the past scientific utility of HGMD, we, therefore, shortly review three representative and commonly used examples of these tools, namely SIFT, PolyPhen-2 and NNSplice.
Collapse
|
48
|
Fan X, Yoshida Y, Honda S, Matsumoto M, Sawada Y, Hattori M, Hisanaga S, Hiwa R, Nakamura F, Tomomori M, Miyagawa S, Fujimaru R, Yamada H, Sawai T, Ikeda Y, Iwata N, Uemura O, Matsukuma E, Aizawa Y, Harada H, Wada H, Ishikawa E, Ashida A, Nangaku M, Miyata T, Fujimura Y. Analysis of genetic and predisposing factors in Japanese patients with atypical hemolytic uremic syndrome. Mol Immunol 2013; 54:238-46. [DOI: 10.1016/j.molimm.2012.12.006] [Citation(s) in RCA: 56] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2012] [Accepted: 12/09/2012] [Indexed: 11/24/2022]
|
49
|
Zhang R, Yap VB. Context-dependent substitution models for circular DNA. INFECTION GENETICS AND EVOLUTION 2013; 18:362-6. [PMID: 23499773 DOI: 10.1016/j.meegid.2013.03.001] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/14/2012] [Revised: 02/25/2013] [Accepted: 03/02/2013] [Indexed: 11/30/2022]
Abstract
The most general context-dependent Markov substitution process, where each substitution event involves only one site and substitution rates depend on the whole sequence, is presented for the first time. The focus is on circular DNA sequences, where the problem of specifying the behaviour of the first and last sites in a linear sequence does not arise. Important special cases include (1) the established models where each site behaves independently, (2) models which are increasingly applied to non-coding DNA, where each site depends on only the immediate neighbouring sites, and (3) models where each site depends on two closest neighbours on both sides, such as the codon models. These special cases are classified and illustrated by published models. It is shown that the existing codon substitution models mix up the mutation and selection processes, rendering the substitution rates challenging to interpret. The classification suggests the study of a more interpretable codon model, where the mutation and selection processes are clearly delineated. Furthermore, this model allows a natural accommodation of possibly different selection pressures in overlapping reading frames, which may contribute to furthering the understanding of viral diseases. Also included are brief discussions on the stationary distribution of a context-dependent substitution process and a simple recipe for simulating it on a computer.
Collapse
Affiliation(s)
- Rongli Zhang
- Department of Statistics and Applied Probability, National University of Singapore, Block S16 Level 7, 6 Science Drive 2, Singapore 117546, Singapore
| | | |
Collapse
|
50
|
Early results of sarcomeric gene screening from the Egyptian National BA-HCM Program. J Cardiovasc Transl Res 2012; 6:65-80. [PMID: 23233322 PMCID: PMC3546296 DOI: 10.1007/s12265-012-9425-0] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/03/2012] [Accepted: 11/07/2012] [Indexed: 02/01/2023]
Abstract
The present study comprised sarcomeric genotyping of the three most commonly involved sarcomeric genes: MYBPC3, MYH7, and TNNT2 in 192 unrelated Egyptian hypertrophic cardiomyopathy (HCM) index patients. Mutations were detected in 40 % of cases. Presence of positive family history was significantly (p = 0.002) associated with a higher genetic positive yield (49/78, 62.8 %). The majority of the detected mutations in the three sarcomeric genes were novel (40/62, 65 %) and mostly private (47/62, 77 %). Single nucleotide substitution was the most frequently detected mutation type (51/62, 82 %). Over three quarters of these substitutions (21/27, 78 %) involved CpG dinucleotide sites and resulted from C > T or G > A transition in the three analyzed genes, highlighting the significance of CpG high mutability within the sarcomeric genes examined. This study could aid in global comparative studies in different ethnic populations and constitutes an important step in the evolution of the integrated clinical, translational, and basic science HCM program.
Collapse
|