51
|
Bueno-Martínez E, Sanoguera-Miralles L, Valenzuela-Palomo A, Lorca V, Gómez-Sanz A, Carvalho S, Allen J, Infante M, Pérez-Segura P, Lázaro C, Easton DF, Devilee P, Vreeswijk MPG, de la Hoya M, Velasco EA. RAD51D Aberrant Splicing in Breast Cancer: Identification of Splicing Regulatory Elements and Minigene-Based Evaluation of 53 DNA Variants. Cancers (Basel) 2021; 13:2845. [PMID: 34200360 PMCID: PMC8201001 DOI: 10.3390/cancers13112845] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2021] [Revised: 06/01/2021] [Accepted: 06/03/2021] [Indexed: 12/18/2022] Open
Abstract
RAD51D loss-of-function variants increase lifetime risk of breast and ovarian cancer. Splicing disruption is a frequent pathogenic mechanism associated with variants in susceptibility genes. Herein, we have assessed the splicing and clinical impact of splice-site and exonic splicing enhancer (ESE) variants identified through the study of ~113,000 women of the BRIDGES cohort. A RAD51D minigene with exons 2-9 was constructed in splicing vector pSAD. Eleven BRIDGES splice-site variants (selected by MaxEntScan) were introduced into the minigene by site-directed mutagenesis and tested in MCF-7 cells. The 11 variants disrupted splicing, collectively generating 25 different aberrant transcripts. All variants but one produced negligible levels (<3.4%) of the full-length (FL) transcript. In addition, ESE elements of the alternative exon 3 were mapped by testing four overlapping exonic microdeletions (≥30-bp), revealing an ESE-rich interval (c.202_235del) with critical sequences for exon 3 recognition that might have been affected by germline variants. Next, 26 BRIDGES variants and 16 artificial exon 3 single-nucleotide substitutions were also assayed. Thirty variants impaired splicing with variable amounts (0-65.1%) of the FL transcript, although only c.202G>A demonstrated a complete aberrant splicing pattern without the FL transcript. On the other hand, c.214T>C increased efficiency of exon 3 recognition, so only the FL transcript was detected (100%). In conclusion, 41 RAD51D spliceogenic variants (28 of which were from the BRIDGES cohort) were identified by minigene assays. We show that minigene-based mapping of ESEs is a powerful approach for identifying ESE hotspots and ESE-disrupting variants. Finally, we have classified nine variants as likely pathogenic according to ACMG/AMP-based guidelines, highlighting the complex relationship between splicing alterations and variant interpretation.
Collapse
Affiliation(s)
- Elena Bueno-Martínez
- Splicing and Genetic Susceptibility to Cancer Laboratory, Unidad de Excelencia Instituto de Biología y Genética Molecular, Consejo Superior de Investigaciones Científicas (CSIC-UVa), 47003 Valladolid, Spain; (E.B.-M.); (L.S.-M.); (A.V.-P.)
| | - Lara Sanoguera-Miralles
- Splicing and Genetic Susceptibility to Cancer Laboratory, Unidad de Excelencia Instituto de Biología y Genética Molecular, Consejo Superior de Investigaciones Científicas (CSIC-UVa), 47003 Valladolid, Spain; (E.B.-M.); (L.S.-M.); (A.V.-P.)
| | - Alberto Valenzuela-Palomo
- Splicing and Genetic Susceptibility to Cancer Laboratory, Unidad de Excelencia Instituto de Biología y Genética Molecular, Consejo Superior de Investigaciones Científicas (CSIC-UVa), 47003 Valladolid, Spain; (E.B.-M.); (L.S.-M.); (A.V.-P.)
| | - Víctor Lorca
- Molecular Oncology Laboratory CIBERONC, IdISSC (Instituto de Investigación Sanitaria del Hospital Clínico San Carlos), Hospital Clinico San Carlos, 28040 Madrid, Spain; (V.L.); (A.G.-S.); (P.P.-S.)
| | - Alicia Gómez-Sanz
- Molecular Oncology Laboratory CIBERONC, IdISSC (Instituto de Investigación Sanitaria del Hospital Clínico San Carlos), Hospital Clinico San Carlos, 28040 Madrid, Spain; (V.L.); (A.G.-S.); (P.P.-S.)
| | - Sara Carvalho
- Centre for Cancer Genetic Epidemiology, Department of Public Health and Primary Care, University of Cambridge, Cambridge CB1 8RN, UK; (S.C.); (J.A.); (D.F.E.)
| | - Jamie Allen
- Centre for Cancer Genetic Epidemiology, Department of Public Health and Primary Care, University of Cambridge, Cambridge CB1 8RN, UK; (S.C.); (J.A.); (D.F.E.)
| | - Mar Infante
- Cancer Genetics, Unidad de Excelencia Instituto de Biología y Genética Molecular (CSIC-UVa), 47003 Valladolid, Spain;
| | - Pedro Pérez-Segura
- Molecular Oncology Laboratory CIBERONC, IdISSC (Instituto de Investigación Sanitaria del Hospital Clínico San Carlos), Hospital Clinico San Carlos, 28040 Madrid, Spain; (V.L.); (A.G.-S.); (P.P.-S.)
| | - Conxi Lázaro
- Hereditary Cancer Program, Catalan Institute of Oncology, IDIBELL and CIBERONC, 08908 Hospitalet de Llobregat, Spain;
| | - Douglas F. Easton
- Centre for Cancer Genetic Epidemiology, Department of Public Health and Primary Care, University of Cambridge, Cambridge CB1 8RN, UK; (S.C.); (J.A.); (D.F.E.)
| | - Peter Devilee
- Department of Human Genetics, Leiden University Medical Center, 2300RC Leiden, The Netherlands; (P.D.); (M.P.G.V.)
| | - Maaike P. G. Vreeswijk
- Department of Human Genetics, Leiden University Medical Center, 2300RC Leiden, The Netherlands; (P.D.); (M.P.G.V.)
| | - Miguel de la Hoya
- Molecular Oncology Laboratory CIBERONC, IdISSC (Instituto de Investigación Sanitaria del Hospital Clínico San Carlos), Hospital Clinico San Carlos, 28040 Madrid, Spain; (V.L.); (A.G.-S.); (P.P.-S.)
| | - Eladio A. Velasco
- Splicing and Genetic Susceptibility to Cancer Laboratory, Unidad de Excelencia Instituto de Biología y Genética Molecular, Consejo Superior de Investigaciones Científicas (CSIC-UVa), 47003 Valladolid, Spain; (E.B.-M.); (L.S.-M.); (A.V.-P.)
| |
Collapse
|
52
|
Hotspot exons are common targets of splicing perturbations. Nat Commun 2021; 12:2756. [PMID: 33980843 PMCID: PMC8115636 DOI: 10.1038/s41467-021-22780-2] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2019] [Accepted: 02/24/2021] [Indexed: 11/08/2022] Open
Abstract
High-throughput splicing assays have demonstrated that many exonic variants can disrupt splicing; however, splice-disrupting variants distribute non-uniformly across genes. We propose the existence of exons that are particularly susceptible to splice-disrupting variants, which we refer to as hotspot exons. Hotspot exons are also more susceptible to splicing perturbation through drug treatment and knock-down of RNA-binding proteins. We develop a classifier for exonic splice-disrupting variants and use it to infer hotspot exons. We estimate that 1400 exons in the human genome are hotspots. Using panels of splicing reporters, we demonstrate how the ability of an exon to tolerate a mutation is inversely proportional to the strength of its neighboring splice sites. Splicing-disrupting mutations are linked to diseases. By employing a machine learning approach, the authors show that certain exons, termed hotspot exons, are enriched for splicing-disruption variants and susceptible to splicing perturbations.
Collapse
|
53
|
Routh S, Acharyya A, Dhar R. A two-step PCR assembly for construction of gene variants across large mutational distances. Biol Methods Protoc 2021; 6:bpab007. [PMID: 33928191 PMCID: PMC8062255 DOI: 10.1093/biomethods/bpab007] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2021] [Revised: 03/09/2021] [Accepted: 04/01/2021] [Indexed: 11/14/2022] Open
Abstract
Construction of empirical fitness landscapes has transformed our understanding of genotype-phenotype relationships across genes. However, most empirical fitness landscapes have been constrained to the local genotype neighbourhood of a gene primarily due to our limited ability to systematically construct genotypes that differ by a large number of mutations. Although a few methods have been proposed in the literature, these techniques are complex owing to several steps of construction or contain a large number of amplification cycles that increase chances of non-specific mutations. A few other described methods require amplification of the whole vector, thereby increasing the chances of vector backbone mutations that can have unintended consequences for study of fitness landscapes. Thus, this has substantially constrained us from traversing large mutational distances in the genotype network, thereby limiting our understanding of the interactions between multiple mutations and the role these interactions play in evolution of novel phenotypes. In the current work, we present a simple but powerful approach that allows us to systematically and accurately construct gene variants at large mutational distances. Our approach relies on building-up small fragments containing targeted mutations in the first step followed by assembly of these fragments into the complete gene fragment by polymerase chain reaction (PCR). We demonstrate the utility of our approach by constructing variants that differ by up to 11 mutations in a model gene. Our work thus provides an accurate method for construction of multi-mutant variants of genes and therefore will transform the studies of empirical fitness landscapes by enabling exploration of genotypes that are far away from a starting genotype.
Collapse
Affiliation(s)
- Shreya Routh
- Department of Biotechnology, Indian Institute of Technology Kharagpur, Kharagpur 721302, West Bengal, India
| | - Anamika Acharyya
- Department of Biotechnology, Indian Institute of Technology Kharagpur, Kharagpur 721302, West Bengal, India
| | - Riddhiman Dhar
- Department of Biotechnology, Indian Institute of Technology Kharagpur, Kharagpur 721302, West Bengal, India
| |
Collapse
|
54
|
Cheng J, Çelik MH, Kundaje A, Gagneur J. MTSplice predicts effects of genetic variants on tissue-specific splicing. Genome Biol 2021; 22:94. [PMID: 33789710 PMCID: PMC8011109 DOI: 10.1186/s13059-021-02273-7] [Citation(s) in RCA: 17] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2020] [Accepted: 01/14/2021] [Indexed: 12/20/2022] Open
Abstract
We develop the free and open-source model Multi-tissue Splicing (MTSplice) to predict the effects of genetic variants on splicing of cassette exons in 56 human tissues. MTSplice combines MMSplice, which models constitutive regulatory sequences, with a new neural network that models tissue-specific regulatory sequences. MTSplice outperforms MMSplice on predicting tissue-specific variations associated with genetic variants in most tissues of the GTEx dataset, with largest improvements on brain tissues. Furthermore, MTSplice predicts that autism-associated de novo mutations are enriched for variants affecting splicing specifically in the brain. We foresee that MTSplice will aid interpreting variants associated with tissue-specific disorders.
Collapse
Affiliation(s)
- Jun Cheng
- Department of Informatics, Technical University of Munich, Boltzmannstraße, Garching, 85748, Germany.
| | - Muhammed Hasan Çelik
- Department of Informatics, Technical University of Munich, Boltzmannstraße, Garching, 85748, Germany
| | - Anshul Kundaje
- Department of Computer Science, Stanford University, Stanford, CA, USA
- Department of Genetics, Stanford University, Stanford, CA, USA
| | - Julien Gagneur
- Department of Informatics, Technical University of Munich, Boltzmannstraße, Garching, 85748, Germany.
- Institute of Computational Biology, Helmholtz Zentrum München, Neuherberg, Germany.
- Institute of Human Genetics, Klinikum rechts der Isar, Technical University of Munich, Munich, Germany.
| |
Collapse
|
55
|
Královičová J, Borovská I, Pengelly R, Lee E, Abaffy P, Šindelka R, Grutzner F, Vořechovský I. Restriction of an intron size en route to endothermy. Nucleic Acids Res 2021; 49:2460-2487. [PMID: 33550394 PMCID: PMC7969005 DOI: 10.1093/nar/gkab046] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2020] [Revised: 01/11/2021] [Accepted: 01/15/2021] [Indexed: 11/15/2022] Open
Abstract
Ca2+-insensitive and -sensitive E1 subunits of the 2-oxoglutarate dehydrogenase complex (OGDHC) regulate tissue-specific NADH and ATP supply by mutually exclusive OGDH exons 4a and 4b. Here we show that their splicing is enforced by distant lariat branch points (dBPs) located near the 5' splice site of the intervening intron. dBPs restrict the intron length and prevent transposon insertions, which can introduce or eliminate dBP competitors. The size restriction was imposed by a single dominant dBP in anamniotes that expanded into a conserved constellation of four dBP adenines in amniotes. The amniote clusters exhibit taxon-specific usage of individual dBPs, reflecting accessibility of their extended motifs within a stable RNA hairpin rather than U2 snRNA:dBP base-pairing. The dBP expansion took place in early terrestrial species and was followed by a uridine enrichment of large downstream polypyrimidine tracts in mammals. The dBP-protected megatracts permit reciprocal regulation of exon 4a and 4b by uridine-binding proteins, including TIA-1/TIAR and PUF60, which promote U1 and U2 snRNP recruitment to the 5' splice site and BP, respectively, but do not significantly alter the relative dBP usage. We further show that codons for residues critically contributing to protein binding sites for Ca2+ and other divalent metals confer the exon inclusion order that mirrors the Irving-Williams affinity series, linking the evolution of auxiliary splicing motifs in exons to metallome constraints. Finally, we hypothesize that the dBP-driven selection for Ca2+-dependent ATP provision by E1 facilitated evolution of endothermy by optimizing the aerobic scope in target tissues.
Collapse
Affiliation(s)
- Jana Královičová
- University of Southampton, Faculty of Medicine, HDH, Southampton SO16 6YD, UK
- Slovak Academy of Sciences, Centre for Biosciences, 840 05 Bratislava, Slovak Republic
| | - Ivana Borovská
- Slovak Academy of Sciences, Centre for Biosciences, 840 05 Bratislava, Slovak Republic
| | - Reuben Pengelly
- University of Southampton, Faculty of Medicine, HDH, Southampton SO16 6YD, UK
| | - Eunice Lee
- School of Biological Sciences, University of Adelaide, Adelaide 5005, SA, Australia
| | - Pavel Abaffy
- Czech Academy of Sciences, Institute of Biotechnology, 25250 Vestec, Czech Republic
| | - Radek Šindelka
- Czech Academy of Sciences, Institute of Biotechnology, 25250 Vestec, Czech Republic
| | - Frank Grutzner
- School of Biological Sciences, University of Adelaide, Adelaide 5005, SA, Australia
| | - Igor Vořechovský
- University of Southampton, Faculty of Medicine, HDH, Southampton SO16 6YD, UK
| |
Collapse
|
56
|
Holcomb D, Alexaki A, Hernandez N, Hunt R, Laurie K, Kames J, Hamasaki-Katagiri N, Komar AA, DiCuccio M, Kimchi-Sarfaty C. Gene variants of coagulation related proteins that interact with SARS-CoV-2. PLoS Comput Biol 2021; 17:e1008805. [PMID: 33730015 PMCID: PMC8007013 DOI: 10.1371/journal.pcbi.1008805] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2020] [Revised: 03/29/2021] [Accepted: 02/15/2021] [Indexed: 12/30/2022] Open
Abstract
Thrombosis is a recognized complication of Coronavirus disease of 2019 (COVID-19) and is often associated with poor prognosis. There is a well-recognized link between coagulation and inflammation, however, the extent of thrombotic events associated with COVID-19 warrants further investigation. Poly(A) Binding Protein Cytoplasmic 4 (PABPC4), Serine/Cysteine Proteinase Inhibitor Clade G Member 1 (SERPING1) and Vitamin K epOxide Reductase Complex subunit 1 (VKORC1), which are all proteins linked to coagulation, have been shown to interact with SARS proteins. We computationally examined the interaction of these with SARS-CoV-2 proteins and, in the case of VKORC1, we describe its binding to ORF7a in detail. We examined the occurrence of variants of each of these proteins across populations and interrogated their potential contribution to COVID-19 severity. Potential mechanisms, by which some of these variants may contribute to disease, are proposed. Some of these variants are prevalent in minority groups that are disproportionally affected by severe COVID-19. Therefore, we are proposing that further investigation around these variants may lead to better understanding of disease pathogenesis in minority groups and more informed therapeutic approaches.
Collapse
Affiliation(s)
- David Holcomb
- Center for Biologics Evaluation and Research, Office of Tissues and Advanced Therapies, Division of Plasma Protein Therapeutics, Food and Drug Administration, Silver Spring, Maryland, United States of America
| | - Aikaterini Alexaki
- Center for Biologics Evaluation and Research, Office of Tissues and Advanced Therapies, Division of Plasma Protein Therapeutics, Food and Drug Administration, Silver Spring, Maryland, United States of America
| | - Nancy Hernandez
- Center for Biologics Evaluation and Research, Office of Tissues and Advanced Therapies, Division of Plasma Protein Therapeutics, Food and Drug Administration, Silver Spring, Maryland, United States of America
| | - Ryan Hunt
- Center for Biologics Evaluation and Research, Office of Tissues and Advanced Therapies, Division of Plasma Protein Therapeutics, Food and Drug Administration, Silver Spring, Maryland, United States of America
| | - Kyle Laurie
- Center for Biologics Evaluation and Research, Office of Tissues and Advanced Therapies, Division of Plasma Protein Therapeutics, Food and Drug Administration, Silver Spring, Maryland, United States of America
| | - Jacob Kames
- Center for Biologics Evaluation and Research, Office of Tissues and Advanced Therapies, Division of Plasma Protein Therapeutics, Food and Drug Administration, Silver Spring, Maryland, United States of America
| | - Nobuko Hamasaki-Katagiri
- Center for Biologics Evaluation and Research, Office of Tissues and Advanced Therapies, Division of Plasma Protein Therapeutics, Food and Drug Administration, Silver Spring, Maryland, United States of America
| | - Anton A. Komar
- Center for Gene Regulation in Health and Disease, Department of Biological, Geological and Environmental Sciences, Cleveland State University, Cleveland, Ohio, United States of America
| | - Michael DiCuccio
- National Center of Biotechnology Information, National Institutes of Health, Bethesda, Maryland, United States of America
| | - Chava Kimchi-Sarfaty
- Center for Biologics Evaluation and Research, Office of Tissues and Advanced Therapies, Division of Plasma Protein Therapeutics, Food and Drug Administration, Silver Spring, Maryland, United States of America
| |
Collapse
|
57
|
Rentzsch P, Schubach M, Shendure J, Kircher M. CADD-Splice-improving genome-wide variant effect prediction using deep learning-derived splice scores. Genome Med 2021; 13:31. [PMID: 33618777 PMCID: PMC7901104 DOI: 10.1186/s13073-021-00835-9] [Citation(s) in RCA: 343] [Impact Index Per Article: 114.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2020] [Accepted: 01/20/2021] [Indexed: 02/08/2023] Open
Abstract
Background Splicing of genomic exons into mRNAs is a critical prerequisite for the accurate synthesis of human proteins. Genetic variants impacting splicing underlie a substantial proportion of genetic disease, but are challenging to identify beyond those occurring at donor and acceptor dinucleotides. To address this, various methods aim to predict variant effects on splicing. Recently, deep neural networks (DNNs) have been shown to achieve better results in predicting splice variants than other strategies. Methods It has been unclear how best to integrate such process-specific scores into genome-wide variant effect predictors. Here, we use a recently published experimental data set to compare several machine learning methods that score variant effects on splicing. We integrate the best of those approaches into general variant effect prediction models and observe the effect on classification of known pathogenic variants. Results We integrate two specialized splicing scores into CADD (Combined Annotation Dependent Depletion; cadd.gs.washington.edu), a widely used tool for genome-wide variant effect prediction that we previously developed to weight and integrate diverse collections of genomic annotations. With this new model, CADD-Splice, we show that inclusion of splicing DNN effect scores substantially improves predictions across multiple variant categories, without compromising overall performance. Conclusions While splice effect scores show superior performance on splice variants, specialized predictors cannot compete with other variant scores in general variant interpretation, as the latter account for nonsense and missense effects that do not alter splicing. Although only shown here for splice scores, we believe that the applied approach will generalize to other specific molecular processes, providing a path for the further improvement of genome-wide variant effect prediction. Supplementary Information The online version contains supplementary material available at 10.1186/s13073-021-00835-9.
Collapse
Affiliation(s)
- Philipp Rentzsch
- Charité - Universitätsmedizin Berlin, 10117, Berlin, Germany.,Berlin Institute of Health (BIH), 10178, Berlin, Germany
| | - Max Schubach
- Charité - Universitätsmedizin Berlin, 10117, Berlin, Germany.,Berlin Institute of Health (BIH), 10178, Berlin, Germany
| | - Jay Shendure
- Brotman Baty Institute for Precision Medicine, University of Washington, Seattle, WA, 98195, USA.,Department of Genome Sciences, University of Washington, Seattle, WA, 98195, USA
| | - Martin Kircher
- Charité - Universitätsmedizin Berlin, 10117, Berlin, Germany. .,Berlin Institute of Health (BIH), 10178, Berlin, Germany.
| |
Collapse
|
58
|
Jung H, Lee KS, Choi JK. Comprehensive characterisation of intronic mis-splicing mutations in human cancers. Oncogene 2021; 40:1347-1361. [PMID: 33420369 PMCID: PMC7892346 DOI: 10.1038/s41388-020-01614-3] [Citation(s) in RCA: 26] [Impact Index Per Article: 8.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2020] [Revised: 11/23/2020] [Accepted: 12/10/2020] [Indexed: 12/21/2022]
Abstract
Previous studies studying mis-splicing mutations were based on exome data and thus our current knowledge is largely limited to exons and the canonical splice sites. To comprehensively characterise intronic mis-splicing mutations, we analysed 1134 pan-cancer whole genomes and transcriptomes together with 3022 normal control samples. The ratio-based splicing analysis resulted in 678 somatic intronic mutations, with 46% residing in deep introns. Among the 309 deep intronic single nucleotide variants, 245 altered core splicing codes, with 38% activating cryptic splice sites, 12% activating cryptic polypyrimidine tracts, and 36% and 12% disrupting authentic polypyrimidine tracts and branchpoints, respectively. All the intronic cryptic splice sites were created at pre-existing GT/AG dinucleotides or by GC-to-GT conversion. Notably, 85 deep intronic mutations indicated gain of splicing enhancers or loss of splicing silencers. We found that 64 tumour suppressors were affected by intronic mutations and blood cancers showed higher proportion of deep intronic mutations. In particular, a telomere maintenance gene, POT1, was recurrently mis-spliced by deep intronic mutations in blood cancers. We validated a pseudoexon activation involving a splicing silencer in POT1 by CRISPR/Cas9. Our results shed light on previously unappreciated mechanisms by which noncoding mutations acting on splicing codes in deep introns contribute to tumourigenesis.
Collapse
Affiliation(s)
- Hyunchul Jung
- Department of Bio and Brain Engineering, KAIST, Daejeon, 34141, Republic of Korea.
- Cancer Ageing and Somatic Mutation Programme, Wellcome Sanger Institute, Cambridge, UK.
| | - Kang Seon Lee
- Department of Bio and Brain Engineering, KAIST, Daejeon, 34141, Republic of Korea
| | - Jung Kyoon Choi
- Department of Bio and Brain Engineering, KAIST, Daejeon, 34141, Republic of Korea.
- Penta Medix Co., Ltd., Seongnam-si, Gyeongi-do, 13449, Republic of Korea.
| |
Collapse
|
59
|
Saint-Martin C, Cauchois-Le Mière M, Rex E, Soukarieh O, Arnoux JB, Buratti J, Bouvet D, Frébourg T, Gaildrat P, Shyng SL, Bellanné-Chantelot C, Martins A. Functional characterization of ABCC8 variants of unknown significance based on bioinformatics predictions, splicing assays, and protein analyses: Benefits for the accurate diagnosis of congenital hyperinsulinism. Hum Mutat 2021; 42:408-420. [PMID: 33410562 DOI: 10.1002/humu.24164] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2020] [Revised: 12/06/2020] [Accepted: 12/31/2020] [Indexed: 12/20/2022]
Abstract
ABCC8 encodes the SUR1 subunit of the β-cell ATP-sensitive potassium channel whose loss of function causes congenital hyperinsulinism (CHI). Molecular diagnosis is critical for optimal management of CHI patients. Unfortunately, assessing the impact of ABCC8 variants on RNA splicing remains very challenging as this gene is poorly expressed in leukocytes. Here, we performed bioinformatics analysis and cell-based minigene assays to assess the impact on splicing of 13 ABCC8 variants identified in 20 CHI patients. Next, channel properties of SUR1 proteins expected to originate from minigene-detected in-frame splicing defects were analyzed after ectopic expression in COSm6 cells. Out of the analyzed variants, seven induced out-of-frame splicing defects and were therefore classified as recessive pathogenic, whereas two led to skipping of in-frame exons. Channel functional analysis of the latter demonstrated their pathogenicity. Interestingly, the common rs757110 SNP increased exon skipping in our system suggesting that it may act as a disease modifier factor. Our strategy allowed determining the pathogenicity of all selected ABCC8 variants, and CHI-inheritance pattern for 16 out of the 20 patients. This study highlights the value of combining RNA and protein functional approaches in variant interpretation and reveals the minigene splicing assay as a new tool for CHI molecular diagnostics.
Collapse
Affiliation(s)
- Cécile Saint-Martin
- Department of Genetics, AP-HP Pitié-Salpêtrière Hospital, Sorbonne University, Paris, France
| | - Marine Cauchois-Le Mière
- Inserm U1245, UFR de Médecine et Pharmacie, UNIROUEN, Normandie University, Normandy Centre for Genomic and Personalized Medicine, Rouen, France.,Department of Genetics, University Hospital, Normandy Centre for Genomic and Personalized Medicine, Rouen, France
| | - Emily Rex
- Department of Chemical Physiology and Biochemistry, Oregon Health & Science University, Portland, OR, USA
| | - Omar Soukarieh
- Inserm U1245, UFR de Médecine et Pharmacie, UNIROUEN, Normandie University, Normandy Centre for Genomic and Personalized Medicine, Rouen, France
| | - Jean-Baptiste Arnoux
- Department of Inherited Metabolic Disease, Necker-Enfants Malades University Hospital, AP-HP, Paris, France
| | - Julien Buratti
- Department of Genetics, AP-HP Pitié-Salpêtrière Hospital, Sorbonne University, Paris, France
| | - Delphine Bouvet
- Department of Genetics, AP-HP Pitié-Salpêtrière Hospital, Sorbonne University, Paris, France
| | - Thierry Frébourg
- Inserm U1245, UFR de Médecine et Pharmacie, UNIROUEN, Normandie University, Normandy Centre for Genomic and Personalized Medicine, Rouen, France.,Department of Genetics, University Hospital, Normandy Centre for Genomic and Personalized Medicine, Rouen, France
| | - Pascaline Gaildrat
- Inserm U1245, UFR de Médecine et Pharmacie, UNIROUEN, Normandie University, Normandy Centre for Genomic and Personalized Medicine, Rouen, France
| | - Show-Ling Shyng
- Department of Chemical Physiology and Biochemistry, Oregon Health & Science University, Portland, OR, USA
| | | | - Alexandra Martins
- Inserm U1245, UFR de Médecine et Pharmacie, UNIROUEN, Normandie University, Normandy Centre for Genomic and Personalized Medicine, Rouen, France
| |
Collapse
|
60
|
Louis JM, Agarwal A, Aduri R, Talukdar I. Global analysis of RNA-protein interactions in TNF-α induced alternative splicing in metabolic disorders. FEBS Lett 2021; 595:476-490. [PMID: 33417721 DOI: 10.1002/1873-3468.14029] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2020] [Revised: 11/26/2020] [Accepted: 12/10/2020] [Indexed: 12/27/2022]
Abstract
In this report, using the database of RNA-binding protein specificities (RBPDB) and our previously published RNA-seq data, we analyzed the interactions between RNA and RNA-binding proteins to decipher the role of alternative splicing in metabolic disorders induced by TNF-α. We identified 13 395 unique RNA-RBP interactions, including 385 unique RNA motifs and 35 RBPs, some of which (including MBNL-1 and 3, ZFP36, ZRANB2, and SNRPA) are transcriptionally regulated by TNF-α. In addition to some previously reported RBPs, such as RBMX and HuR/ELAVL1, we found a few novel RBPs, such as ZRANB2 and SNRPA, to be involved in the regulation of metabolic syndrome-associated genes that contain an enrichment of tetrameric RNA sequences (AUUU). Taken together, this study paves the way for novel RNA-protein interaction-based therapeutics for treating metabolic syndromes.
Collapse
Affiliation(s)
- Jiss Maria Louis
- Department of Biological Sciences, BITS Pilani, Zuarinagar, India
| | - Arjun Agarwal
- Department of Computer Science, BITS Pilani, Zuarinagar, India
| | - Raviprasad Aduri
- Department of Biological Sciences, BITS Pilani, Zuarinagar, India
| | - Indrani Talukdar
- Department of Biological Sciences, BITS Pilani, Zuarinagar, India
| |
Collapse
|
61
|
Amoah K, Hsiao YHE, Bahn JH, Sun Y, Burghard C, Tan BX, Yang EW, Xiao X. Allele-specific alternative splicing and its functional genetic variants in human tissues. Genome Res 2021; 31:359-371. [PMID: 33452016 PMCID: PMC7919445 DOI: 10.1101/gr.265637.120] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2020] [Accepted: 01/14/2021] [Indexed: 02/07/2023]
Abstract
Alternative splicing is an RNA processing mechanism that affects most genes in human, contributing to disease mechanisms and phenotypic diversity. The regulation of splicing involves an intricate network of cis-regulatory elements and trans-acting factors. Due to their high sequence specificity, cis-regulation of splicing can be altered by genetic variants, significantly affecting splicing outcomes. Recently, multiple methods have been applied to understanding the regulatory effects of genetic variants on splicing. However, it is still challenging to go beyond apparent association to pinpoint functional variants. To fill in this gap, we utilized large-scale data sets of the Genotype-Tissue Expression (GTEx) project to study genetically modulated alternative splicing (GMAS) via identification of allele-specific splicing events. We demonstrate that GMAS events are shared across tissues and individuals more often than expected by chance, consistent with their genetically driven nature. Moreover, although the allelic bias of GMAS exons varies across samples, the degree of variation is similar across tissues versus individuals. Thus, genetic background drives the GMAS pattern to a similar degree as tissue-specific splicing mechanisms. Leveraging the genetically driven nature of GMAS, we developed a new method to predict functional splicing-altering variants, built upon a genotype-phenotype concordance model across samples. Complemented by experimental validations, this method predicted >1000 functional variants, many of which may alter RNA-protein interactions. Lastly, 72% of GMAS-associated SNPs were in linkage disequilibrium with GWAS-reported SNPs, and such association was enriched in tissues of relevance for specific traits/diseases. Our study enables a comprehensive view of genetically driven splicing variations in human tissues.
Collapse
Affiliation(s)
- Kofi Amoah
- Bioinformatics Interdepartmental Program, University of California, Los Angeles, California 90095, USA
| | - Yun-Hua Esther Hsiao
- Department of Bioengineering, University of California, Los Angeles, California 90095, USA
| | - Jae Hoon Bahn
- Department of Integrative Biology and Physiology, University of California, Los Angeles, California 90095, USA
| | - Yiwei Sun
- Department of Integrative Biology and Physiology, University of California, Los Angeles, California 90095, USA
| | - Christina Burghard
- Bioinformatics Interdepartmental Program, University of California, Los Angeles, California 90095, USA
| | - Boon Xin Tan
- Department of Integrative Biology and Physiology, University of California, Los Angeles, California 90095, USA
| | - Ei-Wen Yang
- Department of Integrative Biology and Physiology, University of California, Los Angeles, California 90095, USA
| | - Xinshu Xiao
- Bioinformatics Interdepartmental Program, University of California, Los Angeles, California 90095, USA.,Department of Bioengineering, University of California, Los Angeles, California 90095, USA.,Department of Integrative Biology and Physiology, University of California, Los Angeles, California 90095, USA.,Molecular Biology Institute, University of California, Los Angeles, California 90095, USA.,Institute for Quantitative and Computational Biosciences, University of California, Los Angeles, California 90095, USA
| |
Collapse
|
62
|
Splicing mutations in inherited retinal diseases. Prog Retin Eye Res 2021. [DOI: 10.1016/j.preteyeres.2020.100874
expr 921883647 + 833887994] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/16/2023]
|
63
|
Le Tertre M, Ka C, Raud L, Berlivet I, Gourlaouen I, Richard G, Uguen K, Chen JM, Férec C, Fichou Y, Le Gac G. Splicing analysis of SLC40A1 missense variations and contribution to hemochromatosis type 4 phenotypes. Blood Cells Mol Dis 2020; 87:102527. [PMID: 33341511 DOI: 10.1016/j.bcmd.2020.102527] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2020] [Revised: 11/24/2020] [Accepted: 11/24/2020] [Indexed: 02/09/2023]
Abstract
Hemochromatosis type 4, or ferroportin disease, is considered as the second leading cause of primary iron overload after HFE-related hemochromatosis. The disease, which is predominantly associated with missense variations in the SLC40A1 gene, is characterized by wide clinical heterogeneity. We tested the possibility that some of the reported missense mutations, despite their positions within exons, cause splicing defects. Fifty-eight genetic variants were selected from the literature based on two criteria: a precise description of the nucleotide change and individual evidence of iron overload. The selected variants were investigated by different in silico prediction tools and prioritized for midigene splicing assays. Of the 15 variations tested in vitro, only two were associated with splicing changes. We confirm that the c.1402G>A transition (p.Gly468Ser) disrupts the exon 7 donor site, leading to the use of an exonic cryptic splicing site and the generation of a truncated reading frame. We observed, for the first time, that the p.Gly468Ser substitution has no effect on the ferroportin iron export function. We demonstrate alternative splicing of exon 5 in different cell lines and show that the c.430A>G (p.Asn144Asp) variant promotes exon 5 inclusion. This could be part of a gain-of-function mechanism. We conclude that splicing mutations rarely contribute to hemochromatosis type 4 phenotypes. An in-depth investigation of exon 5 auxiliary splicing sequences may help to elucidate the mechanism by which splicing regulatory proteins regulate the production of the full length SLC40A1 transcript and to clarify its physiological importance.
Collapse
Affiliation(s)
- Marlène Le Tertre
- Univ Brest, Inserm, EFS, UMR1078, GGB, F-29200, France; CHRU de Brest, Service de Génétique Médicale et Biologie de la Reproduction, Laboratoire de Génétique Moléculaire et Histocompatibilité, F-29200, France
| | - Chandran Ka
- Univ Brest, Inserm, EFS, UMR1078, GGB, F-29200, France; CHRU de Brest, Service de Génétique Médicale et Biologie de la Reproduction, Laboratoire de Génétique Moléculaire et Histocompatibilité, F-29200, France; Laboratory of Excellence GR-Ex, F-75015, France
| | - Loann Raud
- Univ Brest, Inserm, EFS, UMR1078, GGB, F-29200, France; Association Gaétan Saleün, F-29200, France
| | | | - Isabelle Gourlaouen
- Univ Brest, Inserm, EFS, UMR1078, GGB, F-29200, France; Laboratory of Excellence GR-Ex, F-75015, France
| | | | - Kévin Uguen
- Univ Brest, Inserm, EFS, UMR1078, GGB, F-29200, France; CHRU de Brest, Service de Génétique Médicale et Biologie de la Reproduction, Laboratoire de Génétique Moléculaire et Histocompatibilité, F-29200, France
| | - Jian-Min Chen
- Univ Brest, Inserm, EFS, UMR1078, GGB, F-29200, France
| | - Claude Férec
- Univ Brest, Inserm, EFS, UMR1078, GGB, F-29200, France; CHRU de Brest, Service de Génétique Médicale et Biologie de la Reproduction, Laboratoire de Génétique Moléculaire et Histocompatibilité, F-29200, France; Association Gaétan Saleün, F-29200, France
| | - Yann Fichou
- Univ Brest, Inserm, EFS, UMR1078, GGB, F-29200, France; Laboratory of Excellence GR-Ex, F-75015, France
| | - Gérald Le Gac
- Univ Brest, Inserm, EFS, UMR1078, GGB, F-29200, France; CHRU de Brest, Service de Génétique Médicale et Biologie de la Reproduction, Laboratoire de Génétique Moléculaire et Histocompatibilité, F-29200, France; Laboratory of Excellence GR-Ex, F-75015, France.
| |
Collapse
|
64
|
Humphrey S, Kerr A, Rattray M, Dive C, Miller CJ. A model of k-mer surprisal to quantify local sequence information content surrounding splice regions. PeerJ 2020; 8:e10063. [PMID: 33194378 PMCID: PMC7648452 DOI: 10.7717/peerj.10063] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2020] [Accepted: 09/08/2020] [Indexed: 12/22/2022] Open
Abstract
Molecular sequences carry information. Analysis of sequence conservation between homologous loci is a proven approach with which to explore the information content of molecular sequences. This is often done using multiple sequence alignments to support comparisons between homologous loci. These methods therefore rely on sufficient underlying sequence similarity with which to construct a representative alignment. Here we describe a method using a formal metric of information, surprisal, to analyse biological sub-sequences without alignment constraints. We applied our model to the genomes of five different species to reveal similar patterns across a panel of eukaryotes. As the surprisal of a sub-sequence is inversely proportional to its occurrence within the genome, the optimal size of the sub-sequences was selected for each species under consideration. With the model optimized, we found a strong correlation between surprisal and CG dinucleotide usage. The utility of our model was tested by examining the sequences of genes known to undergo splicing. We demonstrate that our model can identify biological features of interest such as known donor and acceptor sites. Analysis across all annotated coding exon junctions in Homo sapiens reveals the information content of coding exons to be greater than the surrounding intron regions, a consequence of increased suppression of the CG dinucleotide in intronic space. Sequences within coding regions proximal to exon junctions exhibited novel patterns within DNA and coding mRNA that are not a function of the encoded amino acid sequence. Our findings are consistent with the presence of secondary information encoding features such as DNA and RNA binding sites, multiplexed through the coding sequence and independent of the information required to define the corresponding amino-acid sequence. We conclude that surprisal provides a complementary methodology with which to locate regions of interest in the genome, particularly in situations that lack an appropriate multiple sequence alignment.
Collapse
Affiliation(s)
- Sam Humphrey
- CRUK Manchester Institute Cancer Biomarker Centre, The University of Manchester, Manchester, United Kingdom
- CRUK Manchester Institute, CRUK Lung Cancer Centre of Excellence, Manchester, United Kingdom
| | - Alastair Kerr
- CRUK Manchester Institute Cancer Biomarker Centre, The University of Manchester, Manchester, United Kingdom
- CRUK Manchester Institute, CRUK Lung Cancer Centre of Excellence, Manchester, United Kingdom
| | - Magnus Rattray
- Division of Informatics, Imaging and Data Sciences, University of Manchester, Manchester, United Kingdom
| | - Caroline Dive
- CRUK Manchester Institute Cancer Biomarker Centre, The University of Manchester, Manchester, United Kingdom
- CRUK Manchester Institute, CRUK Lung Cancer Centre of Excellence, Manchester, United Kingdom
| | - Crispin J. Miller
- Computational Biology Group, CRUK Beatson Institute, Glasgow, United Kingdom
- Institute of Cancer Sciences, University of Glasgow, Glasgow, United Kingdom
| |
Collapse
|
65
|
Baeza-Centurion P, Miñana B, Valcárcel J, Lehner B. Mutations primarily alter the inclusion of alternatively spliced exons. eLife 2020; 9:59959. [PMID: 33112234 PMCID: PMC7673789 DOI: 10.7554/elife.59959] [Citation(s) in RCA: 23] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2020] [Accepted: 10/27/2020] [Indexed: 12/17/2022] Open
Abstract
Genetic analyses and systematic mutagenesis have revealed that synonymous, non-synonymous and intronic mutations frequently alter the inclusion levels of alternatively spliced exons, consistent with the concept that altered splicing might be a common mechanism by which mutations cause disease. However, most exons expressed in any cell are highly-included in mature mRNAs. Here, by performing deep mutagenesis of highly-included exons and by analysing the association between genome sequence variation and exon inclusion across the transcriptome, we report that mutations only very rarely alter the inclusion of highly-included exons. This is true for both exonic and intronic mutations as well as for perturbations in trans. Therefore, mutations that affect splicing are not evenly distributed across primary transcripts but are focussed in and around alternatively spliced exons with intermediate inclusion levels. These results provide a resource for prioritising synonymous and other variants as disease-causing mutations.
Collapse
Affiliation(s)
- Pablo Baeza-Centurion
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology (BIST), Barcelona, Spain
| | - Belén Miñana
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology (BIST), Barcelona, Spain
| | - Juan Valcárcel
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology (BIST), Barcelona, Spain.,Institució Catalana de Recerca i Estudis Avançats (ICREA), Barcelona, Spain
| | - Ben Lehner
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology (BIST), Barcelona, Spain.,Institució Catalana de Recerca i Estudis Avançats (ICREA), Barcelona, Spain.,Universitat Pompeu Fabra (UPF), Barcelona, Spain
| |
Collapse
|
66
|
Alvarez MEV, Chivers M, Borovska I, Monger S, Giannoulatou E, Kralovicova J, Vorechovsky I. Transposon clusters as substrates for aberrant splice-site activation. RNA Biol 2020; 18:354-367. [PMID: 32965162 PMCID: PMC7951965 DOI: 10.1080/15476286.2020.1805909] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022] Open
Abstract
Transposed elements (TEs) have dramatically shaped evolution of the exon-intron structure and significantly contributed to morbidity, but how recent TE invasions into older TEs cooperate in generating new coding sequences is poorly understood. Employing an updated repository of new exon-intron boundaries induced by pathogenic mutations, termed DBASS, here we identify novel TE clusters that facilitated exon selection. To explore the extent to which such TE exons maintain RNA secondary structure of their progenitors, we carried out structural studies with a composite exon that was derived from a long terminal repeat (LTR78) and AluJ and was activated by a C > T mutation optimizing the 5ʹ splice site. Using a combination of SHAPE, DMS and enzymatic probing, we show that the disease-causing mutation disrupted a conserved AluJ stem that evolved from helix 3.3 (or 5b) of 7SL RNA, liberating a primordial GC 5ʹ splice site from the paired conformation for interactions with the spliceosome. The mutation also reduced flexibility of conserved residues in adjacent exon-derived loops of the central Alu hairpin, revealing a cross-talk between traditional and auxilliary splicing motifs that evolved from opposite termini of 7SL RNA and were approximated by Watson-Crick base-pairing already in organisms without spliceosomal introns. We also identify existing Alu exons activated by the same RNA rearrangement. Collectively, these results provide valuable TE exon models for studying formation and kinetics of pre-mRNA building blocks required for splice-site selection and will be useful for fine-tuning auxilliary splicing motifs and exon and intron size constraints that govern aberrant splice-site activation.
Collapse
Affiliation(s)
| | - Martin Chivers
- School of Medicine, University of Southampton, Southampton, UK
| | - Ivana Borovska
- Slovak Academy of Sciences, Institute of Molecular Physiology and Genetics, Bratislava, Slovak Republic
| | - Steven Monger
- Computational Genomics Laboratory, Victor Chang Cardiac Research Institute, Darlinghurst, Australia
| | - Eleni Giannoulatou
- Computational Genomics Laboratory, Victor Chang Cardiac Research Institute, Darlinghurst, Australia.,St. Vincent's Clinical School, University of New South Wales, Sydney, Australia
| | - Jana Kralovicova
- School of Medicine, University of Southampton, Southampton, UK.,Slovak Academy of Sciences, Institute of Molecular Physiology and Genetics, Bratislava, Slovak Republic
| | | |
Collapse
|
67
|
Holcomb D, Alexaki A, Hernandez N, Laurie K, Kames J, Hamasaki-Katagiri N, Komar AA, DiCuccio M, Kimchi-Sarfaty C. Potential impact on coagulopathy of gene variants of coagulation related proteins that interact with SARS-CoV-2. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2020. [PMID: 32935103 DOI: 10.1101/2020.09.08.272328] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/14/2023]
Abstract
Thrombosis has been one of the complications of the Coronavirus disease of 2019 (COVID-19), often associated with poor prognosis. There is a well-recognized link between coagulation and inflammation, however, the extent of thrombotic events associated with COVID-19 warrants further investigation. Poly(A) Binding Protein Cytoplasmic 4 (PABPC4), Serine/Cysteine Proteinase Inhibitor Clade G Member 1 (SERPING1) and Vitamin K epOxide Reductase Complex subunit 1 (VKORC1), which are all proteins linked to coagulation, have been shown to interact with SARS proteins. We computationally examined the interaction of these with SARS-CoV-2 proteins and, in the case of VKORC1, we describe its binding to ORF7a in detail. We examined the occurrence of variants of each of these proteins across populations and interrogated their potential contribution to COVID-19 severity. Potential mechanisms by which some of these variants may contribute to disease are proposed. Some of these variants are prevalent in minority groups that are disproportionally affected by severe COVID-19. Therefore, we are proposing that further investigation around these variants may lead to better understanding of disease pathogenesis in minority groups and more informed therapeutic approaches. Author summary Increased blood clotting, especially in the lungs, is a common complication of COVID-19. Infectious diseases cause inflammation which in turn can contribute to increased blood clotting. However, the extent of clot formation that is seen in the lungs of COVID-19 patients suggests that there may be a more direct link. We identified three human proteins that are involved indirectly in the blood clotting cascade and have been shown to interact with proteins of SARS virus, which is closely related to the novel coronavirus. We examined computationally the interaction of these human proteins with the viral proteins. We looked for genetic variants of these proteins and examined how these variants are distributed across populations. We investigated whether variants of these genes could impact severity of COVID-19. Further investigation around these variants may provide clues for the pathogenesis of COVID-19 particularly in minority groups.
Collapse
|
68
|
Brandt M, Gokden A, Ziosi M, Lappalainen T. A polyclonal allelic expression assay for detecting regulatory effects of transcript variants. Genome Med 2020; 12:79. [PMID: 32912286 PMCID: PMC7488413 DOI: 10.1186/s13073-020-00777-8] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2019] [Accepted: 08/19/2020] [Indexed: 12/12/2022] Open
Abstract
We present an assay to experimentally test the regulatory effects of genetic variants within transcripts using CRISPR/Cas9 followed by targeted sequencing. We applied the assay to 32 premature stop-gained variants across the genome and in two Mendelian disease genes, 33 putative causal variants of eQTLs, and 62 control variants in HEK293T cells, replicating a subset of variants in HeLa cells. We detected significant effects in the expected direction (in 60% of variants), demonstrating the ability of the assay to capture regulatory effects of eQTL variants and nonsense-mediated decay triggered by premature stop-gained variants. The results suggest a utility for validating transcript-level effects of genetic variants.
Collapse
Affiliation(s)
- Margot Brandt
- New York Genome Center, New York, NY, USA.,Department of Systems Biology, Columbia University, New York, NY, USA
| | | | | | - Tuuli Lappalainen
- New York Genome Center, New York, NY, USA. .,Department of Systems Biology, Columbia University, New York, NY, USA.
| |
Collapse
|
69
|
Saha K, England W, Fernandez MM, Biswas T, Spitale RC, Ghosh G. Structural disruption of exonic stem-loops immediately upstream of the intron regulates mammalian splicing. Nucleic Acids Res 2020; 48:6294-6309. [PMID: 32402057 PMCID: PMC7293017 DOI: 10.1093/nar/gkaa358] [Citation(s) in RCA: 18] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2019] [Revised: 04/20/2020] [Accepted: 04/27/2020] [Indexed: 12/31/2022] Open
Abstract
Recognition of highly degenerate mammalian splice sites by the core spliceosomal machinery is regulated by several protein factors that predominantly bind exonic splicing motifs. These are postulated to be single-stranded in order to be functional, yet knowledge of secondary structural features that regulate the exposure of exonic splicing motifs across the transcriptome is not currently available. Using transcriptome-wide RNA structural information we show that retained introns in mouse are commonly flanked by a short (≲70 nucleotide), highly base-paired segment upstream and a predominantly single-stranded exonic segment downstream. Splicing assays with select pre-mRNA substrates demonstrate that loops immediately upstream of the introns contain pre-mRNA-specific splicing enhancers, the substitution or hybridization of which impedes splicing. Additionally, the exonic segments flanking the retained introns appeared to be more enriched in a previously identified set of hexameric exonic splicing enhancer (ESE) sequences compared to their spliced counterparts, suggesting that base-pairing in the exonic segments upstream of retained introns could be a means for occlusion of ESEs. The upstream exonic loops of the test substrate promoted recruitment of splicing factors and consequent pre-mRNA structural remodeling, leading up to assembly of the early spliceosome. These results suggest that disruption of exonic stem-loop structures immediately upstream (but not downstream) of the introns regulate alternative splicing events, likely through modulating accessibility of splicing factors.
Collapse
Affiliation(s)
- Kaushik Saha
- Department of Chemistry and Biochemistry, University of California San Diego, 9500 Gilman Drive, La Jolla, CA 92093-0375, USA
| | - Whitney England
- Department of Pharmaceutical Sciences, University of California Irvine, 147 Bison Modular, Building 515, Irvine, CA 92697, USA
| | - Mike Minh Fernandez
- Department of Chemistry and Biochemistry, University of California San Diego, 9500 Gilman Drive, La Jolla, CA 92093-0375, USA
| | - Tapan Biswas
- Department of Chemistry and Biochemistry, University of California San Diego, 9500 Gilman Drive, La Jolla, CA 92093-0375, USA
| | - Robert C Spitale
- Department of Pharmaceutical Sciences, University of California Irvine, 147 Bison Modular, Building 515, Irvine, CA 92697, USA
| | - Gourisankar Ghosh
- Department of Chemistry and Biochemistry, University of California San Diego, 9500 Gilman Drive, La Jolla, CA 92093-0375, USA
| |
Collapse
|
70
|
Kováčová T, Souček P, Hujová P, Freiberger T, Grodecká L. Splicing Enhancers at Intron-Exon Borders Participate in Acceptor Splice Sites Recognition. Int J Mol Sci 2020; 21:ijms21186553. [PMID: 32911621 PMCID: PMC7554774 DOI: 10.3390/ijms21186553] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2020] [Revised: 09/05/2020] [Accepted: 09/06/2020] [Indexed: 02/07/2023] Open
Abstract
Acceptor splice site recognition (3′ splice site: 3′ss) is a fundamental step in precursor messenger RNA (pre-mRNA) splicing. Generally, the U2 small nuclear ribonucleoprotein (snRNP) auxiliary factor (U2AF) heterodimer recognizes the 3′ss, of which U2AF35 has a dual function: (i) It binds to the intron–exon border of some 3′ss and (ii) mediates enhancer-binding splicing activators’ interactions with the spliceosome. Alternative mechanisms for 3′ss recognition have been suggested, yet they are still not thoroughly understood. Here, we analyzed 3′ss recognition where the intron–exon border is bound by a ubiquitous splicing regulator SRSF1. Using the minigene analysis of two model exons and their mutants, BRCA2 exon 12 and VARS2 exon 17, we showed that the exon inclusion correlated much better with the predicted SRSF1 affinity than 3′ss quality, which were assessed using the Catalog of Inferred Sequence Binding Preferences of RNA binding proteins (CISBP-RNA) database and maximum entropy algorithm (MaxEnt) predictor and the U2AF35 consensus matrix, respectively. RNA affinity purification proved SRSF1 binding to the model 3′ss. On the other hand, knockdown experiments revealed that U2AF35 also plays a role in these exons’ inclusion. Most probably, both factors stochastically bind the 3′ss, supporting exon recognition, more apparently in VARS2 exon 17. Identifying splicing activators as 3′ss recognition factors is crucial for both a basic understanding of splicing regulation and human genetic diagnostics when assessing variants’ effects on splicing.
Collapse
Affiliation(s)
- Tatiana Kováčová
- Molecular Genetics Laboratory, Centre for Cardiovascular Surgery and Transplantation, 656 91 Brno, Czech Republic; (T.K.); (P.S.); (P.H.); (T.F.)
- Faculty of Medicine, Masaryk University, 625 00 Brno, Czech Republic
| | - Přemysl Souček
- Molecular Genetics Laboratory, Centre for Cardiovascular Surgery and Transplantation, 656 91 Brno, Czech Republic; (T.K.); (P.S.); (P.H.); (T.F.)
- Faculty of Medicine, Masaryk University, 625 00 Brno, Czech Republic
| | - Pavla Hujová
- Molecular Genetics Laboratory, Centre for Cardiovascular Surgery and Transplantation, 656 91 Brno, Czech Republic; (T.K.); (P.S.); (P.H.); (T.F.)
- Faculty of Medicine, Masaryk University, 625 00 Brno, Czech Republic
| | - Tomáš Freiberger
- Molecular Genetics Laboratory, Centre for Cardiovascular Surgery and Transplantation, 656 91 Brno, Czech Republic; (T.K.); (P.S.); (P.H.); (T.F.)
- Faculty of Medicine, Masaryk University, 625 00 Brno, Czech Republic
| | - Lucie Grodecká
- Molecular Genetics Laboratory, Centre for Cardiovascular Surgery and Transplantation, 656 91 Brno, Czech Republic; (T.K.); (P.S.); (P.H.); (T.F.)
- Correspondence:
| |
Collapse
|
71
|
Implications of CLSPN Variants in Cellular Function and Susceptibility to Cancer. Cancers (Basel) 2020; 12:cancers12092396. [PMID: 32847043 PMCID: PMC7565888 DOI: 10.3390/cancers12092396] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2020] [Revised: 08/05/2020] [Accepted: 08/20/2020] [Indexed: 11/28/2022] Open
Abstract
Claspin is a multifunctional protein that participates in physiological processes essential for cell homeostasis that are often defective in cancer, namely due to genetic changes. It is conceivable that Claspin gene (CLSPN) alterations may contribute to cancer development. Therefore, CLSPN germline alterations were characterized in sporadic and familial breast cancer and glioma samples, as well as in six cancer cell lines. Their association to cancer susceptibility and functional impact were investigated. Eight variants were identified (c.-68C>T, c.17G>A, c.1574A>G, c.2230T>C, c.2028+16G>A, c.3595-3597del, and c.3839C>T). CLSPN c.1574A>G (p.Asn525Ser) was significantly associated with breast cancer and was shown to cause partial exon skipping and decreased Claspin expression and Chk1 activation in a minigene splicing assay and in signalling experiments, respectively. CLSPN c.2028+16G>A was significantly associated with familial breast cancer and glioma, whereas c.2230T>C (p.Ser744Pro), was exclusively detected in breast cancer and glioma patients, but not in healthy controls. The remaining variants lacked a significant association with cancer. Nevertheless, the c.-68C>T promoter variant increased transcriptional activity in a luciferase assay. In conclusion, some of the CLSPN variants identified in the present study appear to modulate Claspin’s function by altering CLSPN transcription and RNA processing, as well as Chk1 activation.
Collapse
|
72
|
Tubeuf H, Charbonnier C, Soukarieh O, Blavier A, Lefebvre A, Dauchel H, Frebourg T, Gaildrat P, Martins A. Large-scale comparative evaluation of user-friendly tools for predicting variant-induced alterations of splicing regulatory elements. Hum Mutat 2020; 41:1811-1829. [PMID: 32741062 DOI: 10.1002/humu.24091] [Citation(s) in RCA: 26] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2020] [Revised: 07/11/2020] [Accepted: 07/26/2020] [Indexed: 12/20/2022]
Abstract
Discriminating which nucleotide variants cause disease or contribute to phenotypic traits remains a major challenge in human genetics. In theory, any intragenic variant can potentially affect RNA splicing by altering splicing regulatory elements (SREs). However, these alterations are often ignored mainly because pioneer SRE predictors have proved inefficient. Here, we report the first large-scale comparative evaluation of four user-friendly SRE-dedicated algorithms (QUEPASA, HEXplorer, SPANR, and HAL) tested both as standalone tools and in multiple combined ways based on two independent benchmark datasets adding up to >1,300 exonic variants studied at the messenger RNA level and mapping to 89 different disease-causing genes. These methods display good predictive power, based on decision thresholds derived from the receiver operating characteristics curve analyses, with QUEPASA and HAL having the best accuracies either as standalone or in combination. Still, overall there was a tight race between the four predictors, suggesting that all methods may be of use. Additionally, QUEPASA and HEXplorer may be beneficial as well for predicting variant-induced creation of pseudoexons deep within introns. Our study highlights the potential of SRE predictors as filtering tools for identifying disease-causing candidates among the plethora of variants detected by high-throughput DNA sequencing and provides guidance for their use in genomic medicine settings.
Collapse
Affiliation(s)
- Hélène Tubeuf
- Inserm U1245, UNIROUEN, Normandie University, Normandy Centre for Genomic and Personalized Medicine, Rouen, France.,Interactive Biosoftware, Rouen, France
| | - Camille Charbonnier
- Inserm U1245, UNIROUEN, Normandie University, Normandy Centre for Genomic and Personalized Medicine, Rouen, France
| | - Omar Soukarieh
- Inserm U1245, UNIROUEN, Normandie University, Normandy Centre for Genomic and Personalized Medicine, Rouen, France
| | | | - Arnaud Lefebvre
- Computer Science, Information Processing and Systems Laboratory, UNIROUEN, Normandie University, Mont-Saint-Aignan, France
| | - Hélène Dauchel
- Computer Science, Information Processing and Systems Laboratory, UNIROUEN, Normandie University, Mont-Saint-Aignan, France
| | - Thierry Frebourg
- Inserm U1245, UNIROUEN, Normandie University, Normandy Centre for Genomic and Personalized Medicine, Rouen, France.,Department of Genetics, University Hospital, Normandy Centre for Genomic and Personalized Medicine, Rouen, France
| | - Pascaline Gaildrat
- Inserm U1245, UNIROUEN, Normandie University, Normandy Centre for Genomic and Personalized Medicine, Rouen, France
| | - Alexandra Martins
- Inserm U1245, UNIROUEN, Normandie University, Normandy Centre for Genomic and Personalized Medicine, Rouen, France
| |
Collapse
|
73
|
Sylvester B, Brindopke F, Suzuki A, Giron M, Auslander A, Maas RL, Tsai B, Gao H, Magee W, Cox TC, Sanchez-Lara PA. A Synonymous Exonic Splice Silencer Variant in IRF6 as a Novel and Cryptic Cause of Non-Syndromic Cleft Lip and Palate. Genes (Basel) 2020; 11:genes11080903. [PMID: 32784565 PMCID: PMC7465030 DOI: 10.3390/genes11080903] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2020] [Revised: 07/28/2020] [Accepted: 08/04/2020] [Indexed: 12/31/2022] Open
Abstract
Missense, nonsense, splice site and regulatory region variants in interferon regulatory factor 6 (IRF6) have been shown to contribute to both syndromic and non-syndromic forms of cleft lip and/or palate (CL/P). We report the diagnostic evaluation of a complex multigeneration family of Honduran ancestry with a pedigree structure consistent with autosomal-dominant inheritance with both incomplete penetrance and variable expressivity. The proband's grandmother bore children with two partners and CL/P segregates on both sides of each lineage. Through whole-exome sequencing of five members of the family, we identified a single shared synonymous variant, located in the middle of exon 7 of IRF6 (p.Ser307Ser; g.209963979 G>A; c.921C>T). The variant was shown to segregate in the seven affected individuals and through three unaffected obligate carriers, spanning both sides of this pedigree. This variant is very rare, only being found in three (all of Latino ancestry) of 251,352 alleles in the gnomAD database. While the variant did not create a splice acceptor/donor site, in silico analysis predicted it to impact an exonic splice silencer element and the binding of major splice regulatory factors. In vitro splice assays supported this by revealing multiple abnormal splicing events, estimated to impact >60% of allelic transcripts. Sequencing of the alternate splice products demonstrated the unmasking of a cryptic splice site six nucleotides 5' of the variant, as well as variable utilization of cryptic splice sites in intron 6. The ectopic expression of different splice regulatory proteins altered the proportion of abnormal splicing events seen in the splice assay, although the alteration was dependent on the splice factor. Importantly, each alternatively spliced mRNA is predicted to result in a frame shift and prematurely truncated IRF6 protein. This is the first study to identify a synonymous variant as a likely cause of NS-CL/P and highlights the care that should be taken by laboratories when considering and interpreting variants.
Collapse
Affiliation(s)
- Beau Sylvester
- Division of Plastic and Maxillofacial Surgery, Children’s Hospital Los Angeles, Los Angeles, CA 90027, USA; (B.S.); (A.A.); (W.M.III)
| | | | - Akiko Suzuki
- Department of Oral & Craniofacial Sciences, University of Missouri-Kansas City School of Dentistry, Kansas City, MO 64108, USA; (A.S.); (T.C.C.)
| | - Melissa Giron
- Operación Sonrisa Honduras, Tegucigalpa 11101, Honduras;
| | - Allyn Auslander
- Division of Plastic and Maxillofacial Surgery, Children’s Hospital Los Angeles, Los Angeles, CA 90027, USA; (B.S.); (A.A.); (W.M.III)
- Department of Preventive Medicine, Keck School of Medicine of the University of Southern California, Los Angeles, CA 90033, USA
| | - Richard L. Maas
- Division of Genetics, Department of Medicine, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA 02115, USA;
| | - Becky Tsai
- Fulgent Genetics, Temple City, CA 91780, USA; (B.T.); (H.G.)
| | - Hanlin Gao
- Fulgent Genetics, Temple City, CA 91780, USA; (B.T.); (H.G.)
| | - William Magee
- Division of Plastic and Maxillofacial Surgery, Children’s Hospital Los Angeles, Los Angeles, CA 90027, USA; (B.S.); (A.A.); (W.M.III)
| | - Timothy C. Cox
- Department of Oral & Craniofacial Sciences, University of Missouri-Kansas City School of Dentistry, Kansas City, MO 64108, USA; (A.S.); (T.C.C.)
- Department of Pediatrics, University of Missouri-Kansas City School of Medicine, Kansas City, MO 64108, USA
| | - Pedro A. Sanchez-Lara
- Department of Pediatrics, Cedars-Sinai Medical Center, David Geffen School of Medicine at UCLA, Los Angeles, CA 90048, USA
- Correspondence: ; Tel.: +1-(310)-423-4461
| |
Collapse
|
74
|
Canson D, Glubb D, Spurdle AB. Variant effect on splicing regulatory elements, branchpoint usage, and pseudoexonization: Strategies to enhance bioinformatic prediction using hereditary cancer genes as exemplars. Hum Mutat 2020; 41:1705-1721. [PMID: 32623769 DOI: 10.1002/humu.24074] [Citation(s) in RCA: 22] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2020] [Revised: 06/26/2020] [Accepted: 07/02/2020] [Indexed: 12/15/2022]
Abstract
It is possible to estimate the prior probability of pathogenicity for germline disease gene variants based on bioinformatic prediction of variant effect/s. However, routinely used approaches have likely led to the underestimation and underreporting of variants located outside donor and acceptor splice site motifs that affect messenger RNA (mRNA) processing. This review presents information about hereditary cancer gene germline variants, outside native splice sites, with experimentally validated splicing effects. We list 95 exonic variants that impact splicing regulatory elements (SREs) in BRCA1, BRCA2, MLH1, MSH2, MSH6, and PMS2. We utilized a pre-existing large-scale BRCA1 functional data set to map functional SREs, and assess the relative performance of different tools to predict effects of 283 variants on such elements. We also describe rare examples of intronic variants that impact branchpoint (BP) sites and create pseudoexons. We discuss the challenges in predicting variant effect on BP site usage and pseudoexonization, and suggest strategies to improve the bioinformatic prioritization of such variants for experimental validation. Importantly, our review and analysis highlights the value of considering impact of variants outside donor and acceptor motifs on mRNA splicing and disease causation.
Collapse
Affiliation(s)
- Daffodil Canson
- Genetics and Computational Biology Department, QIMR Berghofer Medical Research Institute, Brisbane, Queensland, Australia
- Faculty of Medicine, The University of Queensland, Brisbane, Queensland, Australia
| | - Dylan Glubb
- Genetics and Computational Biology Department, QIMR Berghofer Medical Research Institute, Brisbane, Queensland, Australia
| | - Amanda B Spurdle
- Genetics and Computational Biology Department, QIMR Berghofer Medical Research Institute, Brisbane, Queensland, Australia
- Faculty of Medicine, The University of Queensland, Brisbane, Queensland, Australia
| |
Collapse
|
75
|
Tubeuf H, Caputo SM, Sullivan T, Rondeaux J, Krieger S, Caux-Moncoutier V, Hauchard J, Castelain G, Fiévet A, Meulemans L, Révillion F, Léoné M, Boutry-Kryza N, Delnatte C, Guillaud-Bataille M, Cleveland L, Reid S, Southon E, Soukarieh O, Drouet A, Di Giacomo D, Vezain M, Bonnet-Dorion F, Bourdon V, Larbre H, Muller D, Pujol P, Vaz F, Audebert-Bellanger S, Colas C, Venat-Bouvet L, Solano AR, Stoppa-Lyonnet D, Houdayer C, Frebourg T, Gaildrat P, Sharan SK, Martins A. Calibration of Pathogenicity Due to Variant-Induced Leaky Splicing Defects by Using BRCA2 Exon 3 as a Model System. Cancer Res 2020; 80:3593-3605. [PMID: 32641407 DOI: 10.1158/0008-5472.can-20-0895] [Citation(s) in RCA: 18] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2020] [Revised: 05/14/2020] [Accepted: 07/02/2020] [Indexed: 12/25/2022]
Abstract
BRCA2 is a clinically actionable gene implicated in breast and ovarian cancer predisposition that has become a high priority target for improving the classification of variants of unknown significance (VUS). Among all BRCA2 VUS, those causing partial/leaky splicing defects are the most challenging to classify because the minimal level of full-length (FL) transcripts required for normal function remains to be established. Here, we explored BRCA2 exon 3 (BRCA2e3) as a model for calibrating variant-induced spliceogenicity and estimating thresholds for BRCA2 haploinsufficiency. In silico predictions, minigene splicing assays, patients' RNA analyses, a mouse embryonic stem cell (mESC) complementation assay and retrieval of patient-related information were combined to determine the minimal requirement of FL BRCA2 transcripts. Of 100 BRCA2e3 variants tested in the minigene assay, 64 were found to be spliceogenic, causing mild to severe RNA defects. Splicing defects were also confirmed in patients' RNA when available. Analysis of a neutral leaky variant (c.231T>G) showed that a reduction of approximately 60% of FL BRCA2 transcripts from a mutant allele does not cause any increase in cancer risk. Moreover, data obtained from mESCs suggest that variants causing a decline in FL BRCA2 with approximately 30% of wild-type are not pathogenic, given that mESCs are fully viable and resistant to DNA-damaging agents in those conditions. In contrast, mESCs producing lower relative amounts of FL BRCA2 exhibited either null or hypomorphic phenotypes. Overall, our findings are likely to have broader implications on the interpretation of BRCA2 variants affecting the splicing pattern of other essential exons. SIGNIFICANCE: These findings demonstrate that BRCA2 tumor suppressor function tolerates substantial reduction in full-length transcripts, helping to determine the pathogenicity of BRCA2 leaky splicing variants, some of which may not increase cancer risk.
Collapse
Affiliation(s)
- Hélène Tubeuf
- Inserm U1245, UNIROUEN, Normandie University, Normandy Centre for Genomic and Personalized Medicine, Rouen, France.,Interactive Biosoftware, Rouen, France
| | - Sandrine M Caputo
- Department of Genetics, Institut Curie, Paris, France.,PSL Research University, Paris, France
| | - Teresa Sullivan
- Mouse Cancer Genetics Program, Center for Cancer Research, National Cancer Institute, Frederick, Maryland
| | - Julie Rondeaux
- Inserm U1245, UNIROUEN, Normandie University, Normandy Centre for Genomic and Personalized Medicine, Rouen, France
| | - Sophie Krieger
- Inserm U1245, UNIROUEN, Normandie University, Normandy Centre for Genomic and Personalized Medicine, Rouen, France.,Laboratory of Cancer Biology and Genetics, Centre François Baclesse, Caen, France - Normandie University, UNICAEN, Caen, France
| | | | - Julie Hauchard
- Inserm U1245, UNIROUEN, Normandie University, Normandy Centre for Genomic and Personalized Medicine, Rouen, France
| | - Gaia Castelain
- Inserm U1245, UNIROUEN, Normandie University, Normandy Centre for Genomic and Personalized Medicine, Rouen, France
| | - Alice Fiévet
- Department of Genetics, Institut Curie, Paris, France.,INSERM U830, University Paris Descartes, Paris, France.,Service Génétique des Tumeurs, Gustave Roussy, Villejuif, France
| | - Laëtitia Meulemans
- Inserm U1245, UNIROUEN, Normandie University, Normandy Centre for Genomic and Personalized Medicine, Rouen, France
| | | | | | | | | | | | - Linda Cleveland
- Mouse Cancer Genetics Program, Center for Cancer Research, National Cancer Institute, Frederick, Maryland
| | - Susan Reid
- Mouse Cancer Genetics Program, Center for Cancer Research, National Cancer Institute, Frederick, Maryland
| | - Eileen Southon
- Mouse Cancer Genetics Program, Center for Cancer Research, National Cancer Institute, Frederick, Maryland
| | - Omar Soukarieh
- Inserm U1245, UNIROUEN, Normandie University, Normandy Centre for Genomic and Personalized Medicine, Rouen, France
| | - Aurélie Drouet
- Inserm U1245, UNIROUEN, Normandie University, Normandy Centre for Genomic and Personalized Medicine, Rouen, France
| | - Daniela Di Giacomo
- Inserm U1245, UNIROUEN, Normandie University, Normandy Centre for Genomic and Personalized Medicine, Rouen, France
| | - Myriam Vezain
- Inserm U1245, UNIROUEN, Normandie University, Normandy Centre for Genomic and Personalized Medicine, Rouen, France
| | | | - Violaine Bourdon
- Department of Genetics, Institut Paoli-Calmettes, Marseille, France
| | - Hélène Larbre
- Laboratoire d'Oncogénétique Moléculaire, Institut Godinot, Reims, France
| | - Danièle Muller
- Unité d'Oncogénétique, Centre Paul Strauss, Strasbourg, France
| | - Pascal Pujol
- Unité d'Oncogénétique, CHU Arnaud de Villeneuve, Montpellier, France
| | - Fátima Vaz
- Breast Cancer Risk Evaluation Clinic, Portuguese Institute of Oncology of Lisbon, Lisbon, Portugal
| | | | - Chrystelle Colas
- Department of Genetics, Institut Curie, Paris, France.,PSL Research University, Paris, France
| | | | - Angela R Solano
- Genotipificacion y Cancer Hereditario, Departmento de Analisis Clinicos, Centro de Educacion Medica e Investigaciones Clinicas (CEMIC), Ciudad Autonoma de Buenos Aires, Argentina
| | - Dominique Stoppa-Lyonnet
- Department of Genetics, Institut Curie, Paris, France.,INSERM U830, University Paris Descartes, Paris, France
| | - Claude Houdayer
- Inserm U1245, UNIROUEN, Normandie University, Normandy Centre for Genomic and Personalized Medicine, Rouen, France.,Department of Genetics, University Hospital, Normandy Centre for Genomic and Personalized Medicine, Rouen, France
| | - Thierry Frebourg
- Inserm U1245, UNIROUEN, Normandie University, Normandy Centre for Genomic and Personalized Medicine, Rouen, France.,Department of Genetics, University Hospital, Normandy Centre for Genomic and Personalized Medicine, Rouen, France
| | - Pascaline Gaildrat
- Inserm U1245, UNIROUEN, Normandie University, Normandy Centre for Genomic and Personalized Medicine, Rouen, France
| | - Shyam K Sharan
- Mouse Cancer Genetics Program, Center for Cancer Research, National Cancer Institute, Frederick, Maryland
| | - Alexandra Martins
- Inserm U1245, UNIROUEN, Normandie University, Normandy Centre for Genomic and Personalized Medicine, Rouen, France.
| |
Collapse
|
76
|
Monger S, Troup M, Ip E, Dunwoodie SL, Giannoulatou E. Spliceogen: an integrative, scalable tool for the discovery of splice-altering variants. Bioinformatics 2020; 35:4405-4407. [PMID: 30993321 DOI: 10.1093/bioinformatics/btz263] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/24/2018] [Revised: 03/31/2019] [Accepted: 04/10/2019] [Indexed: 01/01/2023] Open
Abstract
MOTIVATION In silico prediction tools are essential for identifying variants which create or disrupt cis-splicing motifs. However, there are limited options for genome-scale discovery of splice-altering variants. RESULTS We have developed Spliceogen, a highly scalable pipeline integrating predictions from some of the individually best performing models for splice motif prediction: MaxEntScan, GeneSplicer, ESRseq and Branchpointer. AVAILABILITY AND IMPLEMENTATION Spliceogen is available as a command line tool which accepts VCF/BED inputs and handles both single nucleotide variants (SNVs) and indels (https://github.com/VCCRI/Spliceogen). SNV databases with prediction scores are also available, covering all possible SNVs at all genomic positions within all Gencode-annotated multi-exon transcripts. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Steven Monger
- Victor Chang Cardiac Research Institute, Sydney, Australia
| | - Michael Troup
- Victor Chang Cardiac Research Institute, Sydney, Australia
| | - Eddie Ip
- Victor Chang Cardiac Research Institute, Sydney, Australia.,St Vincent's Clinical School, UNSW Sydney, Australia
| | - Sally L Dunwoodie
- Victor Chang Cardiac Research Institute, Sydney, Australia.,St Vincent's Clinical School, UNSW Sydney, Australia.,School of Biotechnology and Biomolecular Sciences, UNSW Sydney, Australia
| | - Eleni Giannoulatou
- Victor Chang Cardiac Research Institute, Sydney, Australia.,St Vincent's Clinical School, UNSW Sydney, Australia
| |
Collapse
|
77
|
Splicing mutations in inherited retinal diseases. Prog Retin Eye Res 2020; 80:100874. [PMID: 32553897 DOI: 10.1016/j.preteyeres.2020.100874] [Citation(s) in RCA: 16] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2019] [Revised: 05/30/2020] [Accepted: 05/31/2020] [Indexed: 12/15/2022]
Abstract
Mutations which induce aberrant transcript splicing represent a distinct class of disease-causing genetic variants in retinal disease genes. Such mutations may either weaken or erase regular splice sites or create novel splice sites which alter exon recognition. While mutations affecting the canonical GU-AG dinucleotides at the splice donor and splice acceptor site are highly predictive to cause a splicing defect, other variants in the vicinity of the canonical splice sites or those affecting additional cis-acting regulatory sequences within exons or introns are much more difficult to assess or even to recognize and require additional experimental validation. Splicing mutations are unique in that the actual outcome for the transcript (e.g. exon skipping, pseudoexon inclusion, intron retention) and the encoded protein can be quite different depending on the individual mutation. In this article, we present an overview on the current knowledge about and impact of splicing mutations in inherited retinal diseases. We introduce the most common sub-classes of splicing mutations including examples from our own work and others and discuss current strategies for the identification and validation of splicing mutations, as well as therapeutic approaches, open questions, and future perspectives in this field of research.
Collapse
|
78
|
Rong S, Buerer L, Rhine CL, Wang J, Cygan KJ, Fairbrother WG. Mutational bias and the protein code shape the evolution of splicing enhancers. Nat Commun 2020; 11:2845. [PMID: 32504065 PMCID: PMC7275064 DOI: 10.1038/s41467-020-16673-z] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2019] [Accepted: 04/28/2020] [Indexed: 02/06/2023] Open
Abstract
Exonic splicing enhancers (ESEs) are enriched in exons relative to introns and bind splicing activators. This study considers a fundamental question of co-evolution: How did ESE motifs become enriched in exons prior to the evolution of ESE recognition? We hypothesize that the high exon to intron motif ratios necessary for ESE function were created by mutational bias coupled with purifying selection on the protein code. These two forces retain certain coding motifs in exons while passively depleting them from introns. Through the use of simulations, genomic analyses, and high throughput splicing assays, we confirm the key predictions of this hypothesis, including an overlap between protein and splicing information in ESEs. We discuss the implications of mutational bias as an evolutionary driver in other cis-regulatory systems. Splicing is regulated by cis-acting elements in pre-mRNAs such as exonic or intronic splicing enhancers and silencers. Here the authors show that exonic splicing enhancers are enriched in exons compared to introns due to mutational bias coupled with purifying selection on the protein code.
Collapse
Affiliation(s)
- Stephen Rong
- Center for Computational Molecular Biology, Brown University, Providence, RI, 02912, USA.,Ecology and Evolutionary Biology, Brown University, Providence, RI, 02912, USA
| | - Luke Buerer
- Center for Computational Molecular Biology, Brown University, Providence, RI, 02912, USA
| | - Christy L Rhine
- Molecular Biology, Cell Biology and Biochemistry, Brown University, Providence, RI, 02912, USA
| | - Jing Wang
- Molecular Biology, Cell Biology and Biochemistry, Brown University, Providence, RI, 02912, USA
| | - Kamil J Cygan
- Center for Computational Molecular Biology, Brown University, Providence, RI, 02912, USA.,Molecular Biology, Cell Biology and Biochemistry, Brown University, Providence, RI, 02912, USA
| | - William G Fairbrother
- Center for Computational Molecular Biology, Brown University, Providence, RI, 02912, USA. .,Molecular Biology, Cell Biology and Biochemistry, Brown University, Providence, RI, 02912, USA. .,Hassenfeld Child Health Innovation Institute of Brown University, Providence, RI, 02912, USA.
| |
Collapse
|
79
|
Zhang D, Xia J. Somatic synonymous mutations in regulatory elements contribute to the genetic aetiology of melanoma. BMC Med Genomics 2020; 13:43. [PMID: 32241263 PMCID: PMC7119296 DOI: 10.1186/s12920-020-0685-2] [Citation(s) in RCA: 17] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023] Open
Abstract
Background Non-synonymous mutations altering tumor suppressor genes and oncogenes are widely studied. However, synonymous mutations, which do not alter the protein sequence, are rarely investigated in melanoma genome studies. Methods We explored the role of somatic synonymous mutations in melanoma samples from TCGA (The Cancer Genome Atlas). The pathogenic synonymous mutation and neutral synonymous mutation data were used to assess the significance of pathogenic synonymous mutations in melanoma likely to affect genetic regulatory elements using Fisher’s exact test. Poisson distribution probabilities of each gene were used to mine the genes with multiple potential functional synonymous mutations affecting regulatory elements. Results Concentrating on five types of genetic regulatory functions, we found that the mutational patterns of pathogenic synonymous mutations are mostly involved in exonic splicing regulators in near-splicing sites or inside DNase I hypersensitivity sites or non-optimal codon. Moreover, the sites of miRNA binding alteration exhibit a significantly lower rate of evolution than other sites. Finally, 12 genes were hit by recurrent potentially functional synonymous mutations, which showed statistical significance in the pathogenic mutations. Among them, nine genes (DNAH5, ADCY8, GRIN2A, KSR2, TECTA, RIMS2, XKR6, MYH1, SCN10A) have been reported to be mutated in melanoma, and other three genes (SLC9A2, CASR, SLC8A3) have a great potential to impact melanoma. Conclusion These findings confirm the functional consequences of somatic synonymous mutations in melanoma, emphasizing the significance of research in future studies.
Collapse
Affiliation(s)
- Di Zhang
- College of information science and engineering, Shaoguan University, Shaoguan, Guangdong, China.,Institutes of Physical Science and Information Technology, Anhui University, Hefei, Anhui, China
| | - Junfeng Xia
- Institutes of Physical Science and Information Technology, Anhui University, Hefei, Anhui, China.
| |
Collapse
|
80
|
Abrahams L, Hurst LD. A Depletion of Stop Codons in lincRNA is Owing to Transfer of Selective Constraint from Coding Sequences. Mol Biol Evol 2020; 37:1148-1164. [PMID: 31841162 PMCID: PMC7086181 DOI: 10.1093/molbev/msz299] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022] Open
Abstract
Although the constraints on a gene’s sequence are often assumed to reflect the functioning of that gene, here we propose transfer selection, a constraint operating on one class of genes transferred to another, mediated by shared binding factors. We show that such transfer can explain an otherwise paradoxical depletion of stop codons in long intergenic noncoding RNAs (lincRNAs). Serine/arginine-rich proteins direct the splicing machinery by binding exonic splice enhancers (ESEs) in immature mRNA. As coding exons cannot contain stop codons in one reading frame, stop codons should be rare within ESEs. We confirm that the stop codon density (SCD) in ESE motifs is low, even accounting for nucleotide biases. Given that serine/arginine-rich proteins binding ESEs also facilitate lincRNA splicing, a low SCD could transfer to lincRNAs. As predicted, multiexon lincRNA exons are depleted in stop codons, a result not explained by open reading frame (ORF) contamination. Consistent with transfer selection, stop codon depletion in lincRNAs is most acute in exonic regions with the highest ESE density, disappears when ESEs are masked, is consistent with stop codon usage skews in ESEs, and is diminished in both single-exon lincRNAs and introns. Owing to low SCD, the maximum lengths of pseudo-ORFs frequently exceed null expectations. This has implications for ORF annotation and the evolution of de novo protein-coding genes from lincRNAs. We conclude that not all constraints operating on genes need be explained by the functioning of the gene but may instead be transferred owing to shared binding factors.
Collapse
Affiliation(s)
- Liam Abrahams
- Milner Centre for Evolution, Department of Biology and Biochemistry, University of Bath, Bath, United Kingdom
| | - Laurence D Hurst
- Milner Centre for Evolution, Department of Biology and Biochemistry, University of Bath, Bath, United Kingdom
| |
Collapse
|
81
|
Meulemans L, Mesman RLS, Caputo SM, Krieger S, Guillaud-Bataille M, Caux-Moncoutier V, Léone M, Boutry-Kryza N, Sokolowska J, Révillion F, Delnatte C, Tubeuf H, Soukarieh O, Bonnet-Dorion F, Guibert V, Bronner M, Bourdon V, Lizard S, Vilquin P, Privat M, Drouet A, Grout C, Calléja FMGR, Golmard L, Vrieling H, Stoppa-Lyonnet D, Houdayer C, Frebourg T, Vreeswijk MPG, Martins A, Gaildrat P. Skipping Nonsense to Maintain Function: The Paradigm of BRCA2 Exon 12. Cancer Res 2020; 80:1374-1386. [PMID: 32046981 DOI: 10.1158/0008-5472.can-19-2491] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2019] [Revised: 12/18/2019] [Accepted: 02/06/2020] [Indexed: 11/16/2022]
Abstract
Germline nonsense and canonical splice site variants identified in disease-causing genes are generally considered as loss-of-function (LoF) alleles and classified as pathogenic. However, a fraction of such variants could maintain function through their impact on RNA splicing. To test this hypothesis, we used the alternatively spliced BRCA2 exon 12 (E12) as a model system because its in-frame skipping leads to a potentially functional protein. All E12 variants corresponding to putative LoF variants or predicted to alter splicing (n = 40) were selected from human variation databases and characterized for their impact on splicing in minigene assays and, when available, in patient lymphoblastoid cell lines. Moreover, a selection of variants was analyzed in a mouse embryonic stem cell-based functional assay. Using these complementary approaches, we demonstrate that a subset of variants, including nonsense variants, induced in-frame E12 skipping through the modification of splice sites or regulatory elements and, consequently, led to an internally deleted but partially functional protein. These data provide evidence, for the first time in a cancer-predisposition gene, that certain presumed null variants can retain function due to their impact on splicing. Further studies are required to estimate cancer risk associated with these hypomorphic variants. More generally, our findings highlight the need to exercise caution in the interpretation of putative LoF variants susceptible to induce in-frame splicing modifications. SIGNIFICANCE: This study presents evidence that certain presumed loss-of-function variants in a cancer predisposition gene can retain function due to their direct impact on RNA splicing.
Collapse
Affiliation(s)
- Laëtitia Meulemans
- Normandie Univ, UNIROUEN, Inserm U1245, Normandy Centre for Genomic and Personalized Medicine, Rouen, France
| | - Romy L S Mesman
- Department of Human Genetics, Leiden University Medical Center, Leiden, the Netherlands
| | - Sandrine M Caputo
- Department of Genetics, Institut Curie, Paris, France.,PSL Research University, Paris, France
| | - Sophie Krieger
- Normandie Univ, UNIROUEN, Inserm U1245, Normandy Centre for Genomic and Personalized Medicine, Rouen, France.,Laboratory of Cancer Biology and Genetics, Centre François Baclesse, Caen, France.,Normandie University, UNICAEN, Caen, France
| | | | | | | | | | | | | | | | - Hélène Tubeuf
- Normandie Univ, UNIROUEN, Inserm U1245, Normandy Centre for Genomic and Personalized Medicine, Rouen, France.,Interactive Biosoftware, Rouen, France
| | - Omar Soukarieh
- Normandie Univ, UNIROUEN, Inserm U1245, Normandy Centre for Genomic and Personalized Medicine, Rouen, France
| | | | - Virginie Guibert
- Department of Genetics, Nantes University Hospital, Nantes, France
| | - Myriam Bronner
- Department of Genetics, Nancy University Hospital, Nancy, France
| | - Violaine Bourdon
- Department of Genetics, Institut Paoli-Calmettes, Marseille, France
| | - Sarab Lizard
- Department of Genetics, Nancy University Hospital, Nancy, France
| | - Paul Vilquin
- Department of Pathology and Oncobiology, Montpellier University Hospital, Montpellier, France
| | - Maud Privat
- University of Clermont Auvergne, Inserm U1240, Clermont Ferrand, France.,Department of Oncogenetics, Centre Jean Perrin, Clermont Ferrand, France
| | - Aurélie Drouet
- Normandie Univ, UNIROUEN, Inserm U1245, Normandy Centre for Genomic and Personalized Medicine, Rouen, France
| | - Charlotte Grout
- Normandie Univ, UNIROUEN, Inserm U1245, Normandy Centre for Genomic and Personalized Medicine, Rouen, France
| | | | - Lisa Golmard
- Department of Genetics, Institut Curie, Paris, France.,PSL Research University, Paris, France
| | - Harry Vrieling
- Department of Human Genetics, Leiden University Medical Center, Leiden, the Netherlands
| | - Dominique Stoppa-Lyonnet
- Department of Genetics, Institut Curie, Paris, France.,Inserm U830, University Paris Descartes, Paris, France
| | - Claude Houdayer
- Normandie Univ, UNIROUEN, Inserm U1245, Normandy Centre for Genomic and Personalized Medicine, Rouen, France.,Department of Genetics, Institut Curie, Paris, France.,Department of Genetics, Rouen University Hospital, Rouen, France
| | - Thierry Frebourg
- Normandie Univ, UNIROUEN, Inserm U1245, Normandy Centre for Genomic and Personalized Medicine, Rouen, France.,Department of Genetics, Rouen University Hospital, Rouen, France
| | - Maaike P G Vreeswijk
- Department of Human Genetics, Leiden University Medical Center, Leiden, the Netherlands
| | - Alexandra Martins
- Normandie Univ, UNIROUEN, Inserm U1245, Normandy Centre for Genomic and Personalized Medicine, Rouen, France
| | - Pascaline Gaildrat
- Normandie Univ, UNIROUEN, Inserm U1245, Normandy Centre for Genomic and Personalized Medicine, Rouen, France.
| |
Collapse
|
82
|
Howrigan DP, Rose SA, Samocha KE, Fromer M, Cerrato F, Chen WJ, Churchhouse C, Chambert K, Chandler SD, Daly MJ, Dumont A, Genovese G, Hwu HG, Laird N, Kosmicki JA, Moran JL, Roe C, Singh T, Wang SH, Faraone SV, Glatt SJ, McCarroll SA, Tsuang M, Neale BM. Exome sequencing in schizophrenia-affected parent-offspring trios reveals risk conferred by protein-coding de novo mutations. Nat Neurosci 2020; 23:185-193. [PMID: 31932770 PMCID: PMC7007385 DOI: 10.1038/s41593-019-0564-3] [Citation(s) in RCA: 105] [Impact Index Per Article: 26.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2018] [Accepted: 11/22/2019] [Indexed: 12/31/2022]
Abstract
Protein-coding de novo mutations (DNMs) are significant risk factors in many neurodevelopmental disorders, whereas schizophrenia (SCZ) risk associated with DNMs has thus far been shown to be modest. We analyzed DNMs from 1,695 SCZ-affected trios and 1,077 published SCZ-affected trios to better understand the contribution to SCZ risk. Among 2,772 SCZ probands, exome-wide DNM burden remained modest. Gene set analyses revealed that SCZ DNMs were significantly concentrated in genes that were highly expressed in the brain, that were under strong evolutionary constraint and/or overlapped with genes identified in other neurodevelopmental disorders. No single gene surpassed exome-wide significance; however, 16 genes were recurrently hit by protein-truncating DNMs, corresponding to a 3.15-fold higher rate than the mutation model expectation (permuted 95% confidence interval: 1-10 genes; permuted P = 3 × 10-5). Overall, DNMs explain a small fraction of SCZ risk, and larger samples are needed to identify individual risk genes, as coding variation across many genes confers risk for SCZ in the population.
Collapse
Affiliation(s)
- Daniel P Howrigan
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA, USA.
- Broad Institute of MIT and Harvard, Cambridge, MA, USA.
| | - Samuel A Rose
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Kaitlin E Samocha
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA, USA
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Menachem Fromer
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | | | - Wei J Chen
- National Taiwan University, Taipei, Taiwan
| | - Claire Churchhouse
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA, USA
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | | | | | - Mark J Daly
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA, USA
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Ashley Dumont
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | | | | | - Nan Laird
- Harvard School of Public Health, Boston, MA, USA
| | - Jack A Kosmicki
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA, USA
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | | | - Cheryl Roe
- SUNY Upstate Medical University, Syracuse, NY, USA
| | - Tarjinder Singh
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA, USA
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | | | | | | | - Steven A McCarroll
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Harvard University, Cambridge, MA, USA
| | - Ming Tsuang
- University of California, San Diego, La Jolla, CA, USA
| | - Benjamin M Neale
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA, USA.
- Broad Institute of MIT and Harvard, Cambridge, MA, USA.
| |
Collapse
|
83
|
Rowlands CF, Baralle D, Ellingford JM. Machine Learning Approaches for the Prioritization of Genomic Variants Impacting Pre-mRNA Splicing. Cells 2019; 8:E1513. [PMID: 31779139 PMCID: PMC6953098 DOI: 10.3390/cells8121513] [Citation(s) in RCA: 29] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2019] [Revised: 11/20/2019] [Accepted: 11/21/2019] [Indexed: 12/13/2022] Open
Abstract
Defects in pre-mRNA splicing are frequently a cause of Mendelian disease. Despite the advent of next-generation sequencing, allowing a deeper insight into a patient's variant landscape, the ability to characterize variants causing splicing defects has not progressed with the same speed. To address this, recent years have seen a sharp spike in the number of splice prediction tools leveraging machine learning approaches, leaving clinical geneticists with a plethora of choices for in silico analysis. In this review, some basic principles of machine learning are introduced in the context of genomics and splicing analysis. A critical comparative approach is then used to describe seven recent machine learning-based splice prediction tools, revealing highly diverse approaches and common caveats. We find that, although great progress has been made in producing specific and sensitive tools, there is still much scope for personalized approaches to prediction of variant impact on splicing. Such approaches may increase diagnostic yields and underpin improvements to patient care.
Collapse
Affiliation(s)
- Charlie F Rowlands
- North West Genomic Laboratory Hub, Manchester Centre for Genomic Medicine, Manchester University Hospitals NHS Foundation Trust, St Mary’s Hospital, Manchester M13 9WJ, UK;
- Division of Evolution and Genomic Sciences, School of Biological Sciences, Faculty of Biology, Medicine and Health, University of Manchester, Manchester M13 9PR, UK
| | - Diana Baralle
- Human Development and Health, Faculty of Medicine, University of Southampton, MP808, Tremona Road, Southampton SO16 6YD, UK
| | - Jamie M Ellingford
- North West Genomic Laboratory Hub, Manchester Centre for Genomic Medicine, Manchester University Hospitals NHS Foundation Trust, St Mary’s Hospital, Manchester M13 9WJ, UK;
- Division of Evolution and Genomic Sciences, School of Biological Sciences, Faculty of Biology, Medicine and Health, University of Manchester, Manchester M13 9PR, UK
| |
Collapse
|
84
|
Obeng EA, Stewart C, Abdel-Wahab O. Altered RNA Processing in Cancer Pathogenesis and Therapy. Cancer Discov 2019; 9:1493-1510. [PMID: 31611195 PMCID: PMC6825565 DOI: 10.1158/2159-8290.cd-19-0399] [Citation(s) in RCA: 83] [Impact Index Per Article: 16.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2019] [Revised: 06/21/2019] [Accepted: 08/08/2019] [Indexed: 12/17/2022]
Abstract
Major advances in our understanding of cancer pathogenesis and therapy have come from efforts to catalog genomic alterations in cancer. A growing number of large-scale genomic studies have uncovered mutations that drive cancer by perturbing cotranscriptional and post-transcriptional regulation of gene expression. These include alterations that affect each phase of RNA processing, including splicing, transport, editing, and decay of messenger RNA. The discovery of these events illuminates a number of novel therapeutic vulnerabilities generated by aberrant RNA processing in cancer, several of which have progressed to clinical development. SIGNIFICANCE: There is increased recognition that genetic alterations affecting RNA splicing and polyadenylation are common in cancer and may generate novel therapeutic opportunities. Such mutations may occur within an individual gene or in RNA processing factors themselves, thereby influencing splicing of many downstream target genes. This review discusses the biological impact of these mutations on tumorigenesis and the therapeutic approaches targeting cells bearing these mutations.
Collapse
Affiliation(s)
- Esther A Obeng
- Department of Oncology, St. Jude Children's Research Hospital, Memphis, Tennessee.
| | - Connor Stewart
- Human Oncology and Pathogenesis Program and Leukemia Service, Department of Medicine, Memorial Sloan Kettering Cancer Center, New York, New York
| | - Omar Abdel-Wahab
- Human Oncology and Pathogenesis Program and Leukemia Service, Department of Medicine, Memorial Sloan Kettering Cancer Center, New York, New York.
| |
Collapse
|
85
|
Mikl M, Hamburg A, Pilpel Y, Segal E. Dissecting splicing decisions and cell-to-cell variability with designed sequence libraries. Nat Commun 2019; 10:4572. [PMID: 31594945 PMCID: PMC6783452 DOI: 10.1038/s41467-019-12642-3] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2018] [Accepted: 09/22/2019] [Indexed: 11/18/2022] Open
Abstract
Most human genes are alternatively spliced, allowing for a large expansion of the proteome. The multitude of regulatory inputs to splicing limits the potential to infer general principles from investigating native sequences. Here, we create a rationally designed library of >32,000 splicing events to dissect the complexity of splicing regulation through systematic sequence alterations. Measuring RNA and protein splice isoforms allows us to investigate both cause and effect of splicing decisions, quantify diverse regulatory inputs and accurately predict (R2 = 0.73–0.85) isoform ratios from sequence and secondary structure. By profiling individual cells, we measure the cell-to-cell variability of splicing decisions and show that it can be encoded in the DNA and influenced by regulatory inputs, opening the door for a novel, single-cell perspective on splicing regulation. Alternative splicing is regulated by multiple mechanisms. Here the authors employed designed splice site libraries and massively parallel reporter assays to dissect the regulatory complexity and cell-to-cell variability of splicing decisions and to build accurate predictive models.
Collapse
Affiliation(s)
- Martin Mikl
- Department of Computer Science and Applied Mathematics, Weizmann Institute of Science, Rehovot, 7610001, Israel. .,Department of Molecular Cell Biology, Weizmann Institute of Science, Rehovot, 7610001, Israel. .,Department of Molecular Genetics, Weizmann Institute of Science, Rehovot, 7610001, Israel.
| | - Amit Hamburg
- Department of Computer Science and Applied Mathematics, Weizmann Institute of Science, Rehovot, 7610001, Israel.,Department of Molecular Cell Biology, Weizmann Institute of Science, Rehovot, 7610001, Israel
| | - Yitzhak Pilpel
- Department of Molecular Genetics, Weizmann Institute of Science, Rehovot, 7610001, Israel
| | - Eran Segal
- Department of Computer Science and Applied Mathematics, Weizmann Institute of Science, Rehovot, 7610001, Israel. .,Department of Molecular Cell Biology, Weizmann Institute of Science, Rehovot, 7610001, Israel.
| |
Collapse
|
86
|
CRISPR-Cas9-based mutagenesis frequently provokes on-target mRNA misregulation. Nat Commun 2019; 10:4056. [PMID: 31492834 PMCID: PMC6731291 DOI: 10.1038/s41467-019-12028-5] [Citation(s) in RCA: 116] [Impact Index Per Article: 23.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2019] [Accepted: 08/14/2019] [Indexed: 12/16/2022] Open
Abstract
The introduction of insertion-deletions (INDELs) by non-homologous end-joining (NHEJ) pathway underlies the mechanistic basis of CRISPR-Cas9-directed genome editing. Selective gene ablation using CRISPR-Cas9 is achieved by installation of a premature termination codon (PTC) from a frameshift-inducing INDEL that elicits nonsense-mediated decay (NMD) of the mutant mRNA. Here, by examining the mRNA and protein products of CRISPR targeted genes in a cell line panel with presumed gene knockouts, we detect the production of foreign mRNAs or proteins in ~50% of the cell lines. We demonstrate that these aberrant protein products stem from the introduction of INDELs that promote internal ribosomal entry, convert pseudo-mRNAs (alternatively spliced mRNAs with a PTC) into protein encoding molecules, or induce exon skipping by disruption of exon splicing enhancers (ESEs). Our results reveal challenges to manipulating gene expression outcomes using INDEL-based mutagenesis and strategies useful in mitigating their impact on intended genome-editing outcomes.
Collapse
|
87
|
Naito T. Predicting the impact of single nucleotide variants on splicing via sequence-based deep neural networks and genomic features. Hum Mutat 2019; 40:1261-1269. [PMID: 31090248 PMCID: PMC7265986 DOI: 10.1002/humu.23794] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2019] [Revised: 04/26/2019] [Accepted: 05/12/2019] [Indexed: 11/10/2022]
Abstract
Single nucleotide mutations in exonic regions can significantly affect gene function through a disruption of splicing, and various computational methods have been developed to predict the splicing-related effects of a single nucleotide mutation. We implemented a new method using ensemble learning that combines two types of predictive models: (a) base sequence-based deep neural networks (DNNs) and (b) machine learning models based on genomic attributes. This method was applied to the Massively Parallel Splicing Assay challenge of the Fifth Critical Assessment of Genome Interpretation, in which challenge participants predicted various experimentally-defined exonic splicing mutations, and achieved a promising result. We successfully revealed that combining different predictive models based upon the stacked generalization method led to significant improvement in prediction performance. In addition, whereas most of the genomic features adopted in constructing machine learning models were previously reported, feature values generated with DSSP, a DNN-based splice site prediction tool, were novel and helpful for the prediction. Learning the sequence patterns associated with normal splicing and the change in splicing site probabilities caused by a mutation was presumed to be helpful in predicting splicing disruption.
Collapse
Affiliation(s)
- Tatsuhiko Naito
- Department of Neurology, Graduate School of Medicine, The University of Tokyo, Tokyo, Japan, Address: 7-3-1 Hongo, Bunkyo-Ku, Tokyo, 113-8655, Japan
| |
Collapse
|
88
|
Rhine CL, Neil C, Glidden DT, Cygan KJ, Fredericks AM, Wang J, Walton NA, Fairbrother WG. Future directions for high-throughput splicing assays in precision medicine. Hum Mutat 2019; 40:1225-1234. [PMID: 31297895 DOI: 10.1002/humu.23866] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2019] [Revised: 07/02/2019] [Accepted: 07/06/2019] [Indexed: 11/12/2022]
Abstract
Classification of variants of unknown significance is a challenging technical problem in clinical genetics. As up to one-third of disease-causing mutations are thought to affect pre-mRNA splicing, it is important to accurately classify splicing mutations in patient sequencing data. Several consortia and healthcare systems have conducted large-scale patient sequencing studies, which discover novel variants faster than they can be classified. Here, we compare the advantages and limitations of several high-throughput splicing assays aimed at mitigating this bottleneck, and describe a data set of ~5,000 variants that we analyzed using our Massively Parallel Splicing Assay (MaPSy). The Critical Assessment of Genome Interpretation group (CAGI) organized a challenge, in which participants submitted machine learning models to predict the splicing effects of variants in this data set. We discuss the winning submission of the challenge (MMSplice) which outperformed existing software. Finally, we highlight methods to overcome the limitations of MaPSy and similar assays, such as tissue-specific splicing, the effect of surrounding sequence context, classifying intronic variants, synthesizing large exons, and amplifying complex libraries of minigene species. Further development of these assays will greatly benefit the field of clinical genetics, which lack high-throughput methods for variant interpretation.
Collapse
Affiliation(s)
- Christy L Rhine
- Molecular Biology, Cell Biology and Biochemistry, Brown University, Providence, Rhode Island
| | - Christopher Neil
- Molecular Biology, Cell Biology and Biochemistry, Brown University, Providence, Rhode Island
| | - David T Glidden
- Center for Computational Molecular Biology, Brown University, Providence, Rhode Island
| | - Kamil J Cygan
- Molecular Biology, Cell Biology and Biochemistry, Brown University, Providence, Rhode Island.,Center for Computational Molecular Biology, Brown University, Providence, Rhode Island
| | - Alger M Fredericks
- Molecular Biology, Cell Biology and Biochemistry, Brown University, Providence, Rhode Island
| | - Jing Wang
- Molecular Biology, Cell Biology and Biochemistry, Brown University, Providence, Rhode Island
| | - Nephi A Walton
- Genomic Medicine Institute, Geisinger, Danville, Pennsylvania
| | - William G Fairbrother
- Molecular Biology, Cell Biology and Biochemistry, Brown University, Providence, Rhode Island.,Center for Computational Molecular Biology, Brown University, Providence, Rhode Island.,Hassenfeld Child Health Innovation Institute of Brown University, Providence, Rhode Island
| |
Collapse
|
89
|
Katneni UK, Liss A, Holcomb D, Katagiri NH, Hunt R, Bar H, Ismail A, Komar AA, Kimchi‐Sarfaty C. Splicing dysregulation contributes to the pathogenicity of several F9 exonic point variants. Mol Genet Genomic Med 2019; 7:e840. [PMID: 31257730 PMCID: PMC6687662 DOI: 10.1002/mgg3.840] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2019] [Accepted: 06/10/2019] [Indexed: 12/27/2022] Open
Abstract
BACKGROUND Pre-mRNA splicing is a complex process requiring the identification of donor site, acceptor site, and branch point site with an adjacent polypyrimidine tract sequence. Splicing is regulated by splicing regulatory elements (SREs) with both enhancer and suppressor functions. Variants located in exonic regions can impact splicing through dysregulation of native splice sites, SREs, and cryptic splice site activation. While splicing dysregulation is considered primary disease-inducing mechanism of synonymous variants, its contribution toward disease phenotype of non-synonymous variants is underappreciated. METHODS In this study, we analyzed 415 disease-causing and 120 neutral F9 exonic point variants including both synonymous and non-synonymous for their effect on splicing using a series of in silico splice site prediction tools, SRE prediction tools, and in vitro minigene assays. RESULTS The use of splice site and SRE prediction tools in tandem provided better prediction but were not always in agreement with the minigene assays. The net effect of splicing dysregulation caused by variants was context dependent. Minigene assays revealed that perturbed splicing can be found. CONCLUSION Synonymous variants primarily cause disease phenotype via splicing dysregulation while additional mechanisms such as translation rate also play an important role. Splicing dysregulation is likely to contribute to the disease phenotype of several non-synonymous variants.
Collapse
Affiliation(s)
- Upendra K. Katneni
- Hemostasis Branch, Division of Plasma Protein Therapeutics, Office of Tissues and Advanced Therapies, Center for Biologics Evaluation & ResearchUS FDASilver SpringMaryland
| | - Aaron Liss
- Hemostasis Branch, Division of Plasma Protein Therapeutics, Office of Tissues and Advanced Therapies, Center for Biologics Evaluation & ResearchUS FDASilver SpringMaryland
| | - David Holcomb
- Hemostasis Branch, Division of Plasma Protein Therapeutics, Office of Tissues and Advanced Therapies, Center for Biologics Evaluation & ResearchUS FDASilver SpringMaryland
| | - Nobuko H. Katagiri
- Hemostasis Branch, Division of Plasma Protein Therapeutics, Office of Tissues and Advanced Therapies, Center for Biologics Evaluation & ResearchUS FDASilver SpringMaryland
| | - Ryan Hunt
- Hemostasis Branch, Division of Plasma Protein Therapeutics, Office of Tissues and Advanced Therapies, Center for Biologics Evaluation & ResearchUS FDASilver SpringMaryland
| | - Haim Bar
- Department of StatisticsUniversity of ConnecticutStorrsConnecticut
| | - Amra Ismail
- Department of Biological, Geological and Environmental Sciences, Center for Gene Regulation in Health and DiseaseCleveland State UniversityClevelandOhio
| | - Anton A. Komar
- Department of Biological, Geological and Environmental Sciences, Center for Gene Regulation in Health and DiseaseCleveland State UniversityClevelandOhio
| | - Chava Kimchi‐Sarfaty
- Hemostasis Branch, Division of Plasma Protein Therapeutics, Office of Tissues and Advanced Therapies, Center for Biologics Evaluation & ResearchUS FDASilver SpringMaryland
| |
Collapse
|
90
|
Královicová J, Ševcíková I, Stejskalová E, Obuca M, Hiller M, Stanek D, Vorechovský I. PUF60-activated exons uncover altered 3' splice-site selection by germline missense mutations in a single RRM. Nucleic Acids Res 2019; 46:6166-6187. [PMID: 29788428 PMCID: PMC6093180 DOI: 10.1093/nar/gky389] [Citation(s) in RCA: 35] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2018] [Accepted: 05/01/2018] [Indexed: 12/27/2022] Open
Abstract
PUF60 is a splicing factor that binds uridine (U)-rich tracts and facilitates association of the U2 small nuclear ribonucleoprotein with primary transcripts. PUF60 deficiency (PD) causes a developmental delay coupled with intellectual disability and spinal, cardiac, ocular and renal defects, but PD pathogenesis is not understood. Using RNA-Seq, we identify human PUF60-regulated exons and show that PUF60 preferentially acts as their activator. PUF60-activated internal exons are enriched for Us upstream of their 3′ splice sites (3′ss), are preceded by longer AG dinucleotide exclusion zones and more distant branch sites, with a higher probability of unpaired interactions across a typical branch site location as compared to control exons. In contrast, PUF60-repressed exons show U-depletion with lower estimates of RNA single-strandedness. We also describe PUF60-regulated, alternatively spliced isoforms encoding other U-bound splicing factors, including PUF60 partners, suggesting that they are co-regulated in the cell, and identify PUF60-regulated exons derived from transposed elements. PD-associated amino-acid substitutions, even within a single RNA recognition motif (RRM), altered selection of competing 3′ss and branch points of a PUF60-dependent exon and the 3′ss choice was also influenced by alternative splicing of PUF60. Finally, we propose that differential distribution of RNA processing steps detected in cells lacking PUF60 and the PUF60-paralog RBM39 is due to the RBM39 RS domain interactions. Together, these results provide new insights into regulation of exon usage by the 3′ss organization and reveal that germline mutation heterogeneity in RRMs can enhance phenotypic variability at the level of splice-site and branch-site selection.
Collapse
Affiliation(s)
- Jana Královicová
- University of Southampton Faculty of Medicine, Southampton SO16 6YD, UK.,Slovak Academy of Sciences, Centre for Biosciences, 840 05 Bratislava, Slovak Republic
| | - Ivana Ševcíková
- Slovak Academy of Sciences, Centre for Biosciences, 840 05 Bratislava, Slovak Republic
| | - Eva Stejskalová
- Czech Academy of Sciences, Institute of Molecular Genetics, 142 20 Prague, Czech Republic
| | - Mina Obuca
- Czech Academy of Sciences, Institute of Molecular Genetics, 142 20 Prague, Czech Republic
| | - Michael Hiller
- Max Planck Institute of Molecular Cell Biology and Genetics and Max Planck Institute for the Physics of Complex Systems, Dresden, Germany
| | - David Stanek
- Czech Academy of Sciences, Institute of Molecular Genetics, 142 20 Prague, Czech Republic
| | - Igor Vorechovský
- University of Southampton Faculty of Medicine, Southampton SO16 6YD, UK
| |
Collapse
|
91
|
Jobbins AM, Reichenbach LF, Lucas CM, Hudson AJ, Burley GA, Eperon IC. The mechanisms of a mammalian splicing enhancer. Nucleic Acids Res 2019; 46:2145-2158. [PMID: 29394380 PMCID: PMC5861446 DOI: 10.1093/nar/gky056] [Citation(s) in RCA: 28] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2017] [Accepted: 01/19/2018] [Indexed: 12/21/2022] Open
Abstract
Exonic splicing enhancer (ESE) sequences are bound by serine & arginine-rich (SR) proteins, which in turn enhance the recruitment of splicing factors. It was inferred from measurements of splicing around twenty years ago that Drosophila doublesex ESEs are bound stably by SR proteins, and that the bound proteins interact directly but with low probability with their targets. However, it has not been possible with conventional methods to demonstrate whether mammalian ESEs behave likewise. Using single molecule multi-colour colocalization methods to study SRSF1-dependent ESEs, we have found that that the proportion of RNA molecules bound by SRSF1 increases with the number of ESE repeats, but only a single molecule of SRSF1 is bound. We conclude that initial interactions between SRSF1 and an ESE are weak and transient, and that these limit the activity of a mammalian ESE. We tested whether the activation step involves the propagation of proteins along the RNA or direct interactions with 3' splice site components by inserting hexaethylene glycol or abasic RNA between the ESE and the target 3' splice site. These insertions did not block activation, and we conclude that the activation step involves direct interactions. These results support a model in which regulatory proteins bind transiently and in dynamic competition, with the result that each ESE in an exon contributes independently to the probability that an activator protein is bound and in close proximity to a splice site.
Collapse
Affiliation(s)
- Andrew M Jobbins
- Leicester Institute of Structural & Chemical Biology and Department of Molecular & Cell Biology, University of Leicester, UK
| | | | - Christian M Lucas
- Leicester Institute of Structural & Chemical Biology and Department of Molecular & Cell Biology, University of Leicester, UK
| | - Andrew J Hudson
- Leicester Institute of Structural & Chemical Biology and Department of Chemistry, University of Leicester, UK
| | - Glenn A Burley
- Department of Pure and Applied Chemistry, University of Strathclyde, UK
| | - Ian C Eperon
- Leicester Institute of Structural & Chemical Biology and Department of Molecular & Cell Biology, University of Leicester, UK
| |
Collapse
|
92
|
Souček P, Réblová K, Kramárek M, Radová L, Grymová T, Hujová P, Kováčová T, Lexa M, Grodecká L, Freiberger T. High-throughput analysis revealed mutations' diverging effects on SMN1 exon 7 splicing. RNA Biol 2019; 16:1364-1376. [PMID: 31213135 DOI: 10.1080/15476286.2019.1630796] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022] Open
Abstract
Splicing-affecting mutations can disrupt gene function by altering the transcript assembly. To ascertain splicing dysregulation principles, we modified a minigene assay for the parallel high-throughput evaluation of different mutations by next-generation sequencing. In our model system, all exonic and six intronic positions of the SMN1 gene's exon 7 were mutated to all possible nucleotide variants, which amounted to 180 unique single-nucleotide mutants and 470 double mutants. The mutations resulted in a wide range of splicing aberrations. Exonic splicing-affecting mutations resulted either in substantial exon skipping, supposedly driven by predicted exonic splicing silencer or cryptic donor splice site (5'ss) and de novo 5'ss strengthening and use. On the other hand, a single disruption of exonic splicing enhancer was not sufficient to cause major exon skipping, suggesting these elements can be substituted during exon recognition. While disrupting the acceptor splice site led only to exon skipping, some 5'ss mutations potentiated the use of three different cryptic 5'ss. Generally, single mutations supporting cryptic 5'ss use displayed better pre-mRNA/U1 snRNA duplex stability and increased splicing regulatory element strength across the original 5'ss. Analyzing double mutants supported the predominating splicing regulatory elements' effect, but U1 snRNA binding could contribute to the global balance of splicing isoforms. Based on these findings, we suggest that creating a new splicing enhancer across the mutated 5'ss can be one of the main factors driving cryptic 5'ss use.
Collapse
Affiliation(s)
- Přemysl Souček
- Medical Genomics RG, Central European Institute of Technology, Masaryk University , Brno , Czech Republic.,Molecular Genetics Laboratory, Centre for Cardiovascular Surgery and Transplantation , Brno , Czech Republic
| | - Kamila Réblová
- Medical Genomics RG, Central European Institute of Technology, Masaryk University , Brno , Czech Republic
| | - Michal Kramárek
- Molecular Genetics Laboratory, Centre for Cardiovascular Surgery and Transplantation , Brno , Czech Republic
| | - Lenka Radová
- Medical Genomics RG, Central European Institute of Technology, Masaryk University , Brno , Czech Republic
| | - Tereza Grymová
- Molecular Genetics Laboratory, Centre for Cardiovascular Surgery and Transplantation , Brno , Czech Republic
| | - Pavla Hujová
- Molecular Genetics Laboratory, Centre for Cardiovascular Surgery and Transplantation , Brno , Czech Republic
| | - Tatiana Kováčová
- Medical Genomics RG, Central European Institute of Technology, Masaryk University , Brno , Czech Republic
| | - Matej Lexa
- Faculty of Informatics, Masaryk University , Brno , Czech Republic
| | - Lucie Grodecká
- Molecular Genetics Laboratory, Centre for Cardiovascular Surgery and Transplantation , Brno , Czech Republic
| | - Tomáš Freiberger
- Medical Genomics RG, Central European Institute of Technology, Masaryk University , Brno , Czech Republic.,Molecular Genetics Laboratory, Centre for Cardiovascular Surgery and Transplantation , Brno , Czech Republic.,Faculty of Medicine, Masaryk University , Brno , Czech Republic
| |
Collapse
|
93
|
Wang R, Wang Y, Hu Z. Using secondary structure to predict the effects of genetic variants on alternative splicing. Hum Mutat 2019; 40:1270-1279. [PMID: 31074545 DOI: 10.1002/humu.23790] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2019] [Revised: 04/15/2019] [Accepted: 05/06/2019] [Indexed: 01/29/2023]
Abstract
Accurate interpretation of genomic variants that alter RNA splicing is critical to precision medicine. We present a computational framework, Prediction of variant Effect on Percent Spliced In (PEPSI), that predicts the splicing impact of coding and noncoding variants for the Fifth Critical Assessment of Genome Interpretation (CAGI5) "Vex-seq" challenge. PEPSI is a random forest regression model trained on multiple layers of features associated with sequence conservation and regulatory sequence elements. Compared to other splicing defect prediction tools from the literature, our framework integrates secondary structure information in predicting variants that disrupt splicing regulatory elements (SREs). We applied our model to classify splice-disrupting variants among 2,094 single-nucleotide polymorphisms from the Exome Aggregation Consortium using model-predicted changes in percent spliced in (ΔPSI) associated with tested variants. Benchmarking our model against widely used state-of-the-art tools, we demonstrate that PEPSI achieves comparable performance in terms of sensitivity and precision. Moreover, we also show that using secondary structure context can help resolve several cases where changes in the counts of SREs do not correspond with the directionality of ΔPSI measured for tested variants.
Collapse
Affiliation(s)
- Robert Wang
- Department of Bioengineering, University of California, Berkeley, California.,Department of Plant and Microbial Biology, University of California, Berkeley, California
| | - Yaqiong Wang
- Department of Plant and Microbial Biology, University of California, Berkeley, California
| | - Zhiqiang Hu
- Department of Plant and Microbial Biology, University of California, Berkeley, California
| |
Collapse
|
94
|
Ptok J, Müller L, Theiss S, Schaal H. Context matters: Regulation of splice donor usage. BIOCHIMICA ET BIOPHYSICA ACTA-GENE REGULATORY MECHANISMS 2019; 1862:194391. [PMID: 31202784 DOI: 10.1016/j.bbagrm.2019.06.002] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/26/2019] [Revised: 06/07/2019] [Accepted: 06/09/2019] [Indexed: 11/16/2022]
Abstract
Elaborate research on splicing, starting in the late seventies, evolved from the discovery that 5' splice sites are recognized by their complementarity to U1 snRNA towards the realization that RNA duplex formation cannot be the sole basis for 5'ss selection. Rather, their recognition is highly influenced by a number of context factors including transcript architecture as well as splicing regulatory elements (SREs) in the splice site neighborhood. In particular, proximal binding of splicing regulatory proteins highly influences splicing outcome. The importance of SRE integrity especially becomes evident in the light of human pathogenic mutations where single nucleotide changes in SREs can severely affect the resulting transcripts. Bioinformatics tools nowadays greatly assist in the computational evaluation of 5'ss, their neighborhood and the impact of pathogenic mutations. Although predictions are already quite robust, computational evaluation of the splicing regulatory landscape still faces challenges to increase future reliability. This article is part of a Special Issue entitled: RNA structure and splicing regulation edited by Francisco Baralle, Ravindra Singh and Stefan Stamm.
Collapse
Affiliation(s)
- Johannes Ptok
- Institute of Virology, Medical Faculty, Heinrich Heine University Düsseldorf, D-40225 Düsseldorf, Germany
| | - Lisa Müller
- Institute of Virology, Medical Faculty, Heinrich Heine University Düsseldorf, D-40225 Düsseldorf, Germany
| | - Stephan Theiss
- Institute of Clinical Neuroscience and Medical Psychology, Medical Faculty, Heinrich Heine University Düsseldorf, D-40225 Düsseldorf, Germany
| | - Heiner Schaal
- Institute of Virology, Medical Faculty, Heinrich Heine University Düsseldorf, D-40225 Düsseldorf, Germany.
| |
Collapse
|
95
|
Kinney JB, McCandlish DM. Massively Parallel Assays and Quantitative Sequence-Function Relationships. Annu Rev Genomics Hum Genet 2019; 20:99-127. [PMID: 31091417 DOI: 10.1146/annurev-genom-083118-014845] [Citation(s) in RCA: 76] [Impact Index Per Article: 15.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Over the last decade, a rich variety of massively parallel assays have revolutionized our understanding of how biological sequences encode quantitative molecular phenotypes. These assays include deep mutational scanning, high-throughput SELEX, and massively parallel reporter assays. Here, we review these experimental methods and how the data they produce can be used to quantitatively model sequence-function relationships. In doing so, we touch on a diverse range of topics, including the identification of clinically relevant genomic variants, the modeling of transcription factor binding to DNA, the functional and evolutionary landscapes of proteins, and cis-regulatory mechanisms in both transcription and mRNA splicing. We further describe a unified conceptual framework and a core set of mathematical modeling strategies that studies in these diverse areas can make use of. Finally, we highlight key aspects of experimental design and mathematical modeling that are important for the results of such studies to be interpretable and reproducible.
Collapse
Affiliation(s)
- Justin B Kinney
- Simons Center for Quantitative Biology, Cold Spring Harbor Laboratory, Cold Spring Harbor, New York 11724, USA; ,
| | - David M McCandlish
- Simons Center for Quantitative Biology, Cold Spring Harbor Laboratory, Cold Spring Harbor, New York 11724, USA; ,
| |
Collapse
|
96
|
Domingo J, Baeza-Centurion P, Lehner B. The Causes and Consequences of Genetic Interactions (Epistasis). Annu Rev Genomics Hum Genet 2019; 20:433-460. [PMID: 31082279 DOI: 10.1146/annurev-genom-083118-014857] [Citation(s) in RCA: 124] [Impact Index Per Article: 24.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
The same mutation can have different effects in different individuals. One important reason for this is that the outcome of a mutation can depend on the genetic context in which it occurs. This dependency is known as epistasis. In recent years, there has been a concerted effort to quantify the extent of pairwise and higher-order genetic interactions between mutations through deep mutagenesis of proteins and RNAs. This research has revealed two major components of epistasis: nonspecific genetic interactions caused by nonlinearities in genotype-to-phenotype maps, and specific interactions between particular mutations. Here, we provide an overview of our current understanding of the mechanisms causing epistasis at the molecular level, the consequences of genetic interactions for evolution and genetic prediction, and the applications of epistasis for understanding biology and determining macromolecular structures.
Collapse
Affiliation(s)
- Júlia Domingo
- Systems Biology Program, Centre for Genomic Regulation, Barcelona Institute of Science and Technology, 08003 Barcelona, Spain; , ,
| | - Pablo Baeza-Centurion
- Systems Biology Program, Centre for Genomic Regulation, Barcelona Institute of Science and Technology, 08003 Barcelona, Spain; , ,
| | - Ben Lehner
- Systems Biology Program, Centre for Genomic Regulation, Barcelona Institute of Science and Technology, 08003 Barcelona, Spain; , , .,Universitat Pompeu Fabra, 08003 Barcelona, Spain.,Institució Catalana de Recerca i Estudis Avançats (ICREA), 08010 Barcelona, Spain
| |
Collapse
|
97
|
Singh NN, Luo D, Singh RN. Pre-mRNA Splicing Modulation by Antisense Oligonucleotides. Methods Mol Biol 2019; 1828:415-437. [PMID: 30171557 DOI: 10.1007/978-1-4939-8651-4_26] [Citation(s) in RCA: 24] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022]
Abstract
Pre-mRNA splicing, a dynamic process of intron removal and exon joining, is governed by a combinatorial control exerted by overlapping cis-elements that are unique to each exon and its flanking intronic sequences. Splicing cis-elements are usually 4-to-8-nucleotide-long linear motifs that provide binding sites for specific proteins. Pre-mRNA splicing is also influenced by secondary and higher order RNA structures that affect accessibility of splicing cis-elements. Antisense oligonucleotides (ASOs) that block splicing cis-elements and/or affect RNA structure have been shown to modulate splicing in vivo. Therefore, ASO-based strategies have emerged as a powerful tool for therapeutic manipulation of splicing in pathological conditions. Here we describe an ASO-based approach to increase the production of the full-length SMN2 mRNA in spinal muscular atrophy patient cells.
Collapse
Affiliation(s)
- Natalia N Singh
- Department of Biomedical Sciences, College of Veterinary Medicine, Iowa State University, Ames, IA, USA.
| | - Diou Luo
- Department of Biomedical Sciences, College of Veterinary Medicine, Iowa State University, Ames, IA, USA
| | - Ravindra N Singh
- Department of Biomedical Sciences, College of Veterinary Medicine, Iowa State University, Ames, IA, USA
| |
Collapse
|
98
|
Cheng J, Nguyen TYD, Cygan KJ, Çelik MH, Fairbrother WG, Avsec Ž, Gagneur J. MMSplice: modular modeling improves the predictions of genetic variant effects on splicing. Genome Biol 2019; 20:48. [PMID: 30823901 PMCID: PMC6396468 DOI: 10.1186/s13059-019-1653-z] [Citation(s) in RCA: 119] [Impact Index Per Article: 23.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2018] [Accepted: 02/12/2019] [Indexed: 12/15/2022] Open
Abstract
Predicting the effects of genetic variants on splicing is highly relevant for human genetics. We describe the framework MMSplice (modular modeling of splicing) with which we built the winning model of the CAGI5 exon skipping prediction challenge. The MMSplice modules are neural networks scoring exon, intron, and splice sites, trained on distinct large-scale genomics datasets. These modules are combined to predict effects of variants on exon skipping, splice site choice, splicing efficiency, and pathogenicity, with matched or higher performance than state-of-the-art. Our models, available in the repository Kipoi, apply to variants including indels directly from VCF files.
Collapse
Affiliation(s)
- Jun Cheng
- Department of Informatics, Technical University of Munich, Boltzmannstraße, Garching, 85748 Germany
- Graduate School of Quantitative Biosciences (QBM), Ludwig-Maximilians-Universität München, München, Germany
| | - Thi Yen Duong Nguyen
- Department of Informatics, Technical University of Munich, Boltzmannstraße, Garching, 85748 Germany
| | - Kamil J. Cygan
- Center for Computational Molecular Biology, Brown University, Providence, Rhode Island USA
- Department of Molecular Biology, Cell Biology and Biochemistry, Brown University, Providence, Rhode Island USA
| | - Muhammed Hasan Çelik
- Department of Informatics, Technical University of Munich, Boltzmannstraße, Garching, 85748 Germany
| | - William G. Fairbrother
- Center for Computational Molecular Biology, Brown University, Providence, Rhode Island USA
- Department of Molecular Biology, Cell Biology and Biochemistry, Brown University, Providence, Rhode Island USA
| | - žiga Avsec
- Department of Informatics, Technical University of Munich, Boltzmannstraße, Garching, 85748 Germany
- Graduate School of Quantitative Biosciences (QBM), Ludwig-Maximilians-Universität München, München, Germany
| | - Julien Gagneur
- Department of Informatics, Technical University of Munich, Boltzmannstraße, Garching, 85748 Germany
| |
Collapse
|
99
|
Baeza-Centurion P, Miñana B, Schmiedel JM, Valcárcel J, Lehner B. Combinatorial Genetics Reveals a Scaling Law for the Effects of Mutations on Splicing. Cell 2019; 176:549-563.e23. [PMID: 30661752 DOI: 10.1016/j.cell.2018.12.010] [Citation(s) in RCA: 61] [Impact Index Per Article: 12.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2018] [Revised: 08/29/2018] [Accepted: 12/07/2018] [Indexed: 02/08/2023]
Abstract
Despite a wealth of molecular knowledge, quantitative laws for accurate prediction of biological phenomena remain rare. Alternative pre-mRNA splicing is an important regulated step in gene expression frequently perturbed in human disease. To understand the combined effects of mutations during evolution, we quantified the effects of all possible combinations of exonic mutations accumulated during the emergence of an alternatively spliced human exon. This revealed that mutation effects scale non-monotonically with the inclusion level of an exon, with each mutation having maximum effect at a predictable intermediate inclusion level. This scaling is observed genome-wide for cis and trans perturbations of splicing, including for natural and disease-associated variants. Mathematical modeling suggests that competition between alternative splice sites is sufficient to cause this non-linearity in the genotype-phenotype map. Combining the global scaling law with specific pairwise interactions between neighboring mutations allows accurate prediction of the effects of complex genotype changes involving >10 mutations.
Collapse
Affiliation(s)
- Pablo Baeza-Centurion
- Systems Biology Program, Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Dr Aiguader 88, 08003 Barcelona, Spain
| | - Belén Miñana
- Gene Regulation, Stem Cells and Cancer Program, Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Dr Aiguader 88, 08003 Barcelona, Spain
| | - Jörn M Schmiedel
- Systems Biology Program, Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Dr Aiguader 88, 08003 Barcelona, Spain
| | - Juan Valcárcel
- Gene Regulation, Stem Cells and Cancer Program, Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Dr Aiguader 88, 08003 Barcelona, Spain; Universitat Pompeu Fabra (UPF), 08003 Barcelona, Spain; Institució Catalana de Recerca i Estudis Avançats (ICREA), Pg. Lluís Companys 23, 08010 Barcelona, Spain.
| | - Ben Lehner
- Systems Biology Program, Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Dr Aiguader 88, 08003 Barcelona, Spain; Gene Regulation, Stem Cells and Cancer Program, Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Dr Aiguader 88, 08003 Barcelona, Spain; Institució Catalana de Recerca i Estudis Avançats (ICREA), Pg. Lluís Companys 23, 08010 Barcelona, Spain.
| |
Collapse
|
100
|
Chong R, Insigne KD, Yao D, Burghard CP, Wang J, Hsiao YHE, Jones EM, Goodman DB, Xiao X, Kosuri S. A Multiplexed Assay for Exon Recognition Reveals that an Unappreciated Fraction of Rare Genetic Variants Cause Large-Effect Splicing Disruptions. Mol Cell 2019; 73:183-194.e8. [PMID: 30503770 PMCID: PMC6599603 DOI: 10.1016/j.molcel.2018.10.037] [Citation(s) in RCA: 60] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2018] [Revised: 07/19/2018] [Accepted: 10/23/2018] [Indexed: 11/23/2022]
Abstract
Mutations that lead to splicing defects can have severe consequences on gene function and cause disease. Here, we explore how human genetic variation affects exon recognition by developing a multiplexed functional assay of splicing using Sort-seq (MFASS). We assayed 27,733 variants in the Exome Aggregation Consortium (ExAC) within or adjacent to 2,198 human exons in the MFASS minigene reporter and found that 3.8% (1,050) of variants, most of which are extremely rare, led to large-effect splice-disrupting variants (SDVs). Importantly, we find that 83% of SDVs are located outside of canonical splice sites, are distributed evenly across distinct exonic and intronic regions, and are difficult to predict a priori. Our results indicate extant, rare genetic variants can have large functional effects on splicing at appreciable rates, even outside the context of disease, and MFASS enables their empirical assessment at scale.
Collapse
Affiliation(s)
- Rockie Chong
- Department of Chemistry and Biochemistry, University of California, Los Angeles, Los Angeles, CA 90095, USA
| | - Kimberly D Insigne
- Bioinformatics Interdepartmental Graduate Program, University of California, Los Angeles, Los Angeles, CA 90095, USA
| | - David Yao
- Department of Genetics, Stanford University, Stanford, CA 94035, USA
| | - Christina P Burghard
- Bioinformatics Interdepartmental Graduate Program, University of California, Los Angeles, Los Angeles, CA 90095, USA
| | - Jeffrey Wang
- Department of Chemistry and Biochemistry, University of California, Los Angeles, Los Angeles, CA 90095, USA
| | - Yun-Hua E Hsiao
- Department of Bioengineering, University of California, Los Angeles, Los Angeles, CA 90095, USA
| | - Eric M Jones
- Department of Chemistry and Biochemistry, University of California, Los Angeles, Los Angeles, CA 90095, USA
| | - Daniel B Goodman
- Department of Microbiology and Immunology, University of California, San Francisco, San Francisco, CA 94143, USA
| | - Xinshu Xiao
- Bioinformatics Interdepartmental Graduate Program, University of California, Los Angeles, Los Angeles, CA 90095, USA; Department of Integrative Biology and Physiology, University of California, Los Angeles, Los Angeles, CA 90095, USA; Molecular Biology Institute, University of California, Los Angeles, Los Angeles, CA 90095, USA
| | - Sriram Kosuri
- Department of Chemistry and Biochemistry, University of California, Los Angeles, Los Angeles, CA 90095, USA; Molecular Biology Institute, University of California, Los Angeles, Los Angeles, CA 90095, USA; UCLA-DOE Institute for Genomics and Proteomics, Quantitative and Computational Biology Institute, Eli and Edythe Broad Center of Regenerative Medicine and Stem Cell Research, Jonsson Comprehensive Cancer Center, University of California, Los Angeles, Los Angeles, CA 90095, USA.
| |
Collapse
|