1
|
Ballesio F, Pepe G, Ausiello G, Novelletto A, Helmer-Citterich M, Gherardini PF. Human lncRNAs harbor conserved modules embedded in different sequence contexts. Noncoding RNA Res 2024; 9:1257-1270. [PMID: 39040814 PMCID: PMC11261117 DOI: 10.1016/j.ncrna.2024.06.013] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/02/2024] [Revised: 06/11/2024] [Accepted: 06/19/2024] [Indexed: 07/24/2024] Open
Abstract
We analyzed the structure of human long non-coding RNA (lncRNAs) genes to investigate whether the non-coding transcriptome is organized in modular domains, as is the case for protein-coding genes. To this aim, we compared all known human lncRNA exons and identified 340 pairs of exons with high sequence and/or secondary structure similarity but embedded in a dissimilar sequence context. We grouped these pairs in 106 clusters based on their reciprocal similarities. These shared modules are highly conserved between humans and the four great ape species, display evidence of purifying selection and likely arose as a result of recent segmental duplications. Our analysis contributes to the understanding of the mechanisms driving the evolution of the non-coding genome and suggests additional strategies towards deciphering the functional complexity of this class of molecules.
Collapse
Affiliation(s)
- Francesco Ballesio
- PhD Program in Cellular and Molecular Biology, Department of Biology, University of Rome “Tor Vergata”, Rome, Italy
| | - Gerardo Pepe
- Department of Biology, University of Rome “Tor Vergata”, Rome, Italy
| | - Gabriele Ausiello
- Department of Biology, University of Rome “Tor Vergata”, Rome, Italy
| | - Andrea Novelletto
- Department of Biology, University of Rome “Tor Vergata”, Rome, Italy
| | | | | |
Collapse
|
2
|
Recinos Y, Bao S, Wang X, Phillips BL, Yeh YT, Weyn-Vanhentenryck SM, Swanson MS, Zhang C. Lineage-specific splicing regulation of MAPT gene in the primate brain. CELL GENOMICS 2024; 4:100563. [PMID: 38772368 PMCID: PMC11228892 DOI: 10.1016/j.xgen.2024.100563] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/02/2022] [Revised: 01/22/2024] [Accepted: 04/23/2024] [Indexed: 05/23/2024]
Abstract
Divergence of precursor messenger RNA (pre-mRNA) alternative splicing (AS) is widespread in mammals, including primates, but the underlying mechanisms and functional impact are poorly understood. Here, we modeled cassette exon inclusion in primate brains as a quantitative trait and identified 1,170 (∼3%) exons with lineage-specific splicing shifts under stabilizing selection. Among them, microtubule-associated protein tau (MAPT) exons 2 and 10 underwent anticorrelated, two-step evolutionary shifts in the catarrhine and hominoid lineages, leading to their present inclusion levels in humans. The developmental-stage-specific divergence of exon 10 splicing, whose dysregulation can cause frontotemporal lobar degeneration (FTLD), is mediated by divergent distal intronic MBNL-binding sites. Competitive binding of these sites by CRISPR-dCas13d/gRNAs effectively reduces exon 10 inclusion, potentially providing a therapeutically compatible approach to modulate tau isoform expression. Our data suggest adaptation of MAPT function and, more generally, a role for AS in the evolutionary expansion of the primate brain.
Collapse
Affiliation(s)
- Yocelyn Recinos
- Department of Systems Biology, Columbia University, New York, NY 10032, USA; Department of Biochemistry and Molecular Biophysics, Columbia University, New York, NY 10032, USA
| | - Suying Bao
- Department of Systems Biology, Columbia University, New York, NY 10032, USA; Department of Biochemistry and Molecular Biophysics, Columbia University, New York, NY 10032, USA
| | - Xiaojian Wang
- Department of Systems Biology, Columbia University, New York, NY 10032, USA; Department of Biochemistry and Molecular Biophysics, Columbia University, New York, NY 10032, USA
| | - Brittany L Phillips
- Department of Systems Biology, Columbia University, New York, NY 10032, USA; Department of Biochemistry and Molecular Biophysics, Columbia University, New York, NY 10032, USA
| | - Yow-Tyng Yeh
- Department of Systems Biology, Columbia University, New York, NY 10032, USA; Department of Biochemistry and Molecular Biophysics, Columbia University, New York, NY 10032, USA
| | - Sebastien M Weyn-Vanhentenryck
- Department of Systems Biology, Columbia University, New York, NY 10032, USA; Department of Biochemistry and Molecular Biophysics, Columbia University, New York, NY 10032, USA
| | - Maurice S Swanson
- Department of Molecular Genetics and Microbiology, University of Florida, College of Medicine, Gainesville, FL 32610, USA; Center for NeuroGenetics and the Genetics Institute, University of Florida, College of Medicine, Gainesville, FL 32610, USA
| | - Chaolin Zhang
- Department of Systems Biology, Columbia University, New York, NY 10032, USA; Department of Biochemistry and Molecular Biophysics, Columbia University, New York, NY 10032, USA.
| |
Collapse
|
3
|
Nutter CA, Kidd BM, Carter HA, Hamel JI, Mackie PM, Kumbkarni N, Davenport ML, Tuyn DM, Gopinath A, Creigh PD, Sznajder ŁJ, Wang ET, Ranum LPW, Khoshbouei H, Day JW, Sampson JB, Prokop S, Swanson MS. Choroid plexus mis-splicing and altered cerebrospinal fluid composition in myotonic dystrophy type 1. Brain 2023; 146:4217-4232. [PMID: 37143315 PMCID: PMC10545633 DOI: 10.1093/brain/awad148] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2023] [Revised: 04/08/2023] [Accepted: 04/18/2023] [Indexed: 05/06/2023] Open
Abstract
Myotonic dystrophy type 1 is a dominantly inherited multisystemic disease caused by CTG tandem repeat expansions in the DMPK 3' untranslated region. These expanded repeats are transcribed and produce toxic CUG RNAs that sequester and inhibit activities of the MBNL family of developmental RNA processing factors. Although myotonic dystrophy is classified as a muscular dystrophy, the brain is also severely affected by an unusual cohort of symptoms, including hypersomnia, executive dysfunction, as well as early onsets of tau/MAPT pathology and cerebral atrophy. To address the molecular and cellular events that lead to these pathological outcomes, we recently generated a mouse Dmpk CTG expansion knock-in model and identified choroid plexus epithelial cells as particularly affected by the expression of toxic CUG expansion RNAs. To determine if toxic CUG RNAs perturb choroid plexus functions, alternative splicing analysis was performed on lateral and hindbrain choroid plexi from Dmpk CTG knock-in mice. Choroid plexus transcriptome-wide changes were evaluated in Mbnl2 knockout mice, a developmental-onset model of myotonic dystrophy brain dysfunction. To determine if transcriptome changes also occurred in the human disease, we obtained post-mortem choroid plexus for RNA-seq from neurologically unaffected (two females, three males; ages 50-70 years) and myotonic dystrophy type 1 (one female, three males; ages 50-70 years) donors. To test that choroid plexus transcriptome alterations resulted in altered CSF composition, we obtained CSF via lumbar puncture from patients with myotonic dystrophy type 1 (five females, five males; ages 35-55 years) and non-myotonic dystrophy patients (three females, four males; ages 26-51 years), and western blot and osmolarity analyses were used to test CSF alterations predicted by choroid plexus transcriptome analysis. We determined that CUG RNA induced toxicity was more robust in the lateral choroid plexus of Dmpk CTG knock-in mice due to comparatively higher Dmpk and lower Mbnl RNA levels. Impaired transitions to adult splicing patterns during choroid plexus development were identified in Mbnl2 knockout mice, including mis-splicing previously found in Dmpk CTG knock-in mice. Whole transcriptome analysis of myotonic dystrophy type 1 choroid plexus revealed disease-associated RNA expression and mis-splicing events. Based on these RNA changes, predicted alterations in ion homeostasis, secretory output and CSF composition were confirmed by analysis of myotonic dystrophy type 1 CSF. Our results implicate choroid plexus spliceopathy and concomitant alterations in CSF homeostasis as an unappreciated contributor to myotonic dystrophy type 1 CNS pathogenesis.
Collapse
Affiliation(s)
- Curtis A Nutter
- Department of Molecular Genetics and Microbiology, Center for NeuroGenetics and the Genetics Institute, University of Florida, College of Medicine, Gainesville, FL 32610, USA
| | - Benjamin M Kidd
- Department of Molecular Genetics and Microbiology, Center for NeuroGenetics and the Genetics Institute, University of Florida, College of Medicine, Gainesville, FL 32610, USA
| | - Helmut A Carter
- Department of Molecular Genetics and Microbiology, Center for NeuroGenetics and the Genetics Institute, University of Florida, College of Medicine, Gainesville, FL 32610, USA
| | - Johanna I Hamel
- Department of Neurology, University of Rochester, Rochester, NY 14642, USA
| | - Philip M Mackie
- Department of Neuroscience, McKnight Brain Institute, College of Medicine, University of Florida, Gainesville, FL 32610, USA
| | - Nayha Kumbkarni
- Department of Molecular Genetics and Microbiology, Center for NeuroGenetics and the Genetics Institute, University of Florida, College of Medicine, Gainesville, FL 32610, USA
| | - Mackenzie L Davenport
- Department of Molecular Genetics and Microbiology, Center for NeuroGenetics and the Genetics Institute, University of Florida, College of Medicine, Gainesville, FL 32610, USA
| | - Dana M Tuyn
- Department of Molecular Genetics and Microbiology, Center for NeuroGenetics and the Genetics Institute, University of Florida, College of Medicine, Gainesville, FL 32610, USA
| | - Adithya Gopinath
- Department of Neuroscience, McKnight Brain Institute, College of Medicine, University of Florida, Gainesville, FL 32610, USA
| | - Peter D Creigh
- Department of Neurology, University of Rochester, Rochester, NY 14642, USA
| | - Łukasz J Sznajder
- Department of Molecular Genetics and Microbiology, Center for NeuroGenetics and the Genetics Institute, University of Florida, College of Medicine, Gainesville, FL 32610, USA
| | - Eric T Wang
- Department of Molecular Genetics and Microbiology, Center for NeuroGenetics and the Genetics Institute, University of Florida, College of Medicine, Gainesville, FL 32610, USA
| | - Laura P W Ranum
- Department of Molecular Genetics and Microbiology, Center for NeuroGenetics and the Genetics Institute, McKnight Brain Institute and the Fixel Institute for Neurological Diseases, College of Medicine, University of Florida, Gainesville, FL 32610, USA
| | - Habibeh Khoshbouei
- Department of Neuroscience, McKnight Brain Institute, College of Medicine, University of Florida, Gainesville, FL 32610, USA
| | - John W Day
- Department of Neurology and Neurological Sciences, Stanford University School of Medicine, Stanford, CA 94304, USA
| | - Jacinda B Sampson
- Department of Neurology and Neurological Sciences, Stanford University School of Medicine, Stanford, CA 94304, USA
| | - Stefan Prokop
- Department of Pathology, Immunology, and Laboratory Medicine, Center for Translational Research in Neurodegenerative Disease, McKnight Brain Institute and the Fixel Institute for Neurological Diseases, College of Medicine, University of Florida, Gainesville, FL 32610, USA
| | - Maurice S Swanson
- Department of Molecular Genetics and Microbiology, Center for NeuroGenetics and the Genetics Institute, University of Florida, College of Medicine, Gainesville, FL 32610, USA
| |
Collapse
|
4
|
Salamon I, Park Y, Miškić T, Kopić J, Matteson P, Page NF, Roque A, McAuliffe GW, Favate J, Garcia-Forn M, Shah P, Judaš M, Millonig JH, Kostović I, De Rubeis S, Hart RP, Krsnik Ž, Rasin MR. Celf4 controls mRNA translation underlying synaptic development in the prenatal mammalian neocortex. Nat Commun 2023; 14:6025. [PMID: 37758766 PMCID: PMC10533865 DOI: 10.1038/s41467-023-41730-8] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2022] [Accepted: 09/18/2023] [Indexed: 09/29/2023] Open
Abstract
Abnormalities in neocortical and synaptic development are linked to neurodevelopmental disorders. However, the molecular and cellular mechanisms governing initial synapse formation in the prenatal neocortex remain poorly understood. Using polysome profiling coupled with snRNAseq on human cortical samples at various fetal phases, we identify human mRNAs, including those encoding synaptic proteins, with finely controlled translation in distinct cell populations of developing frontal neocortices. Examination of murine and human neocortex reveals that the RNA binding protein and translational regulator, CELF4, is expressed in compartments enriched in initial synaptogenesis: the marginal zone and the subplate. We also find that Celf4/CELF4-target mRNAs are encoded by risk genes for adverse neurodevelopmental outcomes translating into synaptic proteins. Surprisingly, deleting Celf4 in the forebrain disrupts the balance of subplate synapses in a sex-specific fashion. This highlights the significance of RNA binding proteins and mRNA translation in evolutionarily advanced synaptic development, potentially contributing to sex differences.
Collapse
Affiliation(s)
- Iva Salamon
- Department of Neuroscience and Cell Biology, Rutgers University, Robert Wood Johnson Medical School, Piscataway, NJ, 08854, USA
- Rutgers University, School of Graduate Studies, New Brunswick, NJ, 08854, USA
| | - Yongkyu Park
- Department of Neuroscience and Cell Biology, Rutgers University, Robert Wood Johnson Medical School, Piscataway, NJ, 08854, USA
| | - Terezija Miškić
- Croatian Institute for Brain Research, Center of Research Excellence for Basic, Clinical and Translational Neuroscience, University of Zagreb, School of Medicine, Zagreb, 10000, Croatia
| | - Janja Kopić
- Croatian Institute for Brain Research, Center of Research Excellence for Basic, Clinical and Translational Neuroscience, University of Zagreb, School of Medicine, Zagreb, 10000, Croatia
| | - Paul Matteson
- Center for Advanced Biotechnology and Medicine, Department of Neuroscience and Cell Biology, Rutgers Robert Wood Johnson Medical School, Piscataway, NJ, USA
| | - Nicholas F Page
- Department of Neuroscience and Cell Biology, Rutgers University, Robert Wood Johnson Medical School, Piscataway, NJ, 08854, USA
| | - Alfonso Roque
- Department of Neuroscience and Cell Biology, Rutgers University, Robert Wood Johnson Medical School, Piscataway, NJ, 08854, USA
| | - Geoffrey W McAuliffe
- Department of Neuroscience and Cell Biology, Rutgers University, Robert Wood Johnson Medical School, Piscataway, NJ, 08854, USA
| | - John Favate
- Department of Genetics, Rutgers University, Piscataway, NJ, 08854, USA
| | - Marta Garcia-Forn
- Seaver Autism Center for Research and Treatment, Icahn School of Medicine at Mount Sinai, New York, NY, 10029, USA
- Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, NY, 10029, USA
- Friedman Brain Institute, Icahn School of Medicine at Mount Sinai, New York, NY, 10029, USA
- Mindich Child Health and Development Institute, Icahn School of Medicine at Mount Sinai, New York, NY, 10029, USA
- The Alper Center for Neural Development and Regeneration, Icahn School of Medicine at Mount Sinai, New York, NY, 10029, USA
| | - Premal Shah
- Department of Genetics, Rutgers University, Piscataway, NJ, 08854, USA
| | - Miloš Judaš
- Croatian Institute for Brain Research, Center of Research Excellence for Basic, Clinical and Translational Neuroscience, University of Zagreb, School of Medicine, Zagreb, 10000, Croatia
| | - James H Millonig
- Center for Advanced Biotechnology and Medicine, Department of Neuroscience and Cell Biology, Rutgers Robert Wood Johnson Medical School, Piscataway, NJ, USA
| | - Ivica Kostović
- Croatian Institute for Brain Research, Center of Research Excellence for Basic, Clinical and Translational Neuroscience, University of Zagreb, School of Medicine, Zagreb, 10000, Croatia
| | - Silvia De Rubeis
- Seaver Autism Center for Research and Treatment, Icahn School of Medicine at Mount Sinai, New York, NY, 10029, USA
- Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, NY, 10029, USA
- Friedman Brain Institute, Icahn School of Medicine at Mount Sinai, New York, NY, 10029, USA
- Mindich Child Health and Development Institute, Icahn School of Medicine at Mount Sinai, New York, NY, 10029, USA
- The Alper Center for Neural Development and Regeneration, Icahn School of Medicine at Mount Sinai, New York, NY, 10029, USA
| | - Ronald P Hart
- Department of Cell Biology and Neuroscience, Rutgers, The State University of New Jersey, Piscataway, NJ, 08854, USA
| | - Željka Krsnik
- Croatian Institute for Brain Research, Center of Research Excellence for Basic, Clinical and Translational Neuroscience, University of Zagreb, School of Medicine, Zagreb, 10000, Croatia.
| | - Mladen-Roko Rasin
- Department of Neuroscience and Cell Biology, Rutgers University, Robert Wood Johnson Medical School, Piscataway, NJ, 08854, USA.
| |
Collapse
|
5
|
Ma H, Wen H, Xue Z, Li G, Zhang Z. RNANetMotif: Identifying sequence-structure RNA network motifs in RNA-protein binding sites. PLoS Comput Biol 2022; 18:e1010293. [PMID: 35819951 PMCID: PMC9275694 DOI: 10.1371/journal.pcbi.1010293] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2021] [Accepted: 06/09/2022] [Indexed: 11/19/2022] Open
Abstract
RNA molecules can adopt stable secondary and tertiary structures, which are essential in mediating physical interactions with other partners such as RNA binding proteins (RBPs) and in carrying out their cellular functions. In vivo and in vitro experiments such as RNAcompete and eCLIP have revealed in vitro binding preferences of RBPs to RNA oligomers and in vivo binding sites in cells. Analysis of these binding data showed that the structure properties of the RNAs in these binding sites are important determinants of the binding events; however, it has been a challenge to incorporate the structure information into an interpretable model. Here we describe a new approach, RNANetMotif, which takes predicted secondary structure of thousands of RNA sequences bound by an RBP as input and uses a graph theory approach to recognize enriched subgraphs. These enriched subgraphs are in essence shared sequence-structure elements that are important in RBP-RNA binding. To validate our approach, we performed RNA structure modeling via coarse-grained molecular dynamics folding simulations for selected 4 RBPs, and RNA-protein docking for LIN28B. The simulation results, e.g., solvent accessibility and energetics, further support the biological relevance of the discovered network subgraphs. RNA binding proteins (RBPs) regulate every aspect of RNA biology, including splicing, translation, transportation, and degradation. High-throughput technologies such as eCLIP have identified thousands of binding sites for a given RBP throughout the genome. It has been shown by earlier studies that, in addition to nucleotide sequences, the structure and conformation of RNAs also play important role in RBP-RNA interactions. Analogous to protein-protein interactions or protein-DNA interactions, it is likely that there exist intrinsic sequence-structure motifs common to these RNAs that underlie their binding specificity to specific RBPs. It is known that RNAs form energetically favorable secondary structures, which can be represented as graphs, with nucleotides being nodes and backbone covalent bonds and base-pairing hydrogen bonds representing edges. We hypothesize that these graphs can be mined by graph theory approaches to identify sequence-structure motifs as enriched sub-graphs. In this article, we described the details of this approach, termed RNANetMotif and associated new concepts, namely EKS (Extended K-mer Subgraph) and GraphK graph algorithm. To test the utility of our approach, we conducted 3D structure modeling of selected RNA sequences through molecular dynamics (MD) folding simulation and evaluated the significance of the discovered RNA motifs by comparing their spatial exposure with other regions on the RNA. We believe that this approach has the novelty of treating the RNA sequence as a graph and RBP binding sites as enriched subgraph, which has broader applications beyond RBP-RNA interactions.
Collapse
Affiliation(s)
- Hongli Ma
- Research Center for Mathematics and Interdisciplinary Sciences, Shandong University, Qingdao, China
- Terrence Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, Toronto, Ontario, Canada
- Department of Molecular Genetics, University of Toronto, Toronto, Ontario, Canada
- School of Mathematics, Shandong University, Jinan, China
| | - Han Wen
- Terrence Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, Toronto, Ontario, Canada
| | - Zhiyuan Xue
- West China Biomedical Big Data Center, West China Hospital, Sichuan University, Chengdu, China
| | - Guojun Li
- Research Center for Mathematics and Interdisciplinary Sciences, Shandong University, Qingdao, China
- School of Mathematics, Shandong University, Jinan, China
- School of Mathematical Science, Liaocheng University, Liaocheng, China
| | - Zhaolei Zhang
- Terrence Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, Toronto, Ontario, Canada
- Department of Molecular Genetics, University of Toronto, Toronto, Ontario, Canada
- Department of Computer Science, University of Toronto, Toronto, Ontario, Canada
| |
Collapse
|
6
|
Cui F, Zhang Z, Cao C, Zou Q, Chen D, Su X. Protein-DNA/RNA interactions: Machine intelligence tools and approaches in the era of artificial intelligence and big data. Proteomics 2022; 22:e2100197. [PMID: 35112474 DOI: 10.1002/pmic.202100197] [Citation(s) in RCA: 19] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2021] [Revised: 01/02/2022] [Accepted: 01/17/2022] [Indexed: 11/09/2022]
Abstract
With the development of artificial intelligence technologies and the availability of large amounts of biological data, computational methods for proteomics have undergone a developmental process from traditional machine learning to deep learning. This review focuses on computational approaches and tools for the prediction of protein-DNA/RNA interactions using machine intelligence techniques. We provide an overview of the development progress of computational methods and summarize the advantages and shortcomings of these methods. We further compiled applications in tasks related to the protein-DNA/RNA interactions, and pointed out possible future application trends. Moreover, biological sequence-digitizing representation strategies used in different types of computational methods are also summarized and discussed. This article is protected by copyright. All rights reserved.
Collapse
Affiliation(s)
- Feifei Cui
- Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu, 610054, China.,Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, Quzhou, 324000, China
| | - Zilong Zhang
- Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu, 610054, China.,Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, Quzhou, 324000, China
| | - Chen Cao
- Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, Quzhou, 324000, China
| | - Quan Zou
- Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu, 610054, China.,Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, Quzhou, 324000, China
| | - Dong Chen
- College of Electrical and Information Engineering, Quzhou University, Quzhou, 324000, China
| | - Xi Su
- Foshan Maternal and Child Health Hospital, Foshan, Guangdong, China
| |
Collapse
|
7
|
Korn SM, Ulshöfer CJ, Schneider T, Schlundt A. Structures and target RNA preferences of the RNA-binding protein family of IGF2BPs: An overview. Structure 2021; 29:787-803. [PMID: 34022128 DOI: 10.1016/j.str.2021.05.001] [Citation(s) in RCA: 18] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2020] [Revised: 03/12/2021] [Accepted: 04/30/2021] [Indexed: 02/08/2023]
Abstract
Insulin-like growth factor 2 mRNA-binding proteins (IMPs, IGF2BPs) act in mRNA transport and translational control but are oncofetal tumor marker proteins. The IMP protein family represents a number of bona fide multi-domain RNA-binding proteins with up to six RNA-binding domains, resulting in a high complexity of possible modes of interactions with target mRNAs. Their exact mechanism in stability control of oncogenic mRNAs is only partially understood. Our and other laboratories' recent work has significantly pushed the understanding of IMP protein specificities both toward RNA engagement and between each other from NMR and crystal structures serving the basis for systematic biochemical and functional investigations. We here summarize the known structural and biochemical information about IMP RNA-binding domains and their RNA preferences. The article also touches on the respective roles of RNA secondary and protein tertiary structures for specific RNA-protein complexes, including the limited knowledge about IMPs' protein-protein interactions, which are often RNA mediated.
Collapse
Affiliation(s)
- Sophie Marianne Korn
- Institute for Molecular Biosciences and Center for Biomolecular Magnetic Resonance (BMRZ), Goethe-University Frankfurt, Max-von-Laue-Str. 9, 60438 Frankfurt, Germany
| | - Corinna Jessica Ulshöfer
- Institute of Biochemistry, Justus-Liebig-University of Giessen, Heinrich-Buff-Ring 17, 35392 Giessen, Germany
| | - Tim Schneider
- Institute of Biochemistry, Justus-Liebig-University of Giessen, Heinrich-Buff-Ring 17, 35392 Giessen, Germany
| | - Andreas Schlundt
- Institute for Molecular Biosciences and Center for Biomolecular Magnetic Resonance (BMRZ), Goethe-University Frankfurt, Max-von-Laue-Str. 9, 60438 Frankfurt, Germany.
| |
Collapse
|
8
|
Differences in splicing defects between the grey and white matter in myotonic dystrophy type 1 patients. PLoS One 2020; 15:e0224912. [PMID: 32407311 PMCID: PMC7224547 DOI: 10.1371/journal.pone.0224912] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2019] [Accepted: 04/24/2020] [Indexed: 12/11/2022] Open
Abstract
Myotonic dystrophy type 1 (DM1) is a multi-system disorder caused by CTG repeats in the myotonic dystrophy protein kinase (DMPK) gene. This leads to the sequestration of splicing factors such as muscleblind-like 1/2 (MBNL1/2) and aberrant splicing in the central nervous system. We investigated the splicing patterns of MBNL1/2 and genes controlled by MBNL2 in several regions of the brain and between the grey matter (GM) and white matter (WM) in DM1 patients using RT-PCR. Compared with amyotrophic lateral sclerosis (ALS, as disease controls), the percentage of spliced-in parameter (PSI) for most of the examined exons were significantly altered in most of the brain regions of DM1 patients, except for the cerebellum. The splicing of many genes was differently regulated between the GM and WM in both DM1 and ALS. In 7 out of the 15 examined splicing events, the level of PSI change between DM1 and ALS was significantly higher in the GM than in the WM. The differences in alternative splicing between the GM and WM may be related to the effect of DM1 on the WM of the brain.
Collapse
|
9
|
Carazo F, Romero JP, Rubio A. Upstream analysis of alternative splicing: a review of computational approaches to predict context-dependent splicing factors. Brief Bioinform 2020; 20:1358-1375. [PMID: 29390045 DOI: 10.1093/bib/bby005] [Citation(s) in RCA: 21] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2017] [Revised: 12/14/2017] [Indexed: 12/13/2022] Open
Abstract
Alternative splicing (AS) has shown to play a pivotal role in the development of diseases, including cancer. Specifically, all the hallmarks of cancer (angiogenesis, cell immortality, avoiding immune system response, etc.) are found to have a counterpart in aberrant splicing of key genes. Identifying the context-specific regulators of splicing provides valuable information to find new biomarkers, as well as to define alternative therapeutic strategies. The computational models to identify these regulators are not trivial and require three conceptual steps: the detection of AS events, the identification of splicing factors that potentially regulate these events and the contextualization of these pieces of information for a specific experiment. In this work, we review the different algorithmic methodologies developed for each of these tasks. Main weaknesses and strengths of the different steps of the pipeline are discussed. Finally, a case study is detailed to help the reader be aware of the potential and limitations of this computational approach.
Collapse
|
10
|
Sznajder ŁJ, Swanson MS. Short Tandem Repeat Expansions and RNA-Mediated Pathogenesis in Myotonic Dystrophy. Int J Mol Sci 2019; 20:ijms20133365. [PMID: 31323950 PMCID: PMC6651174 DOI: 10.3390/ijms20133365] [Citation(s) in RCA: 52] [Impact Index Per Article: 10.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2019] [Revised: 06/27/2019] [Accepted: 07/08/2019] [Indexed: 12/23/2022] Open
Abstract
Short tandem repeat (STR) or microsatellite, expansions underlie more than 50 hereditary neurological, neuromuscular and other diseases, including myotonic dystrophy types 1 (DM1) and 2 (DM2). Current disease models for DM1 and DM2 propose a common pathomechanism, whereby the transcription of mutant DMPK (DM1) and CNBP (DM2) genes results in the synthesis of CUG and CCUG repeat expansion (CUGexp, CCUGexp) RNAs, respectively. These CUGexp and CCUGexp RNAs are toxic since they promote the assembly of ribonucleoprotein (RNP) complexes or RNA foci, leading to sequestration of Muscleblind-like (MBNL) proteins in the nucleus and global dysregulation of the processing, localization and stability of MBNL target RNAs. STR expansion RNAs also form phase-separated gel-like droplets both in vitro and in transiently transfected cells, implicating RNA-RNA multivalent interactions as drivers of RNA foci formation. Importantly, the nucleation and growth of these nuclear foci and transcript misprocessing are reversible processes and thus amenable to therapeutic intervention. In this review, we provide an overview of potential DM1 and DM2 pathomechanisms, followed by a discussion of MBNL functions in RNA processing and how multivalent interactions between expanded STR RNAs and RNA-binding proteins (RBPs) promote RNA foci assembly.
Collapse
Affiliation(s)
- Łukasz J Sznajder
- Department of Molecular Genetics and Microbiology, Center for NeuroGenetics and the Genetics Institute, University of Florida, College of Medicine, Gainesville, FL 32610, USA.
| | - Maurice S Swanson
- Department of Molecular Genetics and Microbiology, Center for NeuroGenetics and the Genetics Institute, University of Florida, College of Medicine, Gainesville, FL 32610, USA
| |
Collapse
|
11
|
Feng H, Bao S, Rahman MA, Weyn-Vanhentenryck SM, Khan A, Wong J, Shah A, Flynn ED, Krainer AR, Zhang C. Modeling RNA-Binding Protein Specificity In Vivo by Precisely Registering Protein-RNA Crosslink Sites. Mol Cell 2019; 74:1189-1204.e6. [PMID: 31226278 PMCID: PMC6676488 DOI: 10.1016/j.molcel.2019.02.002] [Citation(s) in RCA: 52] [Impact Index Per Article: 10.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2018] [Revised: 01/14/2019] [Accepted: 01/31/2019] [Indexed: 12/30/2022]
Abstract
RNA-binding proteins (RBPs) regulate post-transcriptional gene expression by recognizing short and degenerate sequence motifs in their target transcripts, but precisely defining their binding specificity remains challenging. Crosslinking and immunoprecipitation (CLIP) allows for mapping of the exact protein-RNA crosslink sites, which frequently reside at specific positions in RBP motifs at single-nucleotide resolution. Here, we have developed a computational method, named mCross, to jointly model RBP binding specificity while precisely registering the crosslinking position in motif sites. We applied mCross to 112 RBPs using ENCODE eCLIP data and validated the reliability of the discovered motifs by genome-wide analysis of allelic binding sites. Our analyses revealed that the prototypical SR protein SRSF1 recognizes clusters of GGA half-sites in addition to its canonical GGAGGA motif. Therefore, SRSF1 regulates splicing of a much larger repertoire of transcripts than previously appreciated, including HNRNPD and HNRNPDL, which are involved in multivalent protein assemblies and phase separation.
Collapse
Affiliation(s)
- Huijuan Feng
- Department of Systems Biology, Columbia University, New York, NY 10032, USA; Department of Biochemistry and Molecular Biophysics, Columbia University, New York, NY 10032, USA; Center for Motor Neuron Biology and Disease, Columbia University, New York, NY 10032, USA
| | - Suying Bao
- Department of Systems Biology, Columbia University, New York, NY 10032, USA; Department of Biochemistry and Molecular Biophysics, Columbia University, New York, NY 10032, USA; Center for Motor Neuron Biology and Disease, Columbia University, New York, NY 10032, USA
| | | | - Sebastien M Weyn-Vanhentenryck
- Department of Systems Biology, Columbia University, New York, NY 10032, USA; Department of Biochemistry and Molecular Biophysics, Columbia University, New York, NY 10032, USA; Center for Motor Neuron Biology and Disease, Columbia University, New York, NY 10032, USA
| | - Aziz Khan
- Centre for Molecular Medicine Norway (NCMM), Nordic EMBL Partnership, University of Oslo, 0318 Oslo, Norway
| | - Justin Wong
- Department of Systems Biology, Columbia University, New York, NY 10032, USA; Department of Biochemistry and Molecular Biophysics, Columbia University, New York, NY 10032, USA; Center for Motor Neuron Biology and Disease, Columbia University, New York, NY 10032, USA
| | - Ankeeta Shah
- Department of Systems Biology, Columbia University, New York, NY 10032, USA; Department of Biochemistry and Molecular Biophysics, Columbia University, New York, NY 10032, USA; Center for Motor Neuron Biology and Disease, Columbia University, New York, NY 10032, USA
| | - Elise D Flynn
- Department of Systems Biology, Columbia University, New York, NY 10032, USA; Department of Biochemistry and Molecular Biophysics, Columbia University, New York, NY 10032, USA; Center for Motor Neuron Biology and Disease, Columbia University, New York, NY 10032, USA
| | - Adrian R Krainer
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724, USA
| | - Chaolin Zhang
- Department of Systems Biology, Columbia University, New York, NY 10032, USA; Department of Biochemistry and Molecular Biophysics, Columbia University, New York, NY 10032, USA; Center for Motor Neuron Biology and Disease, Columbia University, New York, NY 10032, USA.
| |
Collapse
|
12
|
Combinatorial recognition of clustered RNA elements by the multidomain RNA-binding protein IMP3. Nat Commun 2019; 10:2266. [PMID: 31118463 PMCID: PMC6531468 DOI: 10.1038/s41467-019-09769-8] [Citation(s) in RCA: 51] [Impact Index Per Article: 10.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/13/2018] [Accepted: 03/26/2019] [Indexed: 02/07/2023] Open
Abstract
How multidomain RNA-binding proteins recognize their specific target sequences, based on a combinatorial code, represents a fundamental unsolved question and has not been studied systematically so far. Here we focus on a prototypical multidomain RNA-binding protein, IMP3 (also called IGF2BP3), which contains six RNA-binding domains (RBDs): four KH and two RRM domains. We establish an integrative systematic strategy, combining single-domain-resolved SELEX-seq, motif-spacing analyses, in vivo iCLIP, functional validation assays, and structural biology. This approach identifies the RNA-binding specificity and RNP topology of IMP3, involving all six RBDs and a cluster of up to five distinct and appropriately spaced CA-rich and GGC-core RNA elements, covering a >100 nucleotide-long target RNA region. Our generally applicable approach explains both specificity and flexibility of IMP3-RNA recognition, allows the prediction of IMP3 targets, and provides a paradigm for the function of multivalent interactions with multidomain RNA-binding proteins in gene regulation.
Collapse
|
13
|
Fontrodona N, Aubé F, Claude JB, Polvèche H, Lemaire S, Tranchevent LC, Modolo L, Mortreux F, Bourgeois CF, Auboeuf D. Interplay between coding and exonic splicing regulatory sequences. Genome Res 2019; 29:711-722. [PMID: 30962178 PMCID: PMC6499313 DOI: 10.1101/gr.241315.118] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2018] [Accepted: 03/28/2019] [Indexed: 01/24/2023]
Abstract
The inclusion of exons during the splicing process depends on the binding of splicing factors to short low-complexity regulatory sequences. The relationship between exonic splicing regulatory sequences and coding sequences is still poorly understood. We demonstrate that exons that are coregulated by any given splicing factor share a similar nucleotide composition bias and preferentially code for amino acids with similar physicochemical properties because of the nonrandomness of the genetic code. Indeed, amino acids sharing similar physicochemical properties correspond to codons that have the same nucleotide composition bias. In particular, we uncover that the TRA2A and TRA2B splicing factors that bind to adenine-rich motifs promote the inclusion of adenine-rich exons coding preferentially for hydrophilic amino acids that correspond to adenine-rich codons. SRSF2 that binds guanine/cytosine-rich motifs promotes the inclusion of GC-rich exons coding preferentially for small amino acids, whereas SRSF3 that binds cytosine-rich motifs promotes the inclusion of exons coding preferentially for uncharged amino acids, like serine and threonine that can be phosphorylated. Finally, coregulated exons encoding amino acids with similar physicochemical properties correspond to specific protein features. In conclusion, the regulation of an exon by a splicing factor that relies on the affinity of this factor for specific nucleotide(s) is tightly interconnected with the exon-encoded physicochemical properties. We therefore uncover an unanticipated bidirectional interplay between the splicing regulatory process and its biological functional outcome.
Collapse
Affiliation(s)
- Nicolas Fontrodona
- Université Lyon, ENS de Lyon, Université Claude Bernard, CNRS UMR 5239, INSERM U1210, Laboratory of Biology and Modelling of the Cell, F-69007, Lyon, France
| | - Fabien Aubé
- Université Lyon, ENS de Lyon, Université Claude Bernard, CNRS UMR 5239, INSERM U1210, Laboratory of Biology and Modelling of the Cell, F-69007, Lyon, France
| | - Jean-Baptiste Claude
- Université Lyon, ENS de Lyon, Université Claude Bernard, CNRS UMR 5239, INSERM U1210, Laboratory of Biology and Modelling of the Cell, F-69007, Lyon, France
| | - Hélène Polvèche
- Université Lyon, ENS de Lyon, Université Claude Bernard, CNRS UMR 5239, INSERM U1210, Laboratory of Biology and Modelling of the Cell, F-69007, Lyon, France
| | - Sébastien Lemaire
- Université Lyon, ENS de Lyon, Université Claude Bernard, CNRS UMR 5239, INSERM U1210, Laboratory of Biology and Modelling of the Cell, F-69007, Lyon, France
| | - Léon-Charles Tranchevent
- Proteome and Genome Research Unit, Department of Oncology, Luxembourg Institute of Health (LIH), L-1445 Strassen, Luxembourg
| | - Laurent Modolo
- LBMC Biocomputing Center, CNRS UMR 5239, INSERM U1210, F-69007, Lyon, France
| | - Franck Mortreux
- Université Lyon, ENS de Lyon, Université Claude Bernard, CNRS UMR 5239, INSERM U1210, Laboratory of Biology and Modelling of the Cell, F-69007, Lyon, France
| | - Cyril F Bourgeois
- Université Lyon, ENS de Lyon, Université Claude Bernard, CNRS UMR 5239, INSERM U1210, Laboratory of Biology and Modelling of the Cell, F-69007, Lyon, France
| | - Didier Auboeuf
- Université Lyon, ENS de Lyon, Université Claude Bernard, CNRS UMR 5239, INSERM U1210, Laboratory of Biology and Modelling of the Cell, F-69007, Lyon, France
| |
Collapse
|
14
|
Li X, Wong KC. Elucidating Genome-Wide Protein-RNA Interactions Using Differential Evolution. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2019; 16:272-282. [PMID: 29990254 DOI: 10.1109/tcbb.2017.2776224] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
RNA-binding proteins (RBPs) play an important role in the post-transcriptional control of RNAs, such as splicing, polyadenylation, mRNA stabilization, mRNA localization, and translation. Thanks to the recent breakthrough, non-negative matrix factorization (NMF) has been developed to combine multiple data sources to discover non-overlapping and class-specific RNA binding patterns. However, several challenges still exist in determining the number of latent dimensions in the factorization steps. In most circumstances, it is often assumed that the number of latent dimensions (or components) is given. Such trial-and-error procedures can be tedious in practice. In order to address this problem, differential evolution algorithm is proposed as the model selection method to choose the suitable number of ranks, which can adaptively decompose the input protein-RNA data matrix into different nonnegative components. Experimental results demonstrate that the proposed algorithms can improve the factorization quality over the recent state-of-the-arts. The effectiveness of the proposed algorithms are supported by comprehensive performance benchmarking on 31 genome-wide cross-linking immunoprecipitation (CLIP) coupled with high-throughput sequencing (CLIP-seq) datasets. In addition, time complexity analysis and parameter analysis are conducted to demonstrate the robustness of the proposed methods.
Collapse
|
15
|
Ustianenko D, Chiu HS, Treiber T, Weyn-Vanhentenryck SM, Treiber N, Meister G, Sumazin P, Zhang C. LIN28 Selectively Modulates a Subclass of Let-7 MicroRNAs. Mol Cell 2018; 71:271-283.e5. [PMID: 30029005 PMCID: PMC6238216 DOI: 10.1016/j.molcel.2018.06.029] [Citation(s) in RCA: 73] [Impact Index Per Article: 12.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2017] [Revised: 04/27/2018] [Accepted: 06/19/2018] [Indexed: 02/06/2023]
Abstract
LIN28 is a bipartite RNA-binding protein that post-transcriptionally inhibits the biogenesis of let-7 microRNAs to regulate development and influence disease states. However, the mechanisms of let-7 suppression remain poorly understood because LIN28 recognition depends on coordinated targeting by both the zinc knuckle domain (ZKD), which binds a GGAG-like element in the precursor, and the cold shock domain (CSD), whose binding sites have not been systematically characterized. By leveraging single-nucleotide-resolution mapping of LIN28 binding sites in vivo, we determined that the CSD recognizes a (U)GAU motif. This motif partitions the let-7 microRNAs into two subclasses, precursors with both CSD and ZKD binding sites (CSD+) and precursors with ZKD but no CSD binding sites (CSD-). LIN28 in vivo recognition-and subsequent 3' uridylation and degradation-of CSD+ precursors is more efficient, leading to their stronger suppression in LIN28-activated cells and cancers. Thus, CSD binding sites amplify the regulatory effects of LIN28.
Collapse
Affiliation(s)
- Dmytro Ustianenko
- Department of Systems Biology, Columbia University, New York, NY 10032, USA; Department of Biochemistry and Molecular Biophysics, Columbia University, New York, NY 10032, USA; Center for Motor Neuron Biology and Disease, Columbia University, New York, NY 10032, USA
| | - Hua-Sheng Chiu
- Texas Children's Cancer Center, Baylor College of Medicine, Houston, TX 77030, USA
| | - Thomas Treiber
- Biochemistry Center Regensburg (BZR), Laboratory for RNA Biology, University of Regensburg, 93053 Regensburg, Germany
| | - Sebastien M Weyn-Vanhentenryck
- Department of Systems Biology, Columbia University, New York, NY 10032, USA; Department of Biochemistry and Molecular Biophysics, Columbia University, New York, NY 10032, USA; Center for Motor Neuron Biology and Disease, Columbia University, New York, NY 10032, USA
| | - Nora Treiber
- Biochemistry Center Regensburg (BZR), Laboratory for RNA Biology, University of Regensburg, 93053 Regensburg, Germany
| | - Gunter Meister
- Biochemistry Center Regensburg (BZR), Laboratory for RNA Biology, University of Regensburg, 93053 Regensburg, Germany
| | - Pavel Sumazin
- Texas Children's Cancer Center, Baylor College of Medicine, Houston, TX 77030, USA.
| | - Chaolin Zhang
- Department of Systems Biology, Columbia University, New York, NY 10032, USA.
| |
Collapse
|
16
|
Weyn-Vanhentenryck SM, Feng H, Ustianenko D, Duffié R, Yan Q, Jacko M, Martinez JC, Goodwin M, Zhang X, Hengst U, Lomvardas S, Swanson MS, Zhang C. Precise temporal regulation of alternative splicing during neural development. Nat Commun 2018; 9:2189. [PMID: 29875359 PMCID: PMC5989265 DOI: 10.1038/s41467-018-04559-0] [Citation(s) in RCA: 137] [Impact Index Per Article: 22.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2018] [Accepted: 05/09/2018] [Indexed: 12/13/2022] Open
Abstract
Alternative splicing (AS) is one crucial step of gene expression that must be tightly regulated during neurodevelopment. However, the precise timing of developmental splicing switches and the underlying regulatory mechanisms are poorly understood. Here we systematically analyze the temporal regulation of AS in a large number of transcriptome profiles of developing mouse cortices, in vivo purified neuronal subtypes, and neurons differentiated in vitro. Our analysis reveals early-switch and late-switch exons in genes with distinct functions, and these switches accurately define neuronal maturation stages. Integrative modeling suggests that these switches are under direct and combinatorial regulation by distinct sets of neuronal RNA-binding proteins including Nova, Rbfox, Mbnl, and Ptbp. Surprisingly, various neuronal subtypes in the sensory systems lack Nova and/or Rbfox expression. These neurons retain the "immature" splicing program in early-switch exons, affecting numerous synaptic genes. These results provide new insights into the organization and regulation of the neurodevelopmental transcriptome.
Collapse
Affiliation(s)
- Sebastien M Weyn-Vanhentenryck
- Department of Systems Biology, Department of Biochemistry and Molecular Biophysics, Center for Motor Neuron Biology and Disease, Columbia University, New York, NY, 10032, USA
| | - Huijuan Feng
- Department of Systems Biology, Department of Biochemistry and Molecular Biophysics, Center for Motor Neuron Biology and Disease, Columbia University, New York, NY, 10032, USA
- Department of Automation, MOE Key Laboratory of Bioinformatics and Bioinformatics Division, TNLIST, Tsinghua University, Beijing, 100084, China
| | - Dmytro Ustianenko
- Department of Systems Biology, Department of Biochemistry and Molecular Biophysics, Center for Motor Neuron Biology and Disease, Columbia University, New York, NY, 10032, USA
| | - Rachel Duffié
- Department of Biochemistry and Molecular Biophysics, Mortimer B. Zuckerman Mind Brain and Behavior Institute, Columbia University, New York, NY, 10027, USA
| | - Qinghong Yan
- Department of Systems Biology, Department of Biochemistry and Molecular Biophysics, Center for Motor Neuron Biology and Disease, Columbia University, New York, NY, 10032, USA
- Department of Comparative Biology and Safety Sciences, Amgen Inc., Cambridge, MA, 02141, USA
| | - Martin Jacko
- Department of Systems Biology, Department of Biochemistry and Molecular Biophysics, Center for Motor Neuron Biology and Disease, Columbia University, New York, NY, 10032, USA
| | - Jose C Martinez
- Department of Pathology and Cell Biology, The Taub Institute for Research on Alzheimer's Disease and the Aging Brain, Columbia University, New York, NY, 10032, USA
| | - Marianne Goodwin
- Department of Molecular Genetics and Microbiology, Center for NeuroGenetics and the Genetics Institute, University of Florida, College of Medicine, Gainesville, FL, 32610, USA
| | - Xuegong Zhang
- Department of Automation, MOE Key Laboratory of Bioinformatics and Bioinformatics Division, TNLIST, Tsinghua University, Beijing, 100084, China
| | - Ulrich Hengst
- Department of Pathology and Cell Biology, The Taub Institute for Research on Alzheimer's Disease and the Aging Brain, Columbia University, New York, NY, 10032, USA
| | - Stavros Lomvardas
- Department of Biochemistry and Molecular Biophysics, Mortimer B. Zuckerman Mind Brain and Behavior Institute, Columbia University, New York, NY, 10027, USA
| | - Maurice S Swanson
- Department of Molecular Genetics and Microbiology, Center for NeuroGenetics and the Genetics Institute, University of Florida, College of Medicine, Gainesville, FL, 32610, USA
| | - Chaolin Zhang
- Department of Systems Biology, Department of Biochemistry and Molecular Biophysics, Center for Motor Neuron Biology and Disease, Columbia University, New York, NY, 10032, USA.
| |
Collapse
|
17
|
Improving conditional random field model for prediction of protein-RNA residue-base contacts. QUANTITATIVE BIOLOGY 2018. [DOI: 10.1007/s40484-018-0136-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/16/2022]
|
18
|
Marco A. SeedVicious: Analysis of microRNA target and near-target sites. PLoS One 2018; 13:e0195532. [PMID: 29664927 PMCID: PMC5903666 DOI: 10.1371/journal.pone.0195532] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2017] [Accepted: 03/23/2018] [Indexed: 01/22/2023] Open
Abstract
Here I describe seedVicious, a versatile microRNA target site prediction software that can be easily fitted into annotation pipelines and run over custom datasets. SeedVicious finds microRNA canonical sites plus other, less efficient, target sites. Among other novel features, seedVicious can compute evolutionary gains/losses of target sites using maximum parsimony, and also detect near-target sites, which have one nucleotide different from a canonical site. Near-target sites are important to study population variation in microRNA regulation. Some analyses suggest that near-target sites may also be functional sites, although there is no conclusive evidence for that, and they may actually be target alleles segregating in a population. SeedVicious does not aim to outperform but to complement existing microRNA prediction tools. For instance, the precision of TargetScan is almost doubled (from 11% to ~20%) when we filter predictions by the distance between target sites using this program. Interestingly, two adjacent canonical target sites are more likely to be present in bona fide target transcripts than pairs of target sites at slightly longer distances. The software is written in Perl and runs on 64-bit Unix computers (Linux and MacOS X). Users with no computing experience can also run the program in a dedicated web-server by uploading custom data, or browse pre-computed predictions. SeedVicious and its associated web-server and database (SeedBank) are distributed under the GPL/GNU license.
Collapse
Affiliation(s)
- Antonio Marco
- School of Biological Sciences, University of Essex, Colchester, United Kingdom
- * E-mail:
| |
Collapse
|
19
|
Dotu I, Adamson SI, Coleman B, Fournier C, Ricart-Altimiras E, Eyras E, Chuang JH. SARNAclust: Semi-automatic detection of RNA protein binding motifs from immunoprecipitation data. PLoS Comput Biol 2018; 14:e1006078. [PMID: 29596423 PMCID: PMC5892938 DOI: 10.1371/journal.pcbi.1006078] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2017] [Revised: 04/10/2018] [Accepted: 03/05/2018] [Indexed: 12/02/2022] Open
Abstract
RNA-protein binding is critical to gene regulation, controlling fundamental processes including splicing, translation, localization and stability, and aberrant RNA-protein interactions are known to play a role in a wide variety of diseases. However, molecular understanding of RNA-protein interactions remains limited; in particular, identification of RNA motifs that bind proteins has long been challenging, especially when such motifs depend on both sequence and structure. Moreover, although RNA binding proteins (RBPs) often contain more than one binding domain, algorithms capable of identifying more than one binding motif simultaneously have not been developed. In this paper we present a novel pipeline to determine binding peaks in crosslinking immunoprecipitation (CLIP) data, to discover multiple possible RNA sequence/structure motifs among them, and to experimentally validate such motifs. At the core is a new semi-automatic algorithm SARNAclust, the first unsupervised method to identify and deconvolve multiple sequence/structure motifs simultaneously. SARNAclust computes similarity between sequence/structure objects using a graph kernel, providing the ability to isolate the impact of specific features through the bulge graph formalism. Application of SARNAclust to synthetic data shows its capability of clustering 5 motifs at once with a V-measure value of over 0.95, while GraphClust achieves only a V-measure of 0.083 and RNAcontext cannot detect any of the motifs. When applied to existing eCLIP sets, SARNAclust finds known motifs for SLBP and HNRNPC and novel motifs for several other RBPs such as AGGF1, AKAP8L and ILF3. We demonstrate an experimental validation protocol, a targeted Bind-n-Seq-like high-throughput sequencing approach that relies on RNA inverse folding for oligo pool design, that can validate the components within the SLBP motif. Finally, we use this protocol to experimentally interrogate the SARNAclust motif predictions for protein ILF3. Our results support a newly identified partially double-stranded UUUUUGAGA motif similar to that known for the splicing factor HNRNPC. RNA-protein binding is critical to gene regulation, and aberrant RNA-protein interactions play a role in a wide variety of diseases. However, molecular understanding of these interactions remains limited because of the difficulty of ascertaining the motifs that bind each protein. To address this challenge, we have developed a novel algorithm, SARNAclust, to computationally identify combined structure/sequence motifs from immunoprecipitation data. SARNAclust can deconvolve multiple motifs simultaneously and determine the importance of specific features through a graph kernel and bulge graph formalism. We have verified SARNAclust to be effective on synthetic motif data and also tested it on ENCODE eCLIP datasets, identifying known motifs and novel predictions. We have experimentally validated SARNAclust for two proteins, SLBP and ILF3, using RNA Bind-n-Seq measurements. Applying SARNAclust to ENCODE data provides new evidence for previously unknown regulatory interactions, notably splicing co-regulation by ILF3 and the splicing factor hnRNPC.
Collapse
Affiliation(s)
- Ivan Dotu
- The Jackson Laboratory for Genomic Medicine, Farmington, CT, United States of America
- Research Programme on Biomedical Informatics (GRIB), Hospital del Mar Medical Research Institute (IMIM)–Pompeu Fabra University (UPF), Barcelona, Spain
| | - Scott I. Adamson
- The Jackson Laboratory for Genomic Medicine, Farmington, CT, United States of America
- UCONN Health, Department of Genetics and Genome Sciences, Farmington, CT, United States of America
| | - Benjamin Coleman
- The Jackson Laboratory for Genomic Medicine, Farmington, CT, United States of America
| | - Cyril Fournier
- The Jackson Laboratory for Genomic Medicine, Farmington, CT, United States of America
| | - Emma Ricart-Altimiras
- The Jackson Laboratory for Genomic Medicine, Farmington, CT, United States of America
- Research Programme on Biomedical Informatics (GRIB), Hospital del Mar Medical Research Institute (IMIM)–Pompeu Fabra University (UPF), Barcelona, Spain
| | - Eduardo Eyras
- Research Programme on Biomedical Informatics (GRIB), Hospital del Mar Medical Research Institute (IMIM)–Pompeu Fabra University (UPF), Barcelona, Spain
- Catalan Institution for Research and Advanced Studies (ICREA), Barcelona, Spain
| | - Jeffrey H. Chuang
- The Jackson Laboratory for Genomic Medicine, Farmington, CT, United States of America
- UCONN Health, Department of Genetics and Genome Sciences, Farmington, CT, United States of America
- * E-mail:
| |
Collapse
|
20
|
Thomas JD, Oliveira R, Sznajder ŁJ, Swanson MS. Myotonic Dystrophy and Developmental Regulation of RNA Processing. Compr Physiol 2018; 8:509-553. [PMID: 29687899 PMCID: PMC11323716 DOI: 10.1002/cphy.c170002] [Citation(s) in RCA: 22] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022]
Abstract
Myotonic dystrophy (DM) is a multisystemic disorder caused by microsatellite expansion mutations in two unrelated genes leading to similar, yet distinct, diseases. DM disease presentation is highly variable and distinguished by differences in age-of-onset and symptom severity. In the most severe form, DM presents with congenital onset and profound developmental defects. At the molecular level, DM pathogenesis is characterized by a toxic RNA gain-of-function mechanism that involves the transcription of noncoding microsatellite expansions. These mutant RNAs disrupt key cellular pathways, including RNA processing, localization, and translation. In DM, these toxic RNA effects are predominantly mediated through the modulation of the muscleblind-like and CUGBP and ETR-3-like factor families of RNA binding proteins (RBPs). Dysfunction of these RBPs results in widespread RNA processing defects culminating in the expression of developmentally inappropriate protein isoforms in adult tissues. The tissue that is the focus of this review, skeletal muscle, is particularly sensitive to mutant RNA-responsive perturbations, as patients display a variety of developmental, structural, and functional defects in muscle. Here, we provide a comprehensive overview of DM1 and DM2 clinical presentation and pathology as well as the underlying cellular and molecular defects associated with DM disease onset and progression. Additionally, fundamental aspects of skeletal muscle development altered in DM are highlighted together with ongoing and potential therapeutic avenues to treat this muscular dystrophy. © 2018 American Physiological Society. Compr Physiol 8:509-553, 2018.
Collapse
Affiliation(s)
- James D. Thomas
- Department of Molecular Genetics and Microbiology, Center for NeuroGenetics and the Genetics Institute, University of Florida, College of Medicine, Gainesville, Florida, USA
| | - Ruan Oliveira
- Department of Molecular Genetics and Microbiology, Center for NeuroGenetics and the Genetics Institute, University of Florida, College of Medicine, Gainesville, Florida, USA
| | - Łukasz J. Sznajder
- Department of Molecular Genetics and Microbiology, Center for NeuroGenetics and the Genetics Institute, University of Florida, College of Medicine, Gainesville, Florida, USA
| | - Maurice S. Swanson
- Department of Molecular Genetics and Microbiology, Center for NeuroGenetics and the Genetics Institute, University of Florida, College of Medicine, Gainesville, Florida, USA
| |
Collapse
|
21
|
Tan JH, Fraser AG. The combinatorial control of alternative splicing in C. elegans. PLoS Genet 2017; 13:e1007033. [PMID: 29121637 PMCID: PMC5697891 DOI: 10.1371/journal.pgen.1007033] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/06/2017] [Revised: 11/21/2017] [Accepted: 09/19/2017] [Indexed: 12/31/2022] Open
Abstract
Normal development requires the right splice variants to be made in the right tissues at the right time. The core splicing machinery is engaged in all splicing events, but which precise splice variant is made requires the choice between alternative splice sites—for this to occur, a set of splicing factors (SFs) must recognize and bind to short RNA motifs in the pre-mRNA. In C. elegans, there is known to be extensive variation in splicing patterns across development, but little is known about the targets of each SF or how multiple SFs combine to regulate splicing. Here we combine RNA-seq with in vitro binding assays to study how 4 different C. elegans SFs, ASD-1, FOX-1, MEC-8, and EXC-7, regulate splicing. The 4 SFs chosen all have well-characterised biology and well-studied loss-of-function genetic alleles, and all contain RRM domains. Intriguingly, while the SFs we examined have varied roles in C. elegans development, they show an unexpectedly high overlap in their targets. We also find that binding sites for these SFs occur on the same pre-mRNAs more frequently than expected suggesting extensive combinatorial control of splicing. We confirm that regulation of splicing by multiple SFs is often combinatorial and show that this is functionally significant. We also find that SFs appear to combine to affect splicing in two modes—they either bind in close proximity within the same intron or they appear to bind to separate regions of the intron in a conserved order. Finally, we find that the genes whose splicing are regulated by multiple SFs are highly enriched for genes involved in the cytoskeleton and in ion channels that are key for neurotransmission. Together, this shows that specific classes of genes have complex combinatorial regulation of splicing and that this combinatorial regulation is critical for normal development to occur. Alternative splicing (AS) is a highly regulated process that is crucial for normal development. It requires the core splicing machinery, but the specific choice of splice site during AS is controlled by splicing factors (SFs) such as ELAV or RBFOX proteins that bind to specific sequences in pre-mRNAs to regulate usage of different splice sites. AS varies across the C. elegans life cycle and here we study how diverse SFs combine to regulate AS during C. elegans development. We selected 4 RRM-containing SFs that are all well studied and that have well-characterised loss-of-function genetic alleles. We find that these SFs regulate many of the same targets, and that combinatorial interactions between these SFs affect both individual splicing events and organism-level phenotypes including specific effects on the neuromuscular system. We further show that SFs combine to regulate splicing of an individual pre-mRNA in two distinct modes—either by binding in close proximity or by binding in a defined order on the pre-mRNA. Finally, we find that the genes whose splicing are most likely to be regulated by multiple SFs are genes that are required for the proper function of the neuromuscular system. These genes are also most likely to have changing AS patterns across development, suggesting that their splicing regulation is highly complex and developmentally regulated. Taken together, our data show that the precise splice variant expressed at any point in development is often the outcome of regulation by multiple SFs.
Collapse
Affiliation(s)
- June H. Tan
- The Donnelly Centre, University of Toronto, Toronto, ON, Canada
- Department of Molecular Genetics, University of Toronto, 1 King’s College Circle, Toronto, ON, Canada
| | - Andrew G. Fraser
- The Donnelly Centre, University of Toronto, Toronto, ON, Canada
- Department of Molecular Genetics, University of Toronto, 1 King’s College Circle, Toronto, ON, Canada
- * E-mail:
| |
Collapse
|
22
|
Li YE, Xiao M, Shi B, Yang YCT, Wang D, Wang F, Marcia M, Lu ZJ. Identification of high-confidence RNA regulatory elements by combinatorial classification of RNA-protein binding sites. Genome Biol 2017; 18:169. [PMID: 28886744 PMCID: PMC5591525 DOI: 10.1186/s13059-017-1298-8] [Citation(s) in RCA: 32] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2017] [Accepted: 08/14/2017] [Indexed: 12/20/2022] Open
Abstract
Crosslinking immunoprecipitation sequencing (CLIP-seq) technologies have enabled researchers to characterize transcriptome-wide binding sites of RNA-binding protein (RBP) with high resolution. We apply a soft-clustering method, RBPgroup, to various CLIP-seq datasets to group together RBPs that specifically bind the same RNA sites. Such combinatorial clustering of RBPs helps interpret CLIP-seq data and suggests functional RNA regulatory elements. Furthermore, we validate two RBP–RBP interactions in cell lines. Our approach links proteins and RNA motifs known to possess similar biochemical and cellular properties and can, when used in conjunction with additional experimental data, identify high-confidence RBP groups and their associated RNA regulatory elements.
Collapse
Affiliation(s)
- Yang Eric Li
- MOE Key Laboratory of Bioinformatics, Center for Synthetic and Systems Biology, School of Life Sciences, Tsinghua University, Beijing, 100084, China
| | - Mu Xiao
- Life Sciences Institute, Innovation Center for Cell Signaling Network, Zhejiang University, Hangzhou, Zhejiang, 310058, China
| | - Binbin Shi
- MOE Key Laboratory of Bioinformatics, Center for Synthetic and Systems Biology, School of Life Sciences, Tsinghua University, Beijing, 100084, China
| | - Yu-Cheng T Yang
- MOE Key Laboratory of Bioinformatics, Center for Synthetic and Systems Biology, School of Life Sciences, Tsinghua University, Beijing, 100084, China
| | - Dong Wang
- MOE Key Laboratory of Bioinformatics, Center for Synthetic and Systems Biology, School of Life Sciences, Tsinghua University, Beijing, 100084, China
| | - Fei Wang
- Life Sciences Institute, Innovation Center for Cell Signaling Network, Zhejiang University, Hangzhou, Zhejiang, 310058, China
| | - Marco Marcia
- European Molecular Biology Laboratory, Grenoble Outstation, 71 Avenue des Martyrs, Grenoble, 38042, France
| | - Zhi John Lu
- MOE Key Laboratory of Bioinformatics, Center for Synthetic and Systems Biology, School of Life Sciences, Tsinghua University, Beijing, 100084, China.
| |
Collapse
|
23
|
Savisaar R, Hurst LD. Estimating the prevalence of functional exonic splice regulatory information. Hum Genet 2017; 136:1059-1078. [PMID: 28405812 PMCID: PMC5602102 DOI: 10.1007/s00439-017-1798-3] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2017] [Accepted: 04/04/2017] [Indexed: 12/14/2022]
Abstract
In addition to coding information, human exons contain sequences necessary for correct splicing. These elements are known to be under purifying selection and their disruption can cause disease. However, the density of functional exonic splicing information remains profoundly uncertain. Several groups have experimentally investigated how mutations at different exonic positions affect splicing. They have found splice information to be distributed widely in exons, with one estimate putting the proportion of splicing-relevant nucleotides at >90%. These results suggest that splicing could place a major pressure on exon evolution. However, analyses of sequence conservation have concluded that the need to preserve splice regulatory signals only slightly constrains exon evolution, with a resulting decrease in the average human rate of synonymous evolution of only 1–4%. Why do these two lines of research come to such different conclusions? Among other reasons, we suggest that the methods are measuring different things: one assays the density of sites that affect splicing, the other the density of sites whose effects on splicing are visible to selection. In addition, the experimental methods typically consider short exons, thereby enriching for nucleotides close to the splice junction, such sites being enriched for splice-control elements. By contrast, in part owing to correction for nucleotide composition biases and to the assumption that constraint only operates on exon ends, the conservation-based methods can be overly conservative.
Collapse
Affiliation(s)
- Rosina Savisaar
- The Milner Centre for Evolution, Department of Biology and Biochemistry, University of Bath, Bath, BA2 7AY, UK.
| | - Laurence D Hurst
- The Milner Centre for Evolution, Department of Biology and Biochemistry, University of Bath, Bath, BA2 7AY, UK
| |
Collapse
|
24
|
Liu Y, Sun S, Bredy T, Wood M, Spitale RC, Baldi P. MotifMap-RNA: a genome-wide map of RBP binding sites. Bioinformatics 2017; 33:2029-2031. [DOI: 10.1093/bioinformatics/btx087] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2016] [Accepted: 02/07/2017] [Indexed: 01/18/2023] Open
Affiliation(s)
- Yu Liu
- Department of Computer Science and Institute for Genomics and Bioinformatics, University of California, Irvine, CA, USA
| | - Sha Sun
- Department of Developmental and Cell Biology, University of California, Irvine, CA, USA
| | - Timothy Bredy
- Department of Neurobiology and Behavior, University of California, Irvine, CA, USA
| | - Marcelo Wood
- Department of Neurobiology and Behavior, University of California, Irvine, CA, USA
| | - Robert C Spitale
- Department of Pharmaceutical Sciences and the Chao Family Comprehensive Cancer Center, University of California, Irvine, CA, USA
| | - Pierre Baldi
- Department of Computer Science and Institute for Genomics and Bioinformatics, University of California, Irvine, CA, USA
| |
Collapse
|
25
|
De S, Gorospe M. Bioinformatic tools for analysis of CLIP ribonucleoprotein data. WILEY INTERDISCIPLINARY REVIEWS-RNA 2016; 8. [PMID: 28008714 DOI: 10.1002/wrna.1404] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/22/2016] [Revised: 09/26/2016] [Accepted: 10/07/2016] [Indexed: 12/15/2022]
Abstract
Investigating the interactions of RNA-binding proteins (RBPs) with RNAs is a complex task for molecular and computational biologists. The molecular biology techniques and the computational approaches to understand RBP-RNA (or ribonucleoprotein, RNP) interactions have advanced considerably over the past few years and numerous and diverse software tools have been developed to analyze these data. Accordingly, laboratories interested in RNP biology face the challenge of choosing adequately among the available software tools those that best address the biological problem they are studying. Here, we focus on state-of-the-art molecular biology techniques that employ crosslinking and immunoprecipitation (CLIP) of an RBP to study and map RNP interactions. We review the different software tools and databases available to analyze the most widely used CLIP methods, HITS-CLIP, PAR-CLIP, and iCLIP. WIREs RNA 2017, 8:e1404. doi: 10.1002/wrna.1404 For further resources related to this article, please visit the WIREs website.
Collapse
Affiliation(s)
- Supriyo De
- Laboratory of Genetics and Genomics, National Institute on Aging, National Institutes of Health, Baltimore, MD, USA
| | - Myriam Gorospe
- Laboratory of Genetics and Genomics, National Institute on Aging, National Institutes of Health, Baltimore, MD, USA
| |
Collapse
|
26
|
Dose-Dependent Regulation of Alternative Splicing by MBNL Proteins Reveals Biomarkers for Myotonic Dystrophy. PLoS Genet 2016; 12:e1006316. [PMID: 27681373 PMCID: PMC5082313 DOI: 10.1371/journal.pgen.1006316] [Citation(s) in RCA: 60] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2015] [Accepted: 08/23/2016] [Indexed: 01/23/2023] Open
Abstract
Alternative splicing is a regulated process that results in expression of
specific mRNA and protein isoforms. Alternative splicing factors determine the
relative abundance of each isoform. Here we focus on MBNL1, a splicing factor
misregulated in the disease myotonic dystrophy. By altering the concentration of
MBNL1 in cells across a broad dynamic range, we show that different splicing
events require different amounts of MBNL1 for half-maximal response, and respond
more or less steeply to MBNL1. Motifs around MBNL1 exon 5 were studied to assess
how cis-elements mediate the MBNL1 dose-dependent splicing
response. A framework was developed to estimate MBNL concentration using
splicing responses alone, validated in the cell-based model, and applied to
myotonic dystrophy patient muscle. Using this framework, we evaluated the
ability of individual and combinations of splicing events to predict functional
MBNL concentration in human biopsies, as well as their performance as biomarkers
to assay mild, moderate, and severe cases of DM. Our studies provide insight into the mechanisms of myotonic dystrophy, the most
common adult form of muscular dystrophy. In this disease, a family of RNA
binding proteins is sequestered by toxic RNA, which leads to mis-regulation and
disease symptoms. We have created a cellular model with one of these family
members to study how these RNA binding proteins function in the absence of the
toxic RNA. In parallel, we analyzed transcriptomic data from over 50 individuals
(44 affected by myotonic dystrophy) with a range of disease severity. The
results from the transcriptomic data provide a rational approach to select
biomarkers for clinical research and therapeutic trials.
Collapse
|
27
|
Pietrosanto M, Mattei E, Helmer-Citterich M, Ferrè F. A novel method for the identification of conserved structural patterns in RNA: From small scale to high-throughput applications. Nucleic Acids Res 2016; 44:8600-8609. [PMID: 27580722 PMCID: PMC5062999 DOI: 10.1093/nar/gkw750] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2016] [Accepted: 08/17/2016] [Indexed: 12/21/2022] Open
Abstract
Functional RNA regions are often related to recurrent secondary structure patterns (or motifs), which can exert their role in several different ways, particularly in dictating the interaction with RNA-binding proteins, and acting in the regulation of a large number of cellular processes. Among the available motif-finding tools, the majority focuses on sequence patterns, sometimes including secondary structure as additional constraints to improve their performance. Nonetheless, secondary structures motifs may be concurrent to their sequence counterparts or even encode a stronger functional signal. Current methods for searching structural motifs generally require long pipelines and/or high computational efforts or previously aligned sequences. Here, we present BEAM (BEAr Motif finder), a novel method for structural motif discovery from a set of unaligned RNAs, taking advantage of a recently developed encoding for RNA secondary structure named BEAR (Brand nEw Alphabet for RNAs) and of evolutionary substitution rates of secondary structure elements. Tested in a varied set of scenarios, from small- to large-scale, BEAM is successful in retrieving structural motifs even in highly noisy data sets, such as those that can arise in CLIP-Seq or other high-throughput experiments.
Collapse
Affiliation(s)
- Marco Pietrosanto
- Centre for Molecular Bioinformatics, Department of Biology, University of Rome Tor Vergata, Via della Ricerca Scientifica snc, 00133 Rome, Italy
| | - Eugenio Mattei
- Centre for Molecular Bioinformatics, Department of Biology, University of Rome Tor Vergata, Via della Ricerca Scientifica snc, 00133 Rome, Italy
| | - Manuela Helmer-Citterich
- Centre for Molecular Bioinformatics, Department of Biology, University of Rome Tor Vergata, Via della Ricerca Scientifica snc, 00133 Rome, Italy
| | - Fabrizio Ferrè
- Department of Pharmacy and Biotechnology (FaBiT), University of Bologna Alma Mater, Via Belmeloro 8/2, 40126 Bologna, Italy
| |
Collapse
|
28
|
Geuens T, Bouhy D, Timmerman V. The hnRNP family: insights into their role in health and disease. Hum Genet 2016; 135:851-67. [PMID: 27215579 PMCID: PMC4947485 DOI: 10.1007/s00439-016-1683-5] [Citation(s) in RCA: 680] [Impact Index Per Article: 85.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2016] [Accepted: 05/09/2016] [Indexed: 12/14/2022]
Abstract
Heterogeneous nuclear ribonucleoproteins (hnRNPs) represent a large family of RNA-binding proteins (RBPs) that contribute to multiple aspects of nucleic acid metabolism including alternative splicing, mRNA stabilization, and transcriptional and translational regulation. Many hnRNPs share general features, but differ in domain composition and functional properties. This review will discuss the current knowledge about the different hnRNP family members, focusing on their structural and functional divergence. Additionally, we will highlight their involvement in neurodegenerative diseases and cancer, and the potential to develop RNA-based therapies.
Collapse
Affiliation(s)
- Thomas Geuens
- Peripheral Neuropathy Group, VIB Molecular Genetics Department, University of Antwerp-CDE, Parking P4, Building V, Room 1.30, Universiteitsplein 1, 2610, Antwerp, Belgium
- Neurogenetics Laboratory, Institute Born Bunge, University of Antwerp, Antwerp, Belgium
| | - Delphine Bouhy
- Peripheral Neuropathy Group, VIB Molecular Genetics Department, University of Antwerp-CDE, Parking P4, Building V, Room 1.30, Universiteitsplein 1, 2610, Antwerp, Belgium
- Neurogenetics Laboratory, Institute Born Bunge, University of Antwerp, Antwerp, Belgium
| | - Vincent Timmerman
- Peripheral Neuropathy Group, VIB Molecular Genetics Department, University of Antwerp-CDE, Parking P4, Building V, Room 1.30, Universiteitsplein 1, 2610, Antwerp, Belgium.
- Neurogenetics Laboratory, Institute Born Bunge, University of Antwerp, Antwerp, Belgium.
| |
Collapse
|
29
|
Dror I, Rohs R, Mandel-Gutfreund Y. How motif environment influences transcription factor search dynamics: Finding a needle in a haystack. Bioessays 2016; 38:605-12. [PMID: 27192961 PMCID: PMC5023137 DOI: 10.1002/bies.201600005] [Citation(s) in RCA: 38] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
Transcription factors (TFs) have to find their binding sites, which are distributed throughout the genome. Facilitated diffusion is currently the most widely accepted model for this search process. Based on this model the TF alternates between one-dimensional sliding along the DNA, and three-dimensional bulk diffusion. In this view, the non-specific associations between the proteins and the DNA play a major role in the search dynamics. However, little is known about how the DNA properties around the motif contribute to the search. Accumulating evidence showing that TF binding sites are embedded within a unique environment, specific to each TF, leads to the hypothesis that the search process is facilitated by favorable DNA features that help to improve the search efficiency. Here, we review the field and present the hypothesis that TF-DNA recognition is dictated not only by the motif, but is also influenced by the environment in which the motif resides.
Collapse
Affiliation(s)
- Iris Dror
- Department of Biology, Technion - Israel Institute of Technology, Technion City, Haifa, Israel.,Departments of Biological Sciences, Chemistry, Physics, and Computer Science, Molecular and Computational Biology Program, University of Southern California, Los Angeles, CA, USA
| | - Remo Rohs
- Departments of Biological Sciences, Chemistry, Physics, and Computer Science, Molecular and Computational Biology Program, University of Southern California, Los Angeles, CA, USA
| | - Yael Mandel-Gutfreund
- Department of Biology, Technion - Israel Institute of Technology, Technion City, Haifa, Israel
| |
Collapse
|
30
|
Blech-Hermoni Y, Dasgupta T, Coram RJ, Ladd AN. Identification of Targets of CUG-BP, Elav-Like Family Member 1 (CELF1) Regulation in Embryonic Heart Muscle. PLoS One 2016; 11:e0149061. [PMID: 26866591 PMCID: PMC4750973 DOI: 10.1371/journal.pone.0149061] [Citation(s) in RCA: 24] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2015] [Accepted: 12/30/2015] [Indexed: 01/17/2023] Open
Abstract
CUG-BP, Elav-like family member 1 (CELF1) is a highly conserved RNA binding protein that regulates pre-mRNA alternative splicing, polyadenylation, mRNA stability, and translation. In the heart, CELF1 is expressed in the myocardium, where its levels are tightly regulated during development. CELF1 levels peak in the heart during embryogenesis, and aberrant up-regulation of CELF1 in the adult heart has been implicated in cardiac pathogenesis in myotonic dystrophy type 1, as well as in diabetic cardiomyopathy. Either inhibition of CELF activity or over-expression of CELF1 in heart muscle causes cardiomyopathy in transgenic mice. Nonetheless, many of the cardiac targets of CELF1 regulation remain unknown. In this study, to identify cardiac targets of CELF1 we performed cross-linking immunoprecipitation (CLIP) for CELF1 from embryonic day 8 chicken hearts. We identified a previously unannotated exon in MYH7B as a novel target of CELF1-mediated regulation. We demonstrated that knockdown of CELF1 in primary chicken embryonic cardiomyocytes leads to increased inclusion of this exon and decreased MYH7B levels. We also investigated global changes in the transcriptome of primary embryonic cardiomyocytes following CELF1 knockdown in a published RNA-seq dataset. Pathway and network analyses identified strong associations between CELF1 and regulation of cell cycle and translation. Important regulatory proteins, including both RNA binding proteins and a cardiac transcription factor, were affected by loss of CELF1. Together, these data suggest that CELF1 is a key regulator of cardiomyocyte gene expression.
Collapse
Affiliation(s)
- Yotam Blech-Hermoni
- Department of Cellular and Molecular Medicine, Lerner Research Institute, Cleveland Clinic, Cleveland, Ohio, United States of America
- Program in Cell Biology, Department of Molecular Biology and Microbiology, Case Western Reserve University School of Medicine, Cleveland, Ohio, United States of America
| | - Twishasri Dasgupta
- Department of Cellular and Molecular Medicine, Lerner Research Institute, Cleveland Clinic, Cleveland, Ohio, United States of America
| | - Ryan J. Coram
- Department of Cellular and Molecular Medicine, Lerner Research Institute, Cleveland Clinic, Cleveland, Ohio, United States of America
| | - Andrea N. Ladd
- Department of Cellular and Molecular Medicine, Lerner Research Institute, Cleveland Clinic, Cleveland, Ohio, United States of America
- Program in Cell Biology, Department of Molecular Biology and Microbiology, Case Western Reserve University School of Medicine, Cleveland, Ohio, United States of America
- * E-mail:
| |
Collapse
|
31
|
Stražar M, Žitnik M, Zupan B, Ule J, Curk T. Orthogonal matrix factorization enables integrative analysis of multiple RNA binding proteins. Bioinformatics 2016; 32:1527-35. [PMID: 26787667 PMCID: PMC4894278 DOI: 10.1093/bioinformatics/btw003] [Citation(s) in RCA: 77] [Impact Index Per Article: 9.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2015] [Accepted: 01/01/2016] [Indexed: 12/15/2022] Open
Abstract
Motivation: RNA binding proteins (RBPs) play important roles in post-transcriptional control of gene expression, including splicing, transport, polyadenylation and RNA stability. To model protein–RNA interactions by considering all available sources of information, it is necessary to integrate the rapidly growing RBP experimental data with the latest genome annotation, gene function, RNA sequence and structure. Such integration is possible by matrix factorization, where current approaches have an undesired tendency to identify only a small number of the strongest patterns with overlapping features. Because protein–RNA interactions are orchestrated by multiple factors, methods that identify discriminative patterns of varying strengths are needed. Results: We have developed an integrative orthogonality-regularized nonnegative matrix factorization (iONMF) to integrate multiple data sources and discover non-overlapping, class-specific RNA binding patterns of varying strengths. The orthogonality constraint halves the effective size of the factor model and outperforms other NMF models in predicting RBP interaction sites on RNA. We have integrated the largest data compendium to date, which includes 31 CLIP experiments on 19 RBPs involved in splicing (such as hnRNPs, U2AF2, ELAVL1, TDP-43 and FUS) and processing of 3’UTR (Ago, IGF2BP). We show that the integration of multiple data sources improves the predictive accuracy of retrieval of RNA binding sites. In our study the key predictive factors of protein–RNA interactions were the position of RNA structure and sequence motifs, RBP co-binding and gene region type. We report on a number of protein-specific patterns, many of which are consistent with experimentally determined properties of RBPs. Availability and implementation: The iONMF implementation and example datasets are available at https://github.com/mstrazar/ionmf. Contact: tomaz.curk@fri.uni-lj.si Supplementary information:Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Martin Stražar
- University of Ljubljana, Faculty of Computer and Information Science, Ljubljana, SI 1000, Slovenia
| | - Marinka Žitnik
- University of Ljubljana, Faculty of Computer and Information Science, Ljubljana, SI 1000, Slovenia
| | - Blaž Zupan
- University of Ljubljana, Faculty of Computer and Information Science, Ljubljana, SI 1000, Slovenia Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA
| | - Jernej Ule
- Department of Molecular Neuroscience, UCL Institute of Neurology, Queen Square, London WC1N 3BG, UK
| | - Tomaž Curk
- University of Ljubljana, Faculty of Computer and Information Science, Ljubljana, SI 1000, Slovenia
| |
Collapse
|
32
|
mCarts: Genome-Wide Prediction of Clustered Sequence Motifs as Binding Sites for RNA-Binding Proteins. Methods Mol Biol 2016; 1421:215-26. [PMID: 26965268 DOI: 10.1007/978-1-4939-3591-8_17] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022]
Abstract
RNA-binding proteins (RBPs) are critical components of post-transcriptional gene expression regulation. However, their binding sites have until recently been difficult to determine due to the apparent low specificity of RBPs for their target transcripts and the lack of high-throughput assays for analyzing binding sites genome wide. Here we present a bioinformatics method for predicting RBP binding motif sites on a genome-wide scale that leverages motif conservation, RNA secondary structure, and the tendency of RBP binding sites to cluster together. A probabilistic model is learned from bona fide binding sites determined by CLIP and applied genome wide to generate high specificity binding site predictions.
Collapse
|
33
|
Hu X, Wu Y, Lu ZJ, Yip KY. Analysis of sequencing data for probing RNA secondary structures and protein–RNA binding in studying posttranscriptional regulations. Brief Bioinform 2015; 17:1032-1043. [DOI: 10.1093/bib/bbv106] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/10/2015] [Revised: 10/11/2015] [Indexed: 11/12/2022] Open
|
34
|
Goodwin M, Mohan A, Batra R, Lee KY, Charizanis K, Fernández Gómez FJ, Eddarkaoui S, Sergeant N, Buée L, Kimura T, Clark HB, Dalton J, Takamura K, Weyn-Vanhentenryck SM, Zhang C, Reid T, Ranum LPW, Day JW, Swanson MS. MBNL Sequestration by Toxic RNAs and RNA Misprocessing in the Myotonic Dystrophy Brain. Cell Rep 2015; 12:1159-68. [PMID: 26257173 DOI: 10.1016/j.celrep.2015.07.029] [Citation(s) in RCA: 100] [Impact Index Per Article: 11.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2015] [Revised: 06/24/2015] [Accepted: 07/14/2015] [Indexed: 11/19/2022] Open
Abstract
For some neurological disorders, disease is primarily RNA mediated due to expression of non-coding microsatellite expansion RNAs (RNA(exp)). Toxicity is thought to result from enhanced binding of proteins to these expansions and depletion from their normal cellular targets. However, experimental evidence for this sequestration model is lacking. Here, we use HITS-CLIP and pre-mRNA processing analysis of human control versus myotonic dystrophy (DM) brains to provide compelling evidence for this RNA toxicity model. MBNL2 binds directly to DM repeat expansions in the brain, resulting in depletion from its normal RNA targets with downstream effects on alternative splicing and polyadenylation. Similar RNA processing defects were detected in Mbnl compound-knockout mice, highlighted by dysregulation of Mapt splicing and fetal tau isoform expression in adults. These results demonstrate that MBNL proteins are directly sequestered by RNA(exp) in the DM brain and introduce a powerful experimental tool to evaluate RNA-mediated toxicity in other expansion diseases.
Collapse
Affiliation(s)
- Marianne Goodwin
- Department of Molecular Genetics and Microbiology, Center for NeuroGenetics and the Genetics Institute, University of Florida, College of Medicine, Gainesville, FL 32610, USA
| | - Apoorva Mohan
- Department of Molecular Genetics and Microbiology, Center for NeuroGenetics and the Genetics Institute, University of Florida, College of Medicine, Gainesville, FL 32610, USA
| | - Ranjan Batra
- Department of Molecular Genetics and Microbiology, Center for NeuroGenetics and the Genetics Institute, University of Florida, College of Medicine, Gainesville, FL 32610, USA
| | - Kuang-Yung Lee
- Department of Molecular Genetics and Microbiology, Center for NeuroGenetics and the Genetics Institute, University of Florida, College of Medicine, Gainesville, FL 32610, USA; Department of Neurology, Chang Gung Memorial Hospital, Keelung 20401, Taiwan
| | - Konstantinos Charizanis
- Department of Molecular Genetics and Microbiology, Center for NeuroGenetics and the Genetics Institute, University of Florida, College of Medicine, Gainesville, FL 32610, USA; InSiliGen LLC, Gainesville, FL 32606, USA
| | - Francisco José Fernández Gómez
- Inserm UMR S1172, Alzheimer and Tauopathies, Université Lille Nord de France, Centre Jean-Pierre Aubert, 1 Place Verdun, 59045 Lille, France
| | - Sabiha Eddarkaoui
- Inserm UMR S1172, Alzheimer and Tauopathies, Université Lille Nord de France, Centre Jean-Pierre Aubert, 1 Place Verdun, 59045 Lille, France
| | - Nicolas Sergeant
- Inserm UMR S1172, Alzheimer and Tauopathies, Université Lille Nord de France, Centre Jean-Pierre Aubert, 1 Place Verdun, 59045 Lille, France
| | - Luc Buée
- Inserm UMR S1172, Alzheimer and Tauopathies, Université Lille Nord de France, Centre Jean-Pierre Aubert, 1 Place Verdun, 59045 Lille, France
| | - Takashi Kimura
- Division of Neurology, Department of Internal Medicine, Hyogo College of Medicine, Hyogo 663-8501, Japan
| | - H Brent Clark
- Departments of Laboratory Medicine and Pathology, Neurology, Neurosurgery, and Genetics, Cell Biology, and Development, University of Minnesota Medical School, Minneapolis, MN 55455, USA
| | - Joline Dalton
- Departments of Laboratory Medicine and Pathology, Neurology, Neurosurgery, and Genetics, Cell Biology, and Development, University of Minnesota Medical School, Minneapolis, MN 55455, USA
| | - Kenji Takamura
- Departments of Laboratory Medicine and Pathology, Neurology, Neurosurgery, and Genetics, Cell Biology, and Development, University of Minnesota Medical School, Minneapolis, MN 55455, USA
| | - Sebastien M Weyn-Vanhentenryck
- Department of Systems Biology, Department of Biochemistry and Molecular Biophysics, Center for Motor Neuron Biology and Disease, Columbia University, New York, NY 10032, USA
| | - Chaolin Zhang
- Department of Systems Biology, Department of Biochemistry and Molecular Biophysics, Center for Motor Neuron Biology and Disease, Columbia University, New York, NY 10032, USA
| | - Tammy Reid
- Department of Molecular Genetics and Microbiology, Center for NeuroGenetics and the Genetics Institute, University of Florida, College of Medicine, Gainesville, FL 32610, USA
| | - Laura P W Ranum
- Department of Molecular Genetics and Microbiology, Center for NeuroGenetics and the Genetics Institute, University of Florida, College of Medicine, Gainesville, FL 32610, USA
| | - John W Day
- Department of Neurology and Neurological Sciences, School of Medicine, Stanford University, Palo Alto, CA 94305, USA
| | - Maurice S Swanson
- Department of Molecular Genetics and Microbiology, Center for NeuroGenetics and the Genetics Institute, University of Florida, College of Medicine, Gainesville, FL 32610, USA.
| |
Collapse
|
35
|
Raj B, Blencowe B. Alternative Splicing in the Mammalian Nervous System: Recent Insights into Mechanisms and Functional Roles. Neuron 2015; 87:14-27. [DOI: 10.1016/j.neuron.2015.05.004] [Citation(s) in RCA: 329] [Impact Index Per Article: 36.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
|
36
|
Wang T, Xiao G, Chu Y, Zhang MQ, Corey DR, Xie Y. Design and bioinformatics analysis of genome-wide CLIP experiments. Nucleic Acids Res 2015; 43:5263-74. [PMID: 25958398 PMCID: PMC4477666 DOI: 10.1093/nar/gkv439] [Citation(s) in RCA: 62] [Impact Index Per Article: 6.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2014] [Accepted: 04/23/2015] [Indexed: 01/05/2023] Open
Abstract
The past decades have witnessed a surge of discoveries revealing RNA regulation as a central player in cellular processes. RNAs are regulated by RNA-binding proteins (RBPs) at all post-transcriptional stages, including splicing, transportation, stabilization and translation. Defects in the functions of these RBPs underlie a broad spectrum of human pathologies. Systematic identification of RBP functional targets is among the key biomedical research questions and provides a new direction for drug discovery. The advent of cross-linking immunoprecipitation coupled with high-throughput sequencing (genome-wide CLIP) technology has recently enabled the investigation of genome-wide RBP–RNA binding at single base-pair resolution. This technology has evolved through the development of three distinct versions: HITS-CLIP, PAR-CLIP and iCLIP. Meanwhile, numerous bioinformatics pipelines for handling the genome-wide CLIP data have also been developed. In this review, we discuss the genome-wide CLIP technology and focus on bioinformatics analysis. Specifically, we compare the strengths and weaknesses, as well as the scopes, of various bioinformatics tools. To assist readers in choosing optimal procedures for their analysis, we also review experimental design and procedures that affect bioinformatics analyses.
Collapse
Affiliation(s)
- Tao Wang
- Quantitative Biomedical Research Center, University of Texas Southwestern Medical Center, 5323 Harry Hines Boulevard, Dallas, TX 75390, USA
| | - Guanghua Xiao
- Quantitative Biomedical Research Center, University of Texas Southwestern Medical Center, 5323 Harry Hines Boulevard, Dallas, TX 75390, USA
| | - Yongjun Chu
- Departments of Pharmacology and Biochemistry, University of Texas Southwestern Medical Center, 5323 Harry Hines Boulevard, Dallas, TX 75390, USA
| | - Michael Q Zhang
- Department of Biological Sciences, Center for Systems Biology, The University of Texas at Dallas, Richardson, TX 75080, USA Bioinformatics Division, Center for Synthetic and System Biology, TNLIST, Tsinghua University, Beijing 100084, China
| | - David R Corey
- Departments of Pharmacology and Biochemistry, University of Texas Southwestern Medical Center, 5323 Harry Hines Boulevard, Dallas, TX 75390, USA
| | - Yang Xie
- Quantitative Biomedical Research Center, University of Texas Southwestern Medical Center, 5323 Harry Hines Boulevard, Dallas, TX 75390, USA Harold C. Simmons Comprehensive Cancer Center, University of Texas Southwestern Medical Center, 5323 Harry Hines Boulevard, Dallas, TX 75390, USA
| |
Collapse
|
37
|
Bahrami-Samani E, Vo DT, de Araujo PR, Vogel C, Smith AD, Penalva LOF, Uren PJ. Computational challenges, tools, and resources for analyzing co- and post-transcriptional events in high throughput. WILEY INTERDISCIPLINARY REVIEWS. RNA 2015; 6:291-310. [PMID: 25515586 PMCID: PMC4397117 DOI: 10.1002/wrna.1274] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/16/2014] [Revised: 10/24/2014] [Accepted: 10/29/2014] [Indexed: 11/10/2022]
Abstract
Co- and post-transcriptional regulation of gene expression is complex and multifaceted, spanning the complete RNA lifecycle from genesis to decay. High-throughput profiling of the constituent events and processes is achieved through a range of technologies that continue to expand and evolve. Fully leveraging the resulting data is nontrivial, and requires the use of computational methods and tools carefully crafted for specific data sources and often intended to probe particular biological processes. Drawing upon databases of information pre-compiled by other researchers can further elevate analyses. Within this review, we describe the major co- and post-transcriptional events in the RNA lifecycle that are amenable to high-throughput profiling. We place specific emphasis on the analysis of the resulting data, in particular the computational tools and resources available, as well as looking toward future challenges that remain to be addressed.
Collapse
Affiliation(s)
- Emad Bahrami-Samani
- Molecular and Computational Biology, Department of Biological Sciences, University of Southern California, Los Angeles, CA
| | - Dat T. Vo
- Children’s Cancer Research Institute and Department of Cellular and Structural Biology, University of Texas Health Science Center, San Antonio, TX
| | - Patricia Rosa de Araujo
- Children’s Cancer Research Institute and Department of Cellular and Structural Biology, University of Texas Health Science Center, San Antonio, TX
| | - Christine Vogel
- Center for Genomics and Systems Biology, Department of Biology, New York University, New York, NY
| | - Andrew D. Smith
- Molecular and Computational Biology, Department of Biological Sciences, University of Southern California, Los Angeles, CA
| | - Luiz O. F. Penalva
- Children’s Cancer Research Institute and Department of Cellular and Structural Biology, University of Texas Health Science Center, San Antonio, TX
| | - Philip J. Uren
- Molecular and Computational Biology, Department of Biological Sciences, University of Southern California, Los Angeles, CA
| |
Collapse
|
38
|
de Klerk E, 't Hoen PAC. Alternative mRNA transcription, processing, and translation: insights from RNA sequencing. Trends Genet 2015; 31:128-39. [PMID: 25648499 DOI: 10.1016/j.tig.2015.01.001] [Citation(s) in RCA: 226] [Impact Index Per Article: 25.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2014] [Revised: 12/22/2014] [Accepted: 01/05/2015] [Indexed: 12/13/2022]
Abstract
The human transcriptome comprises >80,000 protein-coding transcripts and the estimated number of proteins synthesized from these transcripts is in the range of 250,000 to 1 million. These transcripts and proteins are encoded by less than 20,000 genes, suggesting extensive regulation at the transcriptional, post-transcriptional, and translational level. Here we review how RNA sequencing (RNA-seq) technologies have increased our understanding of the mechanisms that give rise to alternative transcripts and their alternative translation. We highlight four different regulatory processes: alternative transcription initiation, alternative splicing, alternative polyadenylation, and alternative translation initiation. We discuss their transcriptome-wide distribution, their impact on protein expression, their biological relevance, and the possible molecular mechanisms leading to their alternative regulation. We conclude with a discussion of the coordination and the interdependence of these four regulatory layers.
Collapse
Affiliation(s)
- Eleonora de Klerk
- Department of Human Genetics, Leiden University Medical Center, Leiden, The Netherlands
| | - Peter A C 't Hoen
- Department of Human Genetics, Leiden University Medical Center, Leiden, The Netherlands.
| |
Collapse
|
39
|
Nicastro G, Taylor IA, Ramos A. KH-RNA interactions: back in the groove. Curr Opin Struct Biol 2015; 30:63-70. [PMID: 25625331 DOI: 10.1016/j.sbi.2015.01.002] [Citation(s) in RCA: 98] [Impact Index Per Article: 10.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2014] [Revised: 12/20/2014] [Accepted: 01/08/2015] [Indexed: 12/30/2022]
Abstract
The hnRNP K-homology (KH) domain is a single stranded nucleic acid binding domain that mediates RNA target recognition by a large group of gene regulators. The structure of the KH fold is well characterised and some initial rules for KH-RNA recognition have been drafted. However, recent findings have shown that these rules need to be revisited and have now provided a better understanding of how the domain can recognise a sequence landscape larger than previously thought as well as revealing the diversity of structural expansions to the KH domain. Finally, novel structural and functional data show how multiple KH domains act in a combinatorial fashion to both allow recognition of longer RNA motifs and remodelling of the RNA structure. These advances set the scene for a detailed molecular understanding of KH selection of the cellular targets.
Collapse
Affiliation(s)
- Giuseppe Nicastro
- Division of Molecular Structure, MRC National Institute for Medical Research, London, UK
| | - Ian A Taylor
- Division of Molecular Structure, MRC National Institute for Medical Research, London, UK
| | - Andres Ramos
- Research Department of Structural and Molecular Biology, University College London, London, UK; Division of Molecular Structure, MRC National Institute for Medical Research, London, UK.
| |
Collapse
|
40
|
Coelho MB, Attig J, Bellora N, König J, Hallegger M, Kayikci M, Eyras E, Ule J, Smith CWJ. Nuclear matrix protein Matrin3 regulates alternative splicing and forms overlapping regulatory networks with PTB. EMBO J 2015; 34:653-68. [PMID: 25599992 PMCID: PMC4365034 DOI: 10.15252/embj.201489852] [Citation(s) in RCA: 109] [Impact Index Per Article: 12.1] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022] Open
Abstract
Matrin3 is an RNA- and DNA-binding nuclear matrix protein found to be associated with neural and muscular degenerative diseases. A number of possible functions of Matrin3 have been suggested, but no widespread role in RNA metabolism has yet been clearly demonstrated. We identified Matrin3 by its interaction with the second RRM domain of the splicing regulator PTB. Using a combination of RNAi knockdown, transcriptome profiling and iCLIP, we find that Matrin3 is a regulator of hundreds of alternative splicing events, principally acting as a splicing repressor with only a small proportion of targeted events being co-regulated by PTB. In contrast to other splicing regulators, Matrin3 binds to an extended region within repressed exons and flanking introns with no sharply defined peaks. The identification of this clear molecular function of Matrin3 should help to clarify the molecular pathology of ALS and other diseases caused by mutations of Matrin3.
Collapse
Affiliation(s)
- Miguel B Coelho
- Department of Biochemistry, University of Cambridge, Cambridge, UK
| | - Jan Attig
- Department of Molecular Neuroscience, UCL Institute of Neurology, London, UK MRC-Laboratory of Molecular Biology, Cambridge, UK
| | - Nicolás Bellora
- Computational Genomics, Universitat Pompeu Fabra, Barcelona, Spain Catalan Institute for Research and Advanced Studies (ICREA), Barcelona, Spain INIBIOMA CONICET-UNComahue, Bariloche, Argentina
| | - Julian König
- MRC-Laboratory of Molecular Biology, Cambridge, UK
| | - Martina Hallegger
- Department of Biochemistry, University of Cambridge, Cambridge, UK Department of Molecular Neuroscience, UCL Institute of Neurology, London, UK
| | | | - Eduardo Eyras
- Computational Genomics, Universitat Pompeu Fabra, Barcelona, Spain Catalan Institute for Research and Advanced Studies (ICREA), Barcelona, Spain
| | - Jernej Ule
- Department of Molecular Neuroscience, UCL Institute of Neurology, London, UK
| | | |
Collapse
|
41
|
Abstract
RNA-binding proteins (RBPs) are important regulators of eukaryotic gene expression. Genomes typically encode dozens to hundreds of proteins containing RNA-binding domains, which collectively recognize diverse RNA sequences and structures. Recent advances in high-throughput methods for assaying the targets of RBPs in vitro and in vivo allow large-scale derivation of RNA-binding motifs as well as determination of RNA–protein interactions in living cells. In parallel, many computational methods have been developed to analyze and interpret these data. The interplay between RNA secondary structure and RBP binding has also been a growing theme. Integrating RNA–protein interaction data with observations of post-transcriptional regulation will enhance our understanding of the roles of these important proteins.
Collapse
|
42
|
Bahrami-Samani E, Penalva LOF, Smith AD, Uren PJ. Leveraging cross-link modification events in CLIP-seq for motif discovery. Nucleic Acids Res 2014; 43:95-103. [PMID: 25505146 PMCID: PMC4288180 DOI: 10.1093/nar/gku1288] [Citation(s) in RCA: 35] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022] Open
Abstract
High-throughput protein-RNA interaction data generated by CLIP-seq has provided an unprecedented depth of access to the activities of RNA-binding proteins (RBPs), the key players in co- and post-transcriptional regulation of gene expression. Motif discovery forms part of the necessary follow-up data analysis for CLIP-seq, both to refine the exact locations of RBP binding sites, and to characterize them. The specific properties of RBP binding sites, and the CLIP-seq methods, provide additional information not usually present in the classic motif discovery problem: the binding site structure, and cross-linking induced events in reads. We show that CLIP-seq data contains clear secondary structure signals, as well as technology- and RBP-specific cross-link signals. We introduce Zagros, a motif discovery algorithm specifically designed to leverage this information and explore its impact on the quality of recovered motifs. Our results indicate that using both secondary structure and cross-link modifications can greatly improve motif discovery on CLIP-seq data. Further, the motifs we recover provide insight into the balance between sequence- and structure-specificity struck by RBP binding.
Collapse
Affiliation(s)
- Emad Bahrami-Samani
- Molecular and Computational Biology, University of Southern California, Los Angeles, CA 90089, USA
| | - Luiz O F Penalva
- Children's Cancer Research Institute and Department of Cellular and Structural Biology, University of Texas Health Science Center, San Antonio, TX 78229, USA
| | - Andrew D Smith
- Molecular and Computational Biology, University of Southern California, Los Angeles, CA 90089, USA
| | - Philip J Uren
- Molecular and Computational Biology, University of Southern California, Los Angeles, CA 90089, USA
| |
Collapse
|
43
|
Reyes-Herrera PH, Ficarra E. Computational Methods for CLIP-seq Data Processing. Bioinform Biol Insights 2014; 8:199-207. [PMID: 25336930 PMCID: PMC4196881 DOI: 10.4137/bbi.s16803] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2014] [Revised: 07/29/2014] [Accepted: 08/01/2014] [Indexed: 12/25/2022] Open
Abstract
RNA-binding proteins (RBPs) are at the core of post-transcriptional regulation and thus of gene expression control at the RNA level. One of the principal challenges in the field of gene expression regulation is to understand RBPs mechanism of action. As a result of recent evolution of experimental techniques, it is now possible to obtain the RNA regions recognized by RBPs on a transcriptome-wide scale. In fact, CLIP-seq protocols use the joint action of CLIP, crosslinking immunoprecipitation, and high-throughput sequencing to recover the transcriptome-wide set of interaction regions for a particular protein. Nevertheless, computational methods are necessary to process CLIP-seq experimental data and are a key to advancement in the understanding of gene regulatory mechanisms. Considering the importance of computational methods in this area, we present a review of the current status of computational approaches used and proposed for CLIP-seq data.
Collapse
Affiliation(s)
- Paula H Reyes-Herrera
- Facultad de Ingeniería Electrónica y Biomédica, Universidad Antonio Nariño, Bogotá, Colombia
| | - Elisa Ficarra
- Department of Control and Computer Engineering, Politecnico di Torino, TO, Italy
| |
Collapse
|
44
|
Mickleburgh I, Kafasla P, Cherny D, Llorian M, Curry S, Jackson RJ, Smith CWJ. The organization of RNA contacts by PTB for regulation of FAS splicing. Nucleic Acids Res 2014; 42:8605-20. [PMID: 24957602 PMCID: PMC4117754 DOI: 10.1093/nar/gku519] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Post-transcriptional steps of gene expression are regulated by RNA binding proteins. Major progress has been made in characterizing RNA-protein interactions, from high resolution structures to transcriptome-wide profiling. Due to the inherent technical challenges, less attention has been paid to the way in which proteins with multiple RNA binding domains engage with target RNAs. We have investigated how the four RNA recognition motif (RRM) domains of Polypyrimidine tract binding (PTB) protein, a major splicing regulator, interact with FAS pre-mRNA under conditions in which PTB represses FAS exon 6 splicing. A combination of tethered hydroxyl radical probing, targeted inactivation of individual RRMs and single molecule analyses revealed an unequal division of labour between the four RRMs of PTB. RNA binding by RRM4 is the most important for function despite the low intrinsic binding specificity and the complete lack of effect of disrupting individual RRM4 contact points on the RNA. The ordered RRM3-4 di-domain packing provides an extended binding surface for RNA interacting at RRM4, via basic residues in the preceding linker. Our results illustrate how multiple alternative low-specificity binding configurations of RRM4 are consistent with repressor function as long as the overall ribonucleoprotein architecture provided by appropriate di-domain packing is maintained.
Collapse
Affiliation(s)
- Ian Mickleburgh
- Department of Biochemistry, University of Cambridge, Downing Site, Tennis Court Road, Cambridge, CB2 1QW, UK
| | - Panagiota Kafasla
- Department of Biochemistry, University of Cambridge, Downing Site, Tennis Court Road, Cambridge, CB2 1QW, UK
| | - Dmitry Cherny
- Department of Biochemistry, Henry Wellcome Building, University of Leicester, Lancaster Road, Leicester LE1 9HN, UK
| | - Miriam Llorian
- Department of Biochemistry, University of Cambridge, Downing Site, Tennis Court Road, Cambridge, CB2 1QW, UK
| | - Stephen Curry
- Division of Cell and Molecular Biology, Imperial College, Exhibition Road, London SW7 2AZ, UK
| | - Richard J Jackson
- Department of Biochemistry, University of Cambridge, Downing Site, Tennis Court Road, Cambridge, CB2 1QW, UK
| | - Christopher W J Smith
- Department of Biochemistry, University of Cambridge, Downing Site, Tennis Court Road, Cambridge, CB2 1QW, UK
| |
Collapse
|
45
|
Kloetgen A, Münch PC, Borkhardt A, Hoell JI, McHardy AC. Biochemical and bioinformatic methods for elucidating the role of RNA-protein interactions in posttranscriptional regulation. Brief Funct Genomics 2014; 14:102-14. [PMID: 24951655 PMCID: PMC4471435 DOI: 10.1093/bfgp/elu020] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/03/2023] Open
Abstract
Our understanding of transcriptional gene regulation has dramatically increased over the past decades, and many regulators of gene expression, such as transcription factors, have been analyzed extensively. Additionally, in recent years, deeper insights into the physiological roles of RNA have been obtained. More precisely, splicing, polyadenylation, various modifications, localization and the translation of messenger RNAs (mRNAs) are regulated by their interaction with RNA-binding proteins (RBPs). New technologies now enable the analysis of this regulation at different levels. A technique known as ultraviolet (UV) cross-linking and immunoprecipitation (CLIP) allows us to determine physical protein–RNA interactions on a genome-wide scale. UV cross-linking introduces covalent bonds between interacting RBPs and RNAs. In combination with immunoprecipitation and deep sequencing techniques, tens of millions of short reads (representing bound RNAs by an RBP of interest) are generated and are used to characterize the regulatory network mediated by an RBP. Other methods, such as mass spectrometry, can also be used for characterization of cross-linked RBPs and RNAs instead of CLIP methods. In this review, we discuss experimental and computational methods for the generation and analysis of CLIP data. The computational methods include short-read alignment, annotation and RNA-binding motif discovery. We describe the challenges of analyzing CLIP data and indicate areas where improvements are needed.
Collapse
Affiliation(s)
| | | | | | | | - Alice C McHardy
- Corresponding author. Alice C. McHardy, Heinrich-Heine University, Department of Algorithmic Bioinformatics, Universitaetsstrasse 1, 40225 Duesseldorf, Germany. Tel.: +49-211-8110427; Fax: +49-211-8113464; E-mail:
| |
Collapse
|
46
|
Biswas A, Brown CM. Scan for Motifs: a webserver for the analysis of post-transcriptional regulatory elements in the 3' untranslated regions (3' UTRs) of mRNAs. BMC Bioinformatics 2014; 15:174. [PMID: 24909639 PMCID: PMC4067372 DOI: 10.1186/1471-2105-15-174] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2014] [Accepted: 05/16/2014] [Indexed: 11/21/2022] Open
Abstract
Background Gene expression in vertebrate cells may be controlled post-transcriptionally through regulatory elements in mRNAs. These are usually located in the untranslated regions (UTRs) of mRNA sequences, particularly the 3′UTRs. Results Scan for Motifs (SFM) simplifies the process of identifying a wide range of regulatory elements on alignments of vertebrate 3′UTRs. SFM includes identification of both RNA Binding Protein (RBP) sites and targets of miRNAs. In addition to searching pre-computed alignments, the tool provides users the flexibility to search their own sequences or alignments. The regulatory elements may be filtered by expected value cutoffs and are cross-referenced back to their respective sources and literature. The output is an interactive graphical representation, highlighting potential regulatory elements and overlaps between them. The output also provides simple statistics and links to related resources for complementary analyses. The overall process is intuitive and fast. As SFM is a free web-application, the user does not need to install any software or databases. Conclusions Visualisation of the binding sites of different classes of effectors that bind to 3′UTRs will facilitate the study of regulatory elements in 3′ UTRs.
Collapse
Affiliation(s)
| | - Chris M Brown
- Department of Biochemistry, Genetics Otago, University of Otago, Dunedin, New Zealand.
| |
Collapse
|
47
|
Paz I, Kosti I, Ares M, Cline M, Mandel-Gutfreund Y. RBPmap: a web server for mapping binding sites of RNA-binding proteins. Nucleic Acids Res 2014; 42:W361-7. [PMID: 24829458 PMCID: PMC4086114 DOI: 10.1093/nar/gku406] [Citation(s) in RCA: 356] [Impact Index Per Article: 35.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/17/2023] Open
Abstract
Regulation of gene expression is executed in many cases by RNA-binding proteins
(RBPs) that bind to mRNAs as well as to non-coding RNAs. RBPs recognize their
RNA target via specific binding sites on the RNA. Predicting the binding sites
of RBPs is known to be a major challenge. We present a new webserver, RBPmap,
freely accessible through the website http://rbpmap.technion.ac.il/ for accurate prediction and
mapping of RBP binding sites. RBPmap has been developed specifically for mapping
RBPs in human, mouse and Drosophila melanogaster genomes,
though it supports other organisms too. RBPmap enables the users to select
motifs from a large database of experimentally defined motifs. In addition,
users can provide any motif of interest, given as either a consensus or a PSSM.
The algorithm for mapping the motifs is based on a Weighted-Rank approach, which
considers the clustering propensity of the binding sites and the overall
tendency of regulatory regions to be conserved. In addition, RBPmap incorporates
a position-specific background model, designed uniquely for different genomic
regions, such as splice sites, 5’ and 3’ UTRs, non-coding RNA
and intergenic regions. RBPmap was tested on high-throughput RNA-binding
experiments and was proved to be highly accurate.
Collapse
Affiliation(s)
- Inbal Paz
- Department of Biology, Technion - Israel Institute of Technology, Technion City, Haifa 32000, Israel
| | - Idit Kosti
- Department of Biology, Technion - Israel Institute of Technology, Technion City, Haifa 32000, Israel
| | - Manuel Ares
- Department of Molecular, Cellular and Developmental Biology, UCSC, Santa Cruz, CA, USA
| | - Melissa Cline
- Center for Biomolecular Science & Engineering, UCSC, Santa Cruz, CA, USA
| | - Yael Mandel-Gutfreund
- Department of Biology, Technion - Israel Institute of Technology, Technion City, Haifa 32000, Israel
| |
Collapse
|
48
|
Systematic identification of regulatory elements in conserved 3' UTRs of human transcripts. Cell Rep 2014; 7:281-92. [PMID: 24656821 DOI: 10.1016/j.celrep.2014.03.001] [Citation(s) in RCA: 48] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/24/2013] [Revised: 02/03/2014] [Accepted: 03/03/2014] [Indexed: 11/21/2022] Open
Abstract
Posttranscriptional regulatory programs governing diverse aspects of RNA biology remain largely uncharacterized. Understanding the functional roles of RNA cis-regulatory elements is essential for decoding complex programs that underlie the dynamic regulation of transcript stability, splicing, localization, and translation. Here, we describe a combined experimental/computational technology to reveal a catalog of functional regulatory elements embedded in 3' UTRs of human transcripts. We used a bidirectional reporter system coupled with flow cytometry and high-throughput sequencing to measure the effect of short, noncoding, vertebrate-conserved RNA sequences on transcript stability and translation. Information-theoretic motif analysis of the resulting sequence-to-gene-expression mapping revealed linear and structural RNA cis-regulatory elements that positively and negatively modulate the posttranscriptional fates of human transcripts. This combined experimental/computational strategy can be used to systematically characterize the vast landscape of posttranscriptional regulatory elements controlling physiological and pathological cellular state transitions.
Collapse
|
49
|
HITS-CLIP and integrative modeling define the Rbfox splicing-regulatory network linked to brain development and autism. Cell Rep 2014; 6:1139-1152. [PMID: 24613350 DOI: 10.1016/j.celrep.2014.02.005] [Citation(s) in RCA: 243] [Impact Index Per Article: 24.3] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2013] [Revised: 11/30/2013] [Accepted: 02/04/2014] [Indexed: 12/31/2022] Open
Abstract
The RNA binding proteins Rbfox1/2/3 regulate alternative splicing in the nervous system, and disruption of Rbfox1 has been implicated in autism. However, comprehensive identification of functional Rbfox targets has been challenging. Here, we perform HITS-CLIP for all three Rbfox family members in order to globally map, at a single-nucleotide resolution, their in vivo RNA interaction sites in the mouse brain. We find that the two guanines in the Rbfox binding motif UGCAUG are critical for protein-RNA interactions and crosslinking. Using integrative modeling, these interaction sites, combined with additional datasets, define 1,059 direct Rbfox target alternative splicing events. Over half of the quantifiable targets show dynamic changes during brain development. Of particular interest are 111 events from 48 candidate autism-susceptibility genes, including syndromic autism genes Shank3, Cacna1c, and Tsc2. Alteration of Rbfox targets in some autistic brains is correlated with downregulation of all three Rbfox proteins, supporting the potential clinical relevance of the splicing-regulatory network.
Collapse
|
50
|
Cereda M, Pozzoli U, Rot G, Juvan P, Schweitzer A, Clark T, Ule J. RNAmotifs: prediction of multivalent RNA motifs that control alternative splicing. Genome Biol 2014; 15:R20. [PMID: 24485098 PMCID: PMC4054596 DOI: 10.1186/gb-2014-15-1-r20] [Citation(s) in RCA: 42] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2013] [Accepted: 01/31/2014] [Indexed: 12/16/2022] Open
Abstract
RNA-binding proteins (RBPs) regulate splicing according to position-dependent principles, which can be exploited for analysis of regulatory motifs. Here we present RNAmotifs, a method that evaluates the sequence around differentially regulated alternative exons to identify clusters of short and degenerate sequences, referred to as multivalent RNA motifs. We show that diverse RBPs share basic positional principles, but differ in their propensity to enhance or repress exon inclusion. We assess exons differentially spliced between brain and heart, identifying known and new regulatory motifs, and predict the expression pattern of RBPs that bind these motifs. RNAmotifs is available at https://bitbucket.org/rogrro/rna_motifs.
Collapse
|