51
|
Esposito D, Weile J, Shendure J, Starita LM, Papenfuss AT, Roth FP, Fowler DM, Rubin AF. MaveDB: an open-source platform to distribute and interpret data from multiplexed assays of variant effect. Genome Biol 2019; 20:223. [PMID: 31679514 PMCID: PMC6827219 DOI: 10.1186/s13059-019-1845-6] [Citation(s) in RCA: 138] [Impact Index Per Article: 23.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2019] [Accepted: 10/01/2019] [Indexed: 11/10/2022] Open
Abstract
Multiplex assays of variant effect (MAVEs), such as deep mutational scans and massively parallel reporter assays, test thousands of sequence variants in a single experiment. Despite the importance of MAVE data for basic and clinical research, there is no standard resource for their discovery and distribution. Here, we present MaveDB ( https://www.mavedb.org ), a public repository for large-scale measurements of sequence variant impact, designed for interoperability with applications to interpret these datasets. We also describe the first such application, MaveVis, which retrieves, visualizes, and contextualizes variant effect maps. Together, the database and applications will empower the community to mine these powerful datasets.
Collapse
Affiliation(s)
- Daniel Esposito
- Bioinformatics Division, The Walter and Eliza Hall Institute of Medical Research, Parkville, VIC, Australia
| | - Jochen Weile
- The Donnelly Centre, University of Toronto, Toronto, ON, Canada
- Lunenfeld-Tanenbaum Research Institute, Sinai Health System, Toronto, ON, Canada
- Department of Molecular Genetics, University of Toronto, Toronto, ON, Canada
- Department of Computer Science, University of Toronto, Toronto, ON, Canada
| | - Jay Shendure
- Department of Genome Sciences, University of Washington, Seattle, WA, USA
- Brotman Baty Institute for Precision Medicine, Seattle, WA, USA
- Howard Hughes Medical Institute, University of Washington, Seattle, WA, USA
| | - Lea M Starita
- Department of Genome Sciences, University of Washington, Seattle, WA, USA
- Brotman Baty Institute for Precision Medicine, Seattle, WA, USA
| | - Anthony T Papenfuss
- Bioinformatics Division, The Walter and Eliza Hall Institute of Medical Research, Parkville, VIC, Australia
- Department of Medical Biology, University of Melbourne, Melbourne, VIC, Australia
- Bioinformatics and Cancer Genomics Laboratory, Peter MacCallum Cancer Centre, Melbourne, VIC, Australia
- Sir Peter MacCallum Department of Oncology, University of Melbourne, Melbourne, VIC, Australia
- Department of Mathematics and Statistics, University of Melbourne, Melbourne, VIC, Australia
| | - Frederick P Roth
- The Donnelly Centre, University of Toronto, Toronto, ON, Canada.
- Lunenfeld-Tanenbaum Research Institute, Sinai Health System, Toronto, ON, Canada.
- Department of Molecular Genetics, University of Toronto, Toronto, ON, Canada.
- Department of Computer Science, University of Toronto, Toronto, ON, Canada.
- Canadian Institute for Advanced Research, Toronto, ON, Canada.
| | - Douglas M Fowler
- Department of Genome Sciences, University of Washington, Seattle, WA, USA.
- Canadian Institute for Advanced Research, Toronto, ON, Canada.
- Department of Bioengineering, University of Washington, Seattle, WA, USA.
| | - Alan F Rubin
- Bioinformatics Division, The Walter and Eliza Hall Institute of Medical Research, Parkville, VIC, Australia.
- Department of Medical Biology, University of Melbourne, Melbourne, VIC, Australia.
- Bioinformatics and Cancer Genomics Laboratory, Peter MacCallum Cancer Centre, Melbourne, VIC, Australia.
| |
Collapse
|
52
|
Vainberg Slutskin I, Weinberger A, Segal E. Sequence determinants of polyadenylation-mediated regulation. Genome Res 2019; 29:1635-1647. [PMID: 31530582 PMCID: PMC6771402 DOI: 10.1101/gr.247312.118] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2018] [Accepted: 08/13/2019] [Indexed: 12/31/2022]
Abstract
The cleavage and polyadenylation reaction is a crucial step in transcription termination and pre-mRNA maturation in human cells. Despite extensive research, the encoding of polyadenylation-mediated regulation of gene expression within the DNA sequence is not well understood. Here, we utilized a massively parallel reporter assay to inspect the effect of over 12,000 rationally designed polyadenylation sequences (PASs) on reporter gene expression and cleavage efficiency. We find that the PAS sequence can modulate gene expression by over five orders of magnitude. By using a uniquely designed scanning mutagenesis data set, we gain mechanistic insight into various modes of action by which the cleavage efficiency affects the sensitivity or robustness of the PAS to mutation. Furthermore, we employ motif discovery to identify both known and novel sequence motifs associated with PAS-mediated regulation. By leveraging the large scale of our data, we train a deep learning model for the highly accurate prediction of RNA levels from DNA sequence alone (R = 0.83). Moreover, we devise unique approaches for predicting exact cleavage sites for our reporter constructs and for endogenous transcripts. Taken together, our results expand our understanding of PAS-mediated regulation, and provide an unprecedented resource for analyzing and predicting PAS for regulatory genomics applications.
Collapse
Affiliation(s)
- Ilya Vainberg Slutskin
- Department of Computer Science and Applied Mathematics, Weizmann Institute of Science, Rehovot 7610001, Israel.,Department of Molecular Cell Biology, Weizmann Institute of Science, Rehovot 7610001, Israel
| | - Adina Weinberger
- Department of Computer Science and Applied Mathematics, Weizmann Institute of Science, Rehovot 7610001, Israel.,Department of Molecular Cell Biology, Weizmann Institute of Science, Rehovot 7610001, Israel
| | - Eran Segal
- Department of Computer Science and Applied Mathematics, Weizmann Institute of Science, Rehovot 7610001, Israel.,Department of Molecular Cell Biology, Weizmann Institute of Science, Rehovot 7610001, Israel
| |
Collapse
|
53
|
Vejnar CE, Abdel Messih M, Takacs CM, Yartseva V, Oikonomou P, Christiano R, Stoeckius M, Lau S, Lee MT, Beaudoin JD, Musaev D, Darwich-Codore H, Walther TC, Tavazoie S, Cifuentes D, Giraldez AJ. Genome wide analysis of 3' UTR sequence elements and proteins regulating mRNA stability during maternal-to-zygotic transition in zebrafish. Genome Res 2019; 29:1100-1114. [PMID: 31227602 PMCID: PMC6633259 DOI: 10.1101/gr.245159.118] [Citation(s) in RCA: 44] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2018] [Accepted: 06/07/2019] [Indexed: 12/16/2022]
Abstract
Posttranscriptional regulation plays a crucial role in shaping gene expression. During the maternal-to-zygotic transition (MZT), thousands of maternal transcripts are regulated. However, how different cis-elements and trans-factors are integrated to determine mRNA stability remains poorly understood. Here, we show that most transcripts are under combinatorial regulation by multiple decay pathways during zebrafish MZT. By using a massively parallel reporter assay, we identified cis-regulatory sequences in the 3' UTR, including U-rich motifs that are associated with increased mRNA stability. In contrast, miR-430 target sequences, UAUUUAUU AU-rich elements (ARE), CCUC, and CUGC elements emerged as destabilizing motifs, with miR-430 and AREs causing mRNA deadenylation upon genome activation. We identified trans-factors by profiling RNA-protein interactions and found that poly(U)-binding proteins are preferentially associated with 3' UTR sequences and stabilizing motifs. We show that this activity is antagonized by C-rich motifs and correlated with protein binding. Finally, we integrated these regulatory motifs into a machine learning model that predicts reporter mRNA stability in vivo.
Collapse
Affiliation(s)
- Charles E Vejnar
- Department of Genetics, Yale University School of Medicine, New Haven, Connecticut 06510, USA
| | - Mario Abdel Messih
- Department of Genetics, Yale University School of Medicine, New Haven, Connecticut 06510, USA
| | - Carter M Takacs
- Department of Genetics, Yale University School of Medicine, New Haven, Connecticut 06510, USA
- University of New Haven, West Haven, Connecticut 06516, USA
| | - Valeria Yartseva
- Department of Genetics, Yale University School of Medicine, New Haven, Connecticut 06510, USA
- Department of Neuroscience, Genentech, Incorporated, South San Francisco, California 94080, USA
| | - Panos Oikonomou
- Department of Systems Biology, Columbia University, New York, New York 10032, USA
| | - Romain Christiano
- Department of Genetics and Complex Diseases, Harvard T.H. Chan School of Public Health, Boston, Massachusetts 02115, USA
| | - Marlon Stoeckius
- Department of Genetics, Yale University School of Medicine, New Haven, Connecticut 06510, USA
- New York Genome Center, New York, New York 10013, USA
| | - Stephanie Lau
- Department of Genetics, Yale University School of Medicine, New Haven, Connecticut 06510, USA
| | - Miler T Lee
- Department of Genetics, Yale University School of Medicine, New Haven, Connecticut 06510, USA
- Department of Biological Sciences, University of Pittsburgh, Pittsburgh, Pennsylvania 15260, USA
| | - Jean-Denis Beaudoin
- Department of Genetics, Yale University School of Medicine, New Haven, Connecticut 06510, USA
| | - Damir Musaev
- Department of Genetics, Yale University School of Medicine, New Haven, Connecticut 06510, USA
| | - Hiba Darwich-Codore
- Department of Genetics, Yale University School of Medicine, New Haven, Connecticut 06510, USA
| | - Tobias C Walther
- Department of Genetics and Complex Diseases, Harvard T.H. Chan School of Public Health, Boston, Massachusetts 02115, USA
- Department of Cell Biology, Harvard Medical School, Boston, Massachusetts 02115, USA
- Broad Institute of Harvard and MIT, Cambridge, Massachusetts 02124, USA
- Howard Hughes Medical Institute, Boston, Massachusetts 02115, USA
| | - Saeed Tavazoie
- Department of Biochemistry and Molecular Biophysics, and Department of Systems Biology, Columbia University, New York, New York 10032, USA
| | - Daniel Cifuentes
- Department of Genetics, Yale University School of Medicine, New Haven, Connecticut 06510, USA
- Department of Biochemistry, Boston University School of Medicine, Boston, Massachusetts 02118, USA
| | - Antonio J Giraldez
- Department of Genetics, Yale University School of Medicine, New Haven, Connecticut 06510, USA
- Yale Stem Cell Center, Yale University School of Medicine, New Haven, Connecticut 06510, USA
- Yale Cancer Center, Yale University School of Medicine, New Haven, Connecticut 06510, USA
| |
Collapse
|
54
|
Bogard N, Linder J, Rosenberg AB, Seelig G. A Deep Neural Network for Predicting and Engineering Alternative Polyadenylation. Cell 2019; 178:91-106.e23. [PMID: 31178116 PMCID: PMC6599575 DOI: 10.1016/j.cell.2019.04.046] [Citation(s) in RCA: 114] [Impact Index Per Article: 19.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2018] [Revised: 03/18/2019] [Accepted: 04/29/2019] [Indexed: 12/22/2022]
Abstract
Alternative polyadenylation (APA) is a major driver of transcriptome diversity in human cells. Here, we use deep learning to predict APA from DNA sequence alone. We trained our model (APARENT, APA REgression NeT) on isoform expression data from over 3 million APA reporters. APARENT's predictions are highly accurate when tasked with inferring APA in synthetic and human 3'UTRs. Visualizing features learned across all network layers reveals that APARENT recognizes sequence motifs known to recruit APA regulators, discovers previously unknown sequence determinants of 3' end processing, and integrates these features into a comprehensive, interpretable, cis-regulatory code. We apply APARENT to forward engineer functional polyadenylation signals with precisely defined cleavage position and isoform usage and validate predictions experimentally. Finally, we use APARENT to quantify the impact of genetic variants on APA. Our approach detects pathogenic variants in a wide range of disease contexts, expanding our understanding of the genetic origins of disease.
Collapse
Affiliation(s)
- Nicholas Bogard
- Department of Electrical & Computer Engineering, University of Washington, Seattle, WA 98195, USA
| | - Johannes Linder
- Paul G. Allen School of Computer Science & Engineering, University of Washington, Seattle, WA 98195, USA
| | - Alexander B Rosenberg
- Department of Electrical & Computer Engineering, University of Washington, Seattle, WA 98195, USA
| | - Georg Seelig
- Department of Electrical & Computer Engineering, University of Washington, Seattle, WA 98195, USA; Paul G. Allen School of Computer Science & Engineering, University of Washington, Seattle, WA 98195, USA.
| |
Collapse
|
55
|
Abstract
Every animal grows from a single fertilized egg into an intricate network of cell types and organ systems. This process is captured in a lineage tree: a diagram of every cell's ancestry back to the founding zygote. Biologists have long sought to trace this cell lineage tree in individual organisms and have developed a variety of technologies to map the progeny of specific cells. However, there are billions to trillions of cells in complex organisms, and conventional approaches can only map a limited number of clonal populations per experiment. A new generation of tools that use molecular recording methods integrated with single cell profiling technologies may provide a solution. Here, we summarize recent breakthroughs in these technologies, outline experimental and computational challenges, and discuss biological questions that can be addressed using single cell dynamic lineage tracing.
Collapse
Affiliation(s)
- Aaron McKenna
- Department of Molecular and Systems Biology, Geisel School of Medicine, Dartmouth College, Hanover, NH 03755, USA
| | - James A Gagnon
- Center for Cell and Genome Science, University of Utah, Salt Lake City, UT 84112, USA
- School of Biological Sciences, University of Utah, Salt Lake City, UT 84112, USA
| |
Collapse
|
56
|
Rotival M. Characterising the genetic basis of immune response variation to identify causal mechanisms underlying disease susceptibility. HLA 2019; 94:275-284. [PMID: 31115186 DOI: 10.1111/tan.13598] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2019] [Accepted: 05/15/2019] [Indexed: 12/12/2022]
Abstract
Over the last 10 years, genome-wide association studies (GWAS) have identified hundreds of susceptibility loci for autoimmune diseases. However, despite increasing power for the detection of both common and rare coding variants affecting disease susceptibility, a large fraction of disease heritability has remained unexplained. In addition, a majority of the identified loci are located in noncoding regions, and translation of disease-associated loci into new biological insights on the etiology of immune disorders has been lagging. This highlights the need for a better understanding of noncoding variation and new strategies to identify causal genes at disease loci. In this review, I will first detail the molecular basis of gene expression and review the various mechanisms that contribute to alter gene activity at the transcriptional and post-transcriptional level. I will then review the findings from 10 years of functional genomics studies regarding the genetics on gene expression, in particular in the context of infection. Finally, I will discuss the extent to which genetic variants that modulate gene expression at transcriptional and post-transcriptional level contribute to disease susceptibility and present strategies to leverage this information for the identification of causal mechanisms at disease loci in the era of whole genome sequencing.
Collapse
Affiliation(s)
- Maxime Rotival
- Unit of Human Evolutionary Genetics, CNRS UMR2000, Institut Pasteur, Paris, France
| |
Collapse
|
57
|
Simulating multiple faceted variability in single cell RNA sequencing. Nat Commun 2019; 10:2611. [PMID: 31197158 PMCID: PMC6565723 DOI: 10.1038/s41467-019-10500-w] [Citation(s) in RCA: 56] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2018] [Accepted: 05/16/2019] [Indexed: 01/06/2023] Open
Abstract
The abundance of new computational methods for processing and interpreting transcriptomes at a single cell level raises the need for in silico platforms for evaluation and validation. Here, we present SymSim, a simulator that explicitly models the processes that give rise to data observed in single cell RNA-Seq experiments. The components of the SymSim pipeline pertain to the three primary sources of variation in single cell RNA-Seq data: noise intrinsic to the process of transcription, extrinsic variation indicative of different cell states (both discrete and continuous), and technical variation due to low sensitivity and measurement noise and bias. We demonstrate how SymSim can be used for benchmarking methods for clustering, differential expression and trajectory inference, and for examining the effects of various parameters on their performance. We also show how SymSim can be used to evaluate the number of cells required to detect a rare population under various scenarios. Simulated single cell RNA sequencing data is useful for method development and comparison. Here, the authors developed SymSim, a simulator that explicitly models the main factors of variation in single cell data.
Collapse
|
58
|
Vastenhouw NL, Cao WX, Lipshitz HD. The maternal-to-zygotic transition revisited. Development 2019; 146:146/11/dev161471. [PMID: 31189646 DOI: 10.1242/dev.161471] [Citation(s) in RCA: 234] [Impact Index Per Article: 39.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]
Abstract
The development of animal embryos is initially directed by maternal gene products. Then, during the maternal-to-zygotic transition (MZT), developmental control is handed to the zygotic genome. Extensive research in both vertebrate and invertebrate model organisms has revealed that the MZT can be subdivided into two phases, during which very different modes of gene regulation are implemented: initially, regulation is exclusively post-transcriptional and post-translational, following which gradual activation of the zygotic genome leads to predominance of transcriptional regulation. These changes in the gene expression program of embryos are precisely controlled and highly interconnected. Here, we review current understanding of the mechanisms that underlie handover of developmental control during the MZT.
Collapse
Affiliation(s)
- Nadine L Vastenhouw
- Max Planck Institute of Molecular Cell Biology and Genetics, Pfotenhauerstraße 108, 01307 Dresden, Germany
| | - Wen Xi Cao
- Department of Molecular Genetics, University of Toronto, 661 University Avenue, Toronto, Ontario M5G 1M1, Canada
| | - Howard D Lipshitz
- Department of Molecular Genetics, University of Toronto, 661 University Avenue, Toronto, Ontario M5G 1M1, Canada
| |
Collapse
|
59
|
Litterman AJ, Kageyama R, Le Tonqueze O, Zhao W, Gagnon JD, Goodarzi H, Erle DJ, Ansel KM. A massively parallel 3' UTR reporter assay reveals relationships between nucleotide content, sequence conservation, and mRNA destabilization. Genome Res 2019; 29:896-906. [PMID: 31152051 PMCID: PMC6581050 DOI: 10.1101/gr.242552.118] [Citation(s) in RCA: 31] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2018] [Accepted: 05/02/2019] [Indexed: 01/02/2023]
Abstract
Compared to coding sequences, untranslated regions of the transcriptome are not well conserved, and functional annotation of these sequences is challenging. Global relationships between nucleotide composition of 3′ UTR sequences and their sequence conservation have been appreciated since mammalian genomes were first sequenced, but the functional relevance of these patterns remain unknown. We systematically measured the effect on gene expression of the sequences of more than 25,000 RNA-binding protein (RBP) binding sites in primary mouse T cells using a massively parallel reporter assay. GC-rich sequences were destabilizing of reporter mRNAs and come from more rapidly evolving regions of the genome. These sequences were more likely to be folded in vivo and contain a number of structural motifs that reduced accumulation of a heterologous reporter protein. Comparison of full-length 3′ UTR sequences across vertebrate phylogeny revealed that strictly conserved 3′ UTRs were GC-poor and enriched in genes associated with organismal development. In contrast, rapidly evolving 3′ UTRs tended to be GC-rich and derived from genes involved in metabolism and immune responses. Cell-essential genes had lower GC content in their 3′ UTRs, suggesting a connection between unstructured mRNA noncoding sequences and optimal protein production. By reducing gene expression, GC-rich RBP-occupied sequences act as a rapidly evolving substrate for gene regulatory interactions.
Collapse
Affiliation(s)
- Adam J Litterman
- Department of Microbiology and Immunology and Sandler Asthma Basic Research Center, University of California San Francisco, San Francisco, California 94143, USA
| | - Robin Kageyama
- Department of Microbiology and Immunology and Sandler Asthma Basic Research Center, University of California San Francisco, San Francisco, California 94143, USA
| | - Olivier Le Tonqueze
- Department of Medicine and Lung Biology Center, University of California San Francisco, San Francisco, California 94143, USA
| | - Wenxue Zhao
- Department of Medicine and Lung Biology Center, University of California San Francisco, San Francisco, California 94143, USA.,School of Medicine, Sun Yat-Sen University, Guangzhou, People's Republic of China, 510245
| | - John D Gagnon
- Department of Microbiology and Immunology and Sandler Asthma Basic Research Center, University of California San Francisco, San Francisco, California 94143, USA
| | - Hani Goodarzi
- Department of Biochemistry and Biophysics, Department of Urology, and Helen Diller Family Comprehensive Cancer Center, University of California San Francisco, San Francisco, California 94143, USA
| | - David J Erle
- Department of Medicine and Lung Biology Center, University of California San Francisco, San Francisco, California 94143, USA
| | - K Mark Ansel
- Department of Microbiology and Immunology and Sandler Asthma Basic Research Center, University of California San Francisco, San Francisco, California 94143, USA
| |
Collapse
|
60
|
Qiu C, Kaplan CD. Functional assays for transcription mechanisms in high-throughput. Methods 2019; 159-160:115-123. [PMID: 30797033 PMCID: PMC6589137 DOI: 10.1016/j.ymeth.2019.02.017] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2019] [Accepted: 02/18/2019] [Indexed: 01/12/2023] Open
Abstract
Dramatic increases in the scale of programmed synthesis of nucleic acid libraries coupled with deep sequencing have powered advances in understanding nucleic acid and protein biology. Biological systems centering on nucleic acids or encoded proteins greatly benefit from such high-throughput studies, given that large DNA variant pools can be synthesized and DNA, or RNA products of transcription, can be easily analyzed by deep sequencing. Here we review the scope of various high-throughput functional assays for studies of nucleic acids and proteins in general, followed by discussion of how these types of study have yielded insights into the RNA Polymerase II (Pol II) active site as an example. We discuss methodological considerations in the design and execution of these experiments that should be valuable to studies in any system.
Collapse
Affiliation(s)
- Chenxi Qiu
- Department of Medicine, Division of Translational Therapeutics, Beth Israel Deaconess Medical Center, Harvard Medical School, Boston, MA 02215, USA; Cancer Research Institute, Beth Israel Deaconess Medical Center, Harvard Medical School, Boston, MA 02215, USA; Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA.
| | - Craig D Kaplan
- Department of Biological Sciences, University of Pittsburgh, Pittsburgh, PA 15260, USA.
| |
Collapse
|
61
|
Schuster SL, Hsieh AC. The Untranslated Regions of mRNAs in Cancer. Trends Cancer 2019; 5:245-262. [PMID: 30961831 PMCID: PMC6465068 DOI: 10.1016/j.trecan.2019.02.011] [Citation(s) in RCA: 63] [Impact Index Per Article: 10.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2018] [Revised: 02/23/2019] [Accepted: 02/25/2019] [Indexed: 12/19/2022]
Abstract
The 5' and 3' untranslated regions (UTRs) regulate crucial aspects of post-transcriptional gene regulation that are necessary for the maintenance of cellular homeostasis. When these processes go awry through mutation or misexpression of certain regulatory elements, the subsequent deregulation of oncogenic gene expression can drive or enhance cancer pathogenesis. Although the number of known cancer-related mutations in UTR regulatory elements has recently increased markedly as a result of advances in whole-genome sequencing, little is known about how the majority of these genetic aberrations contribute functionally to disease. In this review we explore the regulatory functions of UTRs, how they are co-opted in cancer, new technologies to interrogate cancerous UTRs, and potential therapeutic opportunities stemming from these regions.
Collapse
Affiliation(s)
- Samantha L Schuster
- Molecular and Cellular Biology, University of Washington, Seattle, WA 98195, USA; Division of Human Biology, Fred Hutchinson Cancer Research Center, Seattle, WA, 98109, USA
| | - Andrew C Hsieh
- Molecular and Cellular Biology, University of Washington, Seattle, WA 98195, USA; Division of Human Biology, Fred Hutchinson Cancer Research Center, Seattle, WA, 98109, USA; School of Medicine and Genome Sciences, University of Washington, Seattle, WA 98195, USA.
| |
Collapse
|
62
|
Duchaine TF, Fabian MR. Mechanistic Insights into MicroRNA-Mediated Gene Silencing. Cold Spring Harb Perspect Biol 2019; 11:cshperspect.a032771. [PMID: 29959194 DOI: 10.1101/cshperspect.a032771] [Citation(s) in RCA: 104] [Impact Index Per Article: 17.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022]
Abstract
MicroRNAs (miRNAs) posttranscriptionally regulate gene expression by repressing protein synthesis and exert a broad influence over development, physiology, adaptation, and disease. Over the past two decades, great strides have been made toward elucidating how miRNAs go about shutting down messenger RNA (mRNA) translation and promoting mRNA decay.
Collapse
Affiliation(s)
- Thomas F Duchaine
- Department of Biochemistry & Goodman Cancer Research Centre, McGill University, Montreal, Quebec H3G 1Y6, Canada
| | - Marc R Fabian
- Department of Oncology, McGill University, Montreal, Quebec H3G 1Y6, Canada.,Lady Davis Institute, Jewish General Hospital, Montreal, Quebec H3T 1E2, Canada
| |
Collapse
|
63
|
Mayya VK, Duchaine TF. Ciphers and Executioners: How 3'-Untranslated Regions Determine the Fate of Messenger RNAs. Front Genet 2019; 10:6. [PMID: 30740123 PMCID: PMC6357968 DOI: 10.3389/fgene.2019.00006] [Citation(s) in RCA: 60] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2018] [Accepted: 01/07/2019] [Indexed: 12/29/2022] Open
Abstract
The sequences and structures of 3'-untranslated regions (3'UTRs) of messenger RNAs govern their stability, localization, and expression. 3'UTR regulatory elements are recognized by a wide variety of trans-acting factors that include microRNAs (miRNAs), their associated machinery, and RNA-binding proteins (RBPs). In turn, these factors instigate common mechanistic strategies to execute the regulatory programs encoded by 3'UTRs. Here, we review classes of factors that recognize 3'UTR regulatory elements and the effector machineries they guide toward mRNAs to dictate their expression and fate. We outline illustrative examples of competitive, cooperative, and coordinated interplay such as mRNA localization and localized translation. We further review the recent advances in the study of mRNP granules and phase transition, and their possible significance for the functions of 3'UTRs. Finally, we highlight some of the most recent strategies aimed at deciphering the complexity of the regulatory codes of 3'UTRs, and identify some of the important remaining challenges.
Collapse
Affiliation(s)
| | - Thomas F. Duchaine
- Goodman Cancer Research Centre and Department of Biochemistry, McGill University, Montreal, QC, Canada
| |
Collapse
|
64
|
Webster MW, Stowell JA, Passmore LA. RNA-binding proteins distinguish between similar sequence motifs to promote targeted deadenylation by Ccr4-Not. eLife 2019; 8:40670. [PMID: 30601114 PMCID: PMC6340701 DOI: 10.7554/elife.40670] [Citation(s) in RCA: 49] [Impact Index Per Article: 8.2] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2018] [Accepted: 12/28/2018] [Indexed: 12/17/2022] Open
Abstract
The Ccr4-Not complex removes mRNA poly(A) tails to regulate eukaryotic mRNA stability and translation. RNA-binding proteins contribute to specificity by interacting with both Ccr4-Not and target mRNAs, but this is not fully understood. Here, we reconstitute accelerated and selective deadenylation of RNAs containing AU-rich elements (AREs) and Pumilio-response elements (PREs). We find that the fission yeast homologues of Tristetraprolin/TTP and Pumilio/Puf (Zfs1 and Puf3) interact with Ccr4-Not via multiple regions within low-complexity sequences, suggestive of a multipartite interface that extends beyond previously defined interactions. Using a two-color assay to simultaneously monitor poly(A) tail removal from different RNAs, we demonstrate that Puf3 can distinguish between RNAs of very similar sequence. Analysis of binding kinetics reveals that this is primarily due to differences in dissociation rate constants. Consequently, motif quality is a major determinant of mRNA stability for Puf3 targets in vivo and can be used for the prediction of mRNA targets. When a cell needs to make a particular protein, it first copies the instructions from the matching gene into a molecule known as a messenger RNA (or an mRNA for short). The more mRNA copies it makes, the more protein it can produce. A simple way to control protein production is to raise or lower the number of these mRNA messages, and living cells have lots of ways to make this happen. One method involves codes built into the mRNAs themselves. The mRNAs can carry short sequences of genetic letters that can trigger their own destruction. Known as “destabilising motifs”, these sequences attract the attention of a group of proteins called Ccr4-Not. Together these proteins shorten the end of the mRNAs, preparing the molecules for degradation. But how does Ccr4-Not choose which mRNAs to target? Different mRNAs carry different destabilising motifs. This means that when groups of mRNAs all carry the same motif, the cell can destroy them all together. This allows the cell to switch networks of related genes off together without affecting the mRNAs it still needs. What is puzzling is that the destabilising motifs that control different groups of mRNAs can be very similar, and scientists do not yet know how Ccr4-Not can tell the difference, or what triggers it to start breaking down groups of mRNAs. To find out, Webster et al. recreated the system in the laboratory using purified molecules. The test-tube system confirmed previous suggestions that a protein called Puf3 forms a bridge between Ccr4-Not and mRNAs. It acts as a tether, recognising a destabilising motif and linking it to Ccr4-Not. Labelling different mRNAs with two colours of fluorescent dye showed how Puf3 helps the cell to choose which to destroy. Puf3 allows Ccr4-Not to select specific mRNAs from a mixture of molecules. Puf3 could distinguish between mRNAs that differed in a single letter of genetic code. When it matched with the wrong mRNA, it disconnected much faster than when it matched with the right one, preventing Ccr4-Not from linking up. The ability to destroy specific mRNA messages is critical for cell survival. It happens when cells divide, during immune responses such as inflammation, and in early development. Understanding the targets of tethers like Puf3 could help scientists to predict which genes will switch off and when. This could reveal genes that work together, helping to unravel their roles inside cells.
Collapse
Affiliation(s)
| | | | - Lori A Passmore
- MRC Laboratory of Molecular Biology, Cambridge, United Kingdom
| |
Collapse
|
65
|
Mitchell D, Renda AJ, Douds CA, Babitzke P, Assmann SM, Bevilacqua PC. In vivo RNA structural probing of uracil and guanine base-pairing by 1-ethyl-3-(3-dimethylaminopropyl)carbodiimide (EDC). RNA (NEW YORK, N.Y.) 2019; 25:147-157. [PMID: 30341176 PMCID: PMC6298566 DOI: 10.1261/rna.067868.118] [Citation(s) in RCA: 22] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/26/2018] [Accepted: 10/18/2018] [Indexed: 05/09/2023]
Abstract
Many biological functions performed by RNAs arise from their in vivo structures. The structure of the same RNA can differ in vitro and in vivo owing in part to the influence of molecules ranging from protons to secondary metabolites to proteins. Chemical reagents that modify the Watson-Crick (WC) face of unprotected RNA bases report on the absence of base-pairing and so are of value to determining structures adopted by RNAs. Reagents have thus been sought that can report on the native RNA structures that prevail in living cells. Dimethyl sulfate (DMS) and glyoxal penetrate cell membranes and inform on RNA secondary structure in vivo through modification of adenine (A), cytosine (C), and guanine (G) bases. Uracil (U) bases, however, have thus far eluded characterization in vivo. Herein, we show that the water-soluble carbodiimide 1-ethyl-3-(3-dimethylaminopropyl)carbodiimide (EDC) is capable of modifying the WC face of U and G in vivo, favoring the former nucleobase by a factor of ∼1.5, and doing so in the eukaryote rice, as well as in the Gram-negative bacterium Escherichia coli While both EDC and glyoxal target Gs, EDC reacts with Gs in their typical neutral state, while glyoxal requires Gs to populate the rare anionic state. EDC may thus be more generally useful; however, comparison of the reactivity of EDC and glyoxal may allow the identification of Gs with perturbed pKas in vivo and genome-wide. Overall, use of EDC with DMS allows in vivo probing of the base-pairing status of all four RNA bases.
Collapse
MESH Headings
- Base Pairing
- Base Sequence
- Escherichia coli/chemistry
- Escherichia coli/genetics
- Ethyldimethylaminopropyl Carbodiimide
- Glyoxal
- Guanine/chemistry
- Indicators and Reagents
- Molecular Probe Techniques
- Molecular Probes
- Molecular Structure
- Nucleic Acid Conformation
- Oryza/chemistry
- Oryza/genetics
- RNA/chemistry
- RNA/genetics
- RNA, Bacterial/chemistry
- RNA, Bacterial/genetics
- RNA, Plant/chemistry
- RNA, Plant/genetics
- RNA, Ribosomal, 16S/chemistry
- RNA, Ribosomal, 16S/genetics
- RNA, Ribosomal, 5.8S/chemistry
- RNA, Ribosomal, 5.8S/genetics
- Uracil/chemistry
Collapse
Affiliation(s)
- David Mitchell
- Department of Chemistry, The Pennsylvania State University, University Park, Pennsylvania 16802, USA
- Center for RNA Molecular Biology, The Pennsylvania State University, University Park, Pennsylvania 16802, USA
| | - Andrew J Renda
- Center for RNA Molecular Biology, The Pennsylvania State University, University Park, Pennsylvania 16802, USA
- Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, Pennsylvania 16802, USA
| | - Catherine A Douds
- Department of Chemistry, The Pennsylvania State University, University Park, Pennsylvania 16802, USA
- Center for RNA Molecular Biology, The Pennsylvania State University, University Park, Pennsylvania 16802, USA
- Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, Pennsylvania 16802, USA
| | - Paul Babitzke
- Center for RNA Molecular Biology, The Pennsylvania State University, University Park, Pennsylvania 16802, USA
- Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, Pennsylvania 16802, USA
| | - Sarah M Assmann
- Department of Biology, The Pennsylvania State University, University Park, Pennsylvania 16802, USA
| | - Philip C Bevilacqua
- Department of Chemistry, The Pennsylvania State University, University Park, Pennsylvania 16802, USA
- Center for RNA Molecular Biology, The Pennsylvania State University, University Park, Pennsylvania 16802, USA
- Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, Pennsylvania 16802, USA
| |
Collapse
|
66
|
Genome-wide RNA structurome reprogramming by acute heat shock globally regulates mRNA abundance. Proc Natl Acad Sci U S A 2018; 115:12170-12175. [PMID: 30413617 PMCID: PMC6275526 DOI: 10.1073/pnas.1807988115] [Citation(s) in RCA: 70] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
The heat shock response is crucial for organism survival in natural environments. RNA structure is known to influence numerous processes related to gene expression, but there have been few studies on the global RNA structurome as it prevails in vivo. Moreover, how heat shock rapidly affects RNA structure genome-wide in living systems remains unknown. We report here in vivo heat-regulated RNA structuromes. We applied Structure-seq chemical [dimethyl sulfate (DMS)] structure probing to rice (Oryza sativa L.) seedlings with and without 10 min of 42 °C heat shock and obtained structural data on >14,000 mRNAs. We show that RNA secondary structure broadly regulates gene expression in response to heat shock in this essential crop species. Our results indicate significant heat-induced elevation of DMS reactivity in the global transcriptome, revealing RNA unfolding over this biological temperature range. Our parallel Ribo-seq analysis provides no evidence for a correlation between RNA unfolding and heat-induced changes in translation, in contrast to the paradigm established in prokaryotes, wherein melting of RNA thermometers promotes translation. Instead, we find that heat-induced DMS reactivity increases correlate with significant decreases in transcript abundance, as quantified from an RNA-seq time course, indicating that mRNA unfolding promotes transcript degradation. The mechanistic basis for this outcome appears to be mRNA unfolding at both 5' and 3'-UTRs that facilitates access to the RNA degradation machinery. Our results thus reveal unexpected paradigms governing RNA structural changes and the eukaryotic RNA life cycle.
Collapse
|
67
|
Pan H, Shi Y, Chen S, Yang Y, Yue Y, Zhan L, Dai L, Dong H, Hong W, Shi F, Jin Y. Competing RNA pairings in complex alternative splicing of a 3' variable region. RNA (NEW YORK, N.Y.) 2018; 24:1466-1480. [PMID: 30065023 PMCID: PMC6191721 DOI: 10.1261/rna.066225.118] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/26/2018] [Accepted: 07/24/2018] [Indexed: 05/15/2023]
Abstract
Alternative pre-mRNA splicing remarkably expands protein diversity in eukaryotes. Drosophila PGRP-LC can generate three major 3' splice isoforms that exhibit distinct innate immune recognition and defenses against various microbial infections. However, the regulatory mechanisms underlying the uniquely biased splicing pattern at the 3' variable region remain unclear. Here we show that competing RNA pairings control the unique splicing of the 3' variable region of Drosophila PGRP-LC pre-mRNA. We reveal three roles by which these RNA pairings jointly regulate the 3' variant selection through activating the proximal 3' splice site and concurrently masking the intron-proximal 5' splice site, in combination with physical competition of RNA pairing. We also reveal that competing RNA pairings regulate alternative splicing of the highly complex 3' variable regions of Drosophila CG42235 and Pip Our findings will facilitate a better understanding of the regulatory mechanisms of highly complex alternative splicing as well as highly variable 3' processing.
Collapse
Affiliation(s)
- Huawei Pan
- Institute of Biochemistry, College of Life Sciences, Zhejiang University, Hangzhou, Zhejiang, ZJ310058, P.R. China
| | - Yang Shi
- Institute of Biochemistry, College of Life Sciences, Zhejiang University, Hangzhou, Zhejiang, ZJ310058, P.R. China
| | - Shuo Chen
- Institute of Biochemistry, College of Life Sciences, Zhejiang University, Hangzhou, Zhejiang, ZJ310058, P.R. China
| | - Yun Yang
- Institute of Biochemistry, College of Life Sciences, Zhejiang University, Hangzhou, Zhejiang, ZJ310058, P.R. China
| | - Yuan Yue
- Institute of Biochemistry, College of Life Sciences, Zhejiang University, Hangzhou, Zhejiang, ZJ310058, P.R. China
| | - Leilei Zhan
- Institute of Biochemistry, College of Life Sciences, Zhejiang University, Hangzhou, Zhejiang, ZJ310058, P.R. China
| | - Lanzhi Dai
- Institute of Biochemistry, College of Life Sciences, Zhejiang University, Hangzhou, Zhejiang, ZJ310058, P.R. China
| | - Haiyang Dong
- Institute of Biochemistry, College of Life Sciences, Zhejiang University, Hangzhou, Zhejiang, ZJ310058, P.R. China
| | - Weiling Hong
- Institute of Biochemistry, College of Life Sciences, Zhejiang University, Hangzhou, Zhejiang, ZJ310058, P.R. China
| | - Feng Shi
- Institute of Biochemistry, College of Life Sciences, Zhejiang University, Hangzhou, Zhejiang, ZJ310058, P.R. China
| | - Yongfeng Jin
- Institute of Biochemistry, College of Life Sciences, Zhejiang University, Hangzhou, Zhejiang, ZJ310058, P.R. China
| |
Collapse
|
68
|
Goldstrohm AC, Hall TMT, McKenney KM. Post-transcriptional Regulatory Functions of Mammalian Pumilio Proteins. Trends Genet 2018; 34:972-990. [PMID: 30316580 DOI: 10.1016/j.tig.2018.09.006] [Citation(s) in RCA: 113] [Impact Index Per Article: 16.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2018] [Revised: 09/10/2018] [Accepted: 09/19/2018] [Indexed: 01/18/2023]
Abstract
Mammalian Pumilio proteins, PUM1 and PUM2, are members of the PUF family of sequence-specific RNA-binding proteins. In this review, we explore their mechanisms, regulatory networks, biological functions, and relevance to diseases. Pumilio proteins bind an extensive network of mRNAs and repress protein expression by inhibiting translation and promoting mRNA decay. Opposingly, in certain contexts, they can activate protein expression. Pumilio proteins also regulate noncoding (nc)RNAs. The ncRNA, ncRNA activated by DNA damage (NORAD), can in turn modulate Pumilio activity. Genetic analysis provides new insights into Pumilio protein function. They are essential for growth and development. They control diverse processes, including stem cell fate, and neurological functions, such as behavior and memory formation. Novel findings show that their dysfunction contributes to neurodegeneration, epilepsy, movement disorders, intellectual disability, infertility, and cancer.
Collapse
Affiliation(s)
- Aaron C Goldstrohm
- Department of Biochemistry, Molecular Biology and Biophysics, University of Minnesota, Minneapolis, MN, USA.
| | - Traci M Tanaka Hall
- Epigenetics and Stem Cell Biology Laboratory, National Institute of Environmental Health Sciences, National Institutes of Health, Research Triangle Park, NC, USA
| | - Katherine M McKenney
- Department of Biochemistry, Molecular Biology and Biophysics, University of Minnesota, Minneapolis, MN, USA
| |
Collapse
|