101
|
Neininger K, Marschall T, Helms V. SNP and indel frequencies at transcription start sites and at canonical and alternative translation initiation sites in the human genome. PLoS One 2019; 14:e0214816. [PMID: 30978217 PMCID: PMC6461226 DOI: 10.1371/journal.pone.0214816] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/26/2018] [Accepted: 03/20/2019] [Indexed: 11/30/2022] Open
Abstract
Single-nucleotide polymorphisms (SNPs) are the most common form of genetic variation in humans and drive phenotypic variation. Due to evolutionary conservation, SNPs and indels (insertion and deletions) are depleted in functionally important sequence elements. Recently, population-scale sequencing efforts such as the 1000 Genomes Project and the Genome of the Netherlands Project have catalogued large numbers of sequence variants. Here, we present a systematic analysis of the polymorphisms reported by these two projects in different coding and non-coding genomic elements of the human genome (intergenic regions, CpG islands, promoters, 5’ UTRs, coding exons, 3’ UTRs, introns, and intragenic regions). Furthermore, we were especially interested in the distribution of SNPs and indels in direct vicinity to the transcription start site (TSS) and translation start site (CSS). Thereby, we discovered an enrichment of dinucleotides CpG and CpA and an accumulation of SNPs at base position −1 relative to the TSS that involved primarily CpG and CpA dinucleotides. Genes having a CpG dinucleotide at TSS position -1 were enriched in the functional GO terms “Phosphoprotein”, “Alternative splicing”, and “Protein binding”. Focusing on the CSS, we compared SNP patterns in the flanking regions of canonical and alternative AUG and near-cognate start sites where we considered alternative starts previously identified by experimental ribosome profiling. We observed similar conservation patterns of canonical and alternative translation start sites, which underlines the importance of alternative translation mechanisms for cellular function.
Collapse
Affiliation(s)
- Kerstin Neininger
- Center for Bioinformatics, Saarland University, 66123 Saarbrücken, Germany
- Graduate School of Computer Science, Saarland University, 66123 Saarbrücken, Germany
| | - Tobias Marschall
- Center for Bioinformatics, Saarland University, 66123 Saarbrücken, Germany
- Max Planck Institute for Informatics, 66123 Saarbrücken, Germany
| | - Volkhard Helms
- Center for Bioinformatics, Saarland University, 66123 Saarbrücken, Germany
- * E-mail:
| |
Collapse
|
102
|
Schuster SL, Hsieh AC. The Untranslated Regions of mRNAs in Cancer. Trends Cancer 2019; 5:245-262. [PMID: 30961831 PMCID: PMC6465068 DOI: 10.1016/j.trecan.2019.02.011] [Citation(s) in RCA: 80] [Impact Index Per Article: 13.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2018] [Revised: 02/23/2019] [Accepted: 02/25/2019] [Indexed: 12/19/2022]
Abstract
The 5' and 3' untranslated regions (UTRs) regulate crucial aspects of post-transcriptional gene regulation that are necessary for the maintenance of cellular homeostasis. When these processes go awry through mutation or misexpression of certain regulatory elements, the subsequent deregulation of oncogenic gene expression can drive or enhance cancer pathogenesis. Although the number of known cancer-related mutations in UTR regulatory elements has recently increased markedly as a result of advances in whole-genome sequencing, little is known about how the majority of these genetic aberrations contribute functionally to disease. In this review we explore the regulatory functions of UTRs, how they are co-opted in cancer, new technologies to interrogate cancerous UTRs, and potential therapeutic opportunities stemming from these regions.
Collapse
Affiliation(s)
- Samantha L Schuster
- Molecular and Cellular Biology, University of Washington, Seattle, WA 98195, USA; Division of Human Biology, Fred Hutchinson Cancer Research Center, Seattle, WA, 98109, USA
| | - Andrew C Hsieh
- Molecular and Cellular Biology, University of Washington, Seattle, WA 98195, USA; Division of Human Biology, Fred Hutchinson Cancer Research Center, Seattle, WA, 98109, USA; School of Medicine and Genome Sciences, University of Washington, Seattle, WA 98195, USA.
| |
Collapse
|
103
|
Ding W, Cheng J, Guo D, Mao L, Li J, Lu L, Zhang Y, Yang J, Jiang H. Engineering the 5' UTR-Mediated Regulation of Protein Abundance in Yeast Using Nucleotide Sequence Activity Relationships. ACS Synth Biol 2018; 7:2709-2714. [PMID: 30525473 DOI: 10.1021/acssynbio.8b00127] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]
Abstract
The 5' untranslated region (5'UTR) plays a key role in post-transcriptional regulation, but interaction between nucleotides and directed evolution of 5'UTRs as synthetic regulatory elements remain unclear. By constructing a library of synthesized random 5'UTRs of 24 nucleotides in Saccharomyces cerevisiae, we observed strong epistatic interactions among bases from different positions in the 5'UTR. Taking into account these base interactions, we constructed a mathematical model to predict protein abundance with a precision of R2 = 0.60. On the basis of this model, we developed an approach to engineer 5'UTRs according to nucleotide sequence activity relationships (NuSAR), in which 5'UTRs were engineered stepwise through repeated cycles of backbone design, directed screening, and model reconstruction. After three rounds of NuSAR, the predictive accuracy of our model was improved to R2 = 0.71, and a strong 5'UTR was obtained with 5-fold higher protein abundance than the starting 5'UTR. Our findings provide new insights into the mechanism of 5'UTR regulation and contribute to a new translational elements engineering approach in synthetic biology.
Collapse
Affiliation(s)
- Wentao Ding
- Key Laboratory of Systems Microbial Biotechnology, Tianjin Institute of Industrial Biotechnology, Chinese Academy of Sciences, Tianjin 300308, China
- Beijing Advanced Innovation Center for Soft Matter Science and Engineering, Beijing University of Chemical Technology, Beijing 100029, China
| | - Jian Cheng
- Key Laboratory of Systems Microbial Biotechnology, Tianjin Institute of Industrial Biotechnology, Chinese Academy of Sciences, Tianjin 300308, China
| | - Dan Guo
- Key Laboratory of Systems Microbial Biotechnology, Tianjin Institute of Industrial Biotechnology, Chinese Academy of Sciences, Tianjin 300308, China
| | - Ling Mao
- College of Biology and Pharmaceutical Engineering, Wuhan Polytechnic University, Wuhan 430023, China
| | - Jingwei Li
- Laboratory of Mathematics for Nonlinear Science, Shanghai Key Laboratory for Contemporary Applied Mathematics, Centre for Computational Systems Biology, School of Mathematical Sciences, Fudan University, Shanghai 200433, China
| | - Lina Lu
- Key Laboratory of Systems Microbial Biotechnology, Tianjin Institute of Industrial Biotechnology, Chinese Academy of Sciences, Tianjin 300308, China
| | - Yunxin Zhang
- Laboratory of Mathematics for Nonlinear Science, Shanghai Key Laboratory for Contemporary Applied Mathematics, Centre for Computational Systems Biology, School of Mathematical Sciences, Fudan University, Shanghai 200433, China
| | - Jiangke Yang
- College of Biology and Pharmaceutical Engineering, Wuhan Polytechnic University, Wuhan 430023, China
| | - Huifeng Jiang
- Key Laboratory of Systems Microbial Biotechnology, Tianjin Institute of Industrial Biotechnology, Chinese Academy of Sciences, Tianjin 300308, China
| |
Collapse
|
104
|
Abad-Navarro F, de la Morena-Barrio ME, Fernández-Breis JT, Corral J. Lost in translation: bioinformatic analysis of variations affecting the translation initiation codon in the human genome. Bioinformatics 2018; 34:3788-3794. [PMID: 29868922 DOI: 10.1093/bioinformatics/bty453] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2018] [Accepted: 05/30/2018] [Indexed: 11/12/2022] Open
Abstract
Motivation Translation is a key biological process controlled in eukaryotes by the initiation AUG codon. Variations affecting this codon may have pathological consequences by disturbing the correct initiation of translation. Unfortunately, there is no systematic study describing these variations in the human genome. Moreover, we aimed to develop new tools for in silico prediction of the pathogenicity of gene variations affecting AUG codons, because to date, these gene defects have been wrongly classified as missense. Results Whole-exome analysis revealed the mean of 12 gene variations per person affecting initiation codons, mostly with high (>0.01) minor allele frequency (MAF). Moreover, analysis of Ensembl data (December 2017) revealed 11 261 genetic variations affecting the initiation AUG codon of 7205 genes. Most of these variations (99.5%) have low or unknown MAF, probably reflecting deleterious consequences. Only 62 variations had high MAF. Genetic variations with high MAF had closer alternative AUG downstream codons than did those with low MAF. Besides, the high-MAF group better maintained both the signal peptide and reading frame. These differentiating elements could help to determine the pathogenicity of this kind of variation. Availability and implementation Data and scripts in Perl and R are freely available at https://github.com/fanavarro/hemodonacion. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Francisco Abad-Navarro
- Departamento de Informática y Sistemas, Universidad de Murcia, IMIB-Arrixaca, Murcia, Spain
| | - María Eugenia de la Morena-Barrio
- Servicio de Hematología y Oncología Médica, Hospital Universitario Morales Meseguer, Centro Regional de Hemodonación, Universidad de Murcia, IMIB-Arrixaca, Murcia, Spain.,Centro de Investigación Biomédica en Red de Enfermedades Raras (CIBERER), Instituto de Salud Carlos III (ISCIII), Spain
| | | | - Javier Corral
- Servicio de Hematología y Oncología Médica, Hospital Universitario Morales Meseguer, Centro Regional de Hemodonación, Universidad de Murcia, IMIB-Arrixaca, Murcia, Spain.,Centro de Investigación Biomédica en Red de Enfermedades Raras (CIBERER), Instituto de Salud Carlos III (ISCIII), Spain
| |
Collapse
|
105
|
Manna P, Hung ST, Mukherjee S, Friis P, Simpson DM, Lo MN, Palmer AE, Jimenez R. Directed evolution of excited state lifetime and brightness in FusionRed using a microfluidic sorter. Integr Biol (Camb) 2018; 10:516-526. [PMID: 30094420 PMCID: PMC6141309 DOI: 10.1039/c8ib00103k] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022]
Abstract
Green fluorescent proteins (GFP) and their blue, cyan and red counterparts offer unprecedented advantages as biological markers owing to their genetic encodability and straightforward expression in different organisms. Although significant advancements have been made towards engineering the key photo-physical properties of red fluorescent proteins (RFPs), they continue to perform sub-optimally relative to GFP variants. Advanced engineering strategies are needed for further evolution of RFPs in the pursuit of improving their photo-physics. In this report, a microfluidic sorter that discriminates members of a cell-based library based on their excited state lifetime and fluorescence intensity is used for the directed evolution of the photo-physical properties of FusionRed. In-flow measurements of the fluorescence lifetime are performed in a frequency-domain approach with sub-millisecond sampling times. Promising clones are sorted by optical force trapping with an infrared laser. Using this microfluidic sorter, mutants are generated with longer lifetimes than their precursor, FusionRed. This improvement in the excited state lifetime of the mutants leads to an increase in their fluorescence quantum yield up to 1.8-fold. In the course of evolution, we also identified one key mutation (L177M), which generated a mutant (FusionRed-M) that displayed ∼2-fold higher brightness than its precursor upon expression in mammalian (HeLa) cells. Photo-physical and mutational analyses of clones isolated at the different stages of mutagenesis reveal the photo-physical evolution towards higher in vivo brightness.
Collapse
Affiliation(s)
- Premashis Manna
- JILA, NIST and University of Colorado, Boulder, Colorado 80309, USA.
| | | | | | | | | | | | | | | |
Collapse
|
106
|
Lim CS, T. Wardell SJ, Kleffmann T, Brown CM. The exon-intron gene structure upstream of the initiation codon predicts translation efficiency. Nucleic Acids Res 2018; 46:4575-4591. [PMID: 29684192 PMCID: PMC5961209 DOI: 10.1093/nar/gky282] [Citation(s) in RCA: 22] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2017] [Revised: 03/28/2018] [Accepted: 04/06/2018] [Indexed: 12/16/2022] Open
Abstract
Introns in mRNA leaders are common in complex eukaryotes, but often overlooked. These introns are spliced out before translation, leaving exon-exon junctions in the mRNA leaders (leader EEJs). Our multi-omic approach shows that the number of leader EEJs inversely correlates with the main protein translation, as does the number of upstream open reading frames (uORFs). Across the five species studied, the lowest levels of translation were observed for mRNAs with both leader EEJs and uORFs (29%). This class of mRNAs also have ribosome footprints on uORFs, with strong triplet periodicity indicating uORF translation. Furthermore, the positions of both leader EEJ and uORF are conserved between human and mouse. Thus, the uORF, in combination with leader EEJ predicts lower expression for nearly one-third of eukaryotic proteins.
Collapse
Affiliation(s)
- Chun Shen Lim
- Department of Biochemistry, School of Biomedical Sciences, University of Otago, Dunedin, New Zealand
| | - Samuel J T. Wardell
- Department of Biochemistry, School of Biomedical Sciences, University of Otago, Dunedin, New Zealand
| | - Torsten Kleffmann
- Department of Biochemistry, School of Biomedical Sciences, University of Otago, Dunedin, New Zealand
| | - Chris M Brown
- Department of Biochemistry, School of Biomedical Sciences, University of Otago, Dunedin, New Zealand
| |
Collapse
|
107
|
Abstract
Ribosomopathies are a group of human disorders most commonly caused by ribosomal protein haploinsufficiency or defects in ribosome biogenesis. These conditions manifest themselves as physiological defects in specific cell and tissue types. We review current molecular models to explain ribosomopathies and attempt to reconcile the tissue specificity of these disorders with the ubiquitous requirement for ribosomes in all cells. Ribosomopathies as a group are diverse in their origins and clinical manifestations; we use the well-described Diamond-Blackfan anemia (DBA) as a specific example to highlight some common features. We discuss ribosome homeostasis as an overarching principle that governs the sensitivity of specific cells and tissue types to ribosomal protein mutations. Mathematical models and experimental insights rationalize how even subtle shifts in the availability of ribosomes, such as those created by ribosome haploinsufficiency, can drive messenger RNA-specific effects on protein expression. We discuss recently identified roles played by ribosome rescue and recycling factors in regulating ribosome homeostasis.
Collapse
Affiliation(s)
- Eric W Mills
- Howard Hughes Medical Institute, Department of Molecular Biology and Genetics, Johns Hopkins University School of Medicine, Baltimore, MD 21205, USA
| | - Rachel Green
- Howard Hughes Medical Institute, Department of Molecular Biology and Genetics, Johns Hopkins University School of Medicine, Baltimore, MD 21205, USA.
| |
Collapse
|
108
|
Leppek K, Das R, Barna M. Functional 5' UTR mRNA structures in eukaryotic translation regulation and how to find them. Nat Rev Mol Cell Biol 2018; 19:158-174. [PMID: 29165424 PMCID: PMC5820134 DOI: 10.1038/nrm.2017.103] [Citation(s) in RCA: 584] [Impact Index Per Article: 83.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/03/2023]
Abstract
RNA molecules can fold into intricate shapes that can provide an additional layer of control of gene expression beyond that of their sequence. In this Review, we discuss the current mechanistic understanding of structures in 5' untranslated regions (UTRs) of eukaryotic mRNAs and the emerging methodologies used to explore them. These structures may regulate cap-dependent translation initiation through helicase-mediated remodelling of RNA structures and higher-order RNA interactions, as well as cap-independent translation initiation through internal ribosome entry sites (IRESs), mRNA modifications and other specialized translation pathways. We discuss known 5' UTR RNA structures and how new structure probing technologies coupled with prospective validation, particularly compensatory mutagenesis, are likely to identify classes of structured RNA elements that shape post-transcriptional control of gene expression and the development of multicellular organisms.
Collapse
Affiliation(s)
- Kathrin Leppek
- Department of Developmental Biology, Stanford University, Stanford, California 94305, USA
- Department of Genetics, Stanford University, Stanford, California 94305, USA
| | - Rhiju Das
- Departments of Biochemistry and Physics, Stanford University, Stanford, California 94305, USA
| | - Maria Barna
- Department of Developmental Biology, Stanford University, Stanford, California 94305, USA
- Department of Genetics, Stanford University, Stanford, California 94305, USA
| |
Collapse
|
109
|
Unraveling the determinants of microRNA mediated regulation using a massively parallel reporter assay. Nat Commun 2018; 9:529. [PMID: 29410437 PMCID: PMC5802814 DOI: 10.1038/s41467-018-02980-z] [Citation(s) in RCA: 36] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2017] [Accepted: 01/11/2018] [Indexed: 12/16/2022] Open
Abstract
Despite extensive research, the sequence features affecting microRNA-mediated regulation are not well understood, limiting our ability to predict gene expression levels in both native and synthetic sequences. Here we employed a massively parallel reporter assay to investigate the effect of over 14,000 rationally designed 3′ UTR sequences on reporter construct repression. We found that multiple factors, including microRNA identity, hybridization energy, target accessibility, and target multiplicity, can be manipulated to achieve a predictable, up to 57-fold, change in protein repression. Moreover, we predict protein repression and RNA levels with high accuracy (R = 0.84 and R = 0.80, respectively) using only 3′ UTR sequence, as well as the effect of mutation in native 3′ UTRs on protein repression (R = 0.63). Taken together, our results elucidate the effect of different sequence features on miRNA-mediated regulation and demonstrate the predictability of their effect on gene expression with applications in regulatory genomics and synthetic biology. MiRNAs are known regulators of gene expression. Here the authors perform a large-scale massively parallel reporter assay to investigate the effect of a large number of designed 3′ UTR sequences on reporter expression and asses how miRNA regulatory elements features affect miRNA mediated repression.
Collapse
|
110
|
Wisse LE, Penning R, Zaal EA, van Berkel CGM, Ter Braak TJ, Polder E, Kenney JW, Proud CG, Berkers CR, Altelaar MAF, Speijer D, van der Knaap MS, Abbink TEM. Proteomic and Metabolomic Analyses of Vanishing White Matter Mouse Astrocytes Reveal Deregulation of ER Functions. Front Cell Neurosci 2017; 11:411. [PMID: 29375313 PMCID: PMC5770689 DOI: 10.3389/fncel.2017.00411] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2017] [Accepted: 12/07/2017] [Indexed: 12/20/2022] Open
Abstract
Vanishing white matter (VWM) is a leukodystrophy with predominantly early-childhood onset. Affected children display various neurological signs, including ataxia and spasticity, and die early. VWM patients have bi-allelic mutations in any of the five genes encoding the subunits of the eukaryotic translation factor 2B (eIF2B). eIF2B regulates protein synthesis rates under basal and cellular stress conditions. The underlying molecular mechanism of how mutations in eIF2B result in VWM is unknown. Previous studies suggest that brain white matter astrocytes are primarily affected in VWM. We hypothesized that the translation rate of certain astrocytic mRNAs is affected by the mutations, resulting in astrocytic dysfunction. Here we subjected primary astrocyte cultures of wild type (wt) and VWM (2b5ho) mice to pulsed labeling proteomics based on stable isotope labeling with amino acids in cell culture (SILAC) with an L-azidohomoalanine (AHA) pulse to select newly synthesized proteins. AHA was incorporated into newly synthesized proteins in wt and 2b5ho astrocytes with similar efficiency, without affecting cell viability. We quantified proteins synthesized in astrocytes of wt and 2b5ho mice. This proteomic profiling identified a total of 80 proteins that were regulated by the eIF2B mutation. We confirmed increased expression of PROS1 in 2b5ho astrocytes and brain. A DAVID enrichment analysis showed that approximately 50% of the eIF2B-regulated proteins used the secretory pathway. A small-scale metabolic screen further highlighted a significant change in the metabolite 6-phospho-gluconate, indicative of an altered flux through the pentose phosphate pathway (PPP). Some of the proteins migrating through the secretory pathway undergo oxidative folding reactions in the endoplasmic reticulum (ER), which produces reactive oxygen species (ROS). The PPP produces NADPH to remove ROS. The proteomic and metabolomics data together suggest a deregulation of ER function in 2b5ho mouse astrocytes.
Collapse
Affiliation(s)
- Lisanne E Wisse
- Pediatrics, VU University Medical Center, Amsterdam, Netherlands
| | - Renske Penning
- Biomolecular Mass Spectrometry and Proteomics Group, Utrecht Institute for Pharmaceutical Sciences, Bijvoet Center for Biomolecular Research, Utrecht University, Utrecht, Netherlands
| | - Esther A Zaal
- Biomolecular Mass Spectrometry and Proteomics Group, Utrecht Institute for Pharmaceutical Sciences, Bijvoet Center for Biomolecular Research, Utrecht University, Utrecht, Netherlands
| | | | - Timo J Ter Braak
- Pediatrics, VU University Medical Center, Amsterdam, Netherlands
| | - Emiel Polder
- Pediatrics, VU University Medical Center, Amsterdam, Netherlands
| | - Justin W Kenney
- Centre for Biological Sciences, University of Southampton, Southampton, United Kingdom
| | - Christopher G Proud
- Centre for Biological Sciences, University of Southampton, Southampton, United Kingdom
| | - Celia R Berkers
- Biomolecular Mass Spectrometry and Proteomics Group, Utrecht Institute for Pharmaceutical Sciences, Bijvoet Center for Biomolecular Research, Utrecht University, Utrecht, Netherlands
| | - Maarten A F Altelaar
- Biomolecular Mass Spectrometry and Proteomics Group, Utrecht Institute for Pharmaceutical Sciences, Bijvoet Center for Biomolecular Research, Utrecht University, Utrecht, Netherlands
| | - Dave Speijer
- Medical Biochemistry, Academic Medical Center, Amsterdam, Netherlands
| | | | - Truus E M Abbink
- Pediatrics, VU University Medical Center, Amsterdam, Netherlands
| |
Collapse
|
111
|
An RNA structure-mediated, posttranscriptional model of human α-1-antitrypsin expression. Proc Natl Acad Sci U S A 2017; 114:E10244-E10253. [PMID: 29109288 PMCID: PMC5703279 DOI: 10.1073/pnas.1706539114] [Citation(s) in RCA: 43] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022] Open
Abstract
Protein and mRNA expression are in most cases poorly correlated, which suggests that the posttranscriptional regulatory program of a cell is an important component of gene expression. This regulatory network is still poorly understood, including how RNA structure quantitatively contributes to translational control. We present here a series of structural and functional experiments that together allow us to derive a quantitative, structure-dependent model of translation that accurately predicts translation efficiency in reporter assays and primary human tissue for a complex and medically important protein, α-1-antitrypsin. Our model demonstrates the importance of accurate, experimentally derived RNA structural models partnered with Kozak sequence information to explain protein expression and suggests a strategy by which α-1-antitrypsin expression may be increased in diseased individuals. Chronic obstructive pulmonary disease (COPD) affects over 65 million individuals worldwide, where α-1-antitrypsin deficiency is a major genetic cause of the disease. The α-1-antitrypsin gene, SERPINA1, expresses an exceptional number of mRNA isoforms generated entirely by alternative splicing in the 5′-untranslated region (5′-UTR). Although all SERPINA1 mRNAs encode exactly the same protein, expression levels of the individual mRNAs vary substantially in different human tissues. We hypothesize that these transcripts behave unequally due to a posttranscriptional regulatory program governed by their distinct 5′-UTRs and that this regulation ultimately determines α-1-antitrypsin expression. Using whole-transcript selective 2′-hydroxyl acylation by primer extension (SHAPE) chemical probing, we show that splicing yields distinct local 5′-UTR secondary structures in SERPINA1 transcripts. Splicing in the 5′-UTR also changes the inclusion of long upstream ORFs (uORFs). We demonstrate that disrupting the uORFs results in markedly increased translation efficiencies in luciferase reporter assays. These uORF-dependent changes suggest that α-1-antitrypsin protein expression levels are controlled at the posttranscriptional level. A leaky-scanning model of translation based on Kozak translation initiation sequences alone does not adequately explain our quantitative expression data. However, when we incorporate the experimentally derived RNA structure data, the model accurately predicts translation efficiencies in reporter assays and improves α-1-antitrypsin expression prediction in primary human tissues. Our results reveal that RNA structure governs a complex posttranscriptional regulatory program of α-1-antitrypsin expression. Crucially, these findings describe a mechanism by which genetic alterations in noncoding gene regions may result in α-1-antitrypsin deficiency.
Collapse
|
112
|
Cuperus JT, Groves B, Kuchina A, Rosenberg AB, Jojic N, Fields S, Seelig G. Deep learning of the regulatory grammar of yeast 5' untranslated regions from 500,000 random sequences. Genome Res 2017; 27:2015-2024. [PMID: 29097404 PMCID: PMC5741052 DOI: 10.1101/gr.224964.117] [Citation(s) in RCA: 126] [Impact Index Per Article: 15.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2017] [Accepted: 10/18/2017] [Indexed: 11/25/2022]
Abstract
Our ability to predict protein expression from DNA sequence alone remains poor, reflecting our limited understanding of cis-regulatory grammar and hampering the design of engineered genes for synthetic biology applications. Here, we generate a model that predicts the protein expression of the 5′ untranslated region (UTR) of mRNAs in the yeast Saccharomyces cerevisiae. We constructed a library of half a million 50-nucleotide-long random 5′ UTRs and assayed their activity in a massively parallel growth selection experiment. The resulting data allow us to quantify the impact on protein expression of Kozak sequence composition, upstream open reading frames (uORFs), and secondary structure. We trained a convolutional neural network (CNN) on the random library and showed that it performs well at predicting the protein expression of both a held-out set of the random 5′ UTRs as well as native S. cerevisiae 5′ UTRs. The model additionally was used to computationally evolve highly active 5′ UTRs. We confirmed experimentally that the great majority of the evolved sequences led to higher protein expression rates than the starting sequences, demonstrating the predictive power of this model.
Collapse
Affiliation(s)
- Josh T Cuperus
- Department of Genome Sciences, University of Washington, Seattle, Washington 98195, USA.,Howard Hughes Medical Institute, University of Washington, Seattle, Washington 98195, USA
| | - Benjamin Groves
- Department of Electrical Engineering, University of Washington, Seattle, Washington 98195, USA
| | - Anna Kuchina
- Department of Electrical Engineering, University of Washington, Seattle, Washington 98195, USA
| | - Alexander B Rosenberg
- Department of Electrical Engineering, University of Washington, Seattle, Washington 98195, USA
| | | | - Stanley Fields
- Department of Genome Sciences, University of Washington, Seattle, Washington 98195, USA.,Howard Hughes Medical Institute, University of Washington, Seattle, Washington 98195, USA.,Department of Medicine, University of Washington, Seattle, Washington 98195, USA
| | - Georg Seelig
- Department of Electrical Engineering, University of Washington, Seattle, Washington 98195, USA.,Department of Computer Science & Engineering, University of Washington, Seattle, Washington 98195, USA
| |
Collapse
|
113
|
Matreyek KA, Stephany JJ, Fowler DM. A platform for functional assessment of large variant libraries in mammalian cells. Nucleic Acids Res 2017; 45:e102. [PMID: 28335006 PMCID: PMC5499817 DOI: 10.1093/nar/gkx183] [Citation(s) in RCA: 87] [Impact Index Per Article: 10.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2016] [Accepted: 03/08/2017] [Indexed: 01/01/2023] Open
Abstract
Sequencing-based, massively parallel genetic assays have revolutionized our ability to quantify the relationship between many genotypes and a phenotype of interest. Unfortunately, variant library expression platforms in mammalian cells are far from ideal, hindering the study of human gene variants in their physiologically relevant cellular contexts. Here, we describe a platform for phenotyping variant libraries in transfectable mammalian cell lines in two steps. First, a landing pad cell line with a genomically integrated, Tet-inducible cassette containing a Bxb1 recombination site is created. Second, a single variant from a library of transfected, promoter-less plasmids is recombined into the landing pad in each cell. Thus, every cell in the recombined pool expresses a single variant, allowing for parallel, sequencing-based assessment of variant effect. We describe a method for incorporating a single landing pad into a defined site of a cell line of interest, and show that our approach can be used generate more than 20 000 recombinant cells in a single experiment. Finally, we use our platform in combination with a sequencing-based assay to explore the N-end rule by simultaneously measuring the effects of all possible N-terminal amino acids on protein expression.
Collapse
Affiliation(s)
- Kenneth A Matreyek
- Department of Genome Sciences, University of Washington, Seattle, WA 98195, USA
| | - Jason J Stephany
- Department of Bioengineering, University of Washington, Seattle, WA 98195, USA
| | - Douglas M Fowler
- Department of Genome Sciences, University of Washington, Seattle, WA 98195, USA.,Department of Bioengineering, University of Washington, Seattle, WA 98195, USA
| |
Collapse
|
114
|
Fijalkowska D, Verbruggen S, Ndah E, Jonckheere V, Menschaert G, Van Damme P. eIF1 modulates the recognition of suboptimal translation initiation sites and steers gene expression via uORFs. Nucleic Acids Res 2017; 45:7997-8013. [PMID: 28541577 PMCID: PMC5570006 DOI: 10.1093/nar/gkx469] [Citation(s) in RCA: 54] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2017] [Accepted: 05/11/2017] [Indexed: 12/25/2022] Open
Abstract
Alternative translation initiation mechanisms such as leaky scanning and reinitiation potentiate the polycistronic nature of human transcripts. By allowing for reprogrammed translation, these mechanisms can mediate biological responses to stimuli. We combined proteomics with ribosome profiling and mRNA sequencing to identify the biological targets of translation control triggered by the eukaryotic translation initiation factor 1 (eIF1), a protein implicated in the stringency of start codon selection. We quantified expression changes of over 4000 proteins and 10 000 actively translated transcripts, leading to the identification of 245 transcripts undergoing translational control mediated by upstream open reading frames (uORFs) upon eIF1 deprivation. Here, the stringency of start codon selection and preference for an optimal nucleotide context were largely diminished leading to translational upregulation of uORFs with suboptimal start. Interestingly, genes affected by eIF1 deprivation were implicated in energy production and sensing of metabolic stress.
Collapse
Affiliation(s)
- Daria Fijalkowska
- VIB-UGent Center for Medical Biotechnology, B-9000 Ghent, Belgium.,Department of Biochemistry, Ghent University, B-9000 Ghent, Belgium
| | - Steven Verbruggen
- Lab of Bioinformatics and Computational Genomics, Department of Mathematical Modelling, Statistics and Bioinformatics, Faculty of Bioscience Engineering, Ghent University, B-9000 Ghent, Belgium
| | - Elvis Ndah
- VIB-UGent Center for Medical Biotechnology, B-9000 Ghent, Belgium.,Department of Biochemistry, Ghent University, B-9000 Ghent, Belgium.,Lab of Bioinformatics and Computational Genomics, Department of Mathematical Modelling, Statistics and Bioinformatics, Faculty of Bioscience Engineering, Ghent University, B-9000 Ghent, Belgium
| | - Veronique Jonckheere
- VIB-UGent Center for Medical Biotechnology, B-9000 Ghent, Belgium.,Department of Biochemistry, Ghent University, B-9000 Ghent, Belgium
| | - Gerben Menschaert
- Lab of Bioinformatics and Computational Genomics, Department of Mathematical Modelling, Statistics and Bioinformatics, Faculty of Bioscience Engineering, Ghent University, B-9000 Ghent, Belgium
| | - Petra Van Damme
- VIB-UGent Center for Medical Biotechnology, B-9000 Ghent, Belgium.,Department of Biochemistry, Ghent University, B-9000 Ghent, Belgium
| |
Collapse
|
115
|
Evfratov SA, Osterman IA, Komarova ES, Pogorelskaya AM, Rubtsova MP, Zatsepin TS, Semashko TA, Kostryukova ES, Mironov AA, Burnaev E, Krymova E, Gelfand MS, Govorun VM, Bogdanov AA, Sergiev PV, Dontsova OA. Application of sorting and next generation sequencing to study 5΄-UTR influence on translation efficiency in Escherichia coli. Nucleic Acids Res 2017; 45:3487-3502. [PMID: 27899632 PMCID: PMC5389652 DOI: 10.1093/nar/gkw1141] [Citation(s) in RCA: 40] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2016] [Accepted: 10/31/2016] [Indexed: 12/24/2022] Open
Abstract
Yield of protein per translated mRNA may vary by four orders of magnitude. Many studies analyzed the influence of mRNA features on the translation yield. However, a detailed understanding of how mRNA sequence determines its propensity to be translated is still missing. Here, we constructed a set of reporter plasmid libraries encoding CER fluorescent protein preceded by randomized 5΄ untranslated regions (5΄-UTR) and Red fluorescent protein (RFP) used as an internal control. Each library was transformed into Escherchia coli cells, separated by efficiency of CER mRNA translation by a cell sorter and subjected to next generation sequencing. We tested efficiency of translation of the CER gene preceded by each of 48 natural 5΄-UTR sequences and introduced random and designed mutations into natural and artificially selected 5΄-UTRs. Several distinct properties could be ascribed to a group of 5΄-UTRs most efficient in translation. In addition to known ones, several previously unrecognized features that contribute to the translation enhancement were found, such as low proportion of cytidine residues, multiple SD sequences and AG repeats. The latter could be identified as translation enhancer, albeit less efficient than SD sequence in several natural 5΄-UTRs.
Collapse
Affiliation(s)
- Sergey A Evfratov
- Department of Chemistry, Faculty of Bioinformatics and Bioengeneering, Lomonosov Moscow State University, Moscow, 119992, Russia
| | - Ilya A Osterman
- Department of Chemistry, Faculty of Bioinformatics and Bioengeneering, Lomonosov Moscow State University, Moscow, 119992, Russia.,Skolkovo Institute of Science and Technology, Skolkovo, Moscow, 143025, Russia.,A.N. Belozersky Institute of Physico-Chemical Biology, Lomonosov Moscow State University, Moscow, 119992, Russia
| | - Ekaterina S Komarova
- Department of Chemistry, Faculty of Bioinformatics and Bioengeneering, Lomonosov Moscow State University, Moscow, 119992, Russia
| | - Alexandra M Pogorelskaya
- Department of Chemistry, Faculty of Bioinformatics and Bioengeneering, Lomonosov Moscow State University, Moscow, 119992, Russia
| | - Maria P Rubtsova
- Department of Chemistry, Faculty of Bioinformatics and Bioengeneering, Lomonosov Moscow State University, Moscow, 119992, Russia.,Skolkovo Institute of Science and Technology, Skolkovo, Moscow, 143025, Russia.,A.N. Belozersky Institute of Physico-Chemical Biology, Lomonosov Moscow State University, Moscow, 119992, Russia
| | - Timofei S Zatsepin
- Department of Chemistry, Faculty of Bioinformatics and Bioengeneering, Lomonosov Moscow State University, Moscow, 119992, Russia.,Skolkovo Institute of Science and Technology, Skolkovo, Moscow, 143025, Russia.,A.N. Belozersky Institute of Physico-Chemical Biology, Lomonosov Moscow State University, Moscow, 119992, Russia
| | - Tatiana A Semashko
- Research Institute for Physical-Chemical Medicine, FMBA, Moscow, 119435, Russia
| | - Elena S Kostryukova
- Research Institute for Physical-Chemical Medicine, FMBA, Moscow, 119435, Russia.,Moscow Institute of Physics and Technology, Dolgoprpudny, Moscow, 141700, Russia
| | - Andrey A Mironov
- Department of Chemistry, Faculty of Bioinformatics and Bioengeneering, Lomonosov Moscow State University, Moscow, 119992, Russia.,A.N. Belozersky Institute of Physico-Chemical Biology, Lomonosov Moscow State University, Moscow, 119992, Russia
| | - Evgeny Burnaev
- Skolkovo Institute of Science and Technology, Skolkovo, Moscow, 143025, Russia.,A.A. Kharkevich Institute for Information Transmission Problems, Russian Academy of Sciences, Moscow, 127051, Russia
| | - Ekaterina Krymova
- A.A. Kharkevich Institute for Information Transmission Problems, Russian Academy of Sciences, Moscow, 127051, Russia
| | - Mikhail S Gelfand
- Department of Chemistry, Faculty of Bioinformatics and Bioengeneering, Lomonosov Moscow State University, Moscow, 119992, Russia.,Skolkovo Institute of Science and Technology, Skolkovo, Moscow, 143025, Russia.,A.N. Belozersky Institute of Physico-Chemical Biology, Lomonosov Moscow State University, Moscow, 119992, Russia.,A.A. Kharkevich Institute for Information Transmission Problems, Russian Academy of Sciences, Moscow, 127051, Russia.,National Research University Higher School of Economics, Moscow, 123458, Russia
| | - Vadim M Govorun
- Research Institute for Physical-Chemical Medicine, FMBA, Moscow, 119435, Russia
| | - Alexey A Bogdanov
- Department of Chemistry, Faculty of Bioinformatics and Bioengeneering, Lomonosov Moscow State University, Moscow, 119992, Russia.,A.N. Belozersky Institute of Physico-Chemical Biology, Lomonosov Moscow State University, Moscow, 119992, Russia
| | - Petr V Sergiev
- Department of Chemistry, Faculty of Bioinformatics and Bioengeneering, Lomonosov Moscow State University, Moscow, 119992, Russia.,Skolkovo Institute of Science and Technology, Skolkovo, Moscow, 143025, Russia.,A.N. Belozersky Institute of Physico-Chemical Biology, Lomonosov Moscow State University, Moscow, 119992, Russia
| | - Olga A Dontsova
- Department of Chemistry, Faculty of Bioinformatics and Bioengeneering, Lomonosov Moscow State University, Moscow, 119992, Russia.,Skolkovo Institute of Science and Technology, Skolkovo, Moscow, 143025, Russia.,A.N. Belozersky Institute of Physico-Chemical Biology, Lomonosov Moscow State University, Moscow, 119992, Russia
| |
Collapse
|
116
|
Kondratov O, Marsic D, Crosson SM, Mendez-Gomez HR, Moskalenko O, Mietzsch M, Heilbronn R, Allison JR, Green KB, Agbandje-McKenna M, Zolotukhin S. Direct Head-to-Head Evaluation of Recombinant Adeno-associated Viral Vectors Manufactured in Human versus Insect Cells. Mol Ther 2017; 25:2661-2675. [PMID: 28890324 DOI: 10.1016/j.ymthe.2017.08.003] [Citation(s) in RCA: 62] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2017] [Revised: 07/19/2017] [Accepted: 08/07/2017] [Indexed: 10/19/2022] Open
Abstract
The major drawback of the Baculovirus/Sf9 system for recombinant adeno-associated viral (rAAV) manufacturing is that most of the Bac-derived rAAV vector serotypes, with few exceptions, demonstrate altered capsid compositions and lower biological potencies. Here, we describe a new insect cell-based production platform utilizing attenuated Kozak sequence and a leaky ribosome scanning to achieve a serotype-specific modulation of AAV capsid proteins stoichiometry. By way of example, rAAV5 and rAAV9 were produced and comprehensively characterized side by side with HEK293-derived vectors. A mass spectrometry analysis documented a 3-fold increase in both viral protein (VP)1 and VP2 capsid protein content compared with human cell-derived vectors. Furthermore, we conducted an extensive analysis of encapsidated single-stranded viral DNA using next-generation sequencing and show a 6-fold reduction in collaterally packaged contaminating DNA for rAAV5 produced in insect cells. Consequently, the re-designed rAAVs demonstrated significantly higher biological potencies, even in a comparison with HEK293-manufactured rAAVs mediating, in the case of rAAV5, 4-fold higher transduction of brain tissues in mice. Thus, the described system yields rAAV vectors of superior infectivity and higher genetic identity providing a scalable platform for good manufacturing practice (GMP)-grade vector production.
Collapse
Affiliation(s)
- Oleksandr Kondratov
- Department of Pediatrics, University of Florida College of Medicine, Gainesville, FL 32610, USA
| | - Damien Marsic
- Department of Pediatrics, University of Florida College of Medicine, Gainesville, FL 32610, USA
| | - Sean M Crosson
- Department of Pediatrics, University of Florida College of Medicine, Gainesville, FL 32610, USA
| | - Hector R Mendez-Gomez
- Department of Molecular Genetics and Microbiology, University of Florida College of Medicine, Gainesville, FL 32610, USA
| | - Oleksandr Moskalenko
- UFIT Research Computing, University of Florida College of Medicine, Gainesville, FL 32610, USA
| | - Mario Mietzsch
- Department of Biochemistry and Molecular Biology, Center for Structural Biology, University of Florida College of Medicine, Gainesville, FL 32610, USA; Institute of Virology, Campus Benjamin Franklin, Charité Medical School, Berlin, Germany
| | - Regine Heilbronn
- Institute of Virology, Campus Benjamin Franklin, Charité Medical School, Berlin, Germany
| | | | - Kari B Green
- Department of Chemistry, University of Florida, Gainesville, FL, USA
| | - Mavis Agbandje-McKenna
- Department of Biochemistry and Molecular Biology, Center for Structural Biology, University of Florida College of Medicine, Gainesville, FL 32610, USA
| | - Sergei Zolotukhin
- Department of Pediatrics, University of Florida College of Medicine, Gainesville, FL 32610, USA.
| |
Collapse
|
117
|
Lock FE, Babaian A, Zhang Y, Gagnier L, Kuah S, Weberling A, Karimi MM, Mager DL. A novel isoform of IL-33 revealed by screening for transposable element promoted genes in human colorectal cancer. PLoS One 2017; 12:e0180659. [PMID: 28715472 PMCID: PMC5513427 DOI: 10.1371/journal.pone.0180659] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2017] [Accepted: 06/19/2017] [Indexed: 02/06/2023] Open
Abstract
Remnants of ancient transposable elements (TEs) are abundant in mammalian genomes. These sequences contain multiple regulatory motifs and hence are capable of influencing expression of host genes. TEs are known to be released from epigenetic repression and can become transcriptionally active in cancer. Such activation could also lead to lineage-inappropriate activation of oncogenes, as previously described in lymphomas. However, there are few reports of this mechanism occurring in non-blood cancers. Here, we re-analyzed whole transcriptome data from a large cohort of patients with colon cancer, compared to matched normal colon control samples, to detect genes or transcripts ectopically expressed through activation of TE promoters. Among many such transcripts, we identified six where the affected gene has described role in cancer and where the TE-driven gene mRNA is expressed in primary colon cancer, but not normal matched tissue, and confirmed expression in colon cancer-derived cell lines. We further characterized a TE-gene chimeric transcript involving the Interleukin 33 (IL-33) gene (termed LTR-IL-33), that is ectopically expressed in a subset of colon cancer samples through the use of an endogenous retroviral long terminal repeat (LTR) promoter of the MSTD family. The LTR-IL-33 chimeric transcript encodes a novel shorter isoform of the protein, which is missing the initial N-terminus (including many conserved residues) of Native IL-33. In vitro studies showed that LTR-IL-33 expression is required for optimal CRC cell line growth as 3D colonospheres. Taken together, these data demonstrate the significance of TEs as regulators of aberrant gene expression in colon cancer.
Collapse
Affiliation(s)
- Frances E. Lock
- Terry Fox Laboratory, British Columbia Cancer Agency, Vancouver, BC, Canada
- Department of Medical Genetics, University of British Columbia, Vancouver, BC, Canada
| | - Artem Babaian
- Terry Fox Laboratory, British Columbia Cancer Agency, Vancouver, BC, Canada
- Department of Medical Genetics, University of British Columbia, Vancouver, BC, Canada
| | - Ying Zhang
- Terry Fox Laboratory, British Columbia Cancer Agency, Vancouver, BC, Canada
- Department of Medical Genetics, University of British Columbia, Vancouver, BC, Canada
| | - Liane Gagnier
- Terry Fox Laboratory, British Columbia Cancer Agency, Vancouver, BC, Canada
- Department of Medical Genetics, University of British Columbia, Vancouver, BC, Canada
| | - Sabrina Kuah
- Terry Fox Laboratory, British Columbia Cancer Agency, Vancouver, BC, Canada
| | - Antonia Weberling
- Terry Fox Laboratory, British Columbia Cancer Agency, Vancouver, BC, Canada
| | - Mohammad M. Karimi
- Department of Medical Genetics, University of British Columbia, Vancouver, BC, Canada
- Département de Génomique Fonctionnelle et Cancer, Institut de Génétique et Biologie Moléculaire et Cellulaire (IGBMC)/Université de Strasbourg/CNRS/INSERM, France
| | - Dixie L. Mager
- Terry Fox Laboratory, British Columbia Cancer Agency, Vancouver, BC, Canada
- Department of Medical Genetics, University of British Columbia, Vancouver, BC, Canada
| |
Collapse
|
118
|
Abstract
MOTIVATION Translation initiation is a key step in the regulation of gene expression. In addition to the annotated translation initiation sites (TISs), the translation process may also start at multiple alternative TISs (including both AUG and non-AUG codons), which makes it challenging to predict TISs and study the underlying regulatory mechanisms. Meanwhile, the advent of several high-throughput sequencing techniques for profiling initiating ribosomes at single-nucleotide resolution, e.g. GTI-seq and QTI-seq, provides abundant data for systematically studying the general principles of translation initiation and the development of computational method for TIS identification. METHODS We have developed a deep learning-based framework, named TITER, for accurately predicting TISs on a genome-wide scale based on QTI-seq data. TITER extracts the sequence features of translation initiation from the surrounding sequence contexts of TISs using a hybrid neural network and further integrates the prior preference of TIS codon composition into a unified prediction framework. RESULTS Extensive tests demonstrated that TITER can greatly outperform the state-of-the-art prediction methods in identifying TISs. In addition, TITER was able to identify important sequence signatures for individual types of TIS codons, including a Kozak-sequence-like motif for AUG start codon. Furthermore, the TITER prediction score can be related to the strength of translation initiation in various biological scenarios, including the repressive effect of the upstream open reading frames on gene expression and the mutational effects influencing translation initiation efficiency. AVAILABILITY AND IMPLEMENTATION TITER is available as an open-source software and can be downloaded from https://github.com/zhangsaithu/titer . CONTACT lzhang20@mail.tsinghua.edu.cn or zengjy321@tsinghua.edu.cn. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Sai Zhang
- Institute for Interdisciplinary Information Sciences, Tsinghua University, Beijing, China
| | - Hailin Hu
- School of Medicine, Tsinghua University, Beijing, China
| | - Tao Jiang
- Department of Computer Science and Engineering, University of California, Riverside, CA, USA
- MOE Key Lab of Bioinformatics and Bioinformatics Division, TNLIST/Department of Computer Science and Technology, Tsinghua University, Beijing, China
- Institute of Integrative Genome Biology, University of California, Riverside, CA, USA
| | - Lei Zhang
- School of Medicine, Tsinghua University, Beijing, China
| | - Jianyang Zeng
- Institute for Interdisciplinary Information Sciences, Tsinghua University, Beijing, China
| |
Collapse
|
119
|
Abstract
The non-canonical initiation factors DENR and MCTS1 have been linked to cancer and autism. We recently showed in Drosophila that DENR and MCTS1 regulate translation re-initiation on transcripts containing upstream Open Reading Frames (uORFs) with strong Kozak sequences (stuORFs). Due to the medical relevance of DENR and MCTS1, it is worthwhile identifying the transcripts in human cells that depend on DENR and MCTS1 for their translation. We show here that in humans, as in Drosophila, transcripts with short stuORFs require DENR and MCTS1 for their optimal expression. In contrast to Drosophila, however, the dependence on stuORF length in human cells is very strong, so that only transcripts with very short stuORFs coding for 1 amino acid are dependent on DENR and MCTS1. This identifies circa 100 genes as putative DENR and MCTS1 translational targets. These genes are enriched for neuronal genes and G protein-coupled receptors. The identification of DENR and MCTS1 target transcripts will serve as a basis for future studies aimed at understanding the mechanistic involvement of DENR and MCTS1 in cancer and autism.
Collapse
|
120
|
Andreev DE, O'Connor PBF, Loughran G, Dmitriev SE, Baranov PV, Shatsky IN. Insights into the mechanisms of eukaryotic translation gained with ribosome profiling. Nucleic Acids Res 2016; 45:513-526. [PMID: 27923997 PMCID: PMC5314775 DOI: 10.1093/nar/gkw1190] [Citation(s) in RCA: 106] [Impact Index Per Article: 11.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2016] [Revised: 10/31/2016] [Accepted: 11/18/2016] [Indexed: 12/29/2022] Open
Abstract
The development of Ribosome Profiling (RiboSeq) has revolutionized functional genomics. RiboSeq is based on capturing and sequencing of the mRNA fragments enclosed within the translating ribosome and it thereby provides a ‘snapshot’ of ribosome positions at the transcriptome wide level. Although the method is predominantly used for analysis of differential gene expression and discovery of novel translated ORFs, the RiboSeq data can also be a rich source of information about molecular mechanisms of polypeptide synthesis and translational control. This review will focus on how recent findings made with RiboSeq have revealed important details of the molecular mechanisms of translation in eukaryotes. These include mRNA translation sensitivity to drugs affecting translation initiation and elongation, the roles of upstream ORFs in response to stress, the dynamics of elongation and termination as well as details of intrinsic ribosome behavior on the mRNA after translation termination. As the RiboSeq method is still at a relatively early stage we will also discuss the implications of RiboSeq artifacts on data interpretation.
Collapse
Affiliation(s)
- Dmitry E Andreev
- Belozersky Institute of Physico-Chemical Biology, Lomonosov Moscow State University, Moscow 119234, Russia
| | | | - Gary Loughran
- School of Biochemistry and Cell Biology, University College Cork, Cork, Ireland
| | - Sergey E Dmitriev
- Belozersky Institute of Physico-Chemical Biology, Lomonosov Moscow State University, Moscow 119234, Russia
| | - Pavel V Baranov
- School of Biochemistry and Cell Biology, University College Cork, Cork, Ireland
| | - Ivan N Shatsky
- Belozersky Institute of Physico-Chemical Biology, Lomonosov Moscow State University, Moscow 119234, Russia
| |
Collapse
|
121
|
Re A, Waldron L, Quattrone A. Control of Gene Expression by RNA Binding Protein Action on Alternative Translation Initiation Sites. PLoS Comput Biol 2016; 12:e1005198. [PMID: 27923063 PMCID: PMC5140048 DOI: 10.1371/journal.pcbi.1005198] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2015] [Accepted: 10/13/2016] [Indexed: 11/18/2022] Open
Abstract
Transcript levels do not faithfully predict protein levels, due to post-transcriptional regulation of gene expression mediated by RNA binding proteins (RBPs) and non-coding RNAs. We developed a multivariate linear regression model integrating RBP levels and predicted RBP-mRNA regulatory interactions from matched transcript and protein datasets. RBPs significantly improved the accuracy in predicting protein abundance of a portion of the total modeled mRNAs in three panels of tissues and cells and for different methods employed in the detection of mRNA and protein. The presence of upstream translation initiation sites (uTISs) at the mRNA 5’ untranslated regions was strongly associated with improvement in predictive accuracy. On the basis of these observations, we propose that the recently discovered widespread uTISs in the human genome can be a previously unappreciated substrate of translational control mediated by RBPs. Gene expression is a dynamic program by which the information stored in the genome is rendered functional by production and degradation of two types of macromolecules, RNAs and proteins. mRNAs are templates for proteins; therefore we expect correspondence between quantities of mRNAs and proteins. Genome-wide studies instead indicate a marked discrepancy between them, when considering their steady-state levels or their variations across different conditions. We employed linear regression approaches with paired mRNA/protein datasets in order to develop a model predicting the protein level of a gene from both the mRNA level and the protein levels of RBPs inferred to bind the mRNA untranslated regions. The results of our analyses restricted the utility of RBPs to improve accuracy of predicted protein abundance to a small fraction of the total modelled genes, and identified a novel association of the improvement induced by RBPs with the presence of upstream translation sites. This finding suggests a new avenue of experimental studies aimed at exploring the hypothesis that RBPs could influence protein abundance by changing the preference for certain translation initiation sites.
Collapse
Affiliation(s)
- Angela Re
- Laboratory of Translational Genomics, Centre for Integrative Biology, University of Trento, Polo Scientifico e Tecnologico Fabio Ferrari, Trento, Italy
- * E-mail: (AR); (LW); (AQ)
| | - Levi Waldron
- City University of New York Graduate School of Public Health and Health Policy, New York, New York, United States of America
- * E-mail: (AR); (LW); (AQ)
| | - Alessandro Quattrone
- Laboratory of Translational Genomics, Centre for Integrative Biology, University of Trento, Polo Scientifico e Tecnologico Fabio Ferrari, Trento, Italy
- * E-mail: (AR); (LW); (AQ)
| |
Collapse
|
122
|
Reuter K, Biehl A, Koch L, Helms V. PreTIS: A Tool to Predict Non-canonical 5' UTR Translational Initiation Sites in Human and Mouse. PLoS Comput Biol 2016; 12:e1005170. [PMID: 27768687 PMCID: PMC5074520 DOI: 10.1371/journal.pcbi.1005170] [Citation(s) in RCA: 31] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/29/2016] [Accepted: 09/27/2016] [Indexed: 02/03/2023] Open
Abstract
Translation of mRNA sequences into proteins typically starts at an AUG triplet. In rare cases, translation may also start at alternative non-AUG codons located in the annotated 5' UTR which leads to an increased regulatory complexity. Since ribosome profiling detects translational start sites at the nucleotide level, the properties of these start sites can then be used for the statistical evaluation of functional open reading frames. We developed a linear regression approach to predict in-frame and out-of-frame translational start sites within the 5' UTR from mRNA sequence information together with their translation initiation confidence. Predicted start codons comprise AUG as well as near-cognate codons. The underlying datasets are based on published translational start sites for human HEK293 and mouse embryonic stem cells that were derived by the original authors from ribosome profiling data. The average prediction accuracy of true vs. false start sites for HEK293 cells was 80%. When applied to mouse mRNA sequences, the same model predicted translation initiation sites observed in mouse ES cells with an accuracy of 76%. Moreover, we illustrate the effect of in silico mutations in the flanking sequence context of a start site on the predicted initiation confidence. Our new webservice PreTIS visualizes alternative start sites and their respective ORFs and predicts their ability to initiate translation. Solely, the mRNA sequence is required as input. PreTIS is accessible at http://service.bioinformatik.uni-saarland.de/pretis.
Collapse
Affiliation(s)
- Kerstin Reuter
- Center for Bioinformatics, Saarland University, Saarbrücken, Germany
- Saarbrücken Graduate School of Computer Science, Saarland University, Saarbrücken, Germany
| | - Alexander Biehl
- Center for Bioinformatics, Saarland University, Saarbrücken, Germany
| | - Laurena Koch
- Center for Bioinformatics, Saarland University, Saarbrücken, Germany
| | - Volkhard Helms
- Center for Bioinformatics, Saarland University, Saarbrücken, Germany
- * E-mail:
| |
Collapse
|
123
|
Tzani I, Ivanov IP, Andreev DE, Dmitriev RI, Dean KA, Baranov PV, Atkins JF, Loughran G. Systematic analysis of the PTEN 5' leader identifies a major AUU initiated proteoform. Open Biol 2016; 6:rsob.150203. [PMID: 27249819 PMCID: PMC4892431 DOI: 10.1098/rsob.150203] [Citation(s) in RCA: 32] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2015] [Accepted: 04/26/2016] [Indexed: 12/22/2022] Open
Abstract
Abundant evidence for translation within the 5' leaders of many human genes is rapidly emerging, especially, because of the advent of ribosome profiling. In most cases, it is believed that the act of translation rather than the encoded peptide is important. However, the wealth of available sequencing data in recent years allows phylogenetic detection of sequences within 5' leaders that have emerged under coding constraint and therefore allow for the prediction of functional 5' leader translation. Using this approach, we previously predicted a CUG-initiated, 173 amino acid N-terminal extension to the human tumour suppressor PTEN. Here, a systematic experimental analysis of translation events in the PTEN 5' leader identifies at least two additional non-AUG-initiated PTEN proteoforms that are expressed in most human cell lines tested. The most abundant extended PTEN proteoform initiates at a conserved AUU codon and extends the canonical AUG-initiated PTEN by 146 amino acids. All N-terminally extended PTEN proteoforms tested retain the ability to downregulate the PI3K pathway. We also provide evidence for the translation of two conserved AUG-initiated upstream open reading frames within the PTEN 5' leader that control the ratio of PTEN proteoforms.
Collapse
Affiliation(s)
- Ioanna Tzani
- School of Biochemistry and Cell Biology, University College Cork, Cork, Ireland
| | - Ivaylo P Ivanov
- Eunice Kennedy Shriver National Institute of Child Health and Human Development, National Institutes of Health, Bethesda, MD 20892, USA
| | - Dmitri E Andreev
- School of Biochemistry and Cell Biology, University College Cork, Cork, Ireland Belozersky Institute of Physico-Chemical Biology, Lomonosov Moscow State University, Moscow 119992, Russia
| | - Ruslan I Dmitriev
- School of Biochemistry and Cell Biology, University College Cork, Cork, Ireland
| | - Kellie A Dean
- School of Biochemistry and Cell Biology, University College Cork, Cork, Ireland
| | - Pavel V Baranov
- School of Biochemistry and Cell Biology, University College Cork, Cork, Ireland
| | - John F Atkins
- School of Biochemistry and Cell Biology, University College Cork, Cork, Ireland Department of Human Genetics, University of Utah, Salt Lake City, UT 84112-5330, USA
| | - Gary Loughran
- School of Biochemistry and Cell Biology, University College Cork, Cork, Ireland
| |
Collapse
|
124
|
Chew GL, Pauli A, Schier AF. Conservation of uORF repressiveness and sequence features in mouse, human and zebrafish. Nat Commun 2016; 7:11663. [PMID: 27216465 PMCID: PMC4890304 DOI: 10.1038/ncomms11663] [Citation(s) in RCA: 129] [Impact Index Per Article: 14.3] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/06/2016] [Accepted: 04/18/2016] [Indexed: 02/07/2023] Open
Abstract
Upstream open reading frames (uORFs) are ubiquitous repressive genetic elements in vertebrate mRNAs. While much is known about the regulation of individual genes by their uORFs, the range of uORF-mediated translational repression in vertebrate genomes is largely unexplored. Moreover, it is unclear whether the repressive effects of uORFs are conserved across species. To address these questions, we analyse transcript sequences and ribosome profiling data from human, mouse and zebrafish. We find that uORFs are depleted near coding sequences (CDSes) and have initiation contexts that diminish their translation. Linear modelling reveals that sequence features at both uORFs and CDSes modulate the translation of CDSes. Moreover, the ratio of translation over 5′ leaders and CDSes is conserved between human and mouse, and correlates with the number of uORFs. These observations suggest that the prevalence of vertebrate uORFs may be explained by their conserved role in repressing CDS translation. Upstream open reading frames (uORFs) can repress gene expression. Here, Guo-Liang Chew and colleagues use bioinformatics approaches to show that conservation of uORF-mediated translational repression is mediated by sequence features in human, mouse and zebrafish genomes.
Collapse
Affiliation(s)
- Guo-Liang Chew
- Department of Molecular and Cellular Biology, Harvard University, Cambridge, Massachusetts 02138, USA
| | - Andrea Pauli
- Department of Molecular and Cellular Biology, Harvard University, Cambridge, Massachusetts 02138, USA
| | - Alexander F Schier
- Department of Molecular and Cellular Biology, Harvard University, Cambridge, Massachusetts 02138, USA.,The Broad Institute of Massachusetts Institute of Technology and Harvard, Cambridge, Massachusetts 02142, USA.,FAS Center for Systems Biology, Harvard University, Cambridge, Massachusetts 02138, USA.,Center for Brain Science, Harvard University, Cambridge, Massachusetts 02138, USA.,Harvard Stem Cell Institute, Harvard University, Cambridge, Massachusetts 02138, USA
| |
Collapse
|
125
|
Pisareva VP, Pisarev AV. DHX29 reduces leaky scanning through an upstream AUG codon regardless of its nucleotide context. Nucleic Acids Res 2016; 44:4252-65. [PMID: 27067542 PMCID: PMC4872109 DOI: 10.1093/nar/gkw240] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2015] [Accepted: 03/29/2016] [Indexed: 11/23/2022] Open
Abstract
During eukaryotic translation initiation, the 43S preinitiation complex (43S PIC), consisting of the 40S ribosomal subunit, eukaryotic initiation factors (eIFs) and initiator tRNA scans mRNA to find an appropriate start codon. Key roles in the accuracy of initiation codon selection belong to eIF1 and eIF1A, whereas the mammalian-specific DHX29 helicase substantially contributes to ribosomal scanning of structured mRNAs. Here, we show that DHX29 stimulates the recognition of the AUG codon but not the near-cognate CUG codon regardless of its nucleotide context during ribosomal scanning. The stimulatory effect depends on the contact between DHX29 and eIF1A. The unique DHX29 N-terminal domain binds to the ribosomal site near the mRNA entrance, where it contacts the eIF1A OB domain. UV crosslinking assays revealed that DHX29 may rearrange eIF1A and eIF2α in key nucleotide context positions of ribosomal complexes. Interestingly, DHX29 impedes the 48S initiation complex formation in the absence of eIF1A perhaps due to forming a physical barrier that prevents the 43S PIC from loading onto mRNA. Mutational analysis allowed us to split the mRNA unwinding and codon selection activities of DHX29. Thus, DHX29 is another example of an initiation factor contributing to start codon selection.
Collapse
Affiliation(s)
- Vera P Pisareva
- Department of Cell Biology, SUNY Downstate Medical Center, 450 Clarkson Ave, Brooklyn, NY 11203, USA
| | - Andrey V Pisarev
- Department of Cell Biology, SUNY Downstate Medical Center, 450 Clarkson Ave, Brooklyn, NY 11203, USA
| |
Collapse
|
126
|
McKeague M, Wong RS, Smolke CD. Opportunities in the design and application of RNA for gene expression control. Nucleic Acids Res 2016; 44:2987-99. [PMID: 26969733 PMCID: PMC4838379 DOI: 10.1093/nar/gkw151] [Citation(s) in RCA: 60] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2015] [Accepted: 02/29/2016] [Indexed: 12/15/2022] Open
Abstract
The past decade of synthetic biology research has witnessed numerous advances in the development of tools and frameworks for the design and characterization of biological systems. Researchers have focused on the use of RNA for gene expression control due to its versatility in sensing molecular ligands and the relative ease by which RNA can be modeled and designed compared to proteins. We review the recent progress in the field with respect to RNA-based genetic devices that are controlled through small molecule and protein interactions. We discuss new approaches for generating and characterizing these devices and their underlying components. We also highlight immediate challenges, future directions and recent applications of synthetic RNA devices in engineered biological systems.
Collapse
Affiliation(s)
- Maureen McKeague
- Department of Bioengineering, Stanford University, Stanford, CA 94305, USA
| | - Remus S Wong
- Department of Bioengineering, Stanford University, Stanford, CA 94305, USA
| | - Christina D Smolke
- Department of Bioengineering, Stanford University, Stanford, CA 94305, USA
| |
Collapse
|
127
|
Peterman N, Levine E. Sort-seq under the hood: implications of design choices on large-scale characterization of sequence-function relations. BMC Genomics 2016; 17:206. [PMID: 26956374 PMCID: PMC4784318 DOI: 10.1186/s12864-016-2533-5] [Citation(s) in RCA: 56] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2015] [Accepted: 02/25/2016] [Indexed: 12/22/2022] Open
Abstract
Background Sort-seq is an effective approach for simultaneous activity measurements in a large-scale library, combining flow cytometry, deep sequencing, and statistical inference. Such assays enable the characterization of functional landscapes at unprecedented scale for a wide-reaching array of biological molecules and functionalities in vivo. Applications of sort-seq range from footprinting to establishing quantitative models of biological systems and rational design of synthetic genetic elements. Nearly as diverse are implementations of this technique, reflecting key design choices with extensive impact on the scope and accuracy the results. Yet how to make these choices remains unclear. Here we investigate the effects of alternative sort-seq designs and inference methods on the information output using mathematical formulation and simulations. Results We identify key intrinsic properties of any system of interest with practical implications for sort-seq assays, depending on the experimental goals. The fluorescence range and cell-to-cell variability specify the number of sorted populations needed for quantitative measurements that are precise and unbiased. These factors also indicate cases where an enrichment-based approach that uses a single sorted population can offer satisfactory results. These predications of our model are corroborated using re-analysis of published data. We explore implications of these results for quantitative modeling and library design. Conclusions Sort-seq assays can be streamlined by reducing the number of sorted populations, saving considerable resources. Simple preliminary experiments can guide optimal experiment design, minimizing cost while maintaining the maximal information output and avoiding latent biases. These insights can facilitate future applications of this highly adaptable technique. Electronic supplementary material The online version of this article (doi:10.1186/s12864-016-2533-5) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Neil Peterman
- Department of Physics and FAS Center for Systems Biology, Harvard University, 17 Oxford St., Cambridge, MA, USA
| | - Erel Levine
- Department of Physics and FAS Center for Systems Biology, Harvard University, 17 Oxford St., Cambridge, MA, USA.
| |
Collapse
|
128
|
Abstract
A central challenge in the field of metabolic engineering is the efficient identification of a metabolic pathway genotype that maximizes specific productivity over a robust range of process conditions. Here we review current methods for optimizing specific productivity of metabolic pathways in living cells. New tools for library generation, computational analysis of pathway sequence-flux space, and high-throughput screening and selection techniques are discussed.
Collapse
Affiliation(s)
- Justin R Klesmith
- Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing, MI 48824, USA
| | - Timothy A Whitehead
- Department of Chemical Engineering and Materials Science, Michigan State University, East Lansing, MI 48824, USA; Department of Biosystems and Agricultural Engineering, Michigan State University, East Lansing, MI 48824, USA
| |
Collapse
|
129
|
Johnstone TG, Bazzini AA, Giraldez AJ. Upstream ORFs are prevalent translational repressors in vertebrates. EMBO J 2016; 35:706-23. [PMID: 26896445 DOI: 10.15252/embj.201592759] [Citation(s) in RCA: 259] [Impact Index Per Article: 28.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2015] [Accepted: 01/08/2016] [Indexed: 12/20/2022] Open
Abstract
Regulation of gene expression is fundamental in establishing cellular diversity and a target of natural selection. Untranslated mRNA regions (UTRs) are key mediators of post-transcriptional regulation. Previous studies have predicted thousands of ORFs in 5'UTRs, the vast majority of which have unknown function. Here, we present a systematic analysis of the translation and function of upstream open reading frames (uORFs) across vertebrates. Using high-resolution ribosome footprinting, we find that (i)uORFs are prevalent within vertebrate transcriptomes, (ii) the majority show signatures of active translation, and (iii)uORFs act as potent regulators of translation and RNA levels, with a similar magnitude to miRNAs. Reporter experiments reveal clear repression of downstream translation by uORFs/oORFs. uORF number, intercistronic distance, overlap with the CDS, and initiation context most strongly influence translation. Evolution has targeted these features to favor uORFs amenable to regulation over constitutively repressive uORFs/oORFs. Finally, we observe that the regulatory potential of uORFs on individual genes is conserved across species. These results provide insight into the regulatory code within mRNA leader sequences and their capacity to modulate translation across vertebrates.
Collapse
Affiliation(s)
- Timothy G Johnstone
- Department of Genetics, Yale University School of Medicine, New Haven, CT, USA
| | - Ariel A Bazzini
- Department of Genetics, Yale University School of Medicine, New Haven, CT, USA
| | - Antonio J Giraldez
- Department of Genetics, Yale University School of Medicine, New Haven, CT, USA Yale Stem Cell Center, Yale University School of Medicine, New Haven, CT, USA Yale Cancer Center, Yale University School of Medicine, New Haven, CT, USA
| |
Collapse
|
130
|
Dominissini D, Nachtergaele S, Moshitch-Moshkovitz S, Peer E, Kol N, Ben-Haim MS, Dai Q, Di Segni A, Salmon-Divon M, Clark WC, Zheng G, Pan T, Solomon O, Eyal E, Hershkovitz V, Han D, Doré LC, Amariglio N, Rechavi G, He C. The dynamic N(1)-methyladenosine methylome in eukaryotic messenger RNA. Nature 2016; 530:441-6. [PMID: 26863196 DOI: 10.1038/nature16998] [Citation(s) in RCA: 732] [Impact Index Per Article: 81.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2015] [Accepted: 01/15/2016] [Indexed: 12/26/2022]
Abstract
Gene expression can be regulated post-transcriptionally through dynamic and reversible RNA modifications. A recent noteworthy example is N(6)-methyladenosine (m(6)A), which affects messenger RNA (mRNA) localization, stability, translation and splicing. Here we report on a new mRNA modification, N(1)-methyladenosine (m(1)A), that occurs on thousands of different gene transcripts in eukaryotic cells, from yeast to mammals, at an estimated average transcript stoichiometry of 20% in humans. Employing newly developed sequencing approaches, we show that m(1)A is enriched around the start codon upstream of the first splice site: it preferentially decorates more structured regions around canonical and alternative translation initiation sites, is dynamic in response to physiological conditions, and correlates positively with protein production. These unique features are highly conserved in mouse and human cells, strongly indicating a functional role for m(1)A in promoting translation of methylated mRNA.
Collapse
Affiliation(s)
- Dan Dominissini
- Department of Chemistry and Institute for Biophysical Dynamics, The University of Chicago, 929 East 57th Street, Chicago, Illinois 60637, USA.,Howard Hughes Medical Institute, The University of Chicago, 929 East 57th Street, Chicago, Illinois 60637, USA
| | - Sigrid Nachtergaele
- Department of Chemistry and Institute for Biophysical Dynamics, The University of Chicago, 929 East 57th Street, Chicago, Illinois 60637, USA.,Howard Hughes Medical Institute, The University of Chicago, 929 East 57th Street, Chicago, Illinois 60637, USA
| | | | - Eyal Peer
- Cancer Research Center, Chaim Sheba Medical Center, Tel Hashomer 52621, Israel.,Sackler School of Medicine, Tel Aviv University, Tel Aviv 69978, Israel
| | - Nitzan Kol
- Cancer Research Center, Chaim Sheba Medical Center, Tel Hashomer 52621, Israel
| | - Moshe Shay Ben-Haim
- Cancer Research Center, Chaim Sheba Medical Center, Tel Hashomer 52621, Israel.,Sackler School of Medicine, Tel Aviv University, Tel Aviv 69978, Israel
| | - Qing Dai
- Department of Chemistry and Institute for Biophysical Dynamics, The University of Chicago, 929 East 57th Street, Chicago, Illinois 60637, USA.,Howard Hughes Medical Institute, The University of Chicago, 929 East 57th Street, Chicago, Illinois 60637, USA
| | - Ayelet Di Segni
- Cancer Research Center, Chaim Sheba Medical Center, Tel Hashomer 52621, Israel
| | - Mali Salmon-Divon
- Cancer Research Center, Chaim Sheba Medical Center, Tel Hashomer 52621, Israel
| | - Wesley C Clark
- Department of Biochemistry and Molecular Biology, The University of Chicago, 929 East 57th Street, Chicago, Illinois 60637, USA
| | - Guanqun Zheng
- Department of Biochemistry and Molecular Biology, The University of Chicago, 929 East 57th Street, Chicago, Illinois 60637, USA
| | - Tao Pan
- Department of Biochemistry and Molecular Biology, The University of Chicago, 929 East 57th Street, Chicago, Illinois 60637, USA
| | - Oz Solomon
- Cancer Research Center, Chaim Sheba Medical Center, Tel Hashomer 52621, Israel.,The Mina and Everard Goodman Faculty of Life Sciences, Bar-Ilan University, Ramat Gan 52900, Israel
| | - Eran Eyal
- Cancer Research Center, Chaim Sheba Medical Center, Tel Hashomer 52621, Israel
| | - Vera Hershkovitz
- Cancer Research Center, Chaim Sheba Medical Center, Tel Hashomer 52621, Israel
| | - Dali Han
- Department of Chemistry and Institute for Biophysical Dynamics, The University of Chicago, 929 East 57th Street, Chicago, Illinois 60637, USA.,Howard Hughes Medical Institute, The University of Chicago, 929 East 57th Street, Chicago, Illinois 60637, USA
| | - Louis C Doré
- Department of Chemistry and Institute for Biophysical Dynamics, The University of Chicago, 929 East 57th Street, Chicago, Illinois 60637, USA.,Howard Hughes Medical Institute, The University of Chicago, 929 East 57th Street, Chicago, Illinois 60637, USA
| | - Ninette Amariglio
- Cancer Research Center, Chaim Sheba Medical Center, Tel Hashomer 52621, Israel.,The Mina and Everard Goodman Faculty of Life Sciences, Bar-Ilan University, Ramat Gan 52900, Israel
| | - Gideon Rechavi
- Cancer Research Center, Chaim Sheba Medical Center, Tel Hashomer 52621, Israel.,Sackler School of Medicine, Tel Aviv University, Tel Aviv 69978, Israel
| | - Chuan He
- Department of Chemistry and Institute for Biophysical Dynamics, The University of Chicago, 929 East 57th Street, Chicago, Illinois 60637, USA.,Howard Hughes Medical Institute, The University of Chicago, 929 East 57th Street, Chicago, Illinois 60637, USA.,Department of Biochemistry and Molecular Biology, The University of Chicago, 929 East 57th Street, Chicago, Illinois 60637, USA
| |
Collapse
|
131
|
Sheshukova EV, Shindyapina AV, Komarova TV, Dorokhov YL. “Matreshka” genes with alternative reading frames. RUSS J GENET+ 2016. [DOI: 10.1134/s1022795416020149] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
|
132
|
Klesmith JR, Bacik JP, Michalczyk R, Whitehead TA. Comprehensive Sequence-Flux Mapping of a Levoglucosan Utilization Pathway in E. coli. ACS Synth Biol 2015; 4:1235-43. [PMID: 26369947 DOI: 10.1021/acssynbio.5b00131] [Citation(s) in RCA: 35] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
Synthetic metabolic pathways often suffer from low specific productivity, and new methods that quickly assess pathway functionality for many thousands of variants are urgently needed. Here we present an approach that enables the rapid and parallel determination of sequence effects on flux for complete gene-encoding sequences. We show that this method can be used to determine the effects of over 8000 single point mutants of a pyrolysis oil catabolic pathway implanted in Escherichia coli. Experimental sequence-function data sets predicted whether fitness-enhancing mutations to the enzyme levoglucosan kinase resulted from enhanced catalytic efficiency or enzyme stability. A structure of one design incorporating 38 mutations elucidated the structural basis of high fitness mutations. One design incorporating 15 beneficial mutations supported a 15-fold improvement in growth rate and greater than 24-fold improvement in enzyme activity relative to the starting pathway. This technique can be extended to improve a wide variety of designed pathways.
Collapse
Affiliation(s)
- Justin R. Klesmith
- Department
of Biochemistry and Molecular Biology, Michigan State University, East Lansing, Michigan 48824, United States
| | - John-Paul Bacik
- Bioscience
Division, Los Alamos National Laboratory, Los Alamos, New Mexico 87545, United States
| | - Ryszard Michalczyk
- Bioscience
Division, Los Alamos National Laboratory, Los Alamos, New Mexico 87545, United States
| | - Timothy A. Whitehead
- Department
of Chemical Engineering and Materials Science, Michigan State University, East
Lansing, Michigan 48824, United States
- Department
of Biosystems and Agricultural Engineering, Michigan State University, East
Lansing, Michigan 48824, United States
| |
Collapse
|
133
|
Rosenberg AB, Patwardhan RP, Shendure J, Seelig G. Learning the sequence determinants of alternative splicing from millions of random sequences. Cell 2015; 163:698-711. [PMID: 26496609 DOI: 10.1016/j.cell.2015.09.054] [Citation(s) in RCA: 182] [Impact Index Per Article: 18.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2015] [Revised: 08/28/2015] [Accepted: 09/21/2015] [Indexed: 01/24/2023]
Abstract
Most human transcripts are alternatively spliced, and many disease-causing mutations affect RNA splicing. Toward better modeling the sequence determinants of alternative splicing, we measured the splicing patterns of over two million (M) synthetic mini-genes, which include degenerate subsequences totaling over 100 M bases of variation. The massive size of these training data allowed us to improve upon current models of splicing, as well as to gain new mechanistic insights. Our results show that the vast majority of hexamer sequence motifs measurably influence splice site selection when positioned within alternative exons, with multiple motifs acting additively rather than cooperatively. Intriguingly, motifs that enhance (suppress) exon inclusion in alternative 5' splicing also enhance (suppress) exon inclusion in alternative 3' or cassette exon splicing, suggesting a universal mechanism for alternative exon recognition. Finally, our empirically trained models are highly predictive of the effects of naturally occurring variants on alternative splicing in vivo.
Collapse
Affiliation(s)
- Alexander B Rosenberg
- Department of Electrical Engineering, University of Washington, Seattle, WA 98195, USA
| | - Rupali P Patwardhan
- Department of Genome Sciences, University of Washington, Seattle, WA 98195, USA
| | - Jay Shendure
- Department of Genome Sciences, University of Washington, Seattle, WA 98195, USA
| | - Georg Seelig
- Department of Electrical Engineering, University of Washington, Seattle, WA 98195, USA; Department of Computer Science and Engineering, University of Washington, Seattle, WA 98195, USA.
| |
Collapse
|
134
|
Ribosome profiling reveals the what, when, where and how of protein synthesis. Nat Rev Mol Cell Biol 2015; 16:651-64. [PMID: 26465719 DOI: 10.1038/nrm4069] [Citation(s) in RCA: 363] [Impact Index Per Article: 36.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022]
Abstract
Ribosome profiling, which involves the deep sequencing of ribosome-protected mRNA fragments, is a powerful tool for globally monitoring translation in vivo. The method has facilitated discovery of the regulation of gene expression underlying diverse and complex biological processes, of important aspects of the mechanism of protein synthesis, and even of new proteins, by providing a systematic approach for experimental annotation of coding regions. Here, we introduce the methodology of ribosome profiling and discuss examples in which this approach has been a key factor in guiding biological discovery, including its prominent role in identifying thousands of novel translated short open reading frames and alternative translation products.
Collapse
|
135
|
Hansen AS, O'Shea EK. cis Determinants of Promoter Threshold and Activation Timescale. Cell Rep 2015; 12:1226-33. [PMID: 26279577 DOI: 10.1016/j.celrep.2015.07.035] [Citation(s) in RCA: 32] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2015] [Revised: 07/08/2015] [Accepted: 07/15/2015] [Indexed: 11/16/2022] Open
Abstract
Although the relationship between DNA cis-regulatory sequences and gene expression has been extensively studied at steady state, how cis-regulatory sequences affect the dynamics of gene induction is not known. The dynamics of gene induction can be described by the promoter activation timescale (AcTime) and amplitude threshold (AmpThr). Combining high-throughput microfluidics with quantitative time-lapse microscopy, we control the activation dynamics of the budding yeast transcription factor, Msn2, and reveal how cis-regulatory motifs in 20 promoter variants of the Msn2-target-gene SIP18 affect AcTime and AmpThr. By modulating Msn2 binding sites, we can decouple AmpThr from AcTime and switch the SIP18 promoter class from high AmpThr and slow AcTime to low AmpThr and either fast or slow AcTime. We present a model that quantitatively explains gene-induction dynamics on the basis of the Msn2-binding-site number, TATA box location, and promoter nucleosome organization. Overall, we elucidate the cis-regulatory logic underlying promoter decoding of TF dynamics.
Collapse
Affiliation(s)
- Anders S Hansen
- Department of Chemistry and Chemical Biology, Harvard University, 12 Oxford Street, Cambridge, MA 02138, USA; Howard Hughes Medical Institute, Harvard University, Northwest Laboratory, 52 Oxford Street, Cambridge, MA 02138, USA; Faculty of Arts and Sciences Center for Systems Biology, Harvard University, Northwest Laboratory, 52 Oxford Street, Cambridge, MA 02138, USA
| | - Erin K O'Shea
- Department of Chemistry and Chemical Biology, Harvard University, 12 Oxford Street, Cambridge, MA 02138, USA; Howard Hughes Medical Institute, Harvard University, Northwest Laboratory, 52 Oxford Street, Cambridge, MA 02138, USA; Faculty of Arts and Sciences Center for Systems Biology, Harvard University, Northwest Laboratory, 52 Oxford Street, Cambridge, MA 02138, USA; Department of Molecular and Cellular Biology, Harvard University, Northwest Laboratory, 52 Oxford Street, Cambridge, MA 02138, USA.
| |
Collapse
|
136
|
Townshend B, Kennedy AB, Xiang JS, Smolke CD. High-throughput cellular RNA device engineering. Nat Methods 2015; 12:989-94. [PMID: 26258292 PMCID: PMC4589471 DOI: 10.1038/nmeth.3486] [Citation(s) in RCA: 82] [Impact Index Per Article: 8.2] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2014] [Accepted: 06/08/2015] [Indexed: 12/15/2022]
Abstract
Methods for rapidly assessing sequence-structure-function landscapes and developing conditional gene-regulatory devices are critical to our ability to manipulate and interface with biology. We describe a framework for engineering RNA devices from preexisting aptamers that exhibit ligand-responsive ribozyme tertiary interactions. Our methodology utilizes cell sorting, high-throughput sequencing, and statistical data analyses to enable parallel measurements of the activities of hundreds of thousands of sequences from RNA device libraries in the absence and presence of ligands. Our tertiary interaction RNA devices exhibit improved performance in terms of gene silencing, activation ratio, and ligand sensitivity as compared to optimized RNA devices that rely on secondary structure changes. We apply our method to building biosensors for diverse ligands and determine consensus sequences that enable ligand-responsive tertiary interactions. These methods advance our ability to develop broadly applicable genetic tools and to elucidate understanding of the underlying sequence-structure-function relationships that empower rational design of complex biomolecules.
Collapse
Affiliation(s)
- Brent Townshend
- Department of Bioengineering, Stanford University, Stanford, California, USA
| | - Andrew B Kennedy
- Department of Bioengineering, Stanford University, Stanford, California, USA
| | - Joy S Xiang
- Department of Bioengineering, Stanford University, Stanford, California, USA
| | - Christina D Smolke
- Department of Bioengineering, Stanford University, Stanford, California, USA
| |
Collapse
|
137
|
Kopniczky MB, Moore SJ, Freemont PS. Multilevel Regulation and Translational Switches in Synthetic Biology. IEEE TRANSACTIONS ON BIOMEDICAL CIRCUITS AND SYSTEMS 2015; 9:485-496. [PMID: 26336145 DOI: 10.1109/tbcas.2015.2451707] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]
Abstract
In contrast to the versatility of regulatory mechanisms in natural systems, synthetic genetic circuits have been so far predominantly composed of transcriptionally regulated modules. This is about to change as the repertoire of foundational tools for post-transcriptional regulation is quickly expanding. We provide an overview of the different types of translational regulators: protein, small molecule and ribonucleic acid (RNA) responsive and we describe the new emerging circuit designs utilizing these tools. There are several advantages of achieving multilevel regulation via translational switches and it is likely that such designs will have the greatest and earliest impact in mammalian synthetic biology for regenerative medicine and gene therapy applications.
Collapse
|
138
|
A Comprehensive Analysis of Codon Usage Patterns in Blunt Snout Bream (Megalobrama amblycephala) Based on RNA-Seq Data. Int J Mol Sci 2015; 16:11996-2013. [PMID: 26016504 PMCID: PMC4490425 DOI: 10.3390/ijms160611996] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/26/2015] [Accepted: 05/19/2015] [Indexed: 11/21/2022] Open
Abstract
Blunt snout bream (Megalobrama amblycephala) is an important fish species for its delicacy and high economic value in China. Codon usage analysis could be helpful to understand its codon biology, mRNA translation and vertebrate evolution. Based on RNA-Seq data for M. amblycephala, high-frequency codons (CUG, AGA, GUG, CAG and GAG), as well as low-frequency ones (NUA and NCG codons) were identified. A total of 724 high-frequency codon pairs were observed. Meanwhile, 14 preferred and 199 avoided neighboring codon pairs were also identified, but bias was almost not shown with one or more intervening codons inserted between the same pairs. Codon usage bias in the regions close to start and stop codons indicated apparent heterogeneity, which even occurs in the flanking nucleotide sequence. Codon usage bias (RSCU and SCUO) was related to GC3 (GC content of 3rd nucleotide in codon) bias. Six GO (Gene ontology) categories and the number of methylation targets were influenced by GC3. Codon usage patterns comparison among 23 vertebrates showed species specificities by using GC contents, codon usage and codon context analysis. This work provided new insights into fish biology and new information for breeding projects.
Collapse
|