51
|
Ouspenskaia T, Law T, Clauser KR, Klaeger S, Sarkizova S, Aguet F, Li B, Christian E, Knisbacher BA, Le PM, Hartigan CR, Keshishian H, Apffel A, Oliveira G, Zhang W, Chen S, Chow YT, Ji Z, Jungreis I, Shukla SA, Justesen S, Bachireddy P, Kellis M, Getz G, Hacohen N, Keskin DB, Carr SA, Wu CJ, Regev A. Unannotated proteins expand the MHC-I-restricted immunopeptidome in cancer. Nat Biotechnol 2022; 40:209-217. [PMID: 34663921 PMCID: PMC10198624 DOI: 10.1038/s41587-021-01021-3] [Citation(s) in RCA: 117] [Impact Index Per Article: 58.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2020] [Accepted: 07/16/2021] [Indexed: 12/16/2022]
Abstract
Tumor-associated epitopes presented on MHC-I that can activate the immune system against cancer cells are typically identified from annotated protein-coding regions of the genome, but whether peptides originating from novel or unannotated open reading frames (nuORFs) can contribute to antitumor immune responses remains unclear. Here we show that peptides originating from nuORFs detected by ribosome profiling of malignant and healthy samples can be displayed on MHC-I of cancer cells, acting as additional sources of cancer antigens. We constructed a high-confidence database of translated nuORFs across tissues (nuORFdb) and used it to detect 3,555 translated nuORFs from MHC-I immunopeptidome mass spectrometry analysis, including peptides that result from somatic mutations in nuORFs of cancer samples as well as tumor-specific nuORFs translated in melanoma, chronic lymphocytic leukemia and glioblastoma. NuORFs are an unexplored pool of MHC-I-presented, tumor-specific peptides with potential as immunotherapy targets.
Collapse
Affiliation(s)
- Tamara Ouspenskaia
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Flagship Labs 69, Cambridge, MA, USA
| | - Travis Law
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | | | - Susan Klaeger
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Siranush Sarkizova
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
| | | | - Bo Li
- Klarman Cell Observatory, Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Center for Immunology and Inflammatory Diseases, Division of Rheumatology, Allergy, and Immunology, Massachusetts General Hospital and Harvard Medical School, Boston, MA, USA
| | | | | | - Phuong M Le
- Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA, USA
| | | | | | - Annie Apffel
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Giacomo Oliveira
- Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Wandi Zhang
- Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA, USA
| | | | | | - Zhe Ji
- Department of Pharmacology, Feinberg School of Medicine, Northwestern University, Chicago, IL, USA
- Department of Biomedical Engineering, McCormick School of Engineering, Northwestern University, Evanston, IL, USA
| | - Irwin Jungreis
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- MIT Computer Science and Artificial Intelligence Laboratory, Cambridge, MA, USA
| | - Sachet A Shukla
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA, USA
| | | | - Pavan Bachireddy
- Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Manolis Kellis
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- MIT Computer Science and Artificial Intelligence Laboratory, Cambridge, MA, USA
| | - Gad Getz
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Nir Hacohen
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Massachusetts General Hospital Cancer Center, Boston, MA, USA
| | - Derin B Keskin
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA, USA
- Harvard Medical School, Boston, MA, USA
- Department of Medicine, Brigham and Women's Hospital, Boston, MA, USA
- The Translational Immunogenomics Lab, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Steven A Carr
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Catherine J Wu
- Broad Institute of MIT and Harvard, Cambridge, MA, USA.
- Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA, USA.
- Harvard Medical School, Boston, MA, USA.
- Department of Medicine, Brigham and Women's Hospital, Boston, MA, USA.
| | - Aviv Regev
- Klarman Cell Observatory, Broad Institute of Harvard and MIT, Cambridge, MA, USA.
- Howard Hughes Medical Institute, Department of Biology, Massachusetts Institute of Technology, Cambridge, MA, USA.
- Genentech, South San Francisco, CA, USA.
| |
Collapse
|
52
|
Della Bella E, Koch J, Baerenfaller K. Translation and emerging functions of non-coding RNAs in inflammation and immunity. Allergy 2022; 77:2025-2037. [PMID: 35094406 PMCID: PMC9302665 DOI: 10.1111/all.15234] [Citation(s) in RCA: 24] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2021] [Revised: 01/20/2022] [Accepted: 01/24/2022] [Indexed: 12/17/2022]
Abstract
Regulatory non‐coding RNAs (ncRNAs) including small non‐coding RNAs (sRNAs), long non‐coding RNAs (lncRNAs), and circular RNAs (circRNAs) have gained considerable attention in the last few years. This is mainly due to their condition‐ and tissue‐specific expression and their various modes of action, which suggests them as promising biomarkers and therapeutic targets. One important mechanism of ncRNAs to regulate gene expression is through translation of short open reading frames (sORFs). These sORFs can be located in lncRNAs, in non‐translated regions of mRNAs where upstream ORFs (uORFs) represent the majority, or in circRNAs. Regulation of their translation can function as a quick way to adapt protein production to changing cellular or environmental cues, and can either depend solely on the initiation and elongation of translation, or on the roles of the produced functional peptides. Due to the experimental challenges to pinpoint translation events and to detect the produced peptides, translational regulation through regulatory RNAs is not well studied yet. In the case of circRNAs, they have only recently started to be recognized as regulatory molecules instead of mere artifacts of RNA biosynthesis. Of the many roles described for regulatory ncRNAs, we will focus here on their regulation during inflammation and in immunity.
Collapse
Affiliation(s)
| | - Jana Koch
- Swiss Institute of Allergy and Asthma Research (SIAF) University of Zurich Swiss Institute of Bioinformatics (SIB) Davos Switzerland
| | - Katja Baerenfaller
- Swiss Institute of Allergy and Asthma Research (SIAF) University of Zurich Swiss Institute of Bioinformatics (SIB) Davos Switzerland
| |
Collapse
|
53
|
Kute PM, Soukarieh O, Tjeldnes H, Trégouët DA, Valen E. Small Open Reading Frames, How to Find Them and Determine Their Function. Front Genet 2022; 12:796060. [PMID: 35154250 PMCID: PMC8831751 DOI: 10.3389/fgene.2021.796060] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2021] [Accepted: 12/30/2021] [Indexed: 12/12/2022] Open
Abstract
Advances in genomics and molecular biology have revealed an abundance of small open reading frames (sORFs) across all types of transcripts. While these sORFs are often assumed to be non-functional, many have been implicated in physiological functions and a significant number of sORFs have been described in human diseases. Thus, sORFs may represent a hidden repository of functional elements that could serve as therapeutic targets. Unlike protein-coding genes, it is not necessarily the encoded peptide of an sORF that enacts its function, sometimes simply the act of translating an sORF might have a regulatory role. Indeed, the most studied sORFs are located in the 5′UTRs of coding transcripts and can have a regulatory impact on the translation of the downstream protein-coding sequence. However, sORFs have also been abundantly identified in non-coding RNAs including lncRNAs, circular RNAs and ribosomal RNAs suggesting that sORFs may be diverse in function. Of the many different experimental methods used to discover sORFs, the most commonly used are ribosome profiling and mass spectrometry. These can confirm interactions between transcripts and ribosomes and the production of a peptide, respectively. Extensions to ribosome profiling, which also capture scanning ribosomes, have further made it possible to see how sORFs impact the translation initiation of mRNAs. While high-throughput techniques have made the identification of sORFs less difficult, defining their function, if any, is typically more challenging. Together, the abundance and potential function of many of these sORFs argues for the necessity of including sORFs in gene annotations and systematically characterizing these to understand their potential functional roles. In this review, we will focus on the high-throughput methods used in the detection and characterization of sORFs and discuss techniques for validation and functional characterization.
Collapse
Affiliation(s)
- Preeti Madhav Kute
- Computational Biology Unit, Department of Informatics, University of Bergen, Bergen, Norway
- Sars International Centre for Marine Molecular Biology, University of Bergen, Bergen, Norway
| | - Omar Soukarieh
- Department of Molecular Epidemiology Of Vascular and Brain Disorders, INSERM, BPH, U1219, University of Bordeaux, Bordeaux, France
| | - Håkon Tjeldnes
- Computational Biology Unit, Department of Informatics, University of Bergen, Bergen, Norway
| | - David-Alexandre Trégouët
- Department of Molecular Epidemiology Of Vascular and Brain Disorders, INSERM, BPH, U1219, University of Bordeaux, Bordeaux, France
| | - Eivind Valen
- Computational Biology Unit, Department of Informatics, University of Bergen, Bergen, Norway
- Sars International Centre for Marine Molecular Biology, University of Bergen, Bergen, Norway
- *Correspondence: Eivind Valen,
| |
Collapse
|
54
|
Vazquez-Laslop N, Sharma CM, Mankin A, Buskirk AR. Identifying Small Open Reading Frames in Prokaryotes with Ribosome Profiling. J Bacteriol 2022; 204:e0029421. [PMID: 34339296 PMCID: PMC8765392 DOI: 10.1128/jb.00294-21] [Citation(s) in RCA: 18] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
Small proteins encoded by open reading frames (ORFs) shorter than 50 codons (small ORFs [sORFs]) are often overlooked by annotation engines and are difficult to characterize using traditional biochemical techniques. Ribosome profiling has tremendous potential to empirically improve the annotations of prokaryotic genomes. Recent improvements in ribosome profiling methods for bacterial model organisms have revealed many new sORFs in well-characterized genomes. Antibiotics that trap ribosomes just after initiation have played a key role in these developments by allowing the unambiguous identification of the start codons (and, hence, the reading frame) for novel ORFs. Here, we describe these new methods and highlight critical controls and considerations for adapting ribosome profiling to different prokaryotic species.
Collapse
Affiliation(s)
- Nora Vazquez-Laslop
- Center for Biomolecular Sciences, University of Illinois at Chicago, Chicago, Illinois, USA
| | - Cynthia M. Sharma
- Molecular Infection Biology II, Institute of Molecular Infection Biology, University of Würzburg, Würzburg, Germany
| | - Alexander Mankin
- Center for Biomolecular Sciences, University of Illinois at Chicago, Chicago, Illinois, USA
| | - Allen R. Buskirk
- Department of Molecular Biology and Genetics, Johns Hopkins University School of Medicine, Baltimore, Maryland, USA
| |
Collapse
|
55
|
Chen L, Yang Y, Zhang Y, Li K, Cai H, Wang H, Zhao Q. The Small Open Reading Frame-Encoded Peptides: Advances in Methodologies and Functional Studies. Chembiochem 2021; 23:e202100534. [PMID: 34862721 DOI: 10.1002/cbic.202100534] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2021] [Revised: 11/15/2021] [Indexed: 11/07/2022]
Abstract
Small open reading frames (sORFs) are an important class of genes with less than 100 codons. They were historically annotated as noncoding or even junk sequences. In recent years, accumulating evidence suggests that sORFs could encode a considerable number of polypeptides, many of which play important roles in both physiology and disease pathology. However, it has been technically challenging to directly detect sORF-encoded peptides (SEPs). Here, we discuss the latest advances in methodologies for identifying SEPs with mass spectrometry, as well as the progress on functional studies of SEPs.
Collapse
Affiliation(s)
- Lei Chen
- State Key Laboratory of Chemical Biology and Drug Discovery, Department of Applied Biology and Chemical Technology, Hong Kong Polytechnic University, Hung Hom, Hong Kong SAR, 999077, P. R. China.,Laboratory for Synthetic Chemistry and Chemical Biology Limited, Hong Kong Science and Technology Park, New Territories, Hong Kong SAR, 999077, P. R. China
| | - Ying Yang
- State Key Laboratory of Chemical Biology and Drug Discovery, Department of Applied Biology and Chemical Technology, Hong Kong Polytechnic University, Hung Hom, Hong Kong SAR, 999077, P. R. China
| | - Yuanliang Zhang
- State Key Laboratory of Chemical Biology and Drug Discovery, Department of Applied Biology and Chemical Technology, Hong Kong Polytechnic University, Hung Hom, Hong Kong SAR, 999077, P. R. China
| | - Kecheng Li
- State Key Laboratory of Chemical Biology and Drug Discovery, Department of Applied Biology and Chemical Technology, Hong Kong Polytechnic University, Hung Hom, Hong Kong SAR, 999077, P. R. China
| | - Hongmin Cai
- School of Computer Science and Engineering, South China University of Technology, Guangzhou, 510623, P. R. China
| | - Hongwei Wang
- State Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-sen University, Guangdong Provincial Key Laboratory of Ophthalmology and Visual Science, Guangzhou, 510623, P. R. China
| | - Qian Zhao
- State Key Laboratory of Chemical Biology and Drug Discovery, Department of Applied Biology and Chemical Technology, Hong Kong Polytechnic University, Hung Hom, Hong Kong SAR, 999077, P. R. China
| |
Collapse
|
56
|
Abstract
To form synaptic connections and store information, neurons continuously remodel their proteomes. The impressive length of dendrites and axons imposes logistical challenges to maintain synaptic proteins at locations remote from the transcription source (the nucleus). The discovery of thousands of messenger RNAs (mRNAs) near synapses suggested that neurons overcome distance and gain autonomy by producing proteins locally. It is not generally known, however, if, how, and when localized mRNAs are translated into protein. To investigate the translational landscape in neuronal subregions, we performed simultaneous RNA sequencing (RNA-seq) and ribosome sequencing (Ribo-seq) from microdissected rodent brain slices to identify and quantify the transcriptome and translatome in cell bodies (somata) as well as dendrites and axons (neuropil). Thousands of transcripts were differentially translated between somatic and synaptic regions, with many scaffold and signaling molecules displaying increased translation levels in the neuropil. Most translational changes between compartments could be accounted for by differences in RNA abundance. Pervasive translational regulation was observed in both somata and neuropil influenced by specific mRNA features (e.g., untranslated region [UTR] length, RNA-binding protein [RBP] motifs, and upstream open reading frames [uORFs]). For over 800 mRNAs, the dominant source of translation was the neuropil. We constructed a searchable and interactive database for exploring mRNA transcripts and their translation levels in the somata and neuropil [MPI Brain Research, The mRNA translation landscape in the synaptic neuropil. https://public.brain.mpg.de/dashapps/localseq/ Accessed 5 October 2021]. Overall, our findings emphasize the substantial contribution of local translation to maintaining synaptic protein levels and indicate that on-site translational control is an important mechanism to control synaptic strength.
Collapse
|
57
|
Unraveling the hidden role of a uORF-encoded peptide as a kinase inhibitor of PKCs. Proc Natl Acad Sci U S A 2021; 118:2018899118. [PMID: 34593629 PMCID: PMC8501901 DOI: 10.1073/pnas.2018899118] [Citation(s) in RCA: 26] [Impact Index Per Article: 8.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 08/19/2021] [Indexed: 02/01/2023] Open
Abstract
Approximately 40% of human messenger RNAs (mRNAs) contain upstream open reading frames (uORFs) in their 5' untranslated regions. Some of these uORF sequences, thought to attenuate scanning ribosomes or lead to mRNA degradation, were recently shown to be translated, although the function of the encoded peptides remains unknown. Here, we show a uORF-encoded peptide that exhibits kinase inhibitory functions. This uORF, upstream of the protein kinase C-eta (PKC-η) main ORF, encodes a peptide (uPEP2) containing the typical PKC pseudosubstrate motif present in all PKCs that autoinhibits their kinase activity. We show that uPEP2 directly binds to and selectively inhibits the catalytic activity of novel PKCs but not of classical or atypical PKCs. The endogenous deletion of uORF2 or its overexpression in MCF-7 cells revealed that the endogenously translated uPEP2 reduces the protein levels of PKC-η and other novel PKCs and restricts cell proliferation. Functionally, treatment of breast cancer cells with uPEP2 diminished cell survival and their migration and synergized with chemotherapy by interfering with the response to DNA damage. Furthermore, in a xenograft of MDA-MB-231 breast cancer tumor in mice models, uPEP2 suppressed tumor progression, invasion, and metastasis. Tumor histology showed reduced proliferation, enhanced cell death, and lower protein expression levels of novel PKCs along with diminished phosphorylation of PKC substrates. Hence, our study demonstrates that uORFs may encode biologically active peptides beyond their role as translation regulators of their downstream ORFs. Together, we point to a unique function of a uORF-encoded peptide as a kinase inhibitor, pertinent to cancer therapy.
Collapse
|
58
|
Li Y, Zhou H, Chen X, Zheng Y, Kang Q, Hao D, Zhang L, Song T, Luo H, Hao Y, Chen R, Zhang P, He S. SmProt: A Reliable Repository with Comprehensive Annotation of Small Proteins Identified from Ribosome Profiling. GENOMICS PROTEOMICS & BIOINFORMATICS 2021; 19:602-610. [PMID: 34536568 PMCID: PMC9039559 DOI: 10.1016/j.gpb.2021.09.002] [Citation(s) in RCA: 25] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/09/2020] [Revised: 09/07/2021] [Accepted: 09/08/2021] [Indexed: 12/30/2022]
Abstract
Small proteins specifically refer to proteins consisting of less than 100 amino acids translated from small open reading frames (sORFs), which were usually missed in previous genome annotation. The significance of small proteins has been revealed in current years, along with the discovery of their diverse functions. However, systematic annotation of small proteins is still insufficient. SmProt was specially developed to provide valuable information on small proteins for scientific community. Here we present the update of SmProt, which emphasizes reliability of translated sORFs, genetic variants in translated sORFs, disease-specific sORF translation events or sequences, and remarkably increased data volume. More components such as non-ATG translation initiation, function, and new sources are also included. SmProt incorporated 638,958 unique small proteins curated from 3,165,229 primary records, which were computationally predicted from 419 ribosome profiling (Ribo-seq) datasets or collected from literature and other sources from 370 cell lines or tissues in 8 species (Homo sapiens, Mus musculus, Rattus norvegicus, Drosophila melanogaster, Danio rerio, Saccharomyces cerevisiae, Caenorhabditis elegans, and Escherichia coli). In addition, small protein families identified from human microbiomes were also collected. All datasets in SmProt are free to access, and available for browse, search, and bulk downloads at http://bigdata.ibp.ac.cn/SmProt/.
Collapse
Affiliation(s)
- Yanyan Li
- College of Life Sciences, University of Chinese Academy of Sciences, Beijing 100049, China; Key Laboratory of RNA Biology, Center for Big Data Research in Health, Institute of Biophysics, Chinese Academy of Sciences, Beijing 100101, China
| | - Honghong Zhou
- Key Laboratory of RNA Biology, Center for Big Data Research in Health, Institute of Biophysics, Chinese Academy of Sciences, Beijing 100101, China
| | - Xiaomin Chen
- Key Laboratory of RNA Biology, Center for Big Data Research in Health, Institute of Biophysics, Chinese Academy of Sciences, Beijing 100101, China; University of Chinese Academy of Sciences, Beijing 100049, China
| | - Yu Zheng
- College of Life Sciences, University of Chinese Academy of Sciences, Beijing 100049, China; Key Laboratory of RNA Biology, Center for Big Data Research in Health, Institute of Biophysics, Chinese Academy of Sciences, Beijing 100101, China
| | - Quan Kang
- Key Laboratory of RNA Biology, Center for Big Data Research in Health, Institute of Biophysics, Chinese Academy of Sciences, Beijing 100101, China
| | - Di Hao
- Key Laboratory of RNA Biology, Center for Big Data Research in Health, Institute of Biophysics, Chinese Academy of Sciences, Beijing 100101, China
| | - Lili Zhang
- Key Laboratory of RNA Biology, Center for Big Data Research in Health, Institute of Biophysics, Chinese Academy of Sciences, Beijing 100101, China; University of Chinese Academy of Sciences, Beijing 100049, China
| | - Tingrui Song
- Key Laboratory of RNA Biology, Center for Big Data Research in Health, Institute of Biophysics, Chinese Academy of Sciences, Beijing 100101, China
| | - Huaxia Luo
- Key Laboratory of RNA Biology, Center for Big Data Research in Health, Institute of Biophysics, Chinese Academy of Sciences, Beijing 100101, China
| | - Yajing Hao
- Department of Cellular and Molecular Medicine, University of California, San Diego, La Jolla, CA 92093, USA
| | - Runsheng Chen
- Key Laboratory of RNA Biology, Center for Big Data Research in Health, Institute of Biophysics, Chinese Academy of Sciences, Beijing 100101, China; University of Chinese Academy of Sciences, Beijing 100049, China; Guangdong Geneway Decoding Bio-Tech Co. Ltd, Foshan 528316, China.
| | - Peng Zhang
- Key Laboratory of RNA Biology, Center for Big Data Research in Health, Institute of Biophysics, Chinese Academy of Sciences, Beijing 100101, China.
| | - Shunmin He
- College of Life Sciences, University of Chinese Academy of Sciences, Beijing 100049, China; Key Laboratory of RNA Biology, Center for Big Data Research in Health, Institute of Biophysics, Chinese Academy of Sciences, Beijing 100101, China.
| |
Collapse
|
59
|
Zaheed O, Kiniry SJ, Baranov PV, Dean K. Exploring Evidence of Non-coding RNA Translation With Trips-Viz and GWIPS-Viz Browsers. Front Cell Dev Biol 2021; 9:703374. [PMID: 34490252 PMCID: PMC8416628 DOI: 10.3389/fcell.2021.703374] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2021] [Accepted: 07/12/2021] [Indexed: 11/21/2022] Open
Abstract
Detection of translation in so-called non-coding RNA provides an opportunity for identification of novel bioactive peptides and microproteins. The main methods used for these purposes are ribosome profiling and mass spectrometry. A number of publicly available datasets already exist for a substantial number of different cell types grown under various conditions, and public data mining is an attractive strategy for identification of translation in non-coding RNAs. Since the analysis of publicly available data requires intensive data processing, several data resources have been created recently for exploring processed publicly available data, such as OpenProt, GWIPS-viz, and Trips-Viz. In this work we provide a detailed demonstration of how to use the latter two tools for exploring experimental evidence for translation of RNAs hitherto classified as non-coding. For this purpose, we use a set of transcripts with substantially different patterns of ribosome footprint distributions. We discuss how certain features of these patterns can be used as evidence for or against genuine translation. During our analysis we concluded that the MTLN mRNA, previously misannotated as lncRNA LINC00116, likely encodes only a short proteoform expressed from shorter RNA transcript variants.
Collapse
Affiliation(s)
- Oza Zaheed
- School of Biochemistry and Cell Biology, University College Cork, Cork, Ireland
| | - Stephen J Kiniry
- School of Biochemistry and Cell Biology, University College Cork, Cork, Ireland
| | - Pavel V Baranov
- School of Biochemistry and Cell Biology, University College Cork, Cork, Ireland.,Shemyakin-Ovchinnikov Institute of Bioorganic Chemistry, RAS, Moscow, Russia
| | - Kellie Dean
- School of Biochemistry and Cell Biology, University College Cork, Cork, Ireland
| |
Collapse
|
60
|
Kiniry SJ, Judge CE, Michel AM, Baranov PV. Trips-Viz: an environment for the analysis of public and user-generated ribosome profiling data. Nucleic Acids Res 2021; 49:W662-W670. [PMID: 33950201 PMCID: PMC8262740 DOI: 10.1093/nar/gkab323] [Citation(s) in RCA: 15] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2021] [Revised: 04/11/2021] [Accepted: 04/20/2021] [Indexed: 02/07/2023] Open
Abstract
Trips-Viz (https://trips.ucc.ie/) is an interactive platform for the analysis and visualization of ribosome profiling (Ribo-Seq) and shotgun RNA sequencing (RNA-seq) data. This includes publicly available and user generated data, hence Trips-Viz can be classified as a database and as a server. As a database it provides access to many processed Ribo-Seq and RNA-seq data aligned to reference transcriptomes which has been expanded considerably since its inception. Here, we focus on the server functionality of Trips-viz which also has been greatly improved. Trips-viz now enables visualisation of proteomics data from a large number of processed mass spectrometry datasets. It can be used to support translation inferred from Ribo-Seq data. Users are now able to upload a custom reference transcriptome as well as data types other than Ribo-Seq/RNA-Seq. Incorporating custom data has been streamlined with RiboGalaxy (https://ribogalaxy.ucc.ie/) integration. The other new functionality is the rapid detection of translated open reading frames (ORFs) through a simple easy to use interface. The analysis of differential expression has been also improved via integration of DESeq2 and Anota2seq in addition to a number of other improvements of existing Trips-viz features.
Collapse
Affiliation(s)
- Stephen J Kiniry
- School of Biochemistry and Cell Biology, University College Cork, Cork, Ireland
| | - Ciara E Judge
- School of Biochemistry and Cell Biology, University College Cork, Cork, Ireland
| | - Audrey M Michel
- School of Biochemistry and Cell Biology, University College Cork, Cork, Ireland
- Ribomaps Ltd, Western Gateway Bld, Western Rd, Cork, Ireland
| | - Pavel V Baranov
- School of Biochemistry and Cell Biology, University College Cork, Cork, Ireland
- Shemyakin-Ovchinnikov Institute of Bioorganic Chemistry, RAS, Moscow, Russia
| |
Collapse
|
61
|
Choteau SA, Wagner A, Pierre P, Spinelli L, Brun C. MetamORF: a repository of unique short open reading frames identified by both experimental and computational approaches for gene and metagene analyses. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2021; 2021:6307706. [PMID: 34156446 PMCID: PMC8218702 DOI: 10.1093/database/baab032] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/23/2020] [Revised: 04/08/2021] [Accepted: 05/17/2021] [Indexed: 11/12/2022]
Abstract
The development of high-throughput technologies revealed the existence of non-canonical short open reading frames (sORFs) on most eukaryotic ribonucleic acids. They are ubiquitous genetic elements conserved across species and suspected to be involved in numerous cellular processes. MetamORF (https://metamorf.hb.univ-amu.fr/) aims to provide a repository of unique sORFs identified in the human and mouse genomes with both experimental and computational approaches. By gathering publicly available sORF data, normalizing them and summarizing redundant information, we were able to identify a total of 1 162 675 unique sORFs. Despite the usual characterization of ORFs as short, upstream or downstream, there is currently no clear consensus regarding the definition of these categories. Thus, the data have been reprocessed using a normalized nomenclature. MetamORF enables new analyses at locus, gene, transcript and ORF levels, which should offer the possibility to address new questions regarding sORF functions in the future. The repository is available through an user-friendly web interface, allowing easy browsing, visualization, filtering over multiple criteria and export possibilities. sORFs can be searched starting from a gene, a transcript and an ORF ID, looking in a genome area or browsing the whole repository for a species. The database content has also been made available through track hubs at UCSC Genome Browser. Finally, we demonstrated an enrichment of genes harboring upstream ORFs among genes expressed in response to reticular stress. Database URL https://metamorf.hb.univ-amu.fr/.
Collapse
Affiliation(s)
- Sebastien A Choteau
- Aix-Marseille University, INSERM, TAGC, Turing Centre for Living Systems, 163 Avenue de Luminy, Marseille 13009, France.,Aix-Marseille University, INSERM, CNRS, CIML, Turing Centre for Living Systems, 163 Avenue de Luminy, Marseille 13009, France
| | - Audrey Wagner
- Aix-Marseille University, INSERM, TAGC, Turing Centre for Living Systems, 163 Avenue de Luminy, Marseille 13009, France
| | - Philippe Pierre
- Aix-Marseille University, INSERM, CNRS, CIML, Turing Centre for Living Systems, 163 Avenue de Luminy, Marseille 13009, France.,Department of Medical Sciences, Institute for Research in Biomedicine (iBiMED) and Ilidio Pinho Foundation, University of Aveiro, Aveiro 3810-193, Portugal.,Shanghai Institute of Immunology, School of Medicine, Shanghai Jiao Tong University, Shanghai, China
| | - Lionel Spinelli
- Aix-Marseille University, INSERM, TAGC, Turing Centre for Living Systems, 163 Avenue de Luminy, Marseille 13009, France.,Aix-Marseille University, INSERM, CNRS, CIML, Turing Centre for Living Systems, 163 Avenue de Luminy, Marseille 13009, France
| | - Christine Brun
- Aix-Marseille University, INSERM, TAGC, Turing Centre for Living Systems, 163 Avenue de Luminy, Marseille 13009, France.,CNRS, 31 Chemin Joseph Aiguier, Marseille 13009, France
| |
Collapse
|
62
|
Taylor HB, Klaeger S, Clauser KR, Sarkizova S, Weingarten-Gabbay S, Graham DB, Carr SA, Abelin JG. MS-Based HLA-II Peptidomics Combined With Multiomics Will Aid the Development of Future Immunotherapies. Mol Cell Proteomics 2021; 20:100116. [PMID: 34146720 PMCID: PMC8327157 DOI: 10.1016/j.mcpro.2021.100116] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2021] [Revised: 06/02/2021] [Accepted: 06/03/2021] [Indexed: 12/25/2022] Open
Abstract
Immunotherapies have emerged to treat diseases by selectively modulating a patient's immune response. Although the roles of T and B cells in adaptive immunity have been well studied, it remains difficult to select targets for immunotherapeutic strategies. Because human leukocyte antigen class II (HLA-II) peptides activate CD4+ T cells and regulate B cell activation, proliferation, and differentiation, these peptide antigens represent a class of potential immunotherapy targets and biomarkers. To better understand the molecular basis of how HLA-II antigen presentation is involved in disease progression and treatment, systematic HLA-II peptidomics combined with multiomic analyses of diverse cell types in healthy and diseased states is required. For this reason, MS-based innovations that facilitate investigations into the interplay between disease pathologies and the presentation of HLA-II peptides to CD4+ T cells will aid in the development of patient-focused immunotherapies.
Collapse
Affiliation(s)
- Hannah B Taylor
- Broad Institute of MIT and Harvard, Cambridge, Massachusetts, USA
| | - Susan Klaeger
- Broad Institute of MIT and Harvard, Cambridge, Massachusetts, USA
| | - Karl R Clauser
- Broad Institute of MIT and Harvard, Cambridge, Massachusetts, USA
| | | | - Shira Weingarten-Gabbay
- Broad Institute of MIT and Harvard, Cambridge, Massachusetts, USA; Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, Massachusetts, USA
| | - Daniel B Graham
- Broad Institute of MIT and Harvard, Cambridge, Massachusetts, USA; Center for Computational and Integrative Biology, Massachusetts General Hospital and Harvard Medical School, Boston, Massachusetts, USA; Department of Molecular Biology, Massachusetts General Hospital and Harvard Medical School, Boston, Massachusetts, USA
| | - Steven A Carr
- Broad Institute of MIT and Harvard, Cambridge, Massachusetts, USA
| | | |
Collapse
|
63
|
Bartholomäus A, Kolte B, Mustafayeva A, Goebel I, Fuchs S, Benndorf D, Engelmann S, Ignatova Z. smORFer: a modular algorithm to detect small ORFs in prokaryotes. Nucleic Acids Res 2021; 49:e89. [PMID: 34125903 PMCID: PMC8421149 DOI: 10.1093/nar/gkab477] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2020] [Revised: 04/29/2021] [Accepted: 05/18/2021] [Indexed: 11/15/2022] Open
Abstract
Emerging evidence places small proteins (≤50 amino acids) more centrally in physiological processes. Yet, their functional identification and the systematic genome annotation of their cognate small open-reading frames (smORFs) remains challenging both experimentally and computationally. Ribosome profiling or Ribo-Seq (that is a deep sequencing of ribosome-protected fragments) enables detecting of actively translated open-reading frames (ORFs) and empirical annotation of coding sequences (CDSs) using the in-register translation pattern that is characteristic for genuinely translating ribosomes. Multiple identifiers of ORFs that use the 3-nt periodicity in Ribo-Seq data sets have been successful in eukaryotic smORF annotation. They have difficulties evaluating prokaryotic genomes due to the unique architecture (e.g. polycistronic messages, overlapping ORFs, leaderless translation, non-canonical initiation etc.). Here, we present a new algorithm, smORFer, which performs with high accuracy in prokaryotic organisms in detecting putative smORFs. The unique feature of smORFer is that it uses an integrated approach and considers structural features of the genetic sequence along with in-frame translation and uses Fourier transform to convert these parameters into a measurable score to faithfully select smORFs. The algorithm is executed in a modular way, and dependent on the data available for a particular organism, different modules can be selected for smORF search.
Collapse
Affiliation(s)
- Alexander Bartholomäus
- GFZ German Research Centre for Geosciences, Section Geomicrobiology, 14473 Potsdam, Germany.,Inst. Biochemistry and Molecular Biology, Department of Chemistry, University of Hamburg, 20146 Hamburg, Germany
| | - Baban Kolte
- Inst. Biochemistry and Molecular Biology, Department of Chemistry, University of Hamburg, 20146 Hamburg, Germany
| | - Ayten Mustafayeva
- Helmholtz Center for Infection Research, Microbial Proteomics, 38124 Braunschweig, Germany.,Inst. Microbiology, TU Braunschweig, Braunschweig, Germany
| | - Ingrid Goebel
- Inst. Biochemistry and Molecular Biology, Department of Chemistry, University of Hamburg, 20146 Hamburg, Germany
| | | | - Dirk Benndorf
- Otto von Guericke University, Bioprocess Engineering, 39106 Magdeburg, Germany.,Max Planck Institute for Dynamics of Complex Technical Systems, Bioprocess Engineering, 39106 Magdeburg, Germany
| | - Susanne Engelmann
- Helmholtz Center for Infection Research, Microbial Proteomics, 38124 Braunschweig, Germany.,Inst. Microbiology, TU Braunschweig, Braunschweig, Germany
| | - Zoya Ignatova
- Inst. Biochemistry and Molecular Biology, Department of Chemistry, University of Hamburg, 20146 Hamburg, Germany
| |
Collapse
|
64
|
Finkel Y, Gluck A, Nachshon A, Winkler R, Fisher T, Rozman B, Mizrahi O, Lubelsky Y, Zuckerman B, Slobodin B, Yahalom-Ronen Y, Tamir H, Ulitsky I, Israely T, Paran N, Schwartz M, Stern-Ginossar N. SARS-CoV-2 uses a multipronged strategy to impede host protein synthesis. Nature 2021; 594:240-245. [PMID: 33979833 DOI: 10.1038/s41586-021-03610-3] [Citation(s) in RCA: 156] [Impact Index Per Article: 52.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2020] [Accepted: 05/04/2021] [Indexed: 02/07/2023]
Abstract
The coronavirus SARS-CoV-2 is the cause of the ongoing pandemic of COVID-191. Coronaviruses have developed a variety of mechanisms to repress host mRNA translation to allow the translation of viral mRNA, and concomitantly block the cellular innate immune response2,3. Although several different proteins of SARS-CoV-2 have previously been implicated in shutting off host expression4-7, a comprehensive picture of the effects of SARS-CoV-2 infection on cellular gene expression is lacking. Here we combine RNA sequencing, ribosome profiling and metabolic labelling of newly synthesized RNA to comprehensively define the mechanisms that are used by SARS-CoV-2 to shut off cellular protein synthesis. We show that infection leads to a global reduction in translation, but that viral transcripts are not preferentially translated. Instead, we find that infection leads to the accelerated degradation of cytosolic cellular mRNAs, which facilitates viral takeover of the mRNA pool in infected cells. We reveal that the translation of transcripts that are induced in response to infection (including innate immune genes) is impaired. We demonstrate this impairment is probably mediated by inhibition of nuclear mRNA export, which prevents newly transcribed cellular mRNA from accessing ribosomes. Overall, our results uncover a multipronged strategy that is used by SARS-CoV-2 to take over the translation machinery and to suppress host defences.
Collapse
Affiliation(s)
- Yaara Finkel
- Department of Molecular Genetics, Weizmann Institute of Science, Rehovot, Israel
| | - Avi Gluck
- Department of Molecular Genetics, Weizmann Institute of Science, Rehovot, Israel
| | - Aharon Nachshon
- Department of Molecular Genetics, Weizmann Institute of Science, Rehovot, Israel
| | - Roni Winkler
- Department of Molecular Genetics, Weizmann Institute of Science, Rehovot, Israel
| | - Tal Fisher
- Department of Molecular Genetics, Weizmann Institute of Science, Rehovot, Israel
| | - Batsheva Rozman
- Department of Molecular Genetics, Weizmann Institute of Science, Rehovot, Israel
| | - Orel Mizrahi
- Department of Molecular Genetics, Weizmann Institute of Science, Rehovot, Israel
| | - Yoav Lubelsky
- Department of Biological Regulation, Weizmann Institute of Science, Rehovot, Israel
| | - Binyamin Zuckerman
- Department of Biological Regulation, Weizmann Institute of Science, Rehovot, Israel
| | - Boris Slobodin
- Department of Biomolecular Sciences, Weizmann Institute of Science, Rehovot, Israel
| | - Yfat Yahalom-Ronen
- Department of Infectious Diseases, Israel Institute for Biological, Chemical and Environmental Sciences, Ness Ziona, Israel
| | - Hadas Tamir
- Department of Infectious Diseases, Israel Institute for Biological, Chemical and Environmental Sciences, Ness Ziona, Israel
| | - Igor Ulitsky
- Department of Biological Regulation, Weizmann Institute of Science, Rehovot, Israel
| | - Tomer Israely
- Department of Infectious Diseases, Israel Institute for Biological, Chemical and Environmental Sciences, Ness Ziona, Israel
| | - Nir Paran
- Department of Infectious Diseases, Israel Institute for Biological, Chemical and Environmental Sciences, Ness Ziona, Israel
| | - Michal Schwartz
- Department of Molecular Genetics, Weizmann Institute of Science, Rehovot, Israel.
| | - Noam Stern-Ginossar
- Department of Molecular Genetics, Weizmann Institute of Science, Rehovot, Israel.
| |
Collapse
|
65
|
Hu F, Lu J, Matheson LS, Díaz-Muñoz MD, Saveliev A, Turner M. ORFLine: a bioinformatic pipeline to prioritise small open reading frames identifies candidate secreted small proteins from lymphocytes. Bioinformatics 2021; 37:3152-3159. [PMID: 33970232 PMCID: PMC8504629 DOI: 10.1093/bioinformatics/btab339] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2020] [Revised: 03/25/2021] [Accepted: 04/30/2021] [Indexed: 11/30/2022] Open
Abstract
MOTIVATION The annotation of small open reading frames (smORFs) of less than 100 codons (<300 nucleotides) is challenging due to the large number of such sequences in the genome. RESULTS In this study, we developed a computational pipeline, which we have named ORFLine, that stringently identifies smORFs and classifies them according to their position within transcripts. We identified a total of 5744 unique smORFs in datasets from mouse B and T lymphocytes and systematically characterized them using ORFLine. We further searched smORFs for the presence of a signal peptide, which predicted known secreted chemokines as well as novel micropeptides. Four novel micropeptides show evidence of secretion and are therefore candidate mediators of immunoregulatory functions. AVAILABILITY Freely available on the web at https://github.com/boboppie/ORFLine. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Fengyuan Hu
- Laboratory of Lymphocyte Signalling and Development, The Babraham Institute, Babraham Research Campus, Cambridge, CB22 3AT, United Kingdom
| | - Jia Lu
- Laboratory of Lymphocyte Signalling and Development, The Babraham Institute, Babraham Research Campus, Cambridge, CB22 3AT, United Kingdom
| | - Louise S Matheson
- Laboratory of Lymphocyte Signalling and Development, The Babraham Institute, Babraham Research Campus, Cambridge, CB22 3AT, United Kingdom
| | - Manuel D Díaz-Muñoz
- Laboratory of Lymphocyte Signalling and Development, The Babraham Institute, Babraham Research Campus, Cambridge, CB22 3AT, United Kingdom
| | - Alexander Saveliev
- Laboratory of Lymphocyte Signalling and Development, The Babraham Institute, Babraham Research Campus, Cambridge, CB22 3AT, United Kingdom
| | - Martin Turner
- Laboratory of Lymphocyte Signalling and Development, The Babraham Institute, Babraham Research Campus, Cambridge, CB22 3AT, United Kingdom
| |
Collapse
|
66
|
Vitorino R, Guedes S, Amado F, Santos M, Akimitsu N. The role of micropeptides in biology. Cell Mol Life Sci 2021; 78:3285-3298. [PMID: 33507325 PMCID: PMC11073438 DOI: 10.1007/s00018-020-03740-3] [Citation(s) in RCA: 19] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/10/2020] [Revised: 12/01/2020] [Accepted: 12/11/2020] [Indexed: 12/11/2022]
Abstract
Micropeptides are small polypeptides coded by small open-reading frames. Progress in computational biology and the analyses of large-scale transcriptomes and proteomes have revealed that mammalian genomes produce a large number of transcripts encoding micropeptides. Many of these have been previously annotated as long noncoding RNAs. The role of micropeptides in cellular homeostasis maintenance has been demonstrated. This review discusses different types of micropeptides as well as methods to identify them, such as computational approaches, ribosome profiling, and mass spectrometry.
Collapse
Affiliation(s)
- Rui Vitorino
- Departamento de Cirurgia E Fisiologia, Faculdade de Medicina da Universidade Do Porto, UnIC, Porto, Portugal.
- Department of Medical Sciences, iBiMED, University of Aveiro, Aveiro, Portugal.
| | - Sofia Guedes
- Departamento de Química, LAQV-REQUIMTE, Universidade de Aveiro, Aveiro, Portugal
- Department of Chemistry, University of Aveiro, Aveiro, Portugal
| | - Francisco Amado
- Departamento de Química, LAQV-REQUIMTE, Universidade de Aveiro, Aveiro, Portugal
- Department of Chemistry, University of Aveiro, Aveiro, Portugal
| | - Manuel Santos
- Department of Medical Sciences, iBiMED, University of Aveiro, Aveiro, Portugal
| | | |
Collapse
|
67
|
uORF-seqr: A Machine Learning-Based Approach to the Identification of Upstream Open Reading Frames in Yeast. Methods Mol Biol 2021. [PMID: 33765283 DOI: 10.1007/978-1-0716-1150-0_15] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/29/2023]
Abstract
The identification of upstream open reading frames (uORFs) using ribosome profiling data is complicated by several factors such as the noise inherent to the procedure, the substantial increase in potential translation initiation sites (and false positives) when one includes non-canonical start codons, and the paucity of molecularly validated uORFs. Here we present uORF-seqr, a novel machine learning algorithm that uses ribosome profiling data, in conjunction with RNA-seq data, as well as transcript aware genome annotation files to identify statistically significant AUG and near-cognate codon uORFs.
Collapse
|
68
|
Abstract
Translation initiation site (TIS) profiling allows for the genome-wide identification of TISs in vivo by exclusively capturing mRNA fragments within ribosomes that have just completed translation initiation. It leverages translation inhibitors, such as harringtonine and lactimidomycin (LTM), that preferentially capture ribosomes at start codon positions, protecting TIS-derived mRNA fragments from nuclease digestion. Here, we describe a step-by-step protocol for TIS profiling in LTM-treated budding yeast that we developed to identify TISs and open reading frames in vegetative and meiotic cells. For complete details on the use and execution of this protocol, please refer to Eisenberg et al. (2020).
Collapse
Affiliation(s)
- Ina Hollerer
- Department of Molecular and Cell Biology, University of California, Berkeley, Berkeley, CA 94720, USA
- California Institute for Quantitative Biosciences (QB3), University of California, San Francisco, CA 94158, USA
| | - Emily N. Powers
- Department of Molecular and Cell Biology, University of California, Berkeley, Berkeley, CA 94720, USA
- California Institute for Quantitative Biosciences (QB3), University of California, San Francisco, CA 94158, USA
| | - Gloria A. Brar
- Department of Molecular and Cell Biology, University of California, Berkeley, Berkeley, CA 94720, USA
- California Institute for Quantitative Biosciences (QB3), University of California, San Francisco, CA 94158, USA
- Center for Computational Biology, University of California, Berkeley, Berkeley, CA 94720, USA
| |
Collapse
|
69
|
Ruiz Cuevas MV, Hardy MP, Hollý J, Bonneil É, Durette C, Courcelles M, Lanoix J, Côté C, Staudt LM, Lemieux S, Thibault P, Perreault C, Yewdell JW. Most non-canonical proteins uniquely populate the proteome or immunopeptidome. Cell Rep 2021; 34:108815. [PMID: 33691108 PMCID: PMC8040094 DOI: 10.1016/j.celrep.2021.108815] [Citation(s) in RCA: 102] [Impact Index Per Article: 34.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2020] [Revised: 01/29/2021] [Accepted: 02/10/2021] [Indexed: 12/16/2022] Open
Abstract
Combining RNA sequencing, ribosome profiling, and mass spectrometry, we elucidate the contribution of non-canonical translation to the proteome and major histocompatibility complex (MHC) class I immunopeptidome. Remarkably, of 14,498 proteins identified in three human B cell lymphomas, 2,503 are non-canonical proteins. Of these, 28% are novel isoforms and 72% are cryptic proteins encoded by ostensibly non-coding regions (60%) or frameshifted canonical genes (12%). Cryptic proteins are translated as efficiently as canonical proteins, have more predicted disordered residues and lower stability, and critically generate MHC-I peptides 5-fold more efficiently per translation event. Translating 5' "untranslated" regions hinders downstream translation of genes involved in transcription, translation, and antiviral responses. Novel protein isoforms show strong enrichment for signaling pathways deregulated in cancer. Only a small fraction of cryptic proteins detected in the proteome contribute to the MHC-I immunopeptidome, demonstrating the high preferential access of cryptic defective ribosomal products to the class I pathway.
Collapse
Affiliation(s)
- Maria Virginia Ruiz Cuevas
- Institute for Research in Immunology and Cancer (IRIC), Université de Montréal, Montreal, QC H3C 3J7, Canada; Department of Biochemistry and Molecular Medicine, Université de Montréal, Montreal, QC H3C 3J7, Canada
| | - Marie-Pierre Hardy
- Institute for Research in Immunology and Cancer (IRIC), Université de Montréal, Montreal, QC H3C 3J7, Canada
| | - Jaroslav Hollý
- Cellular Biology Section, Laboratory of Viral Diseases, National Institute of Allergy and Infectious Diseases, National Institutes of Health, Bethesda, MD 20892, USA
| | - Éric Bonneil
- Institute for Research in Immunology and Cancer (IRIC), Université de Montréal, Montreal, QC H3C 3J7, Canada
| | - Chantal Durette
- Institute for Research in Immunology and Cancer (IRIC), Université de Montréal, Montreal, QC H3C 3J7, Canada
| | - Mathieu Courcelles
- Institute for Research in Immunology and Cancer (IRIC), Université de Montréal, Montreal, QC H3C 3J7, Canada
| | - Joël Lanoix
- Institute for Research in Immunology and Cancer (IRIC), Université de Montréal, Montreal, QC H3C 3J7, Canada
| | - Caroline Côté
- Institute for Research in Immunology and Cancer (IRIC), Université de Montréal, Montreal, QC H3C 3J7, Canada
| | - Louis M Staudt
- Lymphoid Malignancies Branch, National Cancer Institute, National Institutes of Health, Bethesda, MD 20892, USA
| | - Sébastien Lemieux
- Institute for Research in Immunology and Cancer (IRIC), Université de Montréal, Montreal, QC H3C 3J7, Canada; Department of Biochemistry and Molecular Medicine, Université de Montréal, Montreal, QC H3C 3J7, Canada
| | - Pierre Thibault
- Institute for Research in Immunology and Cancer (IRIC), Université de Montréal, Montreal, QC H3C 3J7, Canada; Department of Chemistry, Université de Montréal, Montreal, QC H3C 3J7, Canada
| | - Claude Perreault
- Institute for Research in Immunology and Cancer (IRIC), Université de Montréal, Montreal, QC H3C 3J7, Canada; Department of Medicine, Université de Montréal, Montreal, QC H3C 3J7, Canada.
| | - Jonathan W Yewdell
- Cellular Biology Section, Laboratory of Viral Diseases, National Institute of Allergy and Infectious Diseases, National Institutes of Health, Bethesda, MD 20892, USA.
| |
Collapse
|
70
|
Dodbele S, Mutlu N, Wilusz JE. Best practices to ensure robust investigation of circular RNAs: pitfalls and tips. EMBO Rep 2021; 22:e52072. [PMID: 33629517 PMCID: PMC7926241 DOI: 10.15252/embr.202052072] [Citation(s) in RCA: 38] [Impact Index Per Article: 12.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2020] [Revised: 12/13/2020] [Accepted: 01/29/2021] [Indexed: 12/14/2022] Open
Abstract
Pre-mRNAs from thousands of eukaryotic genes can be non-canonically spliced to generate circular RNAs (circRNAs) that have covalently linked ends. Most mature circular RNAs are expressed at low levels, but some have known physiological functions and/or accumulate to higher levels than their associated linear mRNAs. These observations have sparked great interest into this class of previously underappreciated RNAs and prompted the development of new experimental approaches to study them, especially methods to measure or modulate circular RNA expression levels. Nonetheless, each of these approaches has caveats and potential pitfalls that must be controlled for when designing experiments and interpreting results. Here, we provide practical advice, tips, and suggested guidelines for performing robust identification, validation, and functional characterization of circular RNAs. Beyond promoting rigor and reproducibility, these suggestions should help bring clarity to the field, especially how circular RNAs function and whether these transcripts may sponge microRNAs/proteins or serve as templates for translation.
Collapse
Affiliation(s)
- Samantha Dodbele
- Department of Biochemistry and BiophysicsUniversity of Pennsylvania Perelman School of MedicinePhiladelphiaPAUSA
| | - Nebibe Mutlu
- Department of Biochemistry and BiophysicsUniversity of Pennsylvania Perelman School of MedicinePhiladelphiaPAUSA
| | - Jeremy E Wilusz
- Department of Biochemistry and BiophysicsUniversity of Pennsylvania Perelman School of MedicinePhiladelphiaPAUSA
| |
Collapse
|
71
|
Schlesinger D, Elsässer SJ. Revisiting sORFs: overcoming challenges to identify and characterize functional microproteins. FEBS J 2021; 289:53-74. [PMID: 33595896 DOI: 10.1111/febs.15769] [Citation(s) in RCA: 54] [Impact Index Per Article: 18.0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2020] [Revised: 01/17/2021] [Accepted: 02/15/2021] [Indexed: 02/07/2023]
Abstract
Short ORFs (sORFs), that is, occurrences of a start and stop codon within 100 codons or less, can be found in organisms of all domains of life, outnumbering annotated protein-coding ORFs by orders of magnitude. Even though functional proteins smaller than 100 amino acids are known, the coding potential of sORFs has often been overlooked, as it is not trivial to predict and test for functionality within the large number of sORFs. Recent advances in ribosome profiling and mass spectrometry approaches, together with refined bioinformatic predictions, have enabled a huge leap forward in this field and identified thousands of likely coding sORFs. A relatively low number of small proteins or microproteins produced from these sORFs have been characterized so far on the molecular, structural, and/or mechanistic level. These however display versatile and, in some cases, essential cellular functions, allowing for the exciting possibility that many more, previously unknown small proteins might be encoded in the genome, waiting to be discovered. This review will give an overview of the steadily growing microprotein field, focusing on eukaryotic small proteins. We will discuss emerging themes in the molecular action of microproteins, as well as advances and challenges in microprotein identification and characterization.
Collapse
Affiliation(s)
- Dörte Schlesinger
- Science for Life Laboratory, Division of Genome Biology, Department of Medical Biochemistry and Biophysics, Karolinska Institutet, Stockholm, Sweden.,Ming Wai Lau Centre for Reparative Medicine, Stockholm node, Karolinska Institutet, Stockholm, Sweden
| | - Simon J Elsässer
- Science for Life Laboratory, Division of Genome Biology, Department of Medical Biochemistry and Biophysics, Karolinska Institutet, Stockholm, Sweden.,Ming Wai Lau Centre for Reparative Medicine, Stockholm node, Karolinska Institutet, Stockholm, Sweden
| |
Collapse
|
72
|
Nasir MA, Nawaz S, Huang J. A Mini-review of Computational Approaches to Predict Functions and Findings of Novel Micro Peptides. Curr Bioinform 2021. [DOI: 10.2174/1574893615999200811130522] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2023]
Abstract
:
New techniques in bioinformatics and the study of the transcriptome at a wide-scale
have uncovered the fact that a large part of the genome is being translated than recently perceived
thoughts and research, bringing about the creation of a various quantity of RNA with proteincoding
and noncoding potential. A lot of RNA particles have been considered as noncoding due to
many reasons, according to developing proofs. Like many sORFs that encode many functional
micro peptides have neglected due to their tiny sizes.
:
Advanced studies reveal many major biological functions of these sORFs and their encoded micro
peptides in a different and wide range of species. All the achievement in the identification of these
sORFs and micro peptides is due to the progressive bioinformatics and high-throughput
sequencing methods. This field has pulled in more consideration due to the detection of a large
number of more sORFs and micro peptides. Nowadays, COVID-19 grabs all the attention of
science as it is a sudden outbreak. sORFs of COVID-19 should be revealed for new ways to
understand this virus. This review discusses ongoing progress in the systems for the identification
and distinguishing proof of sORFs and micro peptides.
Collapse
Affiliation(s)
- Mohsin Ali Nasir
- Center for Informational Biology, University of Electronic Science and Technology of China, No. 2006, Xiyuan Ave, West Hi-Tech Zone, Chengdu 611731, China
| | - Samia Nawaz
- Center for Informational Biology, University of Electronic Science and Technology of China, No. 2006, Xiyuan Ave, West Hi-Tech Zone, Chengdu 611731, China
| | - Jian Huang
- Center for Informational Biology, University of Electronic Science and Technology of China, No. 2006, Xiyuan Ave, West Hi-Tech Zone, Chengdu 611731, China
| |
Collapse
|
73
|
Abstract
Translation is a central biological process in living cells. Ribosome profiling approach enables assessing translation on a global, cell-wide level. Extracting versatile information from the ribosome profiling data usually requires specialized expertise for handling the sequencing data that is not available to the broad community of experimentalists. Here, we provide an easy-to-use and modifiable workflow that uses a small set of commands and enables full data analysis in a standardized way, including precise positioning of the ribosome-protected fragments, for determining codon-specific translation features. The workflow is complemented with simple step-by-step explanations and is accessible to scientists with no computational background.
Collapse
Affiliation(s)
| | - Zoya Ignatova
- Institute for Biochemistry and Molecular Biology, Department of Chemistry, University of Hamburg, Hamburg, Germany.
| |
Collapse
|
74
|
Li K, Hope CM, Wang XA, Wang JP. RiboDiPA: a novel tool for differential pattern analysis in Ribo-seq data. Nucleic Acids Res 2020; 48:12016-12029. [PMID: 33211868 PMCID: PMC7708064 DOI: 10.1093/nar/gkaa1049] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2020] [Revised: 10/14/2020] [Accepted: 10/20/2020] [Indexed: 12/18/2022] Open
Abstract
Ribosome profiling, also known as Ribo-seq, has become a popular approach to investigate regulatory mechanisms of translation in a wide variety of biological contexts. Ribo-seq not only provides a measurement of translation efficiency based on the relative abundance of ribosomes bound to transcripts, but also has the capacity to reveal dynamic and local regulation at different stages of translation based on positional information of footprints across individual transcripts. While many computational tools exist for the analysis of Ribo-seq data, no method is currently available for rigorous testing of the pattern differences in ribosome footprints. In this work, we develop a novel approach together with an R package, RiboDiPA, for Differential Pattern Analysis of Ribo-seq data. RiboDiPA allows for quick identification of genes with statistically significant differences in ribosome occupancy patterns for model organisms ranging from yeast to mammals. We show that differential pattern analysis reveals information that is distinct and complimentary to existing methods that focus on translational efficiency analysis. Using both simulated Ribo-seq footprint data and three benchmark data sets, we illustrate that RiboDiPA can uncover meaningful pattern differences across multiple biological conditions on a global scale, and pinpoint characteristic ribosome occupancy patterns at single codon resolution.
Collapse
Affiliation(s)
- Keren Li
- Department of Statistics, Northwestern University, 633 Clark Street, Evanston, IL 60208, USA.,NSF-Simons Center for Quantitative Biology, Northwestern University, 633 Clark Street, Evanston, IL 60208, USA
| | - C Matthew Hope
- NSF-Simons Center for Quantitative Biology, Northwestern University, 633 Clark Street, Evanston, IL 60208, USA.,Department of Molecular Biosciences, Northwestern University, 633 Clark Street, Evanston, IL 60208, USA
| | - Xiaozhong A Wang
- NSF-Simons Center for Quantitative Biology, Northwestern University, 633 Clark Street, Evanston, IL 60208, USA.,Department of Molecular Biosciences, Northwestern University, 633 Clark Street, Evanston, IL 60208, USA
| | - Ji-Ping Wang
- Department of Statistics, Northwestern University, 633 Clark Street, Evanston, IL 60208, USA.,NSF-Simons Center for Quantitative Biology, Northwestern University, 633 Clark Street, Evanston, IL 60208, USA
| |
Collapse
|
75
|
Higdon AL, Brar GA. Rules are made to be broken: a "simple" model organism reveals the complexity of gene regulation. Curr Genet 2020; 67:49-56. [PMID: 33130938 DOI: 10.1007/s00294-020-01121-8] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2020] [Revised: 10/14/2020] [Accepted: 10/19/2020] [Indexed: 11/27/2022]
Abstract
Global methods for assaying translation have greatly improved our understanding of the protein-coding capacity of the genome. In particular, it is now possible to perform genome-wide and condition-specific identification of translation initiation sites through modified ribosome profiling methods that selectively capture initiating ribosomes. Here we discuss our recent study applying such an approach to meiotic and mitotic timepoints in the simple eukaryote, budding yeast, as an example of the surprising diversity of protein products-many of which are non-canonical-that can be revealed by such methods. We also highlight several key challenges in studying non-canonical protein isoforms that have precluded their prior systematic discovery. A growing body of work supports expanded use of empirical protein-coding region identification, which can help relieve some of the limitations and biases inherent to traditional genome annotation approaches. Our study also argues for the adoption of less static views of gene identity and a broader framework for considering the translational capacity of the genome.
Collapse
Affiliation(s)
- Andrea L Higdon
- Department of Molecular and Cell Biology, University of California, Berkeley, CA, 94720, USA
- Center for Computational Biology, University of California, Berkeley, CA, 94720, USA
| | - Gloria A Brar
- Department of Molecular and Cell Biology, University of California, Berkeley, CA, 94720, USA.
- Center for Computational Biology, University of California, Berkeley, CA, 94720, USA.
| |
Collapse
|
76
|
Zhou B, Yang H, Yang C, Bao YL, Yang SM, Liu J, Xiao YF. Translation of noncoding RNAs and cancer. Cancer Lett 2020; 497:89-99. [PMID: 33038492 DOI: 10.1016/j.canlet.2020.10.002] [Citation(s) in RCA: 83] [Impact Index Per Article: 20.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2020] [Revised: 09/30/2020] [Accepted: 10/01/2020] [Indexed: 02/07/2023]
Abstract
The human genome contains thousands of noncoding RNAs (ncRNAs), which are thought to lack open reading frames (ORFs) and cannot be translated. Some ncRNAs reportedly have important functions, including epigenetic regulation, chromatin remolding, protein modification, and RNA degradation, but the functions of most ncRNAs remain elusive. Through the application and development of ribosome profiling and sequencing technologies, an increasing number of studies have discovered the translation of ncRNAs. Although ncRNAs were initially defined as noncoding RNAs, a number of ncRNAs actually contain ORFs that are translated into peptides. Here, we summarize the available methods, tools, and databases for identifying and validating ncRNA-encoded peptides/proteins, and the recent findings regarding ncRNA-encoded small peptides/proteins in cancer are compiled and synthesized. Importantly, the role of ncRNA-encoding peptides/proteins has application prospects in cancer research, but some potential challenges remain unresolved. The aim of this review is to provide a theoretical basis that might promote the discovery of more peptides/proteins encoded by ncRNAs and aid the further development of novel diagnostic and prognostic cancer markers and therapeutic targets.
Collapse
Affiliation(s)
- Bo Zhou
- Department of Gastroenterology, Xinqiao Hospital, Chongqing, 400037, China
| | - Huan Yang
- Department of Gastroenterology, Xinqiao Hospital, Chongqing, 400037, China
| | - Chuan Yang
- Department of Gastroenterology, Xinqiao Hospital, Chongqing, 400037, China
| | - Yu-Lu Bao
- Department of Gastroenterology, Xinqiao Hospital, Chongqing, 400037, China
| | - Shi-Ming Yang
- Department of Gastroenterology, Xinqiao Hospital, Chongqing, 400037, China
| | - Jiao Liu
- Department of Endoscope, General Hospital of Northern Theater Command, Shenyang, 110016, Liaoning, China.
| | - Yu-Feng Xiao
- Department of Gastroenterology, Xinqiao Hospital, Chongqing, 400037, China.
| |
Collapse
|
77
|
Liu Q, Shvarts T, Sliz P, Gregory RI. RiboToolkit: an integrated platform for analysis and annotation of ribosome profiling data to decode mRNA translation at codon resolution. Nucleic Acids Res 2020; 48:W218-W229. [PMID: 32427338 PMCID: PMC7319539 DOI: 10.1093/nar/gkaa395] [Citation(s) in RCA: 44] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2020] [Revised: 04/23/2020] [Accepted: 05/15/2020] [Indexed: 12/31/2022] Open
Abstract
Ribosome profiling (Ribo-seq) is a powerful technology for globally monitoring RNA translation; ranging from codon occupancy profiling, identification of actively translated open reading frames (ORFs), to the quantification of translational efficiency under various physiological or experimental conditions. However, analyzing and decoding translation information from Ribo-seq data is not trivial. Although there are many existing tools to analyze Ribo-seq data, most of these tools are designed for specific or limited functionalities and an easy-to-use integrated tool to analyze Ribo-seq data is lacking. Fortunately, the small size (26–34 nt) of ribosome protected fragments (RPFs) in Ribo-seq and the relatively small amount of sequencing data greatly facilitates the development of such a web platform, which is easy to manipulate for users with or without bioinformatic expertise. Thus, we developed RiboToolkit (http://rnabioinfor.tch.harvard.edu/RiboToolkit), a convenient, freely available, web-based service to centralize Ribo-seq data analyses, including data cleaning and quality evaluation, expression analysis based on RPFs, codon occupancy, translation efficiency analysis, differential translation analysis, functional annotation, translation metagene analysis, and identification of actively translated ORFs. Besides, easy-to-use web interfaces were developed to facilitate data analysis and intuitively visualize results. Thus, RiboToolkit will greatly facilitate the study of mRNA translation based on ribosome profiling.
Collapse
Affiliation(s)
- Qi Liu
- Stem Cell Program, Division of Hematology/Oncology, Boston Children's Hospital, Boston, MA 02115, USA.,Department of Biological Chemistry and Molecular Pharmacology, Harvard Medical School, Boston, MA 02115, USA
| | - Tanya Shvarts
- Computational Health Informatics Program, Boston Children's Hospital, Boston, MA 02115, USA
| | - Piotr Sliz
- Department of Biological Chemistry and Molecular Pharmacology, Harvard Medical School, Boston, MA 02115, USA.,Computational Health Informatics Program, Boston Children's Hospital, Boston, MA 02115, USA
| | - Richard I Gregory
- Stem Cell Program, Division of Hematology/Oncology, Boston Children's Hospital, Boston, MA 02115, USA.,Department of Biological Chemistry and Molecular Pharmacology, Harvard Medical School, Boston, MA 02115, USA.,Department of Pediatrics, Harvard Medical School, Boston, MA 02115, USA.,Harvard Initiative for RNA Medicine, Boston, MA 02115, USA.,Harvard Stem Cell Institute, Cambridge, MA 02138, USA
| |
Collapse
|
78
|
Hoang HD, Graber TE, Jia JJ, Vaidya N, Gilchrist VH, Xiang X, Li W, Cowan KN, Gkogkas CG, Jaramillo M, Jafarnejad SM, Alain T. Induction of an Alternative mRNA 5' Leader Enhances Translation of the Ciliopathy Gene Inpp5e and Resistance to Oncolytic Virus Infection. Cell Rep 2020; 29:4010-4023.e5. [PMID: 31851930 DOI: 10.1016/j.celrep.2019.11.072] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2019] [Revised: 10/16/2019] [Accepted: 11/15/2019] [Indexed: 01/10/2023] Open
Abstract
Residual cell-intrinsic innate immunity in cancer cells hampers infection with oncolytic viruses. Translational control of mRNA is an important feature of innate immunity, yet the identity of translationally regulated mRNAs functioning in host defense remains ill-defined. We report the translatomes of resistant murine "4T1" breast cancer cells infected with three of the most clinically advanced oncolytic viruses: herpes simplex virus 1, reovirus, and vaccinia virus. Common among all three infections are translationally de-repressed mRNAs, including Inpp5e, encoding an inositol 5-phosphatase that modifies lipid second messenger signaling. We find that viral infection induces the expression of an Inpp5e mRNA variant that lacks repressive upstream open reading frames (uORFs) within its 5' leader and is efficiently translated. Furthermore, we show that INPP5E contributes to antiviral immunity by altering virus attachment. These findings uncover a role for translational control through alternative 5' leader expression and assign an antiviral function to the ciliopathy gene Inpp5e.
Collapse
Affiliation(s)
- Huy-Dung Hoang
- Children's Hospital of Eastern Ontario Research Institute, Ottawa, ON K1H 8L1, Canada; Department of Biochemistry, Microbiology, and Immunology, University of Ottawa, Ottawa, ON K1H 8M5, Canada
| | - Tyson E Graber
- Children's Hospital of Eastern Ontario Research Institute, Ottawa, ON K1H 8L1, Canada; Department of Biochemistry and Goodman Cancer Center, McGill University, Montreal, QC H3A 1A3, Canada
| | - Jian-Jun Jia
- Children's Hospital of Eastern Ontario Research Institute, Ottawa, ON K1H 8L1, Canada
| | - Nasana Vaidya
- Children's Hospital of Eastern Ontario Research Institute, Ottawa, ON K1H 8L1, Canada
| | - Victoria H Gilchrist
- Children's Hospital of Eastern Ontario Research Institute, Ottawa, ON K1H 8L1, Canada; Department of Biochemistry, Microbiology, and Immunology, University of Ottawa, Ottawa, ON K1H 8M5, Canada
| | - Xiao Xiang
- Children's Hospital of Eastern Ontario Research Institute, Ottawa, ON K1H 8L1, Canada; Department of Cellular and Molecular Medicine, University of Ottawa, Ottawa, ON K1H 8M5, Canada
| | - Wencheng Li
- Department of Biochemistry and Molecular Biology, Rutgers New Jersey Medical School, Newark, NJ 07101, USA
| | - Kyle N Cowan
- Children's Hospital of Eastern Ontario Research Institute, Ottawa, ON K1H 8L1, Canada; Department of Cellular and Molecular Medicine, University of Ottawa, Ottawa, ON K1H 8M5, Canada; Department of Surgery, Children's Hospital of Eastern Ontario, University of Ottawa, Ottawa, ON K1H 8L1, Canada
| | - Christos G Gkogkas
- Centre for Discovery Brain Sciences, University of Edinburgh, Edinburgh EH8 9XD, UK
| | - Maritza Jaramillo
- INRS Institut Armand-Frappier Research Centre, Laval, QC H7V 1B7, Canada
| | - Seyed Mehdi Jafarnejad
- Centre for Cancer Research and Cell Biology, School of Medicine, Dentistry and Biomedical Science, Queen's University Belfast, Belfast BT9 7AE, UK
| | - Tommy Alain
- Children's Hospital of Eastern Ontario Research Institute, Ottawa, ON K1H 8L1, Canada; Department of Biochemistry, Microbiology, and Immunology, University of Ottawa, Ottawa, ON K1H 8M5, Canada.
| |
Collapse
|
79
|
Choudhary S, Li W, D Smith A. Accurate detection of short and long active ORFs using Ribo-seq data. Bioinformatics 2020; 36:2053-2059. [PMID: 31750902 DOI: 10.1093/bioinformatics/btz878] [Citation(s) in RCA: 25] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2019] [Revised: 11/04/2019] [Accepted: 11/20/2019] [Indexed: 12/27/2022] Open
Abstract
MOTIVATION Ribo-seq, a technique for deep-sequencing ribosome-protected mRNA fragments, has enabled transcriptome-wide monitoring of translation in vivo. It has opened avenues for re-evaluating the coding potential of open reading frames (ORFs), including many short ORFs that were previously presumed to be non-translating. However, the detection of translating ORFs, specifically short ORFs, from Ribo-seq data, remains challenging due to its high heterogeneity and noise. RESULTS We present ribotricer, a method for detecting actively translating ORFs by directly leveraging the three-nucleotide periodicity of Ribo-seq data. Ribotricer demonstrates higher accuracy and robustness compared with other methods at detecting actively translating ORFs including short ORFs on multiple published datasets across species inclusive of Arabidopsis, Caenorhabditis elegans, Drosophila, human, mouse, rat, yeast and zebrafish. AVAILABILITY AND IMPLEMENTATION Ribotricer is available at https://github.com/smithlabcode/ribotricer. All analysis scripts and results are available at https://github.com/smithlabcode/ribotricer-results. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Saket Choudhary
- Computational Biology and Bioinformatics, University of Southern California, Los Angeles, CA 90089, USA
| | - Wenzheng Li
- Computational Biology and Bioinformatics, University of Southern California, Los Angeles, CA 90089, USA
| | - Andrew D Smith
- Computational Biology and Bioinformatics, University of Southern California, Los Angeles, CA 90089, USA
| |
Collapse
|
80
|
Finkel Y, Mizrahi O, Nachshon A, Weingarten-Gabbay S, Morgenstern D, Yahalom-Ronen Y, Tamir H, Achdout H, Stein D, Israeli O, Beth-Din A, Melamed S, Weiss S, Israely T, Paran N, Schwartz M, Stern-Ginossar N. The coding capacity of SARS-CoV-2. Nature 2020; 589:125-130. [PMID: 32906143 DOI: 10.1038/s41586-020-2739-1] [Citation(s) in RCA: 363] [Impact Index Per Article: 90.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2020] [Accepted: 09/01/2020] [Indexed: 12/18/2022]
Abstract
Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) is the cause of the ongoing coronavirus disease 2019 (COVID-19) pandemic1. To understand the pathogenicity and antigenic potential of SARS-CoV-2 and to develop therapeutic tools, it is essential to profile the full repertoire of its expressed proteins. The current map of SARS-CoV-2 coding capacity is based on computational predictions and relies on homology with other coronaviruses. As the protein complement varies among coronaviruses, especially in regard to the variety of accessory proteins, it is crucial to characterize the specific range of SARS-CoV-2 proteins in an unbiased and open-ended manner. Here, using a suite of ribosome-profiling techniques2-4, we present a high-resolution map of coding regions in the SARS-CoV-2 genome, which enables us to accurately quantify the expression of canonical viral open reading frames (ORFs) and to identify 23 unannotated viral ORFs. These ORFs include upstream ORFs that are likely to have a regulatory role, several in-frame internal ORFs within existing ORFs, resulting in N-terminally truncated products, as well as internal out-of-frame ORFs, which generate novel polypeptides. We further show that viral mRNAs are not translated more efficiently than host mRNAs; instead, virus translation dominates host translation because of the high levels of viral transcripts. Our work provides a resource that will form the basis of future functional studies.
Collapse
Affiliation(s)
- Yaara Finkel
- Department of Molecular Genetics, Weizmann Institute of Science, Rehovot, Israel
| | - Orel Mizrahi
- Department of Molecular Genetics, Weizmann Institute of Science, Rehovot, Israel
| | - Aharon Nachshon
- Department of Molecular Genetics, Weizmann Institute of Science, Rehovot, Israel
| | - Shira Weingarten-Gabbay
- Broad Institute of MIT and Harvard, Cambridge, MA, USA.,Department of Organismal and Evolutionary Biology, Harvard University, Cambridge, MA, USA
| | - David Morgenstern
- de Botton Institute for Protein Profiling, The Nancy and Stephen Grand Israel National Center for Personalised Medicine, Weizmann Institute of Science, Rehovot, Israel
| | - Yfat Yahalom-Ronen
- Department of Infectious Diseases, Israel Institute for Biological Research, Ness Ziona, Israel
| | - Hadas Tamir
- Department of Infectious Diseases, Israel Institute for Biological Research, Ness Ziona, Israel
| | - Hagit Achdout
- Department of Infectious Diseases, Israel Institute for Biological Research, Ness Ziona, Israel
| | - Dana Stein
- Department of Biochemistry and Molecular Genetics, Israel Institute for Biological Research, Ness Ziona, Israel
| | - Ofir Israeli
- Department of Biochemistry and Molecular Genetics, Israel Institute for Biological Research, Ness Ziona, Israel
| | - Adi Beth-Din
- Department of Biochemistry and Molecular Genetics, Israel Institute for Biological Research, Ness Ziona, Israel
| | - Sharon Melamed
- Department of Infectious Diseases, Israel Institute for Biological Research, Ness Ziona, Israel
| | - Shay Weiss
- Department of Infectious Diseases, Israel Institute for Biological Research, Ness Ziona, Israel
| | - Tomer Israely
- Department of Infectious Diseases, Israel Institute for Biological Research, Ness Ziona, Israel
| | - Nir Paran
- Department of Infectious Diseases, Israel Institute for Biological Research, Ness Ziona, Israel
| | - Michal Schwartz
- Department of Molecular Genetics, Weizmann Institute of Science, Rehovot, Israel
| | - Noam Stern-Ginossar
- Department of Molecular Genetics, Weizmann Institute of Science, Rehovot, Israel.
| |
Collapse
|
81
|
Akirtava C, McManus CJ. Control of translation by eukaryotic mRNA transcript leaders-Insights from high-throughput assays and computational modeling. WILEY INTERDISCIPLINARY REVIEWS-RNA 2020; 12:e1623. [PMID: 32869519 DOI: 10.1002/wrna.1623] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/22/2020] [Revised: 07/23/2020] [Accepted: 07/30/2020] [Indexed: 12/21/2022]
Abstract
Eukaryotic gene expression is tightly regulated during translation of mRNA to protein. Mis-regulation of translation can lead to aberrant proteins which accumulate in cancers and cause neurodegenerative diseases. Foundational studies on model genes established fundamental roles for mRNA 5' transcript leader (TL) sequences in controlling ribosome recruitment, scanning, and initiation. TL cis-regulatory elements and their corresponding trans-acting factors control cap-dependent initiation under unstressed conditions. Under stress, cap-dependent initiation is suppressed, and specific mRNA structures and sequences promote translation of stress-responsive transcripts to remodel the proteome. In this review, we summarize current knowledge of TL functions in translation initiation. We focus on insights from high-throughput analyses of ribosome occupancy, mRNA structure, RNA Binding Protein occupancy, and massively parallel reporter assays. These data-driven approaches, coupled with computational analyses and modeling, have paved the way for a comprehensive understanding of TL functions. Finally, we will discuss areas of future research on the roles of mRNA sequences and structures in translation. This article is categorized under: Translation > Translation Mechanisms RNA Evolution and Genomics > Computational Analyses of RNA RNA Structure and Dynamics > Influence of RNA Structure in Biological Systems.
Collapse
Affiliation(s)
- Christina Akirtava
- Department of Biological Sciences, Carnegie Mellon University, Pittsburgh, Pennsylvania, USA
| | - Charles Joel McManus
- Department of Biological Sciences, Carnegie Mellon University, Pittsburgh, Pennsylvania, USA.,Computational Biology Department, Carnegie Mellon University, Pittsburgh, Pennsylvania, USA
| |
Collapse
|
82
|
Eisenberg AR, Higdon AL, Hollerer I, Fields AP, Jungreis I, Diamond PD, Kellis M, Jovanovic M, Brar GA. Translation Initiation Site Profiling Reveals Widespread Synthesis of Non-AUG-Initiated Protein Isoforms in Yeast. Cell Syst 2020; 11:145-160.e5. [PMID: 32710835 PMCID: PMC7508262 DOI: 10.1016/j.cels.2020.06.011] [Citation(s) in RCA: 29] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2020] [Revised: 05/18/2020] [Accepted: 06/24/2020] [Indexed: 12/27/2022]
Abstract
Genomic analyses in budding yeast have helped define the foundational principles of eukaryotic gene expression. However, in the absence of empirical methods for defining coding regions, these analyses have historically excluded specific classes of possible coding regions, such as those initiating at non-AUG start codons. Here, we applied an experimental approach to globally annotate translation initiation sites in yeast and identified 149 genes with alternative N-terminally extended protein isoforms initiating from near-cognate codons upstream of annotated AUG start codons. These isoforms are produced in concert with canonical isoforms and translated with high specificity, resulting from initiation at only a small subset of possible start codons. The non-AUG initiation driving their production is enriched during meiosis and induced by low eIF5A, which is seen in this context. These findings reveal widespread production of non-canonical protein isoforms and unexpected complexity to the rules by which even a simple eukaryotic genome is decoded.
Collapse
Affiliation(s)
- Amy R Eisenberg
- Department of Molecular and Cell Biology, University of California, Berkeley, Berkeley, CA 94720, USA
| | - Andrea L Higdon
- Department of Molecular and Cell Biology, University of California, Berkeley, Berkeley, CA 94720, USA; Center for Computational Biology, University of California, Berkeley, Berkeley, CA 94720, USA
| | - Ina Hollerer
- Department of Molecular and Cell Biology, University of California, Berkeley, Berkeley, CA 94720, USA
| | - Alexander P Fields
- Department of Cellular and Molecular Pharmacology, University of California, San Francisco, San Francisco, CA 94158, USA
| | - Irwin Jungreis
- MIT Computer Science and Artificial Intelligence Laboratory, Cambridge, MA 02139, USA; Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - Paige D Diamond
- Department of Molecular and Cell Biology, University of California, Berkeley, Berkeley, CA 94720, USA
| | - Manolis Kellis
- MIT Computer Science and Artificial Intelligence Laboratory, Cambridge, MA 02139, USA; Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - Marko Jovanovic
- Department of Biological Sciences, Columbia University, New York, NY 10027, USA
| | - Gloria A Brar
- Department of Molecular and Cell Biology, University of California, Berkeley, Berkeley, CA 94720, USA; Center for Computational Biology, University of California, Berkeley, Berkeley, CA 94720, USA.
| |
Collapse
|
83
|
Trulley P, Snieckute G, Bekker-Jensen D, Menon MB, Freund R, Kotlyarov A, Olsen JV, Diaz-Muñoz MD, Turner M, Bekker-Jensen S, Gaestel M, Tiedje C. Alternative Translation Initiation Generates a Functionally Distinct Isoform of the Stress-Activated Protein Kinase MK2. Cell Rep 2020; 27:2859-2870.e6. [PMID: 31167133 DOI: 10.1016/j.celrep.2019.05.024] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2018] [Revised: 04/10/2019] [Accepted: 05/06/2019] [Indexed: 12/16/2022] Open
Abstract
Alternative translation is an important mechanism of post-transcriptional gene regulation leading to the expression of different protein isoforms originating from the same mRNA. Here, we describe an abundant long isoform of the stress/p38MAPK-activated protein kinase MK2. This isoform is constitutively translated from an alternative CUG translation initiation start site located in the 5' UTR of its mRNA. The RNA helicase eIF4A1 is needed to ensure translation of the long and the known short isoforms of MK2, of which the molecular properties were determined. Only the short isoform phosphorylated Hsp27 in vivo, supported migration and stress-induced immediate early gene (IEG) expression. Interaction profiling revealed short-isoform-specific binding partners that were associated with migration. In contrast, the long isoform contains at least one additional phosphorylatable serine in its unique N terminus. In sum, our data reveal a longer isoform of MK2 with distinct physiological properties.
Collapse
Affiliation(s)
- Philipp Trulley
- Institute of Cell Biochemistry, Hannover Medical School (MHH), Carl-Neuberg-Str. 1, 30625 Hannover, Germany
| | - Goda Snieckute
- Center for Healthy Aging, Department of Cellular and Molecular Medicine, University of Copenhagen, Blegdamsvej 3B, 2200 Copenhagen N, Denmark
| | - Dorte Bekker-Jensen
- Mass Spectrometry for Quantitative Proteomics, Proteomics Program, The Novo Nordisk Foundation Center for Protein Research, Faculty of Health and Medical Sciences, University of Copenhagen, Blegdamsvej 3B, 2200 Copenhagen N, Denmark
| | - Manoj B Menon
- Institute of Cell Biochemistry, Hannover Medical School (MHH), Carl-Neuberg-Str. 1, 30625 Hannover, Germany
| | - Robert Freund
- Institute of Cell Biochemistry, Hannover Medical School (MHH), Carl-Neuberg-Str. 1, 30625 Hannover, Germany
| | - Alexey Kotlyarov
- Institute of Cell Biochemistry, Hannover Medical School (MHH), Carl-Neuberg-Str. 1, 30625 Hannover, Germany
| | - Jesper V Olsen
- Mass Spectrometry for Quantitative Proteomics, Proteomics Program, The Novo Nordisk Foundation Center for Protein Research, Faculty of Health and Medical Sciences, University of Copenhagen, Blegdamsvej 3B, 2200 Copenhagen N, Denmark
| | - Manuel D Diaz-Muñoz
- Centre de Physiopathologie Toulouse-Purpan, INSERM UMR1043/CNRS U5282, Toulouse 31300, France; Lymphocyte Signalling and Development, The Babraham Institute, CB22 3AT Cambridge, UK
| | - Martin Turner
- Lymphocyte Signalling and Development, The Babraham Institute, CB22 3AT Cambridge, UK
| | - Simon Bekker-Jensen
- Center for Healthy Aging, Department of Cellular and Molecular Medicine, University of Copenhagen, Blegdamsvej 3B, 2200 Copenhagen N, Denmark.
| | - Matthias Gaestel
- Institute of Cell Biochemistry, Hannover Medical School (MHH), Carl-Neuberg-Str. 1, 30625 Hannover, Germany.
| | - Christopher Tiedje
- Institute of Cell Biochemistry, Hannover Medical School (MHH), Carl-Neuberg-Str. 1, 30625 Hannover, Germany; Center for Healthy Aging, Department of Cellular and Molecular Medicine, University of Copenhagen, Blegdamsvej 3B, 2200 Copenhagen N, Denmark.
| |
Collapse
|
84
|
Li F, Xing X, Xiao Z, Xu G, Yang X. RiboMiner: a toolset for mining multi-dimensional features of the translatome with ribosome profiling data. BMC Bioinformatics 2020; 21:340. [PMID: 32738892 PMCID: PMC7430821 DOI: 10.1186/s12859-020-03670-8] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2020] [Accepted: 07/20/2020] [Indexed: 02/08/2023] Open
Abstract
Background Ribosome profiling has been widely used for studies of translation under a large variety of cellular and physiological contexts. Many of these studies have greatly benefitted from a series of data-mining tools designed for dissection of the translatome from different aspects. However, as the studies of translation advance quickly, the current toolbox still falls in short, and more specialized tools are in urgent need for deeper and more efficient mining of the important and new features of the translation landscapes. Results Here, we present RiboMiner, a bioinformatics toolset for mining of multi-dimensional features of the translatome with ribosome profiling data. RiboMiner performs extensive quality assessment of the data and integrates a spectrum of tools for various metagene analyses of the ribosome footprints and for detailed analyses of multiple features related to translation regulation. Visualizations of all the results are available. Many of these analyses have not been provided by previous methods. RiboMiner is highly flexible, as the pipeline could be easily adapted and customized for different scopes and targets of the studies. Conclusions Applications of RiboMiner on two published datasets did not only reproduced the main results reported before, but also generated novel insights into the translation regulation processes. Therefore, being complementary to the current tools, RiboMiner could be a valuable resource for dissections of the translation landscapes and the translation regulations by mining the ribosome profiling data more comprehensively and with higher resolution. RiboMiner is freely available at https://github.com/xryanglab/RiboMiner and https://pypi.org/project/RiboMiner.
Collapse
Affiliation(s)
- Fajin Li
- MOE Key Laboratory of Bioinformatics, School of Life Sciences, Tsinghua University, Medical Science Building D231, Beijing, 100084, China.,Center for Synthetic & Systems Biology, Tsinghua University, Beijing, 100084, China.,Joint Graduate Program of Peking-Tsinghua-National Institute of Biological Science, Tsinghua University, Beijing, 100084, China
| | - Xudong Xing
- MOE Key Laboratory of Bioinformatics, School of Life Sciences, Tsinghua University, Medical Science Building D231, Beijing, 100084, China.,Center for Synthetic & Systems Biology, Tsinghua University, Beijing, 100084, China.,Joint Graduate Program of Peking-Tsinghua-National Institute of Biological Science, Tsinghua University, Beijing, 100084, China
| | - Zhengtao Xiao
- MOE Key Laboratory of Bioinformatics, School of Life Sciences, Tsinghua University, Medical Science Building D231, Beijing, 100084, China.,Center for Synthetic & Systems Biology, Tsinghua University, Beijing, 100084, China
| | - Gang Xu
- MOE Key Laboratory of Bioinformatics, School of Life Sciences, Tsinghua University, Medical Science Building D231, Beijing, 100084, China
| | - Xuerui Yang
- MOE Key Laboratory of Bioinformatics, School of Life Sciences, Tsinghua University, Medical Science Building D231, Beijing, 100084, China. .,Center for Synthetic & Systems Biology, Tsinghua University, Beijing, 100084, China.
| |
Collapse
|
85
|
Choi SW, Kim HW, Nam JW. The small peptide world in long noncoding RNAs. Brief Bioinform 2020; 20:1853-1864. [PMID: 30010717 PMCID: PMC6917221 DOI: 10.1093/bib/bby055] [Citation(s) in RCA: 173] [Impact Index Per Article: 43.3] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2018] [Revised: 05/08/2018] [Indexed: 02/07/2023] Open
Abstract
Long noncoding RNAs (lncRNAs) are a group of transcripts that are longer than 200 nucleotides (nt) without coding potential. Over the past decade, tens of thousands of novel lncRNAs have been annotated in animal and plant genomes because of advanced high-throughput RNA sequencing technologies and with the aid of coding transcript classifiers. Further, a considerable number of reports have revealed the existence of stable, functional small peptides (also known as micropeptides), translated from lncRNAs. In this review, we discuss the methods of lncRNA classification, the investigations regarding their coding potential and the functional significance of the peptides they encode.
Collapse
Affiliation(s)
- Seo-Won Choi
- Department of Life Science, College of Natural Sciences, Hanyang University, Seoul 04763, Republic of Korea
| | - Hyun-Woo Kim
- Department of Life Science, College of Natural Sciences, Hanyang University, Seoul 04763, Republic of Korea
| | - Jin-Wu Nam
- Department of Life Science, College of Natural Sciences, Hanyang University, Seoul 04763, Republic of Korea
| |
Collapse
|
86
|
Brunet MA, Leblanc S, Roucou X. Reconsidering proteomic diversity with functional investigation of small ORFs and alternative ORFs. Exp Cell Res 2020; 393:112057. [PMID: 32387289 DOI: 10.1016/j.yexcr.2020.112057] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2019] [Revised: 04/21/2020] [Accepted: 05/02/2020] [Indexed: 12/13/2022]
Abstract
The discovery of functional yet non-annotated open reading frames (ORFs) throughout the genome of several species presents an unprecedented challenge in current genome annotation. These novel ORFs are shorter than annotated ones and many can be found on the same RNA, in opposition to current assumptions in annotation methodologies. Whilst the literature lacks consensus, these novel ORFs are commonly referred to as small ORFs (sORFs) or alternative ORFs (alt-ORFs). Unannotated ORFs represent an overlooked layer of complexity in the coding potential of genomes and are transforming our current vision of the nature of coding genes. In this review, we outline what constitutes a sORF or an alt-ORF and emphasize differences between both nomenclatures. We then describe complementary large-scale methods to accurately discover novel ORFs as well as yield functional insights on the novel proteins they encode. While serendipitous discoveries highlighted the functional importance of some novel ORFs, omics methods facilitate and improve their characterization to better understand physiological and pathological pathways. Functional annotation of sORFs, alt-ORFs and their corresponding microproteins will likely help fundamental and clinical research.
Collapse
Affiliation(s)
- Marie A Brunet
- Department of Biochemistry and Functional Genomics, Université de Sherbrooke, Sherbrooke, Québec, Canada; PROTEO, Quebec Network for Research on Protein Function, Structure, and Engineering, Canada.
| | - Sebastien Leblanc
- Department of Biochemistry and Functional Genomics, Université de Sherbrooke, Sherbrooke, Québec, Canada; PROTEO, Quebec Network for Research on Protein Function, Structure, and Engineering, Canada
| | - Xavier Roucou
- Department of Biochemistry and Functional Genomics, Université de Sherbrooke, Sherbrooke, Québec, Canada; PROTEO, Quebec Network for Research on Protein Function, Structure, and Engineering, Canada.
| |
Collapse
|
87
|
Kiniry SJ, Michel AM, Baranov PV. Computational methods for ribosome profiling data analysis. WILEY INTERDISCIPLINARY REVIEWS. RNA 2020; 11:e1577. [PMID: 31760685 DOI: 10.1002/wrna.1577] [Citation(s) in RCA: 20] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/30/2019] [Revised: 10/12/2019] [Accepted: 10/16/2019] [Indexed: 12/15/2022]
Abstract
Since the introduction of the ribosome profiling technique in 2009 its popularity has greatly increased. It is widely used for the comprehensive assessment of gene expression and for studying the mechanisms of regulation at the translational level. As the number of ribosome profiling datasets being produced continues to grow, so too does the need for reliable software that can provide answers to the biological questions it can address. This review describes the computational methods and tools that have been developed to analyze ribosome profiling data at the different stages of the process. It starts with initial routine processing of raw data and follows with more specific tasks such as the identification of translated open reading frames, differential gene expression analysis, or evaluation of local or global codon decoding rates. The review pinpoints challenges associated with each step and explains the ways in which they are currently addressed. In addition it provides a comprehensive, albeit incomplete, list of publicly available software applicable to each step, which may be a beneficial starting point to those unexposed to ribosome profiling analysis. The outline of current challenges in ribosome profiling data analysis may inspire computational biologists to search for novel, potentially superior, solutions that will improve and expand the bioinformatician's toolbox for ribosome profiling data analysis. This article is characterized under: Translation > Ribosome Structure/Function RNA Evolution and Genomics > Computational Analyses of RNA Translation > Translation Mechanisms Translation > Translation Regulation.
Collapse
Affiliation(s)
- Stephen J Kiniry
- School of Biochemistry and Cell Biology, University College Cork, Cork, Ireland
| | - Audrey M Michel
- School of Biochemistry and Cell Biology, University College Cork, Cork, Ireland
| | - Pavel V Baranov
- School of Biochemistry and Cell Biology, University College Cork, Cork, Ireland
- Shemyakin-Ovchinnikov Institute of Bioorganic Chemistry, RAS, Moscow, Russia
| |
Collapse
|
88
|
Recent advances in ribosome profiling for deciphering translational regulation. Methods 2020; 176:46-54. [DOI: 10.1016/j.ymeth.2019.05.011] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2018] [Revised: 05/02/2019] [Accepted: 05/15/2019] [Indexed: 12/16/2022] Open
|
89
|
Orr MW, Mao Y, Storz G, Qian SB. Alternative ORFs and small ORFs: shedding light on the dark proteome. Nucleic Acids Res 2020; 48:1029-1042. [PMID: 31504789 DOI: 10.1093/nar/gkz734] [Citation(s) in RCA: 146] [Impact Index Per Article: 36.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2019] [Revised: 08/03/2019] [Accepted: 08/15/2019] [Indexed: 02/06/2023] Open
Abstract
Traditional annotation of protein-encoding genes relied on assumptions, such as one open reading frame (ORF) encodes one protein and minimal lengths for translated proteins. With the serendipitous discoveries of translated ORFs encoded upstream and downstream of annotated ORFs, from alternative start sites nested within annotated ORFs and from RNAs previously considered noncoding, it is becoming clear that these initial assumptions are incorrect. The findings have led to the realization that genetic information is more densely coded and that the proteome is more complex than previously anticipated. As such, interest in the identification and characterization of the previously ignored 'dark proteome' is increasing, though we note that research in eukaryotes and bacteria has largely progressed in isolation. To bridge this gap and illustrate exciting findings emerging from studies of the dark proteome, we highlight recent advances in both eukaryotic and bacterial cells. We discuss progress in the detection of alternative ORFs as well as in the understanding of functions and the regulation of their expression and posit questions for future work.
Collapse
Affiliation(s)
- Mona Wu Orr
- Division of Molecular and Cellular Biology, Eunice Kennedy Shriver National Institute of Child Health and Human Development, Bethesda, MD 20892, USA
| | - Yuanhui Mao
- Division of Nutritional Sciences, Cornell University, Ithaca, NY 14853, USA
| | - Gisela Storz
- Division of Molecular and Cellular Biology, Eunice Kennedy Shriver National Institute of Child Health and Human Development, Bethesda, MD 20892, USA
| | - Shu-Bing Qian
- Division of Nutritional Sciences, Cornell University, Ithaca, NY 14853, USA
| |
Collapse
|
90
|
Chen J, Brunner AD, Cogan JZ, Nuñez JK, Fields AP, Adamson B, Itzhak DN, Li JY, Mann M, Leonetti MD, Weissman JS. Pervasive functional translation of noncanonical human open reading frames. Science 2020; 367:1140-1146. [PMID: 32139545 DOI: 10.1126/science.aay0262] [Citation(s) in RCA: 347] [Impact Index Per Article: 86.8] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2019] [Revised: 11/22/2019] [Accepted: 01/13/2020] [Indexed: 12/12/2022]
Abstract
Ribosome profiling has revealed pervasive but largely uncharacterized translation outside of canonical coding sequences (CDSs). In this work, we exploit a systematic CRISPR-based screening strategy to identify hundreds of noncanonical CDSs that are essential for cellular growth and whose disruption elicits specific, robust transcriptomic and phenotypic changes in human cells. Functional characterization of the encoded microproteins reveals distinct cellular localizations, specific protein binding partners, and hundreds of microproteins that are presented by the human leukocyte antigen system. We find multiple microproteins encoded in upstream open reading frames, which form stable complexes with the main, canonical protein encoded on the same messenger RNA, thereby revealing the use of functional bicistronic operons in mammals. Together, our results point to a family of functional human microproteins that play critical and diverse cellular roles.
Collapse
Affiliation(s)
- Jin Chen
- Department of Cellular and Molecular Pharmacology, University of California, San Francisco, CA 94158, USA.,Howard Hughes Medical Institute, University of California, San Francisco, CA 94158, USA
| | - Andreas-David Brunner
- Department of Proteomics and Signal Transduction, Max Planck Institute of Biochemistry, Martinsried 82152, Germany
| | - J Zachery Cogan
- Department of Cellular and Molecular Pharmacology, University of California, San Francisco, CA 94158, USA.,Howard Hughes Medical Institute, University of California, San Francisco, CA 94158, USA
| | - James K Nuñez
- Department of Cellular and Molecular Pharmacology, University of California, San Francisco, CA 94158, USA.,Howard Hughes Medical Institute, University of California, San Francisco, CA 94158, USA
| | - Alexander P Fields
- Department of Cellular and Molecular Pharmacology, University of California, San Francisco, CA 94158, USA.,Howard Hughes Medical Institute, University of California, San Francisco, CA 94158, USA
| | - Britt Adamson
- Department of Cellular and Molecular Pharmacology, University of California, San Francisco, CA 94158, USA.,Howard Hughes Medical Institute, University of California, San Francisco, CA 94158, USA
| | - Daniel N Itzhak
- Cell Atlas Initiative, Chan Zuckerberg Biohub, San Francisco, CA 94158, USA
| | - Jason Y Li
- Cell Atlas Initiative, Chan Zuckerberg Biohub, San Francisco, CA 94158, USA
| | - Matthias Mann
- Department of Proteomics and Signal Transduction, Max Planck Institute of Biochemistry, Martinsried 82152, Germany.,Clinical Proteomics Group, Proteomics Program, Novo Nordisk Foundation Center for Protein Research, University of Copenhagen, Copenhagen 2200, Denmark
| | - Manuel D Leonetti
- Cell Atlas Initiative, Chan Zuckerberg Biohub, San Francisco, CA 94158, USA
| | - Jonathan S Weissman
- Department of Cellular and Molecular Pharmacology, University of California, San Francisco, CA 94158, USA. .,Howard Hughes Medical Institute, University of California, San Francisco, CA 94158, USA
| |
Collapse
|
91
|
Alexaki A, Kames J, Hettiarachchi GK, Athey JC, Katneni UK, Hunt RC, Hamasaki-Katagiri N, Holcomb DD, DiCuccio M, Bar H, Komar AA, Kimchi-Sarfaty C. Ribosome profiling of HEK293T cells overexpressing codon optimized coagulation factor IX. F1000Res 2020; 9:174. [PMID: 33014344 PMCID: PMC7509596 DOI: 10.12688/f1000research.22400.2] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 09/09/2020] [Indexed: 12/30/2022] Open
Abstract
Ribosome profiling provides the opportunity to evaluate translation kinetics at codon level resolution. Here, we describe ribosome profiling data, generated from two HEK293T cell lines. The ribosome profiling data are composed of Ribo-seq (mRNA sequencing data from ribosome protected fragments) and RNA-seq data (total RNA sequencing). The two HEK293T cell lines each express a version of the
F9 gene, both of which are translated into identical proteins in terms of their amino acid sequences. However, these
F9 genes vary drastically in their codon usage and predicted mRNA structure. We also provide the pipeline that we used to analyze the data. Further analyzing this dataset holds great potential as it can be used i) to unveil insights into the composition and regulation of the transcriptome, ii) for comparison with other ribosome profiling datasets, iii) to measure the rate of protein synthesis across the proteome and identify differences in elongation rates, iv) to discover previously unidentified translation of peptides, v) to explore the effects of codon usage or codon context in translational kinetics and vi) to investigate cotranslational folding. Importantly, a unique feature of this dataset, compared to other available ribosome profiling data, is the presence of the
F9 gene in two very distinct coding sequences.
Collapse
Affiliation(s)
- Aikaterini Alexaki
- Center for Biologics Evaluation and Research, Food and Drug Administration, USA, Silver Spring, MD, 20993, USA
| | - Jacob Kames
- Center for Biologics Evaluation and Research, Food and Drug Administration, USA, Silver Spring, MD, 20993, USA
| | - Gaya K Hettiarachchi
- Center for Biologics Evaluation and Research, Food and Drug Administration, USA, Silver Spring, MD, 20993, USA
| | - John C Athey
- Center for Biologics Evaluation and Research, Food and Drug Administration, USA, Silver Spring, MD, 20993, USA
| | - Upendra K Katneni
- Center for Biologics Evaluation and Research, Food and Drug Administration, USA, Silver Spring, MD, 20993, USA
| | - Ryan C Hunt
- Center for Biologics Evaluation and Research, Food and Drug Administration, USA, Silver Spring, MD, 20993, USA
| | - Nobuko Hamasaki-Katagiri
- Center for Biologics Evaluation and Research, Food and Drug Administration, USA, Silver Spring, MD, 20993, USA
| | - David D Holcomb
- Center for Biologics Evaluation and Research, Food and Drug Administration, USA, Silver Spring, MD, 20993, USA
| | - Michael DiCuccio
- National Center of Biotechnology Information, National Institutes of Health, USA, Bethesda, MD, 20892, USA
| | - Haim Bar
- Department of Statistics, University of Connecticut, Storrs, CT, 06269, USA
| | - Anton A Komar
- Center for Gene Regulation in Health and Disease, Cleveland State University, Cleveland, OH, 44115, USA
| | - Chava Kimchi-Sarfaty
- Center for Biologics Evaluation and Research, Food and Drug Administration, USA, Silver Spring, MD, 20993, USA
| |
Collapse
|
92
|
Gusic M, Prokisch H. ncRNAs: New Players in Mitochondrial Health and Disease? Front Genet 2020; 11:95. [PMID: 32180794 PMCID: PMC7059738 DOI: 10.3389/fgene.2020.00095] [Citation(s) in RCA: 52] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2019] [Accepted: 01/28/2020] [Indexed: 12/19/2022] Open
Abstract
The regulation of mitochondrial proteome is unique in that its components have origins in both mitochondria and nucleus. With the development of OMICS technologies, emerging evidence indicates an interaction between mitochondria and nucleus based not only on the proteins but also on the non-coding RNAs (ncRNAs). It is now accepted that large parts of the non‐coding genome are transcribed into various ncRNA species. Although their characterization has been a hot topic in recent years, the function of the majority remains unknown. Recently, ncRNA species microRNA (miRNA) and long-non coding RNAs (lncRNA) have been gaining attention as direct or indirect modulators of the mitochondrial proteome homeostasis. These ncRNA can impact mitochondria indirectly by affecting transcripts encoding for mitochondrial proteins in the cytoplasm. Furthermore, reports of mitochondria-localized miRNAs, termed mitomiRs, and lncRNAs directly regulating mitochondrial gene expression suggest the import of RNA to mitochondria, but also transcription from the mitochondrial genome. Interestingly, ncRNAs have been also shown to hide small open reading frames (sORFs) encoding for small functional peptides termed micropeptides, with several examples reported with a role in mitochondria. In this review, we provide a literature overview on ncRNAs and micropeptides found to be associated with mitochondrial biology in the context of both health and disease. Although reported, small study overlap and rare replications by other groups make the presence, transport, and role of ncRNA in mitochondria an attractive, but still challenging subject. Finally, we touch the topic of their potential as prognosis markers and therapeutic targets.
Collapse
Affiliation(s)
- Mirjana Gusic
- Institute of Human Genetics, Helmholtz Zentrum München, Neuherberg, Germany.,DZHK (German Centre for Cardiovascular Research), partner site Munich Heart Alliance, Munich, Germany.,Institute of Human Genetics, Technical University of Munich, Munich, Germany
| | - Holger Prokisch
- Institute of Human Genetics, Helmholtz Zentrum München, Neuherberg, Germany.,Institute of Human Genetics, Technical University of Munich, Munich, Germany
| |
Collapse
|
93
|
Finkel Y, Schmiedel D, Tai-Schmiedel J, Nachshon A, Winkler R, Dobesova M, Schwartz M, Mandelboim O, Stern-Ginossar N. Comprehensive annotations of human herpesvirus 6A and 6B genomes reveal novel and conserved genomic features. eLife 2020; 9:e50960. [PMID: 31944176 PMCID: PMC6964970 DOI: 10.7554/elife.50960] [Citation(s) in RCA: 25] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2019] [Accepted: 11/27/2019] [Indexed: 12/14/2022] Open
Abstract
Human herpesvirus-6 (HHV-6) A and B are ubiquitous betaherpesviruses, infecting the majority of the human population. They encompass large genomes and our understanding of their protein coding potential is far from complete. Here, we employ ribosome-profiling and systematic transcript-analysis to experimentally define HHV-6 translation products. We identify hundreds of new open reading frames (ORFs), including upstream ORFs (uORFs) and internal ORFs (iORFs), generating a complete unbiased atlas of HHV-6 proteome. By integrating systematic data from the prototypic betaherpesvirus, human cytomegalovirus, we uncover numerous uORFs and iORFs conserved across betaherpesviruses and we show uORFs are enriched in late viral genes. We identified three highly abundant HHV-6 encoded long non-coding RNAs, one of which generates a non-polyadenylated stable intron appearing to be a conserved feature of betaherpesviruses. Overall, our work reveals the complexity of HHV-6 genomes and highlights novel features conserved between betaherpesviruses, providing a rich resource for future functional studies.
Collapse
Affiliation(s)
- Yaara Finkel
- Department of Molecular GeneticsWeizmann Institute of ScienceRehovotIsrael
| | - Dominik Schmiedel
- The Lautenberg Center for General and Tumor ImmunologyInstitute for Medical Research Israel-Canada, The Hebrew University Hadassah Medical SchoolJerusalemIsrael
| | | | - Aharon Nachshon
- Department of Molecular GeneticsWeizmann Institute of ScienceRehovotIsrael
| | - Roni Winkler
- Department of Molecular GeneticsWeizmann Institute of ScienceRehovotIsrael
| | - Martina Dobesova
- Department of Molecular GeneticsWeizmann Institute of ScienceRehovotIsrael
| | - Michal Schwartz
- Department of Molecular GeneticsWeizmann Institute of ScienceRehovotIsrael
| | - Ofer Mandelboim
- The Lautenberg Center for General and Tumor ImmunologyInstitute for Medical Research Israel-Canada, The Hebrew University Hadassah Medical SchoolJerusalemIsrael
| | | |
Collapse
|
94
|
Vaklavas C, Blume SW, Grizzle WE. Hallmarks and Determinants of Oncogenic Translation Revealed by Ribosome Profiling in Models of Breast Cancer. Transl Oncol 2020; 13:452-470. [PMID: 31911279 PMCID: PMC6948383 DOI: 10.1016/j.tranon.2019.12.002] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2019] [Revised: 11/28/2019] [Accepted: 12/01/2019] [Indexed: 12/21/2022] Open
Abstract
Gene expression is extensively and dynamically modulated at the level of translation. How cancer cells prioritize the translation of certain mRNAs over others from a pool of competing mRNAs remains an open question. Here, we analyze translation in cell line models of breast cancer and normal mammary tissue by ribosome profiling. We identify key recurrent themes of oncogenic translation: higher ribosome occupancy, greater variance of translational efficiencies, and preferential translation of transcriptional regulators and signaling proteins in malignant cells as compared with their nonmalignant counterpart. We survey for candidate RNA interacting proteins that could associate with the 5′untranslated regions of the transcripts preferentially translated in breast tumour cells. We identify SRSF1, a prototypic splicing factor, to have a pervasive direct and indirect impact on translation. In a representative estrogen receptor–positive and estrogen receptor–negative cell line, we find that protein synthesis relies heavily on SRSF1. SRSF1 is predominantly intranuclear. Under certain conditions, SRSF1 translocates from the nucleus to the cytoplasm where it associates with MYC and CDK1 mRNAs and upregulates their internal ribosome entry site–mediated translation. Our results point to a synergy between splicing and translation and unveil how certain RNA-binding proteins modulate the translational landscape in breast cancer.
Collapse
Affiliation(s)
- Christos Vaklavas
- Department of Medicine, Division of Hematology / Oncology, University of Alabama at Birmingham, Birmingham, AL 35294, USA.
| | - Scott W Blume
- Department of Medicine, Division of Hematology / Oncology, University of Alabama at Birmingham, Birmingham, AL 35294, USA
| | - William E Grizzle
- Department of Pathology, O'Neal Comprehensive Cancer Centre, University of Alabama at Birmingham, Birmingham, AL 35294, USA
| |
Collapse
|
95
|
Clauwaert J, Menschaert G, Waegeman W. DeepRibo: a neural network for precise gene annotation of prokaryotes by combining ribosome profiling signal and binding site patterns. Nucleic Acids Res 2019; 47:e36. [PMID: 30753697 PMCID: PMC6451124 DOI: 10.1093/nar/gkz061] [Citation(s) in RCA: 43] [Impact Index Per Article: 8.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2018] [Revised: 01/02/2019] [Accepted: 01/30/2019] [Indexed: 12/13/2022] Open
Abstract
Annotation of gene expression in prokaryotes often finds itself corrected due to small variations of the annotated gene regions observed between different (sub)-species. It has become apparent that traditional sequence alignment algorithms, used for the curation of genomes, are not able to map the full complexity of the genomic landscape. We present DeepRibo, a novel neural network utilizing features extracted from ribosome profiling information and binding site sequence patterns that shows to be a precise tool for the delineation and annotation of expressed genes in prokaryotes. The neural network combines recurrent memory cells and convolutional layers, adapting the information gained from both the high-throughput ribosome profiling data and ribosome binding translation initiation sequence region into one model. DeepRibo is designed as a single model trained on a variety of ribosome profiling experiments, used for the identification of open reading frames in prokaryotes without a priori knowledge of the translational landscape. Through extensive validation of the model trained on various sets of data, multiple species sequence similarity, mass spectrometry and Edman degradation verified proteins, the effectiveness of DeepRibo is highlighted.
Collapse
Affiliation(s)
- Jim Clauwaert
- KERMIT, Department of Data Analysis and Mathematical Modelling, Ghent University, Coupure Links 653, 9000 Gent, Belgium
| | - Gerben Menschaert
- Biobix, Department of Data Analysis and Mathematical Modelling, Ghent University, Coupure Links 653, 9000 Gent, Belgium
| | - Willem Waegeman
- KERMIT, Department of Data Analysis and Mathematical Modelling, Ghent University, Coupure Links 653, 9000 Gent, Belgium
| |
Collapse
|
96
|
Reisacher C, Arbibe L. Not lost in host translation: The new roles of long noncoding RNAs in infectious diseases. Cell Microbiol 2019; 21:e13119. [PMID: 31634981 DOI: 10.1111/cmi.13119] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2019] [Revised: 09/10/2019] [Accepted: 09/17/2019] [Indexed: 12/20/2022]
Abstract
Long non-coding RNAs (lncRNAs) play a central role in the regulation of gene expression. Although they were initially described as mRNA-like transcripts not encoding proteins, global approaches such as ribosome profiling have shown that they frequently associate with ribosomes, opening the possibility that lncRNAs are a source of cryptic translation events with functional roles. Recent studies have shed more light on small ORFs borne by lncRNAs and encoding short peptides potentially involved in infectious immunity. This review outlines the main strategies used to determine the coding potential of lncRNAs and discusses our emerging understanding of the implication of the encoded peptides in infectious diseases.
Collapse
Affiliation(s)
- Caroline Reisacher
- Department of Immunology, Infectiology and Hematology, Institut Necker-Enfants Malades (INEM), INSERM U1151, CNRS UMR 8253, Université Paris Descartes, Paris, France
| | - Laurence Arbibe
- Department of Immunology, Infectiology and Hematology, Institut Necker-Enfants Malades (INEM), INSERM U1151, CNRS UMR 8253, Université Paris Descartes, Paris, France
| |
Collapse
|
97
|
Mudge JM, Jungreis I, Hunt T, Gonzalez JM, Wright JC, Kay M, Davidson C, Fitzgerald S, Seal R, Tweedie S, He L, Waterhouse RM, Li Y, Bruford E, Choudhary JS, Frankish A, Kellis M. Discovery of high-confidence human protein-coding genes and exons by whole-genome PhyloCSF helps elucidate 118 GWAS loci. Genome Res 2019; 29:2073-2087. [PMID: 31537640 PMCID: PMC6886504 DOI: 10.1101/gr.246462.118] [Citation(s) in RCA: 44] [Impact Index Per Article: 8.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2018] [Accepted: 09/09/2019] [Indexed: 12/15/2022]
Abstract
The most widely appreciated role of DNA is to encode protein, yet the exact portion of the human genome that is translated remains to be ascertained. We previously developed PhyloCSF, a widely used tool to identify evolutionary signatures of protein-coding regions using multispecies genome alignments. Here, we present the first whole-genome PhyloCSF prediction tracks for human, mouse, chicken, fly, worm, and mosquito. We develop a workflow that uses machine learning to predict novel conserved protein-coding regions and efficiently guide their manual curation. We analyze more than 1000 high-scoring human PhyloCSF regions and confidently add 144 conserved protein-coding genes to the GENCODE gene set, as well as additional coding regions within 236 previously annotated protein-coding genes, and 169 pseudogenes, most of them disabled after primates diverged. The majority of these represent new discoveries, including 70 previously undetected protein-coding genes. The novel coding genes are additionally supported by single-nucleotide variant evidence indicative of continued purifying selection in the human lineage, coding-exon splicing evidence from new GENCODE transcripts using next-generation transcriptomic data sets, and mass spectrometry evidence of translation for several new genes. Our discoveries required simultaneous comparative annotation of other vertebrate genomes, which we show is essential to remove spurious ORFs and to distinguish coding from pseudogene regions. Our new coding regions help elucidate disease-associated regions by revealing that 118 GWAS variants previously thought to be noncoding are in fact protein altering. Altogether, our PhyloCSF data sets and algorithms will help researchers seeking to interpret these genomes, while our new annotations present exciting loci for further experimental characterization.
Collapse
Affiliation(s)
- Jonathan M Mudge
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, United Kingdom
| | - Irwin Jungreis
- MIT Computer Science and Artificial Intelligence Laboratory, Cambridge, Massachusetts 02139, USA.,Broad Institute of MIT and Harvard, Cambridge, Massachusetts 02142, USA
| | - Toby Hunt
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, United Kingdom
| | - Jose Manuel Gonzalez
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, United Kingdom
| | - James C Wright
- Functional Proteomics, Division of Cancer Biology, Institute of Cancer Research, London SW7 3RP, United Kingdom
| | - Mike Kay
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, United Kingdom
| | - Claire Davidson
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, United Kingdom
| | - Stephen Fitzgerald
- Wellcome Trust Sanger Institute, Hinxton, Cambridge CB10 1SA, United Kingdom
| | - Ruth Seal
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, United Kingdom.,Department of Haematology, University of Cambridge, Cambridge CB2 0PT, United Kingdom
| | - Susan Tweedie
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, United Kingdom
| | - Liang He
- MIT Computer Science and Artificial Intelligence Laboratory, Cambridge, Massachusetts 02139, USA.,Broad Institute of MIT and Harvard, Cambridge, Massachusetts 02142, USA
| | - Robert M Waterhouse
- Department of Ecology and Evolution, University of Lausanne, Lausanne 1015, Switzerland.,Swiss Institute of Bioinformatics, Lausanne 1015, Switzerland
| | - Yue Li
- MIT Computer Science and Artificial Intelligence Laboratory, Cambridge, Massachusetts 02139, USA.,Broad Institute of MIT and Harvard, Cambridge, Massachusetts 02142, USA
| | - Elspeth Bruford
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, United Kingdom.,Department of Haematology, University of Cambridge, Cambridge CB2 0PT, United Kingdom
| | - Jyoti S Choudhary
- Functional Proteomics, Division of Cancer Biology, Institute of Cancer Research, London SW7 3RP, United Kingdom
| | - Adam Frankish
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, United Kingdom
| | - Manolis Kellis
- MIT Computer Science and Artificial Intelligence Laboratory, Cambridge, Massachusetts 02139, USA.,Broad Institute of MIT and Harvard, Cambridge, Massachusetts 02142, USA
| |
Collapse
|
98
|
Wu HYL, Song G, Walley JW, Hsu PY. The Tomato Translational Landscape Revealed by Transcriptome Assembly and Ribosome Profiling. PLANT PHYSIOLOGY 2019; 181:367-380. [PMID: 31248964 PMCID: PMC6716236 DOI: 10.1104/pp.19.00541] [Citation(s) in RCA: 37] [Impact Index Per Article: 7.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/03/2019] [Accepted: 06/10/2019] [Indexed: 05/14/2023]
Abstract
Recent applications of translational control in Arabidopsis (Arabidopsis thaliana) highlight the potential power of manipulating mRNA translation for crop improvement. However, to what extent translational regulation is conserved between Arabidopsis and other species is largely unknown, and the translatome of most crops remains poorly studied. Here, we combined de novo transcriptome assembly and ribosome profiling to study global mRNA translation in tomato (Solanum lycopersicum) roots. Exploiting features corresponding to active translation, we discovered widespread unannotated translation events, including 1,329 upstream open reading frames (uORFs) within the 5' untranslated regions of annotated coding genes and 354 small ORFs (sORFs) among unannotated transcripts. uORFs may repress translation of their downstream main ORFs, whereas sORFs may encode signaling peptides. Besides evolutionarily conserved sORFs, we uncovered 96 Solanaceae-specific sORFs, revealing the importance of studying translatomes directly in crops. Proteomic analysis confirmed that some of the unannotated ORFs generate stable proteins in planta. In addition to defining the translatome, our results reveal the global regulation by uORFs and microRNAs. Despite diverging over 100 million years ago, many translational features are well conserved between Arabidopsis and tomato. Thus, our approach provides a high-throughput method to discover unannotated ORFs, elucidates evolutionarily conserved and unique translational features, and identifies regulatory mechanisms hidden in a crop genome.
Collapse
Affiliation(s)
- Hsin-Yen Larry Wu
- Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing, Michigan 48824
| | - Gaoyuan Song
- Department of Plant Pathology and Microbiology, Iowa State University, Ames, Iowa 50011
| | - Justin W Walley
- Department of Plant Pathology and Microbiology, Iowa State University, Ames, Iowa 50011
| | - Polly Yingshan Hsu
- Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing, Michigan 48824
| |
Collapse
|
99
|
Xu Z, Hu L, Shi B, Geng S, Xu L, Wang D, Lu ZJ. Ribosome elongating footprints denoised by wavelet transform comprehensively characterize dynamic cellular translation events. Nucleic Acids Res 2019; 46:e109. [PMID: 29945224 PMCID: PMC6182183 DOI: 10.1093/nar/gky533] [Citation(s) in RCA: 33] [Impact Index Per Article: 6.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2018] [Accepted: 05/31/2018] [Indexed: 02/06/2023] Open
Abstract
Translation is dynamically regulated during cell development and stress response. In order to detect actively translated open reading frames (ORFs) and dynamic cellular translation events, we have developed a computational method, RiboWave, to process ribosome profiling data. RiboWave utilizes wavelet transform to denoise the original signal by extracting 3-nt periodicity of ribosomes and precisely locate their footprint denoted as Periodic Footprint P-site (PF P-site). Such high-resolution footprint is found to capture the full track of actively elongating ribosomes, from which translational landscape can be explicitly characterized. We compare RiboWave with several published methods, like RiboTaper, ORFscore and RibORF, and found that RiboWave outperforms them in both accuracy and usage when defining actively translated ORFs. Moreover, we show that PF P-site derived by RiboWave shows superior performance in characterizing the dynamics and complexity of cellular translatome by accurately estimating the abundance of protein levels, assessing differential translation and identifying dynamic translation frameshift.
Collapse
Affiliation(s)
- Zhiyu Xu
- MOE Key Laboratory of Bioinformatics, Center for Synthetic and Systems Biology, School of Life Sciences, Tsinghua University, Beijing 100084, China
| | - Long Hu
- MOE Key Laboratory of Bioinformatics, Center for Synthetic and Systems Biology, School of Life Sciences, Tsinghua University, Beijing 100084, China
| | - Binbin Shi
- MOE Key Laboratory of Bioinformatics, Center for Synthetic and Systems Biology, School of Life Sciences, Tsinghua University, Beijing 100084, China
| | - SiSi Geng
- MOE Key Laboratory of Bioinformatics, Center for Synthetic and Systems Biology, School of Life Sciences, Tsinghua University, Beijing 100084, China
| | - Longchen Xu
- MOE Key Laboratory of Bioinformatics, Center for Synthetic and Systems Biology, School of Life Sciences, Tsinghua University, Beijing 100084, China
| | - Dong Wang
- MOE Key Laboratory of Bioinformatics, Center for Synthetic and Systems Biology, School of Life Sciences, Tsinghua University, Beijing 100084, China
| | - Zhi J Lu
- MOE Key Laboratory of Bioinformatics, Center for Synthetic and Systems Biology, School of Life Sciences, Tsinghua University, Beijing 100084, China
| |
Collapse
|
100
|
Michel AM, Kiniry SJ, O'Connor PBF, Mullan JP, Baranov PV. GWIPS-viz: 2018 update. Nucleic Acids Res 2019; 46:D823-D830. [PMID: 28977460 PMCID: PMC5753223 DOI: 10.1093/nar/gkx790] [Citation(s) in RCA: 39] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2017] [Accepted: 08/29/2017] [Indexed: 12/15/2022] Open
Abstract
The GWIPS-viz browser (http://gwips.ucc.ie/) is an on-line genome browser which is tailored for exploring ribosome profiling (Ribo-seq) data. Since its publication in 2014, GWIPS-viz provides Ribo-seq data for an additional 14 genomes bringing the current total to 23. The integration of new Ribo-seq data has been automated thereby increasing the number of available tracks to 1792, a 10-fold increase in the last three years. The increase is particularly substantial for data derived from human sources. Following user requests, we added the functionality to download these tracks in bigWig format. We also incorporated new types of data (e.g. TCP-seq) as well as auxiliary tracks from other sources that help with the interpretation of Ribo-seq data. Improvements in the visualization of the data have been carried out particularly for bacterial genomes where the Ribo-seq data are now shown in a strand specific manner. For higher eukaryotic datasets, we provide characteristics of individual datasets using the RUST program which includes the triplet periodicity, sequencing biases and relative inferred A-site dwell times. This information can be used for assessing the quality of Ribo-seq datasets. To improve the power of the signal, we aggregate Ribo-seq data from several studies into Global aggregate tracks for each genome.
Collapse
Affiliation(s)
- Audrey M Michel
- School of Biochemistry and Cell Biology, University College Cork, Cork, Ireland
| | - Stephen J Kiniry
- School of Biochemistry and Cell Biology, University College Cork, Cork, Ireland
| | | | - James P Mullan
- School of Biochemistry and Cell Biology, University College Cork, Cork, Ireland
| | - Pavel V Baranov
- School of Biochemistry and Cell Biology, University College Cork, Cork, Ireland
| |
Collapse
|