1
|
Tidu A, Alghoul F, Despons L, Eriani G, Martin F. Critical cis-parameters influence STructure assisted RNA translation (START) initiation on non-AUG codons in eukaryotes. NAR Genom Bioinform 2024; 6:lqae065. [PMID: 38863530 PMCID: PMC11165317 DOI: 10.1093/nargab/lqae065] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2024] [Revised: 04/18/2024] [Accepted: 05/23/2024] [Indexed: 06/13/2024] Open
Abstract
In eukaryotes, translation initiation is a highly regulated process, which combines cis-regulatory sequences located on the messenger RNA along with trans-acting factors like eukaryotic initiation factors (eIF). One critical step of translation initiation is the start codon recognition by the scanning 43S particle, which leads to ribosome assembly and protein synthesis. In this study, we investigated the involvement of secondary structures downstream the initiation codon in the so-called START (STructure-Assisted RNA translation) mechanism on AUG and non-AUG translation initiation. The results demonstrate that downstream secondary structures can efficiently promote non-AUG translation initiation if they are sufficiently stable to stall a scanning 43S particle and if they are located at an optimal distance from non-AUG codons to stabilize the codon-anticodon base pairing in the P site. The required stability of the downstream structure for efficient translation initiation varies in distinct cell types. We extended this study to genome-wide analysis of functionally characterized alternative translation initiation sites in Homo sapiens. This analysis revealed that about 25% of these sites have an optimally located downstream secondary structure of adequate stability which could elicit START, regardless of the start codon. We validated the impact of these structures on translation initiation for several selected uORFs.
Collapse
Affiliation(s)
- Antonin Tidu
- Université de Strasbourg, Institut de Biologie Moléculaire et Cellulaire, Architecture et Réactivité de l’ARN, CNRS UPR9002, 2 allée Konrad Roentgen, F-67084 Strasbourg, France
| | - Fatima Alghoul
- Université de Strasbourg, Institut de Biologie Moléculaire et Cellulaire, Architecture et Réactivité de l’ARN, CNRS UPR9002, 2 allée Konrad Roentgen, F-67084 Strasbourg, France
| | - Laurence Despons
- Université de Strasbourg, Institut de Biologie Moléculaire et Cellulaire, Architecture et Réactivité de l’ARN, CNRS UPR9002, 2 allée Konrad Roentgen, F-67084 Strasbourg, France
| | - Gilbert Eriani
- Université de Strasbourg, Institut de Biologie Moléculaire et Cellulaire, Architecture et Réactivité de l’ARN, CNRS UPR9002, 2 allée Konrad Roentgen, F-67084 Strasbourg, France
| | - Franck Martin
- Université de Strasbourg, Institut de Biologie Moléculaire et Cellulaire, Architecture et Réactivité de l’ARN, CNRS UPR9002, 2 allée Konrad Roentgen, F-67084 Strasbourg, France
| |
Collapse
|
2
|
Dasgupta A, Prensner JR. Upstream open reading frames: new players in the landscape of cancer gene regulation. NAR Cancer 2024; 6:zcae023. [PMID: 38774471 PMCID: PMC11106035 DOI: 10.1093/narcan/zcae023] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2023] [Revised: 04/29/2024] [Accepted: 05/07/2024] [Indexed: 05/24/2024] Open
Abstract
The translation of RNA by ribosomes represents a central biological process and one of the most dysregulated processes in cancer. While translation is traditionally thought to occur exclusively in the protein-coding regions of messenger RNAs (mRNAs), recent transcriptome-wide approaches have shown abundant ribosome activity across diverse stretches of RNA transcripts. The most common type of this kind of ribosome activity occurs in gene leader sequences, also known as 5' untranslated regions (UTRs) of the mRNA, that precede the main coding sequence. Translation of these upstream open reading frames (uORFs) is now known to occur in upwards of 25% of all protein-coding genes. With diverse functions from RNA regulation to microprotein generation, uORFs are rapidly igniting a new arena of cancer biology, where they are linked to cancer genetics, cancer signaling, and tumor-immune interactions. This review focuses on the contributions of uORFs and their associated 5'UTR sequences to cancer biology.
Collapse
Affiliation(s)
- Anwesha Dasgupta
- Chad Carr Pediatric Brain Tumor Center, University of Michigan Medical School, Ann Arbor, MI 48109, USA
- Department of Pediatrics, Division of Pediatric Hematology/Oncology, University of Michigan Medical School, Ann Arbor, MI 48109, USA
- Department of Biological Chemistry, University of Michigan Medical School, Ann Arbor, MI 48109, USA
| | - John R Prensner
- Chad Carr Pediatric Brain Tumor Center, University of Michigan Medical School, Ann Arbor, MI 48109, USA
- Department of Pediatrics, Division of Pediatric Hematology/Oncology, University of Michigan Medical School, Ann Arbor, MI 48109, USA
- Department of Biological Chemistry, University of Michigan Medical School, Ann Arbor, MI 48109, USA
| |
Collapse
|
3
|
Ly J, Xiang K, Su KC, Sissoko GB, Bartel DP, Cheeseman IM. Nuclear release of eIF1 globally increases stringency of start-codon selection to preserve mitotic arrest physiology. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.04.06.588385. [PMID: 38617206 PMCID: PMC11014515 DOI: 10.1101/2024.04.06.588385] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/16/2024]
Abstract
Regulated start-codon selection has the potential to reshape the proteome through the differential production of uORFs, canonical proteins, and alternative translational isoforms. However, conditions under which start-codon selection is altered remain poorly defined. Here, using transcriptome-wide translation initiation site profiling, we reveal a global increase in the stringency of start-codon selection during mammalian mitosis. Low-efficiency initiation sites are preferentially repressed in mitosis, resulting in pervasive changes in the translation of thousands of start sites and their corresponding protein products. This increased stringency of start-codon selection during mitosis results from increased interactions between the key regulator of start-codon selection, eIF1, and the 40S ribosome. We find that increased eIF1-40S ribosome interactions during mitosis are mediated by the release of a nuclear pool of eIF1 upon nuclear envelope breakdown. Selectively depleting the nuclear pool of eIF1 eliminates the changes to translational stringency during mitosis, resulting in altered mitotic proteome composition. In addition, preventing mitotic translational rewiring results in substantially increased cell death and decreased mitotic slippage following treatment with anti-mitotic chemotherapeutics. Thus, cells globally control translation initiation stringency with critical roles during the mammalian cell cycle to preserve mitotic cell physiology.
Collapse
|
4
|
Tierney JAS, Świrski M, Tjeldnes H, Mudge JM, Kufel J, Whiffin N, Valen E, Baranov PV. Ribosome decision graphs for the representation of eukaryotic RNA translation complexity. Genome Res 2024; 34:530-538. [PMID: 38719470 PMCID: PMC11146595 DOI: 10.1101/gr.278810.123] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2023] [Accepted: 04/01/2024] [Indexed: 05/21/2024]
Abstract
The application of ribosome profiling has revealed an unexpected abundance of translation in addition to that responsible for the synthesis of previously annotated protein-coding regions. Multiple short sequences have been found to be translated within single RNA molecules, within both annotated protein-coding and noncoding regions. The biological significance of this translation is a matter of intensive investigation. However, current schematic or annotation-based representations of mRNA translation generally do not account for the apparent multitude of translated regions within the same molecules. They also do not take into account the stochasticity of the process that allows alternative translations of the same RNA molecules by different ribosomes. There is a need for formal representations of mRNA complexity that would enable the analysis of quantitative information on translation and more accurate models for predicting the phenotypic effects of genetic variants affecting translation. To address this, we developed a conceptually novel abstraction that we term ribosome decision graphs (RDGs). RDGs represent translation as multiple ribosome paths through untranslated and translated mRNA segments. We termed the latter "translons." Nondeterministic events, such as initiation, reinitiation, selenocysteine insertion, or ribosomal frameshifting, are then represented as branching points. This representation allows for an adequate representation of eukaryotic translation complexity and focuses on locations critical for translation regulation. We show how RDGs can be used for depicting translated regions and for analyzing genetic variation and quantitative genome-wide data on translation for characterization of regulatory modulators of translation.
Collapse
Affiliation(s)
- Jack A S Tierney
- School of Biochemistry and Cell Biology, University College Cork, Cork T12 K8AF, Ireland
- SFI Centre for Research Training in Genomics Data Science, University College Cork, Cork T12 K8AF, Ireland
| | - Michał Świrski
- Institute of Genetics and Biotechnology, Faculty of Biology, University of Warsaw, 02-106 Warsaw, Poland
| | - Håkon Tjeldnes
- School of Biochemistry and Cell Biology, University College Cork, Cork T12 K8AF, Ireland
- Computational Biology Unit, Department of Informatics, University of Bergen, NO-5020 Bergen, Norway
| | - Jonathan M Mudge
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton CB10 1SD, Cambridge, United Kingdom
| | - Joanna Kufel
- Institute of Genetics and Biotechnology, Faculty of Biology, University of Warsaw, 02-106 Warsaw, Poland
| | - Nicola Whiffin
- The Big Data Institute and Wellcome Centre for Human Genetics, University of Oxford, Oxford OX3 7BN, United Kingdom
| | - Eivind Valen
- Computational Biology Unit, Department of Informatics, University of Bergen, NO-5020 Bergen, Norway
- Department of Biosciences, University of Oslo, 0316 Oslo, Norway
| | - Pavel V Baranov
- School of Biochemistry and Cell Biology, University College Cork, Cork T12 K8AF, Ireland;
| |
Collapse
|
5
|
Wieder N, D'Souza EN, Martin-Geary AC, Lassen FH, Talbot-Martin J, Fernandes M, Chothani SP, Rackham OJL, Schafer S, Aspden JL, MacArthur DG, Davies RW, Whiffin N. Differences in 5'untranslated regions highlight the importance of translational regulation of dosage sensitive genes. Genome Biol 2024; 25:111. [PMID: 38685090 PMCID: PMC11057154 DOI: 10.1186/s13059-024-03248-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2023] [Accepted: 04/15/2024] [Indexed: 05/02/2024] Open
Abstract
BACKGROUND Untranslated regions (UTRs) are important mediators of post-transcriptional regulation. The length of UTRs and the composition of regulatory elements within them are known to vary substantially across genes, but little is known about the reasons for this variation in humans. Here, we set out to determine whether this variation, specifically in 5'UTRs, correlates with gene dosage sensitivity. RESULTS We investigate 5'UTR length, the number of alternative transcription start sites, the potential for alternative splicing, the number and type of upstream open reading frames (uORFs) and the propensity of 5'UTRs to form secondary structures. We explore how these elements vary by gene tolerance to loss-of-function (LoF; using the LOEUF metric), and in genes where changes in dosage are known to cause disease. We show that LOEUF correlates with 5'UTR length and complexity. Genes that are most intolerant to LoF have longer 5'UTRs, greater TSS diversity, and more upstream regulatory elements than their LoF tolerant counterparts. We show that these differences are evident in disease gene-sets, but not in recessive developmental disorder genes where LoF of a single allele is tolerated. CONCLUSIONS Our results confirm the importance of post-transcriptional regulation through 5'UTRs in tight regulation of mRNA and protein levels, particularly for genes where changes in dosage are deleterious and lead to disease. Finally, to support gene-based investigation we release a web-based browser tool, VuTR, that supports exploration of the composition of individual 5'UTRs and the impact of genetic variation within them.
Collapse
Affiliation(s)
- Nechama Wieder
- Big Data Institute, University of Oxford, Oxford, UK
- Wellcome Centre for Human Genetics, University of Oxford, Oxford, UK
| | - Elston N D'Souza
- Big Data Institute, University of Oxford, Oxford, UK
- Wellcome Centre for Human Genetics, University of Oxford, Oxford, UK
| | - Alexandra C Martin-Geary
- Big Data Institute, University of Oxford, Oxford, UK
- Wellcome Centre for Human Genetics, University of Oxford, Oxford, UK
| | - Frederik H Lassen
- Big Data Institute, University of Oxford, Oxford, UK
- Wellcome Centre for Human Genetics, University of Oxford, Oxford, UK
| | | | - Maria Fernandes
- Big Data Institute, University of Oxford, Oxford, UK
- Wellcome Centre for Human Genetics, University of Oxford, Oxford, UK
| | - Sonia P Chothani
- Program in Cardiovascular and Metabolic Disorders, Duke-National University of Singapore, Singapore, 169857, Singapore
| | - Owen J L Rackham
- Program in Cardiovascular and Metabolic Disorders, Duke-National University of Singapore, Singapore, 169857, Singapore
- School of Biological Sciences, University of Southampton, Southampton, UK
| | - Sebastian Schafer
- Program in Cardiovascular and Metabolic Disorders, Duke-National University of Singapore, Singapore, 169857, Singapore
| | - Julie L Aspden
- School of Molecular and Cellular Biology, Faculty of Biological Sciences, University of Leeds, Leeds, LS2 9JT, United Kingdom
- LeedsOmics, University of Leeds, Leeds, LS2 9JT, United Kingdom
- Astbury Centre of Structural Molecular Biology, University of Leeds, Leeds, LS2 9JT, United Kingdom
| | - Daniel G MacArthur
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Centre for Population Genomics, Garvan Institute of Medical Research, and UNSW Sydney, Sydney, NSW, Australia
- Centre for Population Genomics, Murdoch Children's Research Institute, Melbourne, VIC, Australia
| | - Robert W Davies
- Wellcome Centre for Human Genetics, University of Oxford, Oxford, UK
- Department of Statistics, University of Oxford, Oxford, UK
| | - Nicola Whiffin
- Big Data Institute, University of Oxford, Oxford, UK.
- Wellcome Centre for Human Genetics, University of Oxford, Oxford, UK.
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA.
| |
Collapse
|
6
|
Zhang Y, Bailey TS, Hittmeyer P, Dubois LJ, Theys J, Lambin P. Multiplex genetic manipulations in Clostridium butyricum and Clostridium sporogenes to secrete recombinant antigen proteins for oral-spore vaccination. Microb Cell Fact 2024; 23:119. [PMID: 38659027 PMCID: PMC11040787 DOI: 10.1186/s12934-024-02389-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2024] [Accepted: 04/11/2024] [Indexed: 04/26/2024] Open
Abstract
BACKGROUND Clostridium spp. has demonstrated therapeutic potential in cancer treatment through intravenous or intratumoral administration. This approach has expanded to include non-pathogenic clostridia for the treatment of various diseases, underscoring the innovative concept of oral-spore vaccination using clostridia. Recent advancements in the field of synthetic biology have significantly enhanced the development of Clostridium-based bio-therapeutics. These advancements are particularly notable in the areas of efficient protein overexpression and secretion, which are crucial for the feasibility of oral vaccination strategies. Here, we present two examples of genetically engineered Clostridium candidates: one as an oral cancer vaccine and the other as an antiviral oral vaccine against SARS-CoV-2. RESULTS Using five validated promoters and a signal peptide derived from Clostridium sporogenes, a series of full-length NY-ESO-1/CTAG1, a promising cancer vaccine candidate, expression vectors were constructed and transformed into C. sporogenes and Clostridium butyricum. Western blotting analysis confirmed efficient expression and secretion of NY-ESO-1 in clostridia, with specific promoters leading to enhanced detection signals. Additionally, the fusion of a reported bacterial adjuvant to NY-ESO-1 for improved immune recognition led to the cloning difficulties in E. coli. The use of an AUU start codon successfully mitigated potential toxicity issues in E. coli, enabling the secretion of recombinant proteins in C. sporogenes and C. butyricum. We further demonstrate the successful replacement of PyrE loci with high-expression cassettes carrying NY-ESO-1 and adjuvant-fused NY-ESO-1, achieving plasmid-free clostridia capable of secreting the antigens. Lastly, the study successfully extends its multiplex genetic manipulations to engineer clostridia for the secretion of SARS-CoV-2-related Spike_S1 antigens. CONCLUSIONS This study successfully demonstrated that C. butyricum and C. sporogenes can produce the two recombinant antigen proteins (NY-ESO-1 and SARS-CoV-2-related Spike_S1 antigens) through genetic manipulations, utilizing the AUU start codon. This approach overcomes challenges in cloning difficult proteins in E. coli. These findings underscore the feasibility of harnessing commensal clostridia for antigen protein secretion, emphasizing the applicability of non-canonical translation initiation across diverse species with broad implications for medical or industrial biotechnology.
Collapse
Affiliation(s)
- Yanchao Zhang
- The M-Lab, Department of Precision Medicine, GROW - Research Institute for Oncology and Reproduction, Maastricht University, Maastricht, 6229 ER, the Netherlands.
| | - Tom S Bailey
- The M-Lab, Department of Precision Medicine, GROW - Research Institute for Oncology and Reproduction, Maastricht University, Maastricht, 6229 ER, the Netherlands
- Department of Cell Biology-Inspired Tissue Engineering, MERLN Institute for Technology-Inspired Regenerative Medicine, Maastricht University, Maastricht, 6229 ER, the Netherlands
| | - Philip Hittmeyer
- The M-Lab, Department of Precision Medicine, GROW - Research Institute for Oncology and Reproduction, Maastricht University, Maastricht, 6229 ER, the Netherlands
- LivingMed Biotech BV, Clos Chanmurly 13, Liège, 4000, Belgium
| | - Ludwig J Dubois
- The M-Lab, Department of Precision Medicine, GROW - Research Institute for Oncology and Reproduction, Maastricht University, Maastricht, 6229 ER, the Netherlands
| | - Jan Theys
- The M-Lab, Department of Precision Medicine, GROW - Research Institute for Oncology and Reproduction, Maastricht University, Maastricht, 6229 ER, the Netherlands
| | - Philippe Lambin
- The M-Lab, Department of Precision Medicine, GROW - Research Institute for Oncology and Reproduction, Maastricht University, Maastricht, 6229 ER, the Netherlands.
| |
Collapse
|
7
|
Yang H, Li Q, Stroup EK, Wang S, Ji Z. Widespread stable noncanonical peptides identified by integrated analyses of ribosome profiling and ORF features. Nat Commun 2024; 15:1932. [PMID: 38431639 PMCID: PMC10908861 DOI: 10.1038/s41467-024-46240-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2023] [Accepted: 02/18/2024] [Indexed: 03/05/2024] Open
Abstract
Studies have revealed dozens of functional peptides in putative 'noncoding' regions and raised the question of how many proteins are encoded by noncanonical open reading frames (ORFs). Here, we comprehensively annotate genome-wide translated ORFs across five eukaryotes (human, mouse, zebrafish, worm, and yeast) by analyzing ribosome profiling data. We develop a logistic regression model named PepScore based on ORF features (expected length, encoded domain, and conservation) to calculate the probability that the encoded peptide is stable in humans. Systematic ectopic expression validates PepScore and shows that stable complex-associating microproteins can be encoded in 5'/3' untranslated regions and overlapping coding regions of mRNAs besides annotated noncoding RNAs. Stable noncanonical proteins follow conventional rules and localize to different subcellular compartments. Inhibition of proteasomal/lysosomal degradation pathways can stabilize some peptides especially those with moderate PepScores, but cannot rescue the expression of short ones with low PepScores suggesting they are directly degraded by cellular proteases. The majority of human noncanonical peptides with high PepScores show longer lengths but low conservation across species/mammals, and hundreds contain trait-associated genetic variants. Our study presents a statistical framework to identify stable noncanonical peptides in the genome and provides a valuable resource for functional characterization of noncanonical translation during development and disease.
Collapse
Affiliation(s)
- Haiwang Yang
- Department of Pharmacology, Feinberg School of Medicine, Northwestern University, Chicago, IL, 60611, USA
| | - Qianru Li
- Department of Pharmacology, Feinberg School of Medicine, Northwestern University, Chicago, IL, 60611, USA
| | - Emily K Stroup
- Department of Pharmacology, Feinberg School of Medicine, Northwestern University, Chicago, IL, 60611, USA
| | - Sheng Wang
- Department of Biomedical Engineering, McCormick School of Engineering, Northwestern University, Evanston, IL, 60628, USA
| | - Zhe Ji
- Department of Pharmacology, Feinberg School of Medicine, Northwestern University, Chicago, IL, 60611, USA.
- Department of Biomedical Engineering, McCormick School of Engineering, Northwestern University, Evanston, IL, 60628, USA.
| |
Collapse
|
8
|
Tierney JAS, Świrski M, Tjeldnes H, Mudge JM, Kufel J, Whiffin N, Valen E, Baranov PV. Ribosome Decision Graphs for the Representation of Eukaryotic RNA Translation Complexity. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.11.10.566564. [PMID: 37986835 PMCID: PMC10659439 DOI: 10.1101/2023.11.10.566564] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/22/2023]
Abstract
The application of ribosome profiling has revealed an unexpected abundance of translation in addition to that responsible for the synthesis of previously annotated protein-coding regions. Multiple short sequences have been found to be translated within single RNA molecules, both within annotated protein-coding and non-coding regions. The biological significance of this translation is a matter of intensive investigation. However, current schematic or annotation-based representations of mRNA translation generally do not account for the apparent multitude of translated regions within the same molecules. They also do not take into account the stochasticity of the process that allows alternative translations of the same RNA molecules by different ribosomes. There is a need for formal representations of mRNA complexity that would enable the analysis of quantitative information on translation and more accurate models for predicting the phenotypic effects of genetic variants affecting translation. To address this, we developed a conceptually novel abstraction that we term Ribosome Decision Graphs (RDGs). RDGs represent translation as multiple ribosome paths through untranslated and translated mRNA segments. We termed the later 'translons'. Non-deterministic events, such as initiation, re-initiation, selenocysteine insertion or ribosomal frameshifting are then represented as branching points. This representation allows for an adequate representation of eukaryotic translation complexity and focuses on locations critical for translation regulation. We show how RDGs can be used for depicting translated regions, analysis of genetic variation and quantitative genome-wide data on translation for characterisation of regulatory modulators of translation.
Collapse
Affiliation(s)
- Jack A S Tierney
- School of Biochemistry and Cell Biology, University College Cork, Cork, Ireland
- SFI Centre for Research Training in Genomics Data Science, University College Cork, Cork, Ireland
| | - Michał Świrski
- Institute of Genetics and Biotechnology, Faculty of Biology, University of Warsaw, Warsaw, Poland
| | - Håkon Tjeldnes
- School of Biochemistry and Cell Biology, University College Cork, Cork, Ireland
- Computational Biology Unit, Department of Informatics, University of Bergen, Bergen, Norway
| | - Jonathan M Mudge
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK
| | - Joanna Kufel
- Institute of Genetics and Biotechnology, Faculty of Biology, University of Warsaw, Warsaw, Poland
| | - Nicola Whiffin
- The Big Data Institute and Wellcome Centre for Human Genetics, University of Oxford, Oxford, OX3 7BN, UK
| | - Eivind Valen
- Computational Biology Unit, Department of Informatics, University of Bergen, Bergen, Norway
- Department of Biosciences, University of Oslo, Oslo, Norway
| | - Pavel V Baranov
- School of Biochemistry and Cell Biology, University College Cork, Cork, Ireland
| |
Collapse
|
9
|
Fang JC, Liu MJ. Translation initiation at AUG and non-AUG triplets in plants. PLANT SCIENCE : AN INTERNATIONAL JOURNAL OF EXPERIMENTAL PLANT BIOLOGY 2023; 335:111822. [PMID: 37574140 DOI: 10.1016/j.plantsci.2023.111822] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/15/2022] [Revised: 07/22/2023] [Accepted: 08/07/2023] [Indexed: 08/15/2023]
Abstract
In plants and other eukaryotes, precise selection of translation initiation site (TIS) on mRNAs shapes the proteome in response to cellular events or environmental cues. The canonical translation of mRNAs initiates at a 5' proximal AUG codon in a favorable context. However, the coding and non-coding regions of plant genomes contain numerous unannotated alternative AUG and non-AUG TISs. Determining how and why these unexpected and prevalent TISs are activated in plants has emerged as an exciting research area. In this review, we focus on the selection of plant TISs and highlight studies that revealed previously unannotated TISs used in vivo via comparative genomics and genome-wide profiling of ribosome positioning and protein N-terminal ends. The biological signatures of non-AUG TIS-initiated open reading frames (ORFs) in plants are also discussed. We describe what is understood about cis-regulatory RNA elements and trans-acting eukaryotic initiation factors (eIFs) in the site selection for translation initiation by featuring the findings in plants along with supporting findings in non-plant species. The prevalent, unannotated TISs provide a hidden reservoir of ORFs that likely help reshape plant proteomes in response to developmental or environmental cues. These findings underscore the importance of understanding the mechanistic basis of TIS selection to functionally annotate plant genomes, especially for crops with large genomes.
Collapse
Affiliation(s)
- Jhen-Cheng Fang
- Biotechnology Center in Southern Taiwan, Academia Sinica, Tainan 711, Taiwan
| | - Ming-Jung Liu
- Biotechnology Center in Southern Taiwan, Academia Sinica, Tainan 711, Taiwan; Agricultural Biotechnology Research Center, Academia Sinica, Taipei 115, Taiwan.
| |
Collapse
|
10
|
Prensner JR, Abelin JG, Kok LW, Clauser KR, Mudge JM, Ruiz-Orera J, Bassani-Sternberg M, Moritz RL, Deutsch EW, van Heesch S. What Can Ribo-Seq, Immunopeptidomics, and Proteomics Tell Us About the Noncanonical Proteome? Mol Cell Proteomics 2023; 22:100631. [PMID: 37572790 PMCID: PMC10506109 DOI: 10.1016/j.mcpro.2023.100631] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2023] [Revised: 07/21/2023] [Accepted: 08/08/2023] [Indexed: 08/14/2023] Open
Abstract
Ribosome profiling (Ribo-Seq) has proven transformative for our understanding of the human genome and proteome by illuminating thousands of noncanonical sites of ribosome translation outside the currently annotated coding sequences (CDSs). A conservative estimate suggests that at least 7000 noncanonical ORFs are translated, which, at first glance, has the potential to expand the number of human protein CDSs by 30%, from ∼19,500 annotated CDSs to over 26,000 annotated CDSs. Yet, additional scrutiny of these ORFs has raised numerous questions about what fraction of them truly produce a protein product and what fraction of those can be understood as proteins according to conventional understanding of the term. Adding further complication is the fact that published estimates of noncanonical ORFs vary widely by around 30-fold, from several thousand to several hundred thousand. The summation of this research has left the genomics and proteomics communities both excited by the prospect of new coding regions in the human genome but searching for guidance on how to proceed. Here, we discuss the current state of noncanonical ORF research, databases, and interpretation, focusing on how to assess whether a given ORF can be said to be "protein coding."
Collapse
Affiliation(s)
- John R Prensner
- Division of Pediatric Hematology/Oncology, Department of Pediatrics, University of Michigan Medical School, Ann Arbor, Michigan, USA; Department of Biological Chemistry, University of Michigan Medical School, Ann Arbor, Michigan, USA.
| | | | - Leron W Kok
- Princess Máxima Center for Pediatric Oncology, Utrecht, The Netherlands
| | - Karl R Clauser
- Broad Institute of MIT and Harvard, Cambridge, Massachusetts, USA
| | - Jonathan M Mudge
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Cambridge, UK
| | - Jorge Ruiz-Orera
- Cardiovascular and Metabolic Sciences, Max Delbrück Center for Molecular Medicine in the Helmholtz Association (MDC), Berlin, Germany
| | - Michal Bassani-Sternberg
- Ludwig Institute for Cancer Research, Agora Center Bugnon 25A, University of Lausanne, Lausanne, Switzerland; Department of Oncology, Centre Hospitalier Universitaire Vaudois (CHUV), Lausanne, Switzerland; Agora Cancer Research Centre, Lausanne, Switzerland
| | - Robert L Moritz
- Institute for Systems Biology (ISB), Seattle, Washington, USA
| | - Eric W Deutsch
- Institute for Systems Biology (ISB), Seattle, Washington, USA
| | | |
Collapse
|
11
|
Wek RC, Anthony TG, Staschke KA. Surviving and Adapting to Stress: Translational Control and the Integrated Stress Response. Antioxid Redox Signal 2023; 39:351-373. [PMID: 36943285 PMCID: PMC10443206 DOI: 10.1089/ars.2022.0123] [Citation(s) in RCA: 10] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/16/2022] [Revised: 02/16/2023] [Accepted: 02/20/2023] [Indexed: 03/23/2023]
Abstract
Significance: Organisms adapt to changing environments by engaging cellular stress response pathways that serve to restore proteostasis and enhance survival. A primary adaptive mechanism is the integrated stress response (ISR), which features phosphorylation of the α subunit of eukaryotic translation initiation factor 2 (eIF2). Four eIF2α kinases respond to different stresses, enabling cells to rapidly control translation to optimize management of resources and reprogram gene expression for stress adaptation. Phosphorylation of eIF2 blocks its guanine nucleotide exchange factor, eIF2B, thus lowering the levels of eIF2 bound to GTP that is required to deliver initiator transfer RNA (tRNA) to ribosomes. While bulk messenger RNA (mRNA) translation can be sharply lowered by heightened phosphorylation of eIF2α, there are other gene transcripts whose translation is unchanged or preferentially translated. Among the preferentially translated genes is ATF4, which directs transcription of adaptive genes in the ISR. Recent Advances and Critical Issues: This review focuses on how eIF2α kinases function as first responders of stress, the mechanisms by which eIF2α phosphorylation and other stress signals regulate the exchange activity of eIF2B, and the processes by which the ISR triggers differential mRNA translation. To illustrate the synergy between stress pathways, we describe the mechanisms and functional significance of communication between the ISR and another key regulator of translation, mammalian/mechanistic target of rapamycin complex 1 (mTORC1), during acute and chronic amino acid insufficiency. Finally, we discuss the pathological conditions that stem from aberrant regulation of the ISR, as well as therapeutic strategies targeting the ISR to alleviate disease. Future Directions: Important topics for future ISR research are strategies for modulating this stress pathway in disease conditions and drug development, molecular processes for differential translation and the coordinate regulation of GCN2 and other stress pathways during physiological and pathological conditions. Antioxid. Redox Signal. 39, 351-373.
Collapse
Affiliation(s)
- Ronald C. Wek
- Department of Biochemistry and Molecular Biology, Indiana University School of Medicine, Indianapolis, Indiana, USA
- Indiana University Melvin and Bren Simon Comprehensive Cancer Center, Indianapolis, Indiana, USA
| | - Tracy G. Anthony
- Department of Nutritional Sciences, Rutgers University, New Brunswick, New Jersey, USA
| | - Kirk A. Staschke
- Department of Biochemistry and Molecular Biology, Indiana University School of Medicine, Indianapolis, Indiana, USA
- Indiana University Melvin and Bren Simon Comprehensive Cancer Center, Indianapolis, Indiana, USA
| |
Collapse
|
12
|
Jaiswal M, Kumar S. smAMPsTK: a toolkit to unravel the smORFome encoding AMPs of plant species. J Biomol Struct Dyn 2023:1-13. [PMID: 37464885 DOI: 10.1080/07391102.2023.2235605] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2023] [Accepted: 07/06/2023] [Indexed: 07/20/2023]
Abstract
The pervasive repertoire of plant molecules with the potential to serve as a substitute for conventional antibiotics has led to obtaining better insights into plant-derived antimicrobial peptides (AMPs). The massive distribution of Small Open Reading Frames (smORFs) throughout eukaryotic genomes with proven extensive biological functions reflects their practicality as antimicrobials. Here, we have developed a pipeline named smAMPsTK to unveil the underlying hidden smORFs encoding AMPs for plant species. By applying this pipeline, we have elicited AMPs of various functional activity of lengths ranging from 5 to 100 aa by employing publicly available transcriptome data of five different angiosperms. Later, we studied the coding potential of AMPs-smORFs, the inclusion of diverse translation initiation start codons, and amino acid frequency. Codon usage study signifies no such codon usage biases for smORFs encoding AMPs. Majorly three start codons are prominent in generating AMPs. The evolutionary and conservational study proclaimed the widespread distribution of AMPs encoding genes throughout the plant kingdom. Domain analysis revealed that nearly all AMPs have chitin-binding ability, establishing their role as antifungal agents. The current study includes a developed methodology to characterize smORFs encoding AMPs, and their implications as antimicrobial, antibacterial, antifungal, or antiviral provided by SVM score and prediction status calculated by machine learning-based prediction models. The pipeline, complete package, and the results derived for five angiosperms are freely available at https://github.com/skbinfo/smAMPsTK.Communicated by Ramaswamy H. Sarma.
Collapse
Affiliation(s)
- Mohini Jaiswal
- Bioinformatics Laboratory, National Institute of Plant Genome Research (NIPGR), Aruna Asaf Ali Marg, New Delhi, India
| | - Shailesh Kumar
- Bioinformatics Laboratory, National Institute of Plant Genome Research (NIPGR), Aruna Asaf Ali Marg, New Delhi, India
| |
Collapse
|
13
|
Kienzle L, Bettinazzi S, Choquette T, Brunet M, Khorami HH, Jacques JF, Moreau M, Roucou X, Landry CR, Angers A, Breton S. A small protein coded within the mitochondrial canonical gene nd4 regulates mitochondrial bioenergetics. BMC Biol 2023; 21:111. [PMID: 37198654 DOI: 10.1186/s12915-023-01609-y] [Citation(s) in RCA: 7] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2022] [Accepted: 05/03/2023] [Indexed: 05/19/2023] Open
Abstract
BACKGROUND Mitochondria have a central role in cellular functions, aging, and in certain diseases. They possess their own genome, a vestige of their bacterial ancestor. Over the course of evolution, most of the genes of the ancestor have been lost or transferred to the nucleus. In humans, the mtDNA is a very small circular molecule with a functional repertoire limited to only 37 genes. Its extremely compact nature with genes arranged one after the other and separated by short non-coding regions suggests that there is little room for evolutionary novelties. This is radically different from bacterial genomes, which are also circular but much larger, and in which we can find genes inside other genes. These sequences, different from the reference coding sequences, are called alternatives open reading frames or altORFs, and they are involved in key biological functions. However, whether altORFs exist in mitochondrial protein-coding genes or elsewhere in the human mitogenome has not been fully addressed. RESULTS We found a downstream alternative ATG initiation codon in the + 3 reading frame of the human mitochondrial nd4 gene. This newly characterized altORF encodes a 99-amino-acid-long polypeptide, MTALTND4, which is conserved in primates. Our custom antibody, but not the pre-immune serum, was able to immunoprecipitate MTALTND4 from HeLa cell lysates, confirming the existence of an endogenous MTALTND4 peptide. The protein is localized in mitochondria and cytoplasm and is also found in the plasma, and it impacts cell and mitochondrial physiology. CONCLUSIONS Many human mitochondrial translated ORFs might have so far gone unnoticed. By ignoring mtaltORFs, we have underestimated the coding potential of the mitogenome. Alternative mitochondrial peptides such as MTALTND4 may offer a new framework for the investigation of mitochondrial functions and diseases.
Collapse
Affiliation(s)
- Laura Kienzle
- Département de sciences biologiques, Université de Montréal, Montréal, Canada
| | - Stefano Bettinazzi
- Département de sciences biologiques, Université de Montréal, Montréal, Canada
| | - Thierry Choquette
- Département de sciences biologiques, Université de Montréal, Montréal, Canada
| | - Marie Brunet
- Service de génétique médicale, Département de pédiatrie, Université de Sherbrooke, Sherbrooke, Canada
- Centre de recherche du Centre hospitalier universitaire de Sherbrooke (CRCHUS), Sherbrooke, Canada
| | | | - Jean-François Jacques
- Département de biochimie et génomique fonctionnelle, Université de Sherbrooke, Sherbrooke, Canada
| | - Mathilde Moreau
- Département de biochimie et génomique fonctionnelle, Université de Sherbrooke, Sherbrooke, Canada
| | - Xavier Roucou
- Centre de recherche du Centre hospitalier universitaire de Sherbrooke (CRCHUS), Sherbrooke, Canada
- Département de biochimie et génomique fonctionnelle, Université de Sherbrooke, Sherbrooke, Canada
| | - Christian R Landry
- Département de biochimie, de microbiologie et de bio-informatique, Faculté des sciences et de génie, Université Laval, Québec, Canada
- Institut de biologie intégrative et des systèmes, Université Laval, Québec, Canada
- PROTEO, Le regroupement québécois de recherche sur la fonction, l'ingénierie et les applications des protéines, Université Laval, Québec, Canada
- Centre de recherche sur les données massives, Université Laval, Québec, Canada
- Département de biologie, Faculté des sciences et de génie, Université Laval, Québec, Canada
| | - Annie Angers
- Département de sciences biologiques, Université de Montréal, Montréal, Canada
| | - Sophie Breton
- Département de sciences biologiques, Université de Montréal, Montréal, Canada.
| |
Collapse
|
14
|
Prensner JR, Abelin JG, Kok LW, Clauser KR, Mudge JM, Ruiz-Orera J, Bassani-Sternberg M, Deutsch EW, van Heesch S. What can Ribo-seq and proteomics tell us about the non-canonical proteome? BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.05.16.541049. [PMID: 37292611 PMCID: PMC10245706 DOI: 10.1101/2023.05.16.541049] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Ribosome profiling (Ribo-seq) has proven transformative for our understanding of the human genome and proteome by illuminating thousands of non-canonical sites of ribosome translation outside of the currently annotated coding sequences (CDSs). A conservative estimate suggests that at least 7,000 non-canonical open reading frames (ORFs) are translated, which, at first glance, has the potential to expand the number of human protein-coding sequences by 30%, from ∼19,500 annotated CDSs to over 26,000. Yet, additional scrutiny of these ORFs has raised numerous questions about what fraction of them truly produce a protein product and what fraction of those can be understood as proteins according to conventional understanding of the term. Adding further complication is the fact that published estimates of non-canonical ORFs vary widely by around 30-fold, from several thousand to several hundred thousand. The summation of this research has left the genomics and proteomics communities both excited by the prospect of new coding regions in the human genome, but searching for guidance on how to proceed. Here, we discuss the current state of non-canonical ORF research, databases, and interpretation, focusing on how to assess whether a given ORF can be said to be "protein-coding". In brief The human genome encodes thousands of non-canonical open reading frames (ORFs) in addition to protein-coding genes. As a nascent field, many questions remain regarding non-canonical ORFs. How many exist? Do they encode proteins? What level of evidence is needed for their verification? Central to these debates has been the advent of ribosome profiling (Ribo-seq) as a method to discern genome-wide ribosome occupancy, and immunopeptidomics as a method to detect peptides that are processed and presented by MHC molecules and not observed in traditional proteomics experiments. This article provides a synthesis of the current state of non-canonical ORF research and proposes standards for their future investigation and reporting. Highlights Combined use of Ribo-seq and proteomics-based methods enables optimal confidence in detecting non-canonical ORFs and their protein products.Ribo-seq can provide more sensitive detection of non-canonical ORFs, but data quality and analytical pipelines will impact results.Non-canonical ORF catalogs are diverse and span both high-stringency and low-stringency ORF nominations.A framework for standardized non-canonical ORF evidence will advance the research field.
Collapse
Affiliation(s)
- John R. Prensner
- Department of Pediatrics, Division of Pediatric Hematology/Oncology, University of Michigan Medical School, Ann Arbor, MI 48109, USA
| | | | - Leron W. Kok
- Princess Máxima Center for Pediatric Oncology, Heidelberglaan 25, 3584 CS, Utrecht, the Netherlands
| | - Karl R. Clauser
- Broad Institute of MIT and Harvard, Cambridge, MA, 02142, USA
| | - Jonathan M. Mudge
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Jorge Ruiz-Orera
- Cardiovascular and Metabolic Sciences, Max Delbrück Center for Molecular Medicine in the Helmholtz Association (MDC), 13125 Berlin, Germany
| | - Michal Bassani-Sternberg
- Ludwig Institute for Cancer Research, University of Lausanne, Agora Center Bugnon 25A, 1005 Lausanne, Switzerland
- Department of Oncology, Centre hospitalier universitaire vaudois (CHUV), Rue du Bugnon 46, 1005 Lausanne, Switzerland
- Agora Cancer Research Centre, 1011 Lausanne, Switzerland
| | - Eric W. Deutsch
- Institute for Systems Biology (ISB), Seattle, Washington 98109, USA
| | - Sebastiaan van Heesch
- Princess Máxima Center for Pediatric Oncology, Heidelberglaan 25, 3584 CS, Utrecht, the Netherlands
| |
Collapse
|
15
|
Zhang D, Zhu L, Wang F, Li P, Wang Y, Gao Y. Molecular mechanisms of eukaryotic translation fidelity and their associations with diseases. Int J Biol Macromol 2023; 242:124680. [PMID: 37141965 DOI: 10.1016/j.ijbiomac.2023.124680] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2023] [Accepted: 04/27/2023] [Indexed: 05/06/2023]
Abstract
Converting genetic information into functional proteins is a complex, multi-step process, with each step being tightly regulated to ensure the accuracy of translation, which is critical to cellular health. In recent years, advances in modern biotechnology, especially the development of cryo-electron microscopy and single-molecule techniques, have enabled a clearer understanding of the mechanisms of protein translation fidelity. Although there are many studies on the regulation of protein translation in prokaryotes, and the basic elements of translation are highly conserved in prokaryotes and eukaryotes, there are still great differences in the specific regulatory mechanisms. This review describes how eukaryotic ribosomes and translation factors regulate protein translation and ensure translation accuracy. However, a certain frequency of translation errors does occur in translation, so we describe diseases that arise when the rate of translation errors reaches or exceeds a threshold of cellular tolerance.
Collapse
Affiliation(s)
- Dejiu Zhang
- Institute for Translational Medicine, The Affiliated Hospital of Qingdao University, College of Medicine, Qingdao University, Qingdao, China
| | - Lei Zhu
- College of Basic Medical, Qingdao Binhai University, Qingdao, China
| | - Fei Wang
- Institute for Translational Medicine, The Affiliated Hospital of Qingdao University, College of Medicine, Qingdao University, Qingdao, China
| | - Peifeng Li
- Institute for Translational Medicine, The Affiliated Hospital of Qingdao University, College of Medicine, Qingdao University, Qingdao, China
| | - Yin Wang
- Institute for Translational Medicine, The Affiliated Hospital of Qingdao University, College of Medicine, Qingdao University, Qingdao, China.
| | - Yanyan Gao
- Institute for Translational Medicine, The Affiliated Hospital of Qingdao University, College of Medicine, Qingdao University, Qingdao, China.
| |
Collapse
|
16
|
Van't Spijker HM, Almeida S. How villains are made: The translation of dipeptide repeat proteins in C9ORF72-ALS/FTD. Gene 2023; 858:147167. [PMID: 36621656 PMCID: PMC9928902 DOI: 10.1016/j.gene.2023.147167] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2022] [Accepted: 01/03/2023] [Indexed: 01/07/2023]
Abstract
A hexanucleotide repeat expansion in the C9ORF72 gene is the most common genetic alteration associated with amyotrophic lateral sclerosis (ALS) and frontotemporal dementia (FTD). These neurodegenerative diseases share genetic, clinical and pathological features. The mutation in C9ORF72 appears to drive pathogenesis through a combination of loss of C9ORF72 normal function and gain of toxic effects due to the repeat expansion, which result in aggregation prone expanded RNAs and dipeptide repeat (DPR) proteins. Studies in cellular and animal models indicate that the DPR proteins are the more toxic species. Thus, a large body of research has focused on identifying the cellular pathways most directly impacted by these toxic proteins, with the goal of characterizing disease pathogenesis and nominating potential targets for therapeutic development. The preventative block of the production of the toxic proteins before they can cause harm is a second strategy of intense focus. Despite the considerable amount of effort dedicated to this prophylactic approach, it is still unclear how the DPR proteins are synthesized from RNAs harboring repeat expansions. In this review, we summarize our current knowledge of the specific protein translation mechanisms shown to account for the synthesis of DPR proteins. We will then discuss how enhanced understanding of the composition of these toxic effectors could help in refining disease mechanisms, and paving the way to identify and design effective prophylactic therapies for C9ORF72 ALS-FTD.
Collapse
Affiliation(s)
- Heleen M Van't Spijker
- Program in Molecular Medicine, University of Massachusetts Chan Medical School, Worcester, MA 01605, USA
| | - Sandra Almeida
- Department of Neurology, University of Massachusetts Chan Medical School, Worcester, MA 01605, USA.
| |
Collapse
|
17
|
Evolution and implications of de novo genes in humans. Nat Ecol Evol 2023:10.1038/s41559-023-02014-y. [PMID: 36928843 DOI: 10.1038/s41559-023-02014-y] [Citation(s) in RCA: 15] [Impact Index Per Article: 15.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2022] [Accepted: 02/06/2023] [Indexed: 03/18/2023]
Abstract
Genes and translated open reading frames (ORFs) that emerged de novo from previously non-coding sequences provide species with opportunities for adaptation. When aberrantly activated, some human-specific de novo genes and ORFs have disease-promoting properties-for instance, driving tumour growth. Thousands of putative de novo coding sequences have been described in humans, but we still do not know what fraction of those ORFs has readily acquired a function. Here, we discuss the challenges and controversies surrounding the detection, mechanisms of origin, annotation, validation and characterization of de novo genes and ORFs. Through manual curation of literature and databases, we provide a thorough table with most de novo genes reported for humans to date. We re-evaluate each locus by tracing the enabling mutations and list proposed disease associations, protein characteristics and supporting evidence for translation and protein detection. This work will support future explorations of de novo genes and ORFs in humans.
Collapse
|
18
|
Ryczek N, Łyś A, Makałowska I. The Functional Meaning of 5'UTR in Protein-Coding Genes. Int J Mol Sci 2023; 24:ijms24032976. [PMID: 36769304 PMCID: PMC9917990 DOI: 10.3390/ijms24032976] [Citation(s) in RCA: 8] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/28/2022] [Revised: 01/20/2023] [Accepted: 01/26/2023] [Indexed: 02/05/2023] Open
Abstract
As it is well known, messenger RNA has many regulatory regions along its sequence length. One of them is the 5' untranslated region (5'UTR), which itself contains many regulatory elements such as upstream ORFs (uORFs), internal ribosome entry sites (IRESs), microRNA binding sites, and structural components involved in the regulation of mRNA stability, pre-mRNA splicing, and translation initiation. Activation of the alternative, more upstream transcription start site leads to an extension of 5'UTR. One of the consequences of 5'UTRs extension may be head-to-head gene overlap. This review describes elements in 5'UTR of protein-coding transcripts and the functional significance of protein-coding genes 5' overlap with implications for transcription, translation, and disease.
Collapse
|
19
|
Fedorova AD, Kiniry SJ, Andreev DE, Mudge JM, Baranov PV. Thousands of human non-AUG extended proteoforms lack evidence of evolutionary selection among mammals. Nat Commun 2022; 13:7910. [PMID: 36564405 PMCID: PMC9789052 DOI: 10.1038/s41467-022-35595-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2022] [Accepted: 12/12/2022] [Indexed: 12/24/2022] Open
Abstract
The synthesis of most proteins begins at AUG codons, yet a small number of non-AUG initiated proteoforms are also known. Here we analyse a large number of publicly available Ribo-seq datasets to identify novel, previously uncharacterised non-AUG proteoforms using Trips-Viz implementation of a novel algorithm for detecting translated ORFs. In parallel we analyse genomic alignment of 120 mammals to identify evidence of protein coding evolution in sequences encoding potential extensions. Unexpectedly we find that the number of non-AUG proteoforms identified with ribosome profiling data greatly exceeds those with strong phylogenetic support suggesting their recent evolution. Our study argues that the protein coding potential of human genome greatly exceeds that detectable through comparative genomics and exposes the existence of multiple proteins encoded by the same genomic loci.
Collapse
Affiliation(s)
- Alla D Fedorova
- School of Biochemistry and Cell Biology, University College Cork, Cork, Ireland.
- SFI Centre for Research Training in Genomics Data Science, University College Cork, Cork, Ireland.
| | - Stephen J Kiniry
- School of Biochemistry and Cell Biology, University College Cork, Cork, Ireland
| | - Dmitry E Andreev
- Shemyakin-Ovchinnikov Institute of Bioorganic Chemistry, RAS, Moscow, Russia
- Belozersky Institute of Physico-Chemical Biology, Lomonosov Moscow State University, Moscow, Russia
| | - Jonathan M Mudge
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK
| | - Pavel V Baranov
- School of Biochemistry and Cell Biology, University College Cork, Cork, Ireland.
| |
Collapse
|
20
|
Duncan C, Mata J. Translation-complex profiling of fission yeast cells reveals dynamic rearrangements of scanning ribosomal subunits upon nutritional stress. Nucleic Acids Res 2022; 50:13011-13025. [PMID: 36478272 PMCID: PMC9825154 DOI: 10.1093/nar/gkac1140] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2022] [Revised: 11/09/2022] [Accepted: 11/16/2022] [Indexed: 12/13/2022] Open
Abstract
Control of mRNA translation is key for stress responses. Translation initiation is usually rate-limiting and, in eukaryotes, involves mRNA scanning by the small ribosomal subunit. Despite its importance, many aspects of translation in vivo have not been explored fully, especially at the transcriptome-wide level. A recent method termed translation-complex profiling (TCP-seq) allows transcriptome-wide views of scanning ribosomal subunits. We applied TCP-seq to nutritional stress in the fission yeast Schizosaccharomyces pombe. At initiation sites, we observed multiple complexes resembling those of mammals, and consistent with queuing of scanning subunits. In 5' UTRs, small subunit accumulations were common and may reflect impediments to scanning. A key mediator of stress responses in S. pombe is the Fil1 transcription factor, which is regulated translationally by a poorly-understood mechanism involving upstream Open Reading Frames (uORFs). TCP-seq data of fil1 shows that stress allows scanning subunits to by-pass specific uORFs and reach the fil1 coding sequence. The integration of these observations with reporter assays revealed that fil1 translational control is mediated by a combination of scanning reinitiation-repressive and permissive uORFs, and establishes fil1 as a model for uORF-mediated translational control. Altogether, our transcriptome-wide study reveals general and gene-specific features of translation in a model eukaryote.
Collapse
Affiliation(s)
| | - Juan Mata
- To whom correspondence should be addressed. Tel: +44 01223360467;
| |
Collapse
|
21
|
Jürgens L, Wethmar K. The Emerging Role of uORF-Encoded uPeptides and HLA uLigands in Cellular and Tumor Biology. Cancers (Basel) 2022; 14:cancers14246031. [PMID: 36551517 PMCID: PMC9776223 DOI: 10.3390/cancers14246031] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2022] [Revised: 11/29/2022] [Accepted: 11/30/2022] [Indexed: 12/13/2022] Open
Abstract
Recent technological advances have facilitated the detection of numerous non-canonical human peptides derived from regulatory regions of mRNAs, long non-coding RNAs, and other cryptic transcripts. In this review, we first give an overview of the classification of these novel peptides and summarize recent improvements in their annotation and detection by ribosome profiling, mass spectrometry, and individual experimental analysis. A large fraction of the novel peptides originates from translation at upstream open reading frames (uORFs) that are located within the transcript leader sequence of regular mRNA. In humans, uORF-encoded peptides (uPeptides) have been detected in both healthy and malignantly transformed cells and emerge as important regulators in cellular and immunological pathways. In the second part of the review, we focus on various functional implications of uPeptides. As uPeptides frequently act at the transition of translational regulation and individual peptide function, we describe the mechanistic modes of translational regulation through ribosome stalling, the involvement in cellular programs through protein interaction and complex formation, and their role within the human leukocyte antigen (HLA)-associated immunopeptidome as HLA uLigands. We delineate how malignant transformation may lead to the formation of novel uORFs, uPeptides, or HLA uLigands and explain their potential implication in tumor biology. Ultimately, we speculate on a potential use of uPeptides as peptide drugs and discuss how uPeptides and HLA uLigands may facilitate translational inhibition of oncogenic protein messages and immunotherapeutic approaches in cancer therapy.
Collapse
|
22
|
Translation and natural selection of micropeptides from long non-canonical RNAs. Nat Commun 2022; 13:6515. [PMID: 36316320 PMCID: PMC9622821 DOI: 10.1038/s41467-022-34094-y] [Citation(s) in RCA: 16] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2022] [Accepted: 10/13/2022] [Indexed: 12/25/2022] Open
Abstract
Long noncoding RNAs (lncRNAs) are transcripts longer than 200 nucleotides but lacking canonical coding sequences. Apparently unable to produce peptides, lncRNA function seems to rely only on RNA expression, sequence and structure. Here, we exhaustively detect in-vivo translation of small open reading frames (small ORFs) within lncRNAs using Ribosomal profiling during Drosophila melanogaster embryogenesis. We show that around 30% of lncRNAs contain small ORFs engaged by ribosomes, leading to regulated translation of 100 to 300 micropeptides. We identify lncRNA features that favour translation, such as cistronicity, Kozak sequences, and conservation. For the latter, we develop a bioinformatics pipeline to detect small ORF homologues, and reveal evidence of natural selection favouring the conservation of micropeptide sequence and function across evolution. Our results expand the repertoire of lncRNA biochemical functions, and suggest that lncRNAs give rise to novel coding genes throughout evolution. Since most lncRNAs contain small ORFs with as yet unknown translation potential, we propose to rename them "long non-canonical RNAs".
Collapse
|
23
|
Ryan CS, Schröder M. The human DEAD-box helicase DDX3X as a regulator of mRNA translation. Front Cell Dev Biol 2022; 10:1033684. [PMID: 36393867 PMCID: PMC9642913 DOI: 10.3389/fcell.2022.1033684] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2022] [Accepted: 10/07/2022] [Indexed: 08/27/2023] Open
Abstract
The human DEAD-box protein DDX3X is an RNA remodelling enzyme that has been implicated in various aspects of RNA metabolism. In addition, like many DEAD-box proteins, it has non-conventional functions that are independent of its enzymatic activity, e.g., DDX3X acts as an adaptor molecule in innate immune signalling pathways. DDX3X has been linked to several human diseases. For example, somatic mutations in DDX3X were identified in various human cancers, and de novo germline mutations cause a neurodevelopmental condition now termed 'DDX3X syndrome'. DDX3X is also an important host factor in many different viral infections, where it can have pro-or anti-viral effects depending on the specific virus. The regulation of translation initiation for specific mRNA transcripts is likely a central cellular function of DDX3X, yet many questions regarding its exact targets and mechanisms of action remain unanswered. In this review, we explore the current knowledge about DDX3X's physiological RNA targets and summarise its interactions with the translation machinery. A role for DDX3X in translational reprogramming during cellular stress is emerging, where it may be involved in the regulation of stress granule formation and in mediating non-canonical translation initiation. Finally, we also discuss the role of DDX3X-mediated translation regulation during viral infections. Dysregulation of DDX3X's function in mRNA translation likely contributes to its involvement in disease pathophysiology. Thus, a better understanding of its exact mechanisms for regulating translation of specific mRNA targets is important, so that we can potentially develop therapeutic strategies for overcoming the negative effects of its dysregulation.
Collapse
|
24
|
Ichihara K, Nakayama KI, Matsumoto A. Identification of unannotated coding sequences and their physiological functions. J Biochem 2022; 173:237-242. [PMID: 35959549 DOI: 10.1093/jb/mvac064] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2022] [Accepted: 08/05/2022] [Indexed: 11/12/2022] Open
Abstract
Most protein-coding sequences (CDSs) are predicted sequences based on criteria such as a size sufficient to encode a product of at least 100 amino acids and with translation starting at an AUG initiation codon. However, recent studies based on ribosome profiling and mass spectrometry have shown that several RNAs annotated as long noncoding RNAs (lncRNAs) are actually translated to generate polypeptides of fewer than 100 amino acids, and that many proteins are translated from near-cognate initiation codons such as CUG and GUG. Furthermore, studies of genetically engineered mouse models have revealed that such polypeptides and proteins contribute to diverse physiological processes. In this review, we describe the latest methods for the identification of unannotated CDSs and provide examples of their physiological functions.
Collapse
Affiliation(s)
- Kazuya Ichihara
- Division of Cell Biology, Medical Institute of Bioregulation, Kyushu University, Fukuoka, Fukuoka 819-0395, Japan
| | - Keiichi I Nakayama
- Division of Cell Biology, Medical Institute of Bioregulation, Kyushu University, Fukuoka, Fukuoka 819-0395, Japan
| | - Akinobu Matsumoto
- Division of Cell Biology, Medical Institute of Bioregulation, Kyushu University, Fukuoka, Fukuoka 819-0395, Japan
| |
Collapse
|
25
|
Zhdanov AV, Golubeva AV, Yordanova MM, Andreev DE, Ventura-Silva AP, Schellekens H, Baranov PV, Cryan JF, Papkovsky DB. Ghrelin rapidly elevates protein synthesis in vitro by employing the rpS6K-eEF2K-eEF2 signalling axis. Cell Mol Life Sci 2022; 79:426. [PMID: 35841486 PMCID: PMC9288388 DOI: 10.1007/s00018-022-04446-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2019] [Revised: 06/16/2022] [Accepted: 06/22/2022] [Indexed: 11/27/2022]
Abstract
Activated ghrelin receptor GHS-R1α triggers cell signalling pathways that modulate energy homeostasis and biosynthetic processes. However, the effects of ghrelin on mRNA translation are unknown. Using various reporter assays, here we demonstrate a rapid elevation of protein synthesis in cells within 15–30 min upon stimulation of GHS-R1α by ghrelin. We further show that ghrelin-induced activation of translation is mediated, at least in part, through the de-phosphorylation (de-suppression) of elongation factor 2 (eEF2). The levels of eEF2 phosphorylation at Thr56 decrease due to the reduced activity of eEF2 kinase, which is inhibited via Ser366 phosphorylation by rpS6 kinases. Being stress-susceptible, the ghrelin-mediated decrease in eEF2 phosphorylation can be abolished by glucose deprivation and mitochondrial uncoupling. We believe that the observed burst of translation benefits rapid restocking of neuropeptides, which are released upon GHS-R1α activation, and represents the most time- and energy-efficient way of prompt recharging the orexigenic neuronal circuitry.
Collapse
Affiliation(s)
- Alexander V Zhdanov
- School of Biochemistry & Cell Biology, University College Cork, Cavanagh Pharmacy Building, College Road, Cork, Ireland.
| | - Anna V Golubeva
- Department of Anatomy & Neuroscience, University College Cork, Cork, Ireland
| | - Martina M Yordanova
- School of Biochemistry & Cell Biology, University College Cork, Cavanagh Pharmacy Building, College Road, Cork, Ireland
| | - Dmitry E Andreev
- Belozersky Research Institute of Physico-Chemical Biology, Lomonosov Moscow State University, Moscow, Russia.,Shemyakin-Ovchinnikov Institute of Bioorganic Chemistry RAS, Moscow, Russia
| | - Ana Paula Ventura-Silva
- APC Microbiome Institute, University College Cork, Cork, Ireland.,School of Biomolecular and Biomedical Science, University College Dublin, Dublin 4, Ireland
| | - Harriet Schellekens
- Department of Anatomy & Neuroscience, University College Cork, Cork, Ireland.,APC Microbiome Institute, University College Cork, Cork, Ireland
| | - Pavel V Baranov
- School of Biochemistry & Cell Biology, University College Cork, Cavanagh Pharmacy Building, College Road, Cork, Ireland
| | - John F Cryan
- Department of Anatomy & Neuroscience, University College Cork, Cork, Ireland.,APC Microbiome Institute, University College Cork, Cork, Ireland
| | - Dmitri B Papkovsky
- School of Biochemistry & Cell Biology, University College Cork, Cavanagh Pharmacy Building, College Road, Cork, Ireland
| |
Collapse
|