1
|
Parmar BS, Peeters MKR, Boonen K, Clark EC, Baggerman G, Menschaert G, Temmerman L. Identification of Non-Canonical Translation Products in C. elegans Using Tandem Mass Spectrometry. Front Genet 2021; 12:728900. [PMID: 34759956 PMCID: PMC8575065 DOI: 10.3389/fgene.2021.728900] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2021] [Accepted: 09/16/2021] [Indexed: 11/22/2022] Open
Abstract
Transcriptome and ribosome sequencing have revealed the existence of many non-canonical transcripts, mainly containing splice variants, ncRNA, sORFs and altORFs. However, identification and characterization of products that may be translated out of these remains a challenge. Addressing this, we here report on 552 non-canonical proteins and splice variants in the model organism C. elegans using tandem mass spectrometry. Aided by sequencing-based prediction, we generated a custom proteome database tailored to search for non-canonical translation products of C. elegans. Using this database, we mined available mass spectrometric resources of C. elegans, from which 51 novel, non-canonical proteins could be identified. Furthermore, we utilized diverse proteomic and peptidomic strategies to detect 40 novel non-canonical proteins in C. elegans by LC-TIMS-MS/MS, of which 6 were common with our meta-analysis of existing resources. Together, this permits us to provide a resource with detailed annotation of 467 splice variants and 85 novel proteins mapped onto UTRs, non-coding regions and alternative open reading frames of the C. elegans genome.
Collapse
Affiliation(s)
- Bhavesh S. Parmar
- Animal Physiology and Neurobiology, University of Leuven (KU Leuven), Leuven, Belgium
| | - Marlies K. R. Peeters
- Laboratory of Bioinformatics and Computational Genomics (BioBix), Department of Mathematical Modelling, Ghent University, Ghent, Belgium
| | - Kurt Boonen
- Centre for Proteomics (CFP), University of Antwerp, Antwerp, Belgium
| | - Ellie C. Clark
- Animal Physiology and Neurobiology, University of Leuven (KU Leuven), Leuven, Belgium
| | - Geert Baggerman
- Centre for Proteomics (CFP), University of Antwerp, Antwerp, Belgium
| | - Gerben Menschaert
- Laboratory of Bioinformatics and Computational Genomics (BioBix), Department of Mathematical Modelling, Ghent University, Ghent, Belgium
| | - Liesbet Temmerman
- Animal Physiology and Neurobiology, University of Leuven (KU Leuven), Leuven, Belgium
| |
Collapse
|
2
|
Guerra-Almeida D, Tschoeke DA, da-Fonseca RN. Understanding small ORF diversity through a comprehensive transcription feature classification. DNA Res 2021; 28:6317669. [PMID: 34240112 PMCID: PMC8435553 DOI: 10.1093/dnares/dsab007] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2020] [Indexed: 11/13/2022] Open
Abstract
Small open reading frames (small ORFs/sORFs/smORFs) are potentially coding sequences smaller than 100 codons that have historically been considered junk DNA by gene prediction software and in annotation screening; however, the advent of next-generation sequencing has contributed to the deeper investigation of junk DNA regions and their transcription products, resulting in the emergence of smORFs as a new focus of interest in systems biology. Several smORF peptides were recently reported in noncanonical mRNAs as new players in numerous biological contexts; however, their relevance is still overlooked in coding potential analysis. Hence, this review proposes a smORF classification based on transcriptional features, discussing the most promising approaches to investigate smORFs based on their different characteristics. First, smORFs were divided into nonexpressed (intergenic) and expressed (genic) smORFs. Second, genic smORFs were classified as smORFs located in noncoding RNAs (ncRNAs) or canonical mRNAs. Finally, smORFs in ncRNAs were further subdivided into sequences located in small or long RNAs, whereas smORFs located in canonical mRNAs were subdivided into several specific classes depending on their localization along the gene. We hope that this review provides new insights into large-scale annotations and reinforces the role of smORFs as essential components of a hidden coding DNA world.
Collapse
Affiliation(s)
- Diego Guerra-Almeida
- Institute of Biodiversity and Sustainability, Federal University of Rio de Janeiro, Rio de Janeiro, Brazil
| | - Diogo Antonio Tschoeke
- Alberto Luiz Coimbra Institute of Graduate Studies and Engineering Research (COPPE), Biomedical Engineering Program, Federal University of Rio de Janeiro, Rio de Janeiro, Brazil
| | - Rodrigo Nunes- da-Fonseca
- Institute of Biodiversity and Sustainability, Federal University of Rio de Janeiro, Rio de Janeiro, Brazil.,National Institute of Science and Technology in Molecular Entomology, Rio de Janeiro, Brazil
| |
Collapse
|
3
|
Tharakan R, Sawa A. Minireview: Novel Micropeptide Discovery by Proteomics and Deep Sequencing Methods. Front Genet 2021; 12:651485. [PMID: 34025718 PMCID: PMC8136307 DOI: 10.3389/fgene.2021.651485] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2021] [Accepted: 03/22/2021] [Indexed: 12/12/2022] Open
Abstract
A novel class of small proteins, called micropeptides, has recently been discovered in the genome. These proteins, which have been found to play important roles in many physiological and cellular systems, are shorter than 100 amino acids and were overlooked during previous genome annotations. Discovery and characterization of more micropeptides has been ongoing, often using -omics methods such as proteomics, RNA sequencing, and ribosome profiling. In this review, we survey the recent advances in the micropeptides field and describe the methodological and conceptual challenges facing future micropeptide endeavors.
Collapse
Affiliation(s)
- Ravi Tharakan
- National Institute on Aging, National Institutes of Health, Baltimore, MD, United States
| | - Akira Sawa
- Departments of Psychiatry, Neuroscience, Biomedical Engineering, and Genetic Medicine, Johns Hopkins University School of Medicine, Baltimore, MD, United States.,Department of Mental Health, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, United States
| |
Collapse
|
4
|
Schlesinger D, Elsässer SJ. Revisiting sORFs: overcoming challenges to identify and characterize functional microproteins. FEBS J 2021; 289:53-74. [PMID: 33595896 DOI: 10.1111/febs.15769] [Citation(s) in RCA: 51] [Impact Index Per Article: 17.0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2020] [Revised: 01/17/2021] [Accepted: 02/15/2021] [Indexed: 02/07/2023]
Abstract
Short ORFs (sORFs), that is, occurrences of a start and stop codon within 100 codons or less, can be found in organisms of all domains of life, outnumbering annotated protein-coding ORFs by orders of magnitude. Even though functional proteins smaller than 100 amino acids are known, the coding potential of sORFs has often been overlooked, as it is not trivial to predict and test for functionality within the large number of sORFs. Recent advances in ribosome profiling and mass spectrometry approaches, together with refined bioinformatic predictions, have enabled a huge leap forward in this field and identified thousands of likely coding sORFs. A relatively low number of small proteins or microproteins produced from these sORFs have been characterized so far on the molecular, structural, and/or mechanistic level. These however display versatile and, in some cases, essential cellular functions, allowing for the exciting possibility that many more, previously unknown small proteins might be encoded in the genome, waiting to be discovered. This review will give an overview of the steadily growing microprotein field, focusing on eukaryotic small proteins. We will discuss emerging themes in the molecular action of microproteins, as well as advances and challenges in microprotein identification and characterization.
Collapse
Affiliation(s)
- Dörte Schlesinger
- Science for Life Laboratory, Division of Genome Biology, Department of Medical Biochemistry and Biophysics, Karolinska Institutet, Stockholm, Sweden.,Ming Wai Lau Centre for Reparative Medicine, Stockholm node, Karolinska Institutet, Stockholm, Sweden
| | - Simon J Elsässer
- Science for Life Laboratory, Division of Genome Biology, Department of Medical Biochemistry and Biophysics, Karolinska Institutet, Stockholm, Sweden.,Ming Wai Lau Centre for Reparative Medicine, Stockholm node, Karolinska Institutet, Stockholm, Sweden
| |
Collapse
|
5
|
Dvorak P, Hlavac V, Soucek P. 5' Untranslated Region Elements Show High Abundance and Great Variability in Homologous ABCA Subfamily Genes. Int J Mol Sci 2020; 21:ijms21228878. [PMID: 33238634 PMCID: PMC7700387 DOI: 10.3390/ijms21228878] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2020] [Revised: 11/16/2020] [Accepted: 11/20/2020] [Indexed: 11/16/2022] Open
Abstract
The 12 members of the ABCA subfamily in humans are known for their ability to transport cholesterol and its derivatives, vitamins, and xenobiotics across biomembranes. Several ABCA genes are causatively linked to inborn diseases, and the role in cancer progression and metastasis is studied intensively. The regulation of translation initiation is implicated as the major mechanism in the processes of post-transcriptional modifications determining final protein levels. In the current bioinformatics study, we mapped the features of the 5' untranslated regions (5'UTR) known to have the potential to regulate translation, such as the length of 5'UTRs, upstream ATG codons, upstream open-reading frames, introns, RNA G-quadruplex-forming sequences, stem loops, and Kozak consensus motifs, in the DNA sequences of all members of the subfamily. Subsequently, the conservation of the features, correlations among them, ribosome profiling data as well as protein levels in normal human tissues were examined. The 5'UTRs of ABCA genes contain above-average numbers of upstream ATGs, open-reading frames and introns, as well as conserved ones, and these elements probably play important biological roles in this subfamily, unlike RG4s. Although we found significant correlations among the features, we did not find any correlation between the numbers of 5'UTR features and protein tissue distribution and expression scores. We showed the existence of single nucleotide variants in relation to the 5'UTR features experimentally in a cohort of 105 breast cancer patients. 5'UTR features presumably prepare a complex playground, in which the other elements such as RNA binding proteins and non-coding RNAs play the major role in the fine-tuning of protein expression.
Collapse
Affiliation(s)
- Pavel Dvorak
- Department of Biology, Faculty of Medicine in Pilsen, Charles University, 32300 Pilsen, Czech Republic
- Biomedical Center, Faculty of Medicine in Pilsen, Charles University, 32300 Pilsen, Czech Republic; (V.H.); (P.S.)
- Correspondence: ; Tel.: +420-377593263
| | - Viktor Hlavac
- Biomedical Center, Faculty of Medicine in Pilsen, Charles University, 32300 Pilsen, Czech Republic; (V.H.); (P.S.)
- Toxicogenomics Unit, National Institute of Public Health, 100 42 Prague, Czech Republic
| | - Pavel Soucek
- Biomedical Center, Faculty of Medicine in Pilsen, Charles University, 32300 Pilsen, Czech Republic; (V.H.); (P.S.)
- Toxicogenomics Unit, National Institute of Public Health, 100 42 Prague, Czech Republic
| |
Collapse
|
6
|
Takahashi H, Miyaki S, Onouchi H, Motomura T, Idesako N, Takahashi A, Murase M, Fukuyoshi S, Endo T, Satou K, Naito S, Itoh M. Exhaustive identification of conserved upstream open reading frames with potential translational regulatory functions from animal genomes. Sci Rep 2020; 10:16289. [PMID: 33004976 PMCID: PMC7530721 DOI: 10.1038/s41598-020-73307-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2020] [Accepted: 09/15/2020] [Indexed: 11/17/2022] Open
Abstract
Upstream open reading frames (uORFs) are present in the 5′-untranslated regions of many eukaryotic mRNAs, and some peptides encoded by these regions play important regulatory roles in controlling main ORF (mORF) translation. We previously developed a novel pipeline, ESUCA, to comprehensively identify plant uORFs encoding functional peptides, based on genome-wide identification of uORFs with conserved peptide sequences (CPuORFs). Here, we applied ESUCA to diverse animal genomes, because animal CPuORFs have been identified only by comparing uORF sequences between a limited number of species, and how many previously identified CPuORFs encode regulatory peptides is unclear. By using ESUCA, 1517 (1373 novel and 144 known) CPuORFs were extracted from four evolutionarily divergent animal genomes. We examined the effects of 17 human CPuORFs on mORF translation using transient expression assays. Through these analyses, we identified seven novel regulatory CPuORFs that repressed mORF translation in a sequence-dependent manner, including one conserved only among Eutheria. We discovered a much higher number of animal CPuORFs than previously identified. Since most human CPuORFs identified in this study are conserved across a wide range of Eutheria or a wider taxonomic range, many CPuORFs encoding regulatory peptides are expected to be found in the identified CPuORFs.
Collapse
Affiliation(s)
- Hiro Takahashi
- Graduate School of Medical Sciences, Kanazawa University, Kanazawa, 920-1192, Japan. .,Graduate School of Horticulture, Chiba University, Matsudo, 271-8510, Japan. .,Fundamental Innovative Oncology Core Center, National Cancer Center, Tokyo, 104-0045, Japan.
| | - Shido Miyaki
- Graduate School of Horticulture, Chiba University, Matsudo, 271-8510, Japan
| | - Hitoshi Onouchi
- Graduate School of Agriculture, Hokkaido University, Sapporo, 060-8589, Japan
| | - Taichiro Motomura
- Graduate School of Medical Sciences, Kanazawa University, Kanazawa, 920-1192, Japan
| | - Nobuo Idesako
- Graduate School of Horticulture, Chiba University, Matsudo, 271-8510, Japan
| | - Anna Takahashi
- Faculty of Information Technologies and Control, Belarusian State University of Informatics and Radio Electronics, 220013, Minsk, Belarus.,College of Bioscience and Biotechnology, Chubu University, Kasugai, 487-8501, Japan
| | - Masataka Murase
- Graduate School of Medical Sciences, Kanazawa University, Kanazawa, 920-1192, Japan
| | - Shuichi Fukuyoshi
- Institute of Medical, Pharmaceutical and Health Sciences, Kanazawa University, Kanazawa, 920-1192, Japan
| | - Toshinori Endo
- Graduate School of Information Science and Technology, Hokkaido University, Sapporo, 060-0814, Japan
| | - Kenji Satou
- Faculty of Biological Science and Technology, Institute of Science and Engineering, Kanazawa University, Kanazawa, 920-1192, Japan
| | - Satoshi Naito
- Graduate School of Agriculture, Hokkaido University, Sapporo, 060-8589, Japan.,Graduate School of Life Science, Hokkaido University, Sapporo, 060-0810, Japan
| | - Motoyuki Itoh
- Graduate School of Pharmaceutical Science, Chiba University, Chiba, 260-8675, Japan.
| |
Collapse
|
7
|
uORFs: Important Cis-Regulatory Elements in Plants. Int J Mol Sci 2020; 21:ijms21176238. [PMID: 32872304 PMCID: PMC7503886 DOI: 10.3390/ijms21176238] [Citation(s) in RCA: 26] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2020] [Revised: 08/20/2020] [Accepted: 08/22/2020] [Indexed: 11/17/2022] Open
Abstract
Gene expression is regulated at many levels, including mRNA transcription, translation, and post-translational modification. Compared with transcriptional regulation, mRNA translational control is a more critical step in gene expression and allows for more rapid changes of encoded protein concentrations in cells. Translation is highly regulated by complex interactions between cis-acting elements and trans-acting factors. Initiation is not only the first phase of translation, but also the core of translational regulation, because it limits the rate of protein synthesis. As potent cis-regulatory elements in eukaryotic mRNAs, upstream open reading frames (uORFs) generally inhibit the translation initiation of downstream major ORFs (mORFs) through ribosome stalling. During the past few years, with the development of RNA-seq and ribosome profiling, functional uORFs have been identified and characterized in many organisms. Here, we review uORF identification, uORF classification, and uORF-mediated translation initiation. More importantly, we summarize the translational regulation of uORFs in plant metabolic pathways, morphogenesis, disease resistance, and nutrient absorption, which open up an avenue for precisely modulating the plant growth and development, as well as environmental adaption. Additionally, we also discuss prospective applications of uORFs in plant breeding.
Collapse
|
8
|
Takahashi H, Hayashi N, Hiragori Y, Sasaki S, Motomura T, Yamashita Y, Naito S, Takahashi A, Fuse K, Satou K, Endo T, Kojima S, Onouchi H. Comprehensive genome-wide identification of angiosperm upstream ORFs with peptide sequences conserved in various taxonomic ranges using a novel pipeline, ESUCA. BMC Genomics 2020; 21:260. [PMID: 32228449 PMCID: PMC7106846 DOI: 10.1186/s12864-020-6662-5] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2019] [Accepted: 03/10/2020] [Indexed: 12/27/2022] Open
Abstract
Background Upstream open reading frames (uORFs) in the 5′-untranslated regions (5′-UTRs) of certain eukaryotic mRNAs encode evolutionarily conserved functional peptides, such as cis-acting regulatory peptides that control translation of downstream main ORFs (mORFs). For genome-wide searches for uORFs with conserved peptide sequences (CPuORFs), comparative genomic studies have been conducted, in which uORF sequences were compared between selected species. To increase chances of identifying CPuORFs, we previously developed an approach in which uORF sequences were compared using BLAST between Arabidopsis and any other plant species with available transcript sequence databases. If this approach is applied to multiple plant species belonging to phylogenetically distant clades, it is expected to further comprehensively identify CPuORFs conserved in various plant lineages, including those conserved among relatively small taxonomic groups. Results To efficiently compare uORF sequences among many species and efficiently identify CPuORFs conserved in various taxonomic lineages, we developed a novel pipeline, ESUCA. We applied ESUCA to the genomes of five angiosperm species, which belong to phylogenetically distant clades, and selected CPuORFs conserved among at least three different orders. Through these analyses, we identified 89 novel CPuORF families. As expected, ESUCA analysis of each of the five angiosperm genomes identified many CPuORFs that were not identified from ESUCA analyses of the other four species. However, unexpectedly, these CPuORFs include those conserved across wide taxonomic ranges, indicating that the approach used here is useful not only for comprehensive identification of narrowly conserved CPuORFs but also for that of widely conserved CPuORFs. Examination of the effects of 11 selected CPuORFs on mORF translation revealed that CPuORFs conserved only in relatively narrow taxonomic ranges can have sequence-dependent regulatory effects, suggesting that most of the identified CPuORFs are conserved because of functional constraints of their encoded peptides. Conclusions This study demonstrates that ESUCA is capable of efficiently identifying CPuORFs likely to be conserved because of the functional importance of their encoded peptides. Furthermore, our data show that the approach in which uORF sequences from multiple species are compared with those of many other species, using ESUCA, is highly effective in comprehensively identifying CPuORFs conserved in various taxonomic ranges.
Collapse
Affiliation(s)
- Hiro Takahashi
- Graduate School of Medical Sciences, Kanazawa University, Kanazawa, 920-1192, Japan. .,Graduate School of Horticulture, Chiba University, Matsudo, 271-8510, Japan.
| | - Noriya Hayashi
- Graduate School of Agriculture, Hokkaido University, Sapporo, 060-8589, Japan
| | - Yuta Hiragori
- Graduate School of Agriculture, Hokkaido University, Sapporo, 060-8589, Japan
| | - Shun Sasaki
- Graduate School of Agriculture, Hokkaido University, Sapporo, 060-8589, Japan
| | - Taichiro Motomura
- Graduate School of Medical Sciences, Kanazawa University, Kanazawa, 920-1192, Japan
| | - Yui Yamashita
- Graduate School of Agriculture, Hokkaido University, Sapporo, 060-8589, Japan
| | - Satoshi Naito
- Graduate School of Agriculture, Hokkaido University, Sapporo, 060-8589, Japan.,Graduate School of Life Science, Hokkaido University, Sapporo, 060-0810, Japan
| | - Anna Takahashi
- Faculty of Information Technologies and Control, Belarusian State University of Informatics and Radio Electronics, 220013, Minsk, Belarus
| | - Kazuyuki Fuse
- New Business Development Office, Churitsu Electric Corporation, Toyoake, 470-1112, Japan
| | - Kenji Satou
- Faculty of Biological Science and Technology, Institute of Science and Engineering, Kanazawa University, Kanazawa, 920-1192, Japan
| | - Toshinori Endo
- Graduate School of Information Science and Technology, Hokkaido University, Sapporo, 060-0814, Japan
| | - Shoko Kojima
- Graduate School of Bioscience and Biotechnology, Chubu University, Kasugai, 487-8501, Japan
| | - Hitoshi Onouchi
- Graduate School of Agriculture, Hokkaido University, Sapporo, 060-8589, Japan.
| |
Collapse
|
9
|
Chu Y, Huang J, Ma G, Cui T, Yan X, Li H, Wang N. An Upstream Open Reading Frame Represses Translation of Chicken PPARγ Transcript Variant 1. Front Genet 2020; 11:165. [PMID: 32184808 PMCID: PMC7058706 DOI: 10.3389/fgene.2020.00165] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2019] [Accepted: 02/12/2020] [Indexed: 11/20/2022] Open
Abstract
Peroxisome proliferator-activated receptor γ (PPARγ) is a master regulator of adipogenesis. The PPARγ gene produces various transcripts with different 5'-untranslated regions (5' UTRs) because of alternative promoter usage and splicing. The 5' UTR plays important roles in posttranscriptional gene regulation. However, to date, the regulatory role and underlying mechanism of 5' UTRs in the posttranscriptional regulation of PPARγ expression remain largely unclear. In this study, we investigated the effects of 5' UTRs on posttranscriptional regulation using reporter assays. Our results showed that the five PPARγ 5' UTRs exerted different effects on reporter gene activity. Bioinformatics analysis showed that chicken PPARγ transcript 1 (PPARγ1) possessed an upstream open reading frame (uORF) in its 5' UTR. Mutation analysis showed that a mutation in the uORF led to increased Renilla luciferase activity and PPARγ protein expression, but decreased Renilla luciferase and PPARγ1 mRNA expression. mRNA stability analysis using real-time RT-PCR showed that the uORF mutation did not interfere with mRNA stability, but promoter activity analysis of the cloned 5' UTR showed that the uORF mutation reduced promoter activity. Furthermore, in vitro transcription/translation assays demonstrated that the uORF mutation markedly increased the translation of PPARγ1 mRNA. Collectively, our results indicate that the uORF represses the translation of chicken PPARγ1 mRNA.
Collapse
Affiliation(s)
- Yankai Chu
- Key Laboratory of Chicken Genetics and Breeding, Ministry of Agriculture and Rural Affairs, Harbin, China
- Key Laboratory of Animal Genetics, Breeding and Reproduction, Education Department of Heilongjiang Province, Harbin, China
- College of Animal Science and Technology, Northeast Agricultural University, Harbin, China
| | - Jiaxin Huang
- Key Laboratory of Chicken Genetics and Breeding, Ministry of Agriculture and Rural Affairs, Harbin, China
- Key Laboratory of Animal Genetics, Breeding and Reproduction, Education Department of Heilongjiang Province, Harbin, China
- College of Animal Science and Technology, Northeast Agricultural University, Harbin, China
| | - Guangwei Ma
- Key Laboratory of Chicken Genetics and Breeding, Ministry of Agriculture and Rural Affairs, Harbin, China
- Key Laboratory of Animal Genetics, Breeding and Reproduction, Education Department of Heilongjiang Province, Harbin, China
- College of Animal Science and Technology, Northeast Agricultural University, Harbin, China
| | - Tingting Cui
- Key Laboratory of Chicken Genetics and Breeding, Ministry of Agriculture and Rural Affairs, Harbin, China
- Key Laboratory of Animal Genetics, Breeding and Reproduction, Education Department of Heilongjiang Province, Harbin, China
- College of Animal Science and Technology, Northeast Agricultural University, Harbin, China
| | - Xiaohong Yan
- Key Laboratory of Chicken Genetics and Breeding, Ministry of Agriculture and Rural Affairs, Harbin, China
- Key Laboratory of Animal Genetics, Breeding and Reproduction, Education Department of Heilongjiang Province, Harbin, China
- College of Animal Science and Technology, Northeast Agricultural University, Harbin, China
| | - Hui Li
- Key Laboratory of Chicken Genetics and Breeding, Ministry of Agriculture and Rural Affairs, Harbin, China
- Key Laboratory of Animal Genetics, Breeding and Reproduction, Education Department of Heilongjiang Province, Harbin, China
- College of Animal Science and Technology, Northeast Agricultural University, Harbin, China
| | - Ning Wang
- Key Laboratory of Chicken Genetics and Breeding, Ministry of Agriculture and Rural Affairs, Harbin, China
- Key Laboratory of Animal Genetics, Breeding and Reproduction, Education Department of Heilongjiang Province, Harbin, China
- College of Animal Science and Technology, Northeast Agricultural University, Harbin, China
| |
Collapse
|
10
|
R Cerqueira F, Vasconcelos ATR. OCCAM: prediction of small ORFs in bacterial genomes by means of a target-decoy database approach and machine learning techniques. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2020; 2020:5989499. [PMID: 33206960 PMCID: PMC7673341 DOI: 10.1093/database/baaa067] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/26/2020] [Revised: 07/11/2020] [Accepted: 07/27/2020] [Indexed: 11/14/2022]
Abstract
Small open reading frames (ORFs) have been systematically disregarded by automatic genome annotation. The difficulty in finding patterns in tiny sequences is the main reason that makes small ORFs to be overlooked by computational procedures. However, advances in experimental methods show that small proteins can play vital roles in cellular activities. Hence, it is urgent to make progress in the development of computational approaches to speed up the identification of potential small ORFs. In this work, our focus is on bacterial genomes. We improve a previous approach to identify small ORFs in bacteria. Our method uses machine learning techniques and decoy subject sequences to filter out spurious ORF alignments. We show that an advanced multivariate analysis can be more effective in terms of sensitivity than applying the simplistic and widely used e-value cutoff. This is particularly important in the case of small ORFs for which alignments present higher e-values than usual. Experiments with control datasets show that the machine learning algorithms used in our method to curate significant alignments can achieve average sensitivity and specificity of 97.06% and 99.61%, respectively. Therefore, an important step is provided here toward the construction of more accurate computational tools for the identification of small ORFs in bacteria.
Collapse
Affiliation(s)
- Fabio R Cerqueira
- Department of Production Engineering, Universidade Federal Fluminense, Rua Domingos Silvério s/n, Petrópolis, 25 650-050, Rio de Janeiro, Brazil.,Graduate Program in Computer Science, Universidade Federal de Viçosa, 36570-900, Minas Gerais, Brazil
| | | |
Collapse
|
11
|
Rothnagel J, Menschaert G. Short Open Reading Frames and Their Encoded Peptides. Proteomics 2019; 18:e1700035. [PMID: 29691985 DOI: 10.1002/pmic.201700035] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2018] [Revised: 04/23/2018] [Indexed: 11/10/2022]
|
12
|
Wang T, Liu Y, Liu Q, Cummins S, Zhao M. Integrative proteomic analysis reveals potential high-frequency alternative open reading frame-encoded peptides in human colorectal cancer. Life Sci 2018; 215:182-189. [PMID: 30419281 DOI: 10.1016/j.lfs.2018.11.018] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2018] [Revised: 10/31/2018] [Accepted: 11/08/2018] [Indexed: 11/30/2022]
Abstract
Identification of alternative open reading frame-encoded peptides (AEPs) for the diagnosis of colorectal cancer at the proteome level is largely unexplored because of a lack of comprehensive proteomics data. Here, we performed a comprehensive integrative analysis of mass spectral data published by Clinical Proteomic Tumor Analysis Consortium and characterized 93 high-confident AEPs encoded within 75 genes. There are four cancer-related genes appeared to have AEPs identified frequently in >20 out of 95 colorectal cancer samples, including ABCF2, AR, RBM10 and NRG1. Further network analysis of the identified AEPs found the enrichment of novel AEPs within hormone androgen receptor and a highly-modularised network with 42 genes associated with patient survival. Our results not only suggested a mechanistic view of how AEPs work in cancer progression, but also shed light on somatic amino acid mutations in AEPs, which might be overlooked previously because of their low frequencies. In particular, potential high-frequency mutations in 77 samples associated with EDARADD may contribute to the discovery of new biomarkers and the development of innovative therapeutic approaches.
Collapse
Affiliation(s)
- Tianfang Wang
- School of Science and Engineering, University of the Sunshine Coast, Maroochydore DC, Queensland, 4558, Australia.
| | - Yining Liu
- The School of Public Health, Institute for Chemical Carcinogenesis, Guangzhou Medical University, 195 Dongfengxi Road, Guangzhou 510182, China
| | - Qi Liu
- Department of Biomedical Informatics, School of Medicine, Vanderbilt University, Nashville, TN 37232, United States; Center for Quantitative Sciences, School of Medicine, Vanderbilt University, Nashville, TN 37232, United States
| | - Scott Cummins
- School of Science and Engineering, University of the Sunshine Coast, Maroochydore DC, Queensland, 4558, Australia
| | - Min Zhao
- School of Science and Engineering, University of the Sunshine Coast, Maroochydore DC, Queensland, 4558, Australia.
| |
Collapse
|
13
|
Zhu Y, Vaughn JC. Experimental Verification and Evolutionary Origin of 5'-UTR Polyadenylation Sites in Arabidopsis thaliana. FRONTIERS IN PLANT SCIENCE 2018; 9:969. [PMID: 30026753 PMCID: PMC6041940 DOI: 10.3389/fpls.2018.00969] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 12/31/2017] [Accepted: 06/15/2018] [Indexed: 06/08/2023]
Abstract
Messenger RNA (mRNA) polyadenylation is an indispensable step during post-transcriptional pre-mRNA processing for most genes in eukaryotes. The usage of one poly(A) site over another is known as alternative polyadenylation (APA). APA has been implicated in gene expression regulation through its role of selecting the ends of a transcript. Recent studies of polyadenylation profiles in the Arabidopsis database unexpectedly predicted that a portion of the poly(A) sites are located in the 5'-UTR, which remains to be experimentally verified. We selected 16 genes from a dataset of 744, based on criteria designed to minimize problems in interpretation. Here, we experimentally verify 5'-UTR-APA in Arabidopsis for 10 of the 16 selected genes, and show for the first time existence of independent polyadenylated 5'-UTR transcripts, arising due to alternative polyadenylation. We used 3'-RACE and sequencing to validate poly(A) sites and northern blot to show that the observed short upstream transcripts do not arise from the 3'-end of a previously unrecognized convergent gene. Evidence is reported showing that two of the independent upstream open reading frame (uORF) transcripts studied, one containing a complex dual uORF, very likely arose by exon shuffling following duplication of the 5'-end from the downstream major open reading frame (mORF). Finally, results are presented to show that the uORF in this gene may encode two short functional proteins, based on observation of amino acid sequence conservation encoded by the dual uORFs.
Collapse
|
14
|
Abstract
Peptides encoded by short open reading frames (sORFs) are usually defined as peptides ≤100 aa long. Usually sORFs were ignored by automatic genome annotation programs due to the high probability of false discovery. However, improved computational tools along with a high-throughput RIBO-seq approach identified a myriad of translated sORFs. Their importance becomes evident as we are gaining experimental validation of their diverse cellular functions. This Review examines various computational and experimental approaches of sORFs identification as well as provides the summary of our current knowledge of their functional roles in cells.
Collapse
Affiliation(s)
- Anastasia Chugunova
- Lomonosov Moscow State University , Department of Chemistry and A.N. Belozersky Institute of Physico-Chemical Biology, Moscow 119992, Russia.,Skolkovo Institute of Science and Technology , Skolkovo, Moscow Region 143025, Russia
| | - Tsimafei Navalayeu
- Lomonosov Moscow State University , Department of Chemistry and A.N. Belozersky Institute of Physico-Chemical Biology, Moscow 119992, Russia
| | - Olga Dontsova
- Lomonosov Moscow State University , Department of Chemistry and A.N. Belozersky Institute of Physico-Chemical Biology, Moscow 119992, Russia.,Skolkovo Institute of Science and Technology , Skolkovo, Moscow Region 143025, Russia
| | - Petr Sergiev
- Lomonosov Moscow State University , Department of Chemistry and A.N. Belozersky Institute of Physico-Chemical Biology, Moscow 119992, Russia.,Skolkovo Institute of Science and Technology , Skolkovo, Moscow Region 143025, Russia
| |
Collapse
|
15
|
Hayashi N, Sasaki S, Takahashi H, Yamashita Y, Naito S, Onouchi H. Identification of Arabidopsis thaliana upstream open reading frames encoding peptide sequences that cause ribosomal arrest. Nucleic Acids Res 2017. [PMID: 28637336 PMCID: PMC5587730 DOI: 10.1093/nar/gkx528] [Citation(s) in RCA: 25] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022] Open
Abstract
Specific sequences of certain nascent peptides cause programmed ribosomal arrest during mRNA translation to control gene expression. In eukaryotes, most known regulatory arrest peptides are encoded by upstream open reading frames (uORFs) present in the 5′-untranslated region of mRNAs. However, to date, a limited number of eukaryotic uORFs encoding arrest peptides have been reported. Here, we searched for arrest peptide-encoding uORFs among Arabidopsis thaliana uORFs with evolutionarily conserved peptide sequences. Analysis of in vitro translation products of 22 conserved uORFs identified three novel uORFs causing ribosomal arrest in a peptide sequence-dependent manner. Stop codon-scanning mutagenesis, in which the effect of changing the uORF stop codon position on the ribosomal arrest was examined, and toeprint analysis revealed that two of the three uORFs cause ribosomal arrest during translation elongation, whereas the other one causes ribosomal arrest during translation termination. Transient expression assays showed that the newly identified arrest-causing uORFs exerted a strong sequence-dependent repressive effect on the expression of the downstream reporter gene in A. thaliana protoplasts. These results suggest that the peptide sequences of the three uORFs identified in this study cause ribosomal arrest in the uORFs, thereby repressing the expression of proteins encoded by the main ORFs.
Collapse
Affiliation(s)
- Noriya Hayashi
- Graduate School of Agriculture, Hokkaido University, Sapporo 060-8589, Japan
| | - Shun Sasaki
- Graduate School of Agriculture, Hokkaido University, Sapporo 060-8589, Japan
| | - Hiro Takahashi
- Graduate School of Horticulture, Chiba University, Chiba 263-8522, Japan
| | - Yui Yamashita
- Graduate School of Agriculture, Hokkaido University, Sapporo 060-8589, Japan
| | - Satoshi Naito
- Graduate School of Agriculture, Hokkaido University, Sapporo 060-8589, Japan.,Graduate School of Life Science, Hokkaido University, Sapporo 060-0810, Japan
| | - Hitoshi Onouchi
- Graduate School of Agriculture, Hokkaido University, Sapporo 060-8589, Japan
| |
Collapse
|
16
|
Separating the wheat from the chaff: systematic identification of functionally relevant noncoding variants in ADHD. Mol Psychiatry 2016; 21:1589-1598. [PMID: 27113999 DOI: 10.1038/mp.2016.2] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/01/2015] [Revised: 12/04/2015] [Accepted: 01/11/2016] [Indexed: 12/16/2022]
Abstract
Attention deficit hyperactivity disorder (ADHD) is a highly heritable psychiatric condition with negative lifetime outcomes. Uncovering its genetic architecture should yield important insights into the neurobiology of ADHD and assist development of novel treatment strategies. Twenty years of candidate gene investigations and more recently genome-wide association studies have identified an array of potential association signals. In this context, separating the likely true from false associations ('the wheat' from 'the chaff') will be crucial for uncovering the functional biology of ADHD. Here, we defined a set of 2070 DNA variants that showed evidence of association with ADHD (or were in linkage disequilibrium). More than 97% of these variants were noncoding, and were prioritised for further exploration using two tools-genome-wide annotation of variants (GWAVA) and Combined Annotation-Dependent Depletion (CADD)-that were recently developed to rank variants based upon their likely pathogenicity. Capitalising on recent efforts such as the Encyclopaedia of DNA Elements and US National Institutes of Health Roadmap Epigenomics Projects to improve understanding of the noncoding genome, we subsequently identified 65 variants to which we assigned functional annotations, based upon their likely impact on alternative splicing, transcription factor binding and translational regulation. We propose that these 65 variants, which possess not only a high likelihood of pathogenicity but also readily testable functional hypotheses, represent a tractable shortlist for future experimental validation in ADHD. Taken together, this study brings into sharp focus the likely relevance of noncoding variants for the genetic risk associated with ADHD, and more broadly suggests a bioinformatics approach that should be relevant to other psychiatric disorders.
Collapse
|
17
|
Cabrera-Quio LE, Herberg S, Pauli A. Decoding sORF translation - from small proteins to gene regulation. RNA Biol 2016; 13:1051-1059. [PMID: 27653973 DOI: 10.1080/15476286.2016.1218589] [Citation(s) in RCA: 49] [Impact Index Per Article: 6.1] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022] Open
Abstract
Translation is best known as the fundamental mechanism by which the ribosome converts a sequence of nucleotides into a string of amino acids. Extensive research over many years has elucidated the key principles of translation, and the majority of translated regions were thought to be known. The recent discovery of wide-spread translation outside of annotated protein-coding open reading frames (ORFs) came therefore as a surprise, raising the intriguing possibility that these newly discovered translated regions might have unrecognized protein-coding or gene-regulatory functions. Here, we highlight recent findings that provide evidence that some of these newly discovered translated short ORFs (sORFs) encode functional, previously missed small proteins, while others have regulatory roles. Based on known examples we will also speculate about putative additional roles and the potentially much wider impact that these translated regions might have on cellular homeostasis and gene regulation.
Collapse
Affiliation(s)
| | - Sarah Herberg
- a The Research Institute of Molecular Pathology, Vienna Biocenter (VBC) , Vienna , Austria
| | - Andrea Pauli
- a The Research Institute of Molecular Pathology, Vienna Biocenter (VBC) , Vienna , Austria
| |
Collapse
|
18
|
New Peptides Under the s(ORF)ace of the Genome. Trends Biochem Sci 2016; 41:665-678. [DOI: 10.1016/j.tibs.2016.05.003] [Citation(s) in RCA: 69] [Impact Index Per Article: 8.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2016] [Revised: 04/28/2016] [Accepted: 05/03/2016] [Indexed: 01/30/2023]
|
19
|
Sheshukova EV, Shindyapina AV, Komarova TV, Dorokhov YL. “Matreshka” genes with alternative reading frames. RUSS J GENET+ 2016. [DOI: 10.1134/s1022795416020149] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
|
20
|
Olexiouk V, Menschaert G. Identification of Small Novel Coding Sequences, a Proteogenomics Endeavor. ADVANCES IN EXPERIMENTAL MEDICINE AND BIOLOGY 2016; 926:49-64. [PMID: 27686805 DOI: 10.1007/978-3-319-42316-6_4] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]
Abstract
The identification of small proteins and peptides has consistently proven to be challenging. However, technological advances as well as multi-omics endeavors facilitate the identification of novel small coding sequences, leading to new insights. Specifically, the application of next generation sequencing technologies (NGS), providing accurate and sample specific transcriptome / translatome information, into the proteomics field led to more comprehensive results and new discoveries. This book chapter focuses on the inclusion of RNA-Seq and RIBO-Seq also known as ribosome profiling, an RNA-Seq based technique sequencing the +/- 30 bp long fragments captured by translating ribosomes. We emphasize the identification of micropeptides and neo-antigens, two distinct classes of small translation products, triggering our current understanding of biology. RNA-Seq is capable of capturing sample specific genomic variations, enabling focused neo-antigen identification. RIBO-Seq can identify translation events in small open reading frames which are considered to be non-coding, leading to the discovery of micropeptides. The identification of small translation products requires the integration of multi-omics data, stressing the importance of proteogenomics in this novel research area.
Collapse
Affiliation(s)
- Volodimir Olexiouk
- Lab of Bioinformatics and Computational Genomics (BioBix), Faculty of Bioscience Engineering, Department of Mathematical Modelling, Statistics and Bioinformatics, Ghent University, Coupure Links 653, Building A, Ghent, 9000, Belgium.
| | - Gerben Menschaert
- Lab of Bioinformatics and Computational Genomics (BioBix), Faculty of Bioscience Engineering, Department of Mathematical Modelling, Statistics and Bioinformatics, Ghent University, Coupure Links 653, Building A, Ghent, 9000, Belgium
| |
Collapse
|
21
|
Mackowiak SD, Zauber H, Bielow C, Thiel D, Kutz K, Calviello L, Mastrobuoni G, Rajewsky N, Kempa S, Selbach M, Obermayer B. Extensive identification and analysis of conserved small ORFs in animals. Genome Biol 2015; 16:179. [PMID: 26364619 PMCID: PMC4568590 DOI: 10.1186/s13059-015-0742-x] [Citation(s) in RCA: 140] [Impact Index Per Article: 15.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2015] [Accepted: 08/05/2015] [Indexed: 02/06/2023] Open
Abstract
Background There is increasing evidence that transcripts or transcript regions annotated as non-coding can harbor functional short open reading frames (sORFs). Loss-of-function experiments have identified essential developmental or physiological roles for a few of the encoded peptides (micropeptides), but genome-wide experimental or computational identification of functional sORFs remains challenging. Results Here, we expand our previously developed method and present results of an integrated computational pipeline for the identification of conserved sORFs in human, mouse, zebrafish, fruit fly, and the nematode C. elegans. Isolating specific conservation signatures indicative of purifying selection on amino acid (rather than nucleotide) sequence, we identify about 2,000 novel small ORFs located in the untranslated regions of canonical mRNAs or on transcripts annotated as non-coding. Predicted sORFs show stronger conservation signatures than those identified in previous studies and are sometimes conserved over large evolutionary distances. The encoded peptides have little homology to known proteins and are enriched in disordered regions and short linear interaction motifs. Published ribosome profiling data indicate translation of more than 100 novel sORFs, and mass spectrometry data provide evidence for more than 70 novel candidates. Conclusions Taken together, we identify hundreds of previously unknown conserved sORFs in major model organisms. Our computational analyses and integration with experimental data show that these sORFs are expressed, often translated, and sometimes widely conserved, in some cases even between vertebrates and invertebrates. We thus provide an integrated resource of putatively functional micropeptides for functional validation in vivo. Electronic supplementary material The online version of this article (doi:10.1186/s13059-015-0742-x) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Sebastian D Mackowiak
- Berlin Institute for Medical Systems Biology, Max-Delbrück-Center for Molecular Medicine, Robert-Rössle-Str. 10, 13125, Berlin, Germany.
| | - Henrik Zauber
- Berlin Institute for Medical Systems Biology, Max-Delbrück-Center for Molecular Medicine, Robert-Rössle-Str. 10, 13125, Berlin, Germany.
| | - Chris Bielow
- Berlin Institute for Medical Systems Biology, Max-Delbrück-Center for Molecular Medicine, Robert-Rössle-Str. 10, 13125, Berlin, Germany. .,Berlin Institute of Health, Kapelle-Ufer 2, 10117, Berlin, Germany.
| | - Denise Thiel
- Berlin Institute for Medical Systems Biology, Max-Delbrück-Center for Molecular Medicine, Robert-Rössle-Str. 10, 13125, Berlin, Germany.
| | - Kamila Kutz
- Berlin Institute for Medical Systems Biology, Max-Delbrück-Center for Molecular Medicine, Robert-Rössle-Str. 10, 13125, Berlin, Germany.
| | - Lorenzo Calviello
- Berlin Institute for Medical Systems Biology, Max-Delbrück-Center for Molecular Medicine, Robert-Rössle-Str. 10, 13125, Berlin, Germany.
| | - Guido Mastrobuoni
- Berlin Institute for Medical Systems Biology, Max-Delbrück-Center for Molecular Medicine, Robert-Rössle-Str. 10, 13125, Berlin, Germany.
| | - Nikolaus Rajewsky
- Berlin Institute for Medical Systems Biology, Max-Delbrück-Center for Molecular Medicine, Robert-Rössle-Str. 10, 13125, Berlin, Germany.
| | - Stefan Kempa
- Berlin Institute for Medical Systems Biology, Max-Delbrück-Center for Molecular Medicine, Robert-Rössle-Str. 10, 13125, Berlin, Germany.
| | - Matthias Selbach
- Berlin Institute for Medical Systems Biology, Max-Delbrück-Center for Molecular Medicine, Robert-Rössle-Str. 10, 13125, Berlin, Germany.
| | - Benedikt Obermayer
- Berlin Institute for Medical Systems Biology, Max-Delbrück-Center for Molecular Medicine, Robert-Rössle-Str. 10, 13125, Berlin, Germany.
| |
Collapse
|
22
|
Kerr N, Holmes FE, Hobson SA, Vanderplank P, Leard A, Balthasar N, Wynick D. The generation of knock-in mice expressing fluorescently tagged galanin receptors 1 and 2. Mol Cell Neurosci 2015; 68:258-71. [PMID: 26292267 PMCID: PMC4604734 DOI: 10.1016/j.mcn.2015.08.006] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2015] [Revised: 08/06/2015] [Accepted: 08/10/2015] [Indexed: 12/12/2022] Open
Abstract
The neuropeptide galanin has diverse roles in the central and peripheral nervous systems, by activating the G protein-coupled receptors Gal1, Gal2 and the less studied Gal3 (GalR1-3 gene products). There is a wealth of data on expression of Gal1-3 at the mRNA level, but not at the protein level due to the lack of specificity of currently available antibodies. Here we report the generation of knock-in mice expressing Gal1 or Gal2 receptor fluorescently tagged at the C-terminus with, respectively, mCherry or hrGFP (humanized Renilla green fluorescent protein). In dorsal root ganglia (DRG) neurons expressing the highest levels of Gal1-mCherry, localization to the somatic cell membrane was detected by live-cell fluorescence and immunohistochemistry, and that fluorescence decreased upon addition of galanin. In spinal cord, abundant Gal1-mCherry immunoreactive processes were detected in the superficial layers of the dorsal horn, and highly expressing intrinsic neurons of the lamina III/IV border showed both somatic cell membrane localization and outward transport of receptor from the cell body, detected as puncta within cell processes. In brain, high levels of Gal1-mCherry immunofluorescence were detected within thalamus, hypothalamus and amygdala, with a high density of nerve endings in the external zone of the median eminence, and regions with lesser immunoreactivity included the dorsal raphe nucleus. Gal2-hrGFP mRNA was detected in DRG, but live-cell fluorescence was at the limits of detection, drawing attention to both the much lower mRNA expression than to Gal1 in mice and the previously unrecognized potential for translational control by upstream open reading frames (uORFs).
Collapse
MESH Headings
- Animals
- Brain/metabolism
- Cells, Cultured
- Ganglia, Spinal/cytology
- Green Fluorescent Proteins/genetics
- Green Fluorescent Proteins/metabolism
- Luminescent Proteins/genetics
- Luminescent Proteins/metabolism
- Mice
- Mice, Transgenic
- Microscopy, Confocal
- Neurons/physiology
- RNA, Messenger/metabolism
- Receptor, Galanin, Type 1/genetics
- Receptor, Galanin, Type 1/metabolism
- Receptor, Galanin, Type 2/genetics
- Receptor, Galanin, Type 2/metabolism
- Spinal Cord/metabolism
- Red Fluorescent Protein
Collapse
Affiliation(s)
- Niall Kerr
- Schools of Physiology and Pharmacology and Clinical Sciences, Medical Sciences Building, University Walk, Bristol BS8 1TD, UK
| | - Fiona E Holmes
- Schools of Physiology and Pharmacology and Clinical Sciences, Medical Sciences Building, University Walk, Bristol BS8 1TD, UK
| | - Sally-Ann Hobson
- Schools of Physiology and Pharmacology and Clinical Sciences, Medical Sciences Building, University Walk, Bristol BS8 1TD, UK
| | - Penny Vanderplank
- Schools of Physiology and Pharmacology and Clinical Sciences, Medical Sciences Building, University Walk, Bristol BS8 1TD, UK
| | - Alan Leard
- Wolfson Bioimaging Facility, Medical Sciences Building, University Walk, Bristol BS8 1TD, UK
| | - Nina Balthasar
- Schools of Physiology and Pharmacology and Clinical Sciences, Medical Sciences Building, University Walk, Bristol BS8 1TD, UK
| | - David Wynick
- Schools of Physiology and Pharmacology and Clinical Sciences, Medical Sciences Building, University Walk, Bristol BS8 1TD, UK.
| |
Collapse
|
23
|
Housman G, Ulitsky I. Methods for distinguishing between protein-coding and long noncoding RNAs and the elusive biological purpose of translation of long noncoding RNAs. BIOCHIMICA ET BIOPHYSICA ACTA-GENE REGULATORY MECHANISMS 2015; 1859:31-40. [PMID: 26265145 DOI: 10.1016/j.bbagrm.2015.07.017] [Citation(s) in RCA: 67] [Impact Index Per Article: 7.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/31/2015] [Revised: 06/18/2015] [Accepted: 07/19/2015] [Indexed: 12/12/2022]
Abstract
Long noncoding RNAs (lncRNAs) are a diverse class of RNAs with increasingly appreciated functions in vertebrates, yet much of their biology remains poorly understood. In particular, it is unclear to what extent the current catalog of over 10,000 annotated lncRNAs is indeed devoid of genes coding for proteins. Here we review the available computational and experimental schemes for distinguishing between coding and noncoding transcripts and assess the conclusions from their recent genome-wide applications. We conclude that the model most consistent with the available data is that a large number of mammalian lncRNAs undergo translation, but only a very small minority of such translation events results in stable and functional peptides. The outcomes of the majority of the translation events and their potential biological purposes remain an intriguing topic for future investigation. This article is part of a Special Issue entitled: Clues to long noncoding RNA taxonomy1, edited by Dr. Tetsuro Hirose and Dr. Shinichi Nakagawa.
Collapse
Affiliation(s)
- Gali Housman
- Department of Biological Regulation, Weizmann Institute of Science, Rehovot 76100, Israel
| | - Igor Ulitsky
- Department of Biological Regulation, Weizmann Institute of Science, Rehovot 76100, Israel.
| |
Collapse
|
24
|
Ebina I, Takemoto-Tsutsumi M, Watanabe S, Koyama H, Endo Y, Kimata K, Igarashi T, Murakami K, Kudo R, Ohsumi A, Noh AL, Takahashi H, Naito S, Onouchi H. Identification of novel Arabidopsis thaliana upstream open reading frames that control expression of the main coding sequences in a peptide sequence-dependent manner. Nucleic Acids Res 2015; 43:1562-76. [PMID: 25618853 PMCID: PMC4330380 DOI: 10.1093/nar/gkv018] [Citation(s) in RCA: 51] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022] Open
Abstract
Upstream open reading frames (uORFs) are often found in the 5'-leader regions of eukaryotic mRNAs and can negatively modulate the translational efficiency of the downstream main ORF. Although the effects of most uORFs are thought to be independent of their encoded peptide sequences, certain uORFs control translation of the main ORF in a peptide sequence-dependent manner. For genome-wide identification of such peptide sequence-dependent regulatory uORFs, exhaustive searches for uORFs with conserved amino acid sequences have been conducted using bioinformatic analyses. However, whether the conserved uORFs identified by these bioinformatic approaches encode regulatory peptides has not been experimentally determined. Here we analyzed 16 recently identified Arabidopsis thaliana conserved uORFs for the effects of their amino acid sequences on the expression of the main ORF using a transient expression assay. We identified five novel uORFs that repress main ORF expression in a peptide sequence-dependent manner. Mutational analysis revealed that, in four of them, the C-terminal region of the uORF-encoded peptide is critical for the repression of main ORF expression. Intriguingly, we also identified one exceptional sequence-dependent regulatory uORF, in which the stop codon position is not conserved and the C-terminal region is not important for the repression of main ORF expression.
Collapse
Affiliation(s)
- Isao Ebina
- Graduate School of Life Science, Hokkaido University, Sapporo 060-0810, Japan
| | | | - Shun Watanabe
- Graduate School of Life Science, Hokkaido University, Sapporo 060-0810, Japan
| | - Hiroaki Koyama
- Graduate School of Agriculture, Hokkaido University, Sapporo 060-8589, Japan
| | - Yayoi Endo
- Faculty of Agriculture, Hokkaido University, Sapporo 060-8589, Japan
| | - Kaori Kimata
- Graduate School of Agriculture, Hokkaido University, Sapporo 060-8589, Japan
| | - Takuya Igarashi
- Graduate School of Agriculture, Hokkaido University, Sapporo 060-8589, Japan
| | - Karin Murakami
- Graduate School of Agriculture, Hokkaido University, Sapporo 060-8589, Japan
| | - Rin Kudo
- Graduate School of Agriculture, Hokkaido University, Sapporo 060-8589, Japan
| | - Arisa Ohsumi
- Graduate School of Agriculture, Hokkaido University, Sapporo 060-8589, Japan
| | - Abdul Latif Noh
- Graduate School of Agriculture, Hokkaido University, Sapporo 060-8589, Japan
| | - Hiro Takahashi
- Graduate School of Horticulture, Chiba University, Matsudo 271-8510, Japan
| | - Satoshi Naito
- Graduate School of Life Science, Hokkaido University, Sapporo 060-0810, Japan Graduate School of Agriculture, Hokkaido University, Sapporo 060-8589, Japan
| | - Hitoshi Onouchi
- Graduate School of Agriculture, Hokkaido University, Sapporo 060-8589, Japan
| |
Collapse
|
25
|
Abstract
Over the past decade, high-throughput studies have identified many novel transcripts. While their existence is undisputed, their coding potential and functionality have remained controversial. Recent computational approaches guided by ribosome profiling have indicated that translation is far more pervasive than anticipated and takes place on many transcripts previously assumed to be non-coding. Some of these newly discovered translated transcripts encode short, functional proteins that had been missed in prior screens. Other transcripts are translated, but it might be the process of translation rather than the resulting peptides that serves a function. Here, we review annotation studies in zebrafish to discuss the challenges of placing RNAs onto the continuum that ranges from functional protein-encoding mRNAs to potentially non-functional peptide-producing RNAs to non-coding RNAs. As highlighted by the discovery of the novel signaling peptide Apela/ELABELA/Toddler, accurate annotations can give rise to exciting opportunities to identify the functions of previously uncharacterized transcripts.
Collapse
Affiliation(s)
- Andrea Pauli
- Department of Molecular and Cellular Biology, Harvard University, MA, USA
| | - Eivind Valen
- Department of Molecular and Cellular Biology, Harvard University, MA, USA
| | - Alexander F. Schier
- Department of Molecular and Cellular Biology, Harvard University, MA, USA
- The Broad Institute of Massachusetts Institute of Technology and Harvard, Cambridge, MA, USA
- FAS Center for Systems Biology, Harvard University, Cambridge, MA, USA
- Center for Brain Science, Harvard University, Cambridge, MA, USA
| |
Collapse
|
26
|
Gawron D, Gevaert K, Van Damme P. The proteome under translational control. Proteomics 2014; 14:2647-62. [PMID: 25263132 DOI: 10.1002/pmic.201400165] [Citation(s) in RCA: 35] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2014] [Revised: 08/21/2014] [Accepted: 09/23/2014] [Indexed: 02/02/2023]
Abstract
A single eukaryotic gene can give rise to a variety of protein forms (proteoforms) as a result of genetic variation and multilevel regulation of gene expression. In addition to alternative splicing, an increasing line of evidence shows that alternative translation contributes to the overall complexity of proteomes. Identifying the repertoire of proteins and micropeptides expressed by alternative selection of (near-)cognate translation initiation sites and different reading frames however remains challenging with contemporary proteomics. MS-enabled identification of proteoforms is expected to benefit from transcriptome and translatome data by the creation of customized and sample-specific protein sequence databases. Here, we focus on contemporary integrative omics approaches that complement proteomics with DNA- and/or RNA-oriented technologies to elucidate the mechanisms of translational control. Together, these technologies enable to map the translation (initiation) landscape and more comprehensively define the inventory of proteoforms raised upon alternative translation, thus assisting in the (re-)annotation of genomes.
Collapse
Affiliation(s)
- Daria Gawron
- Department of Medical Protein Research, VIB, Ghent, Belgium; Department of Biochemistry, Ghent University, Ghent, Belgium
| | | | | |
Collapse
|
27
|
Díaz A, García K, Navarrete A, Higuera G, Romero J. Virtual screening of gene expression regulatory sites in non-coding regions of the infectious salmon anemia virus. BMC Res Notes 2014; 7:477. [PMID: 25069483 PMCID: PMC4132239 DOI: 10.1186/1756-0500-7-477] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2014] [Accepted: 07/09/2014] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Members of the Orthomyxoviridae family, which contains an important fish pathogen called the infectious salmon anemia virus (ISAV), have a genome consisting of eight segments of single-stranded RNA that encode different viral proteins. Each of these segments is flanked by non-coding regions (NCRs). In other Orthomyxoviruses, sequences have been shown within these NCRs that regulate gene expression and virulence; however, only the sequences of these regions are known in ISAV, and a biological role has not yet been attributed to these regions. This study aims to determine possible functions of the NCRs of ISAV. RESULTS The results suggested an association between the molecular architecture of NCR regions and their role in the viral life cycle. The available NCR sequences from ISAV isolates were compiled, alignments were performed to obtain a consensus sequence, and conserved regions were identified in this consensus sequence. To determine the molecular structure adopted by these NCRs, various bioinformatics tools, including RNAfold, RNAstructure, Sfold, and Mfold, were used. This hypothetical structure, together with a comparison with influenza, yielded reliable secondary structure models that lead to the identification of conserved nucleotide positions on an intergenus level. These models determined which nucleotide positions are involved in the recognition of the vRNA/cRNA by RNA-dependent RNA polymerase (RdRp) or mRNA by the ribosome. CONCLUSIONS The information obtained in this work allowed the proposal of previously unknown sites that are involved in the regulation of different stages of the viral cycle, leading to the identification of new viral targets that may assist future antiviral strategies.
Collapse
Affiliation(s)
| | | | | | | | - Jaime Romero
- Instituto de Nutrición y Tecnología de los Alimentos, INTA, Universidad de Chile, Avenida El Líbano #5524, Macul, Santiago, Chile.
| |
Collapse
|
28
|
Wethmar K. The regulatory potential of upstream open reading frames in eukaryotic gene expression. WILEY INTERDISCIPLINARY REVIEWS-RNA 2014; 5:765-78. [DOI: 10.1002/wrna.1245] [Citation(s) in RCA: 127] [Impact Index Per Article: 12.7] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/21/2014] [Revised: 05/09/2014] [Accepted: 05/09/2014] [Indexed: 01/04/2023]
Affiliation(s)
- Klaus Wethmar
- Max-Delbrueck-Center for Molecular Medicine; Berlin Germany
- Helios Klinikum Berlin-Buch; Berlin Germany
| |
Collapse
|
29
|
Emerging evidence for functional peptides encoded by short open reading frames. Nat Rev Genet 2014; 15:193-204. [PMID: 24514441 DOI: 10.1038/nrg3520] [Citation(s) in RCA: 381] [Impact Index Per Article: 38.1] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022]
Abstract
Short open reading frames (sORFs) are a common feature of all genomes, but their coding potential has mostly been disregarded, partly because of the difficulty in determining whether these sequences are translated. Recent innovations in computing, proteomics and high-throughput analyses of translation start sites have begun to address this challenge and have identified hundreds of putative coding sORFs. The translation of some of these has been confirmed, although the contribution of their peptide products to cellular functions remains largely unknown. This Review examines this hitherto overlooked component of the proteome and considers potential roles for sORF-encoded peptides.
Collapse
|
30
|
Skarshewski A, Stanton-Cook M, Huber T, Al Mansoori S, Smith R, Beatson SA, Rothnagel JA. uPEPperoni: an online tool for upstream open reading frame location and analysis of transcript conservation. BMC Bioinformatics 2014; 15:36. [PMID: 24484385 PMCID: PMC3914846 DOI: 10.1186/1471-2105-15-36] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2013] [Accepted: 01/11/2014] [Indexed: 01/17/2023] Open
Abstract
BACKGROUND Several small open reading frames located within the 5' untranslated regions of mRNAs have recently been shown to be translated. In humans, about 50% of mRNAs contain at least one upstream open reading frame representing a large resource of coding potential. We propose that some upstream open reading frames encode peptides that are functional and contribute to proteome complexity in humans and other organisms. We use the term uPEPs to describe peptides encoded by upstream open reading frames. RESULTS We have developed an online tool, termed uPEPperoni, to facilitate the identification of putative bioactive peptides. uPEPperoni detects conserved upstream open reading frames in eukaryotic transcripts by comparing query nucleotide sequences against mRNA sequences within the NCBI RefSeq database. The algorithm first locates the main coding sequence and then searches for open reading frames 5' to the main start codon which are subsequently analysed for conservation. uPEPperoni also determines the substitution frequency for both the upstream open reading frames and the main coding sequence. In addition, the uPEPperoni tool produces sequence identity heatmaps which allow rapid visual inspection of conserved regions in paired mRNAs. CONCLUSIONS uPEPperoni features user-nominated settings including, nucleotide match/mismatch, gap penalties, Ka/Ks ratios and output mode. The heatmap output shows levels of identity between any two sequences and provides easy recognition of conserved regions. Furthermore, this web tool allows comparison of evolutionary pressures acting on the upstream open reading frame against other regions of the mRNA. Additionally, the heatmap web applet can also be used to visualise the degree of conservation in any pair of sequences. uPEPperoni is freely available on an interactive web server at http://upep-scmb.biosci.uq.edu.au.
Collapse
Affiliation(s)
| | | | | | | | | | | | - Joseph A Rothnagel
- School of Chemistry and Molecular Biosciences, University of Queensland, Brisbane, QLD 4072, Australia.
| |
Collapse
|
31
|
Wethmar K, Barbosa-Silva A, Andrade-Navarro MA, Leutz A. uORFdb--a comprehensive literature database on eukaryotic uORF biology. Nucleic Acids Res 2013; 42:D60-7. [PMID: 24163100 PMCID: PMC3964959 DOI: 10.1093/nar/gkt952] [Citation(s) in RCA: 62] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022] Open
Abstract
Approximately half of all human transcripts contain at least one upstream translational initiation site that precedes the main coding sequence (CDS) and gives rise to an upstream open reading frame (uORF). We generated uORFdb, publicly available at http://cbdm.mdc-berlin.de/tools/uorfdb, to serve as a comprehensive literature database on eukaryotic uORF biology. Upstream ORFs affect downstream translation by interfering with the unrestrained progression of ribosomes across the transcript leader sequence. Although the first uORF-related translational activity was observed >30 years ago, and an increasing number of studies link defective uORF-mediated translational control to the development of human diseases, the features that determine uORF-mediated regulation of downstream translation are not well understood. The uORFdb was manually curated from all uORF-related literature listed at the PubMed database. It categorizes individual publications by a variety of denominators including taxon, gene and type of study. Furthermore, the database can be filtered for multiple structural and functional uORF-related properties to allow convenient and targeted access to the complex field of eukaryotic uORF biology.
Collapse
Affiliation(s)
- Klaus Wethmar
- Max Delbrück Center for Molecular Medicine (MDC), Cell Differentiation and Tumorigenesis, Robert-Rössle-Strasse 10, D-13092 Berlin, Germany, Hematology, Oncology and Tumor Immunology, Helios Klinikum Berlin-Buch, Schwanebecker Chaussee 50, D-13125 Berlin, Germany, Max Delbrück Center for Molecular Medicine (MDC), Computational Biology and Data Mining, Robert-Rössle-Strasse 10, D-13092 Berlin, Germany and Humoldt-University, Department of Biology, Invalidenstrasse 43, D-10115 Berlin, Germany
| | | | | | | |
Collapse
|
32
|
Abstract
Long intervening noncoding RNAs (lincRNAs) are transcribed from thousands of loci in mammalian genomes and might play widespread roles in gene regulation and other cellular processes. This Review outlines the emerging understanding of lincRNAs in vertebrate animals, with emphases on how they are being identified and current conclusions and questions regarding their genomics, evolution and mechanisms of action.
Collapse
Affiliation(s)
- Igor Ulitsky
- Whitehead Institute for Biomedical Research, Cambridge, MA 02142, USA
| | | |
Collapse
|
33
|
Dave RK, Dinger ME, Andrew M, Askarian-Amiri M, Hume DA, Kellie S. Regulated expression of PTPRJ/CD148 and an antisense long noncoding RNA in macrophages by proinflammatory stimuli. PLoS One 2013; 8:e68306. [PMID: 23840844 PMCID: PMC3695918 DOI: 10.1371/journal.pone.0068306] [Citation(s) in RCA: 33] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2013] [Accepted: 05/28/2013] [Indexed: 12/28/2022] Open
Abstract
PTPRJ/CD148 is a tyrosine phosphatase that has tumour suppressor-like activity. Quantitative PCR of various cells and tissues revealed that it is preferentially expressed in macrophage-enriched tissues. Within lymphoid tissues immunohistochemistry revealed that PTPRJ/CD148 co-localised with F4/80, indicating that macrophages most strongly express the protein. Macrophages express the highest basal level of ptprj, and this is elevated further by treatment with LPS and other Toll-like receptor ligands. In contrast, CSF-1 treatment reduced basal and stimulated Ptprj expression in human and mouse cells, and interferon also repressed Ptprj expression. We identified a 1006 nucleotide long noncoding RNA species, Ptprj-as1 that is transcribed antisense to Ptprj. Ptprj-as1 was highly expressed in macrophage-enriched tissue and was transiently induced by Toll-like receptor ligands with a similar time course to Ptprj. Finally, putative transcription factor binding sites in the promoter region of Ptprj were identified.
Collapse
Affiliation(s)
- Richa K. Dave
- The University of Queensland, Institute for Molecular Bioscience, Brisbane, Australia
- The University of Queensland, Cooperative Research Centre for Chronic Inflammatory Diseases (CRC-CID), Brisbane, Australia
- The University of Queensland, Australian Infectious Diseases Research Centre, School of Chemistry and Molecular Biosciences, Australia
| | - Marcel E. Dinger
- The University of Queensland Diamantina Institute, Brisbane, Australia
- Garvan Institute of Medical Research, Darlinghurst, Australia
| | - Megan Andrew
- The University of Queensland, Australian Infectious Diseases Research Centre, School of Chemistry and Molecular Biosciences, Australia
| | - Marjan Askarian-Amiri
- The University of Queensland, Institute for Molecular Bioscience, Brisbane, Australia
| | - David A. Hume
- The University of Queensland, Institute for Molecular Bioscience, Brisbane, Australia
- The University of Queensland, Cooperative Research Centre for Chronic Inflammatory Diseases (CRC-CID), Brisbane, Australia
- The Roslin Institute, University of Edinburgh, Roslin, Scotland, United Kingdom
| | - Stuart Kellie
- The University of Queensland, Institute for Molecular Bioscience, Brisbane, Australia
- The University of Queensland, Cooperative Research Centre for Chronic Inflammatory Diseases (CRC-CID), Brisbane, Australia
- The University of Queensland, Australian Infectious Diseases Research Centre, School of Chemistry and Molecular Biosciences, Australia
| |
Collapse
|
34
|
Nguyen HL, Yang X, Omiecinski CJ. Expression of a novel mRNA transcript for human microsomal epoxide hydrolase (EPHX1) is regulated by short open reading frames within its 5'-untranslated region. RNA (NEW YORK, N.Y.) 2013; 19:752-66. [PMID: 23564882 PMCID: PMC3683910 DOI: 10.1261/rna.037036.112] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/03/2023]
Abstract
Microsomal epoxide hydrolase (mEH, EPHX1) is a critical xenobiotic-metabolizing enzyme, catalyzing both detoxification and bioactivation reactions that direct the disposition of chemical epoxides, including the carcinogenic metabolites of several polycyclic aromatic hydrocarbons. Recently, we discovered that a previously unrecognized and primate-specific EPHX1 transcript, termed E1-b, was actually the predominant driver of EPHX1 expression in all human tissues. In this study, we identify another human EPHX1 transcript, designated as E1-b'. Unusually, both the E1-b and E1-b' mRNA transcripts are generated from the use of a far upstream gene promoter, localized ∼18.5 kb 5'-upstream of the EPHX1 protein-coding region. Although expressed at comparatively lower levels than E1-b, the novel E1-b' transcript is readily detected in all tissues examined, with highest levels maintained in human ovary. The E1-b' mRNA possesses unusual functional features in its 5'-untranslated region, including a GC-rich leader sequence and two upstream AUGs that encode for short peptides of 26 and 17 amino acids in length, respectively. Results from in vitro transcription/translation assays and direct transfection in mammalian cells of either the E1-b' transcript or the encoded peptides demonstrated that the E1-b' upstream open reading frames (uORFs) are functional, with their presence markedly inhibiting the translation of EPHX1 protein, both in cis and in trans configurations. These unique uORF peptides exhibit no homology to any other known uORF sequences but likely function to mediate post-transcription regulation of EPHX1 and perhaps more broadly as translational regulators in human cells.
Collapse
|
35
|
Occhi G, Regazzo D, Trivellin G, Boaretto F, Ciato D, Bobisse S, Ferasin S, Cetani F, Pardi E, Korbonits M, Pellegata NS, Sidarovich V, Quattrone A, Opocher G, Mantero F, Scaroni C. A novel mutation in the upstream open reading frame of the CDKN1B gene causes a MEN4 phenotype. PLoS Genet 2013; 9:e1003350. [PMID: 23555276 PMCID: PMC3605397 DOI: 10.1371/journal.pgen.1003350] [Citation(s) in RCA: 108] [Impact Index Per Article: 9.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2012] [Accepted: 01/16/2013] [Indexed: 11/19/2022] Open
Abstract
The CDKN1B gene encodes the cyclin-dependent kinase inhibitor p27KIP1, an atypical tumor suppressor playing a key role in cell cycle regulation, cell proliferation, and differentiation. Impaired p27KIP1 expression and/or localization are often observed in tumor cells, further confirming its central role in regulating the cell cycle. Recently, germline mutations in CDKN1B have been associated with the inherited multiple endocrine neoplasia syndrome type 4, an autosomal dominant syndrome characterized by varying combinations of tumors affecting at least two endocrine organs. In this study we identified a 4-bp deletion in a highly conserved regulatory upstream ORF (uORF) in the 5′UTR of the CDKN1B gene in a patient with a pituitary adenoma and a well-differentiated pancreatic neoplasm. This deletion causes the shift of the uORF termination codon with the consequent lengthening of the uORF–encoded peptide and the drastic shortening of the intercistronic space. Our data on the immunohistochemical analysis of the patient's pancreatic lesion, functional studies based on dual-luciferase assays, site-directed mutagenesis, and on polysome profiling show a negative influence of this deletion on the translation reinitiation at the CDKN1B starting site, with a consequent reduction in p27KIP1 expression. Our findings demonstrate that, in addition to the previously described mechanisms leading to reduced p27KIP1 activity, such as degradation via the ubiquitin/proteasome pathway or non-covalent sequestration, p27KIP1 activity can also be modulated by an uORF and mutations affecting uORF could change p27KIP1 expression. This study adds the CDKN1B gene to the short list of genes for which mutations that either create, delete, or severely modify their regulatory uORFs have been associated with human diseases. Gene expression can be modulated at different steps on the way from DNA to protein including control of transcription, translation, and post-translational modifications. An abnormality in the regulation of mRNA and protein expression is a hallmark of many human diseases, including cancer. In some eukaryotic genes translation can be influenced by small DNA sequences termed upstream open reading frames (uORFs). These elements located upstream to the gene start codon may either negatively influence the ability of the translational machinery to reinitiate translation of the main protein or, much less frequently, stimulate protein translation by enabling the ribosomes to bypass cis-acting inhibitory elements. CDKN1B, which encodes the cell cycle inhibitor p27KIP1, includes an uORF in its 5′UTR sequence. p27KIP1 expression is often reduced in cancer, and germline mutations have been identified in CDKN1B in patients affected with a syndrome (MEN4) characterized by varying combinations of tumors in endocrine glands. Here we show that a small deletion in the uORF upstream to CDKN1B reduces translation reinitiation efficiency, leading to underexpression of p27KIP1 and coinciding with tumorigenesis. This study describes a novel mechanism by which p27KIP1 could be underexpressed in human tumors. In addition, our data provide a new insight to the unique pathogenic potential of uORFs in human diseases.
Collapse
Affiliation(s)
- Gianluca Occhi
- Department of Medicine, Endocrinology Unit, University of Padova, Padova, Italy.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
36
|
Takahashi H, Takahashi A, Naito S, Onouchi H. BAIUCAS: a novel BLAST-based algorithm for the identification of upstream open reading frames with conserved amino acid sequences and its application to the Arabidopsis thaliana genome. ACTA ACUST UNITED AC 2012; 28:2231-41. [PMID: 22618534 DOI: 10.1093/bioinformatics/bts303] [Citation(s) in RCA: 41] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022]
Abstract
MOTIVATION Upstream open reading frames (uORFs) are often found in the 5'-untranslated regions of eukaryotic messenger RNAs. Some uORFs have been shown to encode functional peptides involved in the translational regulation of the downstream main ORFs. Comparative genomic approaches have been used in genome-wide searches for uORFs encoding bioactive peptides, and by comparing uORF sequences between a few selected species or among a small group of species, uORFs with conserved amino acid sequences (UCASs) have been identified in plants, mammals and insects. Regulatory regions within uORF-encoded peptides that are involved in translational control are typically 10-20 amino acids long. Detection of homology between such short regions largely depends on the selection of species for comparison. To maximize the chances of identifying UCASs with short conserved regions, we devised a novel algorithm for homology search among a large number of species and the automatic selection of uORFs conserved in a wide range of species. RESULTS In this study, we developed the BAIUCAS (BLAST-based algorithm for identification of UCASs) method and identified 18 novel Arabidopsis uORFs whose amino acid sequences are conserved across diverse eudicot species, which include uORFs not found in previous comparative genomic studies due to low sequence conservation among species. Therefore, BAIUCAS is a powerful method for the identification of UCASs, and it is particularly useful for the detection of uORFs with a small number of conserved amino acid residues.
Collapse
Affiliation(s)
- Hiro Takahashi
- Plant Biology Research Center, Chubu University, Kasugai, Aichi, Japan
| | | | | | | |
Collapse
|
37
|
Harte RA, Farrell CM, Loveland JE, Suner MM, Wilming L, Aken B, Barrell D, Frankish A, Wallin C, Searle S, Diekhans M, Harrow J, Pruitt KD. Tracking and coordinating an international curation effort for the CCDS Project. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2012; 2012:bas008. [PMID: 22434842 PMCID: PMC3308164 DOI: 10.1093/database/bas008] [Citation(s) in RCA: 45] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/27/2022]
Abstract
The Consensus Coding Sequence (CCDS) collaboration involves curators at multiple centers with a goal of producing a conservative set of high quality, protein-coding region annotations for the human and mouse reference genome assemblies. The CCDS data set reflects a ‘gold standard’ definition of best supported protein annotations, and corresponding genes, which pass a standard series of quality assurance checks and are supported by manual curation. This data set supports use of genome annotation information by human and mouse researchers for effective experimental design, analysis and interpretation. The CCDS project consists of analysis of automated whole-genome annotation builds to identify identical CDS annotations, quality assurance testing and manual curation support. Identical CDS annotations are tracked with a CCDS identifier (ID) and any future change to the annotated CDS structure must be agreed upon by the collaborating members. CCDS curation guidelines were developed to address some aspects of curation in order to improve initial annotation consistency and to reduce time spent in discussing proposed annotation updates. Here, we present the current status of the CCDS database and details on our procedures to track and coordinate our efforts. We also present the relevant background and reasoning behind the curation standards that we have developed for CCDS database treatment of transcripts that are nonsense-mediated decay (NMD) candidates, for transcripts containing upstream open reading frames, for identifying the most likely translation start codons and for the annotation of readthrough transcripts. Examples are provided to illustrate the application of these guidelines. Database URL: http://www.ncbi.nlm.nih.gov/CCDS/CcdsBrowse.cgi
Collapse
Affiliation(s)
- Rachel A Harte
- Center for Biomolecular Science and Engineering, University of California Santa Cruz (UCSC), Santa Cruz, CA 95064, USA
| | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
38
|
Huang L, Li W, Tang W, Zhu X, Ou-yang P, Lu G. A Chinese family with Oguchi's disease due to compound heterozygosity including a novel deletion in the arrestin gene. Mol Vis 2012; 18:528-36. [PMID: 22419846 PMCID: PMC3298420] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2011] [Accepted: 02/27/2012] [Indexed: 11/06/2022] Open
Abstract
PURPOSE Oguchi's disease is a rare autosomal recessive disease and known to be caused by mutations in the rhodopsin kinase (GRK1) gene or the arrestin (SAG) gene. SAG contains 16 exons and encodes a protein with 405 amino acids. This study was to identify the underlying genetic defects in a non-consanguineous Chinese family with Oguchi's disease. METHODS Ophthalmologic examinations including fundus photography and electroretinography (ERG) were performed on all family members. All exons of the GRK1 gene and the SAG gene were amplified with PCR and directly sequenced. Quantitative real-time PCR (qPCR) was performed to screen heterozygous deletions/duplications in the SAG gene. Long-range PCR and direct sequencing were further performed to define the breakpoints. RESULTS The patient had characteristic clinical features of Oguchi's disease, including night blindness, normal vision fields, typical fundus appearance with the Mizuo-Nakamura phenomenon, nearly undetectable rod b waves in the scotopic 0.01 ERGs, and nearly "negative" scotopic 3.0 ERGs. No mutations were found in the GRK1 gene. A heterozygous nonsense Arg193stop (R193X) mutation was found in the SAG gene in the patient and the unaffected mother. No pathogenic SAG mutations were found in the unaffected father. qPCRs showed a heterozygous deletion encompassing exon 2 of the SAG gene in the patient and the unaffected father. Long-range PCR and direct sequencing verified the deletion and revealed the breakpoints of the deletion, skipping a 3,224-bp fragment of the SAG gene. The deletion was not detected in 96 unrelated healthy controls. This deletion was predicted to eliminate the exon 2 and the AUG initiate codon in the mature SAG mRNA and cause no production of the SAG protein or low-level production of a non-functional truncated protein lacking 134 amino acids in the NH(2) terminus. CONCLUSIONS Compound heterozygosity of a nonsense R193X mutation and a heterozygous deletion of 3,224 bp encompassing exon 2 in the SAG gene is the cause of Oguchi's disease in this Chinese family. qPCR analysis should be performed if there is a negative result of the mutation screening of the SAG gene in patients with Oguchi's disease.
Collapse
Affiliation(s)
- Lingli Huang
- Institute of Reproductive and Stem Cell Engineering, Central South University, Changsha, P.R. China
| | - Wen Li
- Institute of Reproductive and Stem Cell Engineering, Central South University, Changsha, P.R. China,Reproductive and Genetic Hospital of Citic-Xiangya, Changsha, P.R. China
| | - Weilin Tang
- Reproductive and Genetic Hospital of Citic-Xiangya, Changsha, P.R. China
| | - Xiaohua Zhu
- Department of Ophthalmology, the Second Affiliated Xiangya Hospital, Central South University, Changsha, P.R. China
| | - Pingbo Ou-yang
- Department of Ophthalmology, the Second Affiliated Xiangya Hospital, Central South University, Changsha, P.R. China
| | - Guangxiu Lu
- Institute of Reproductive and Stem Cell Engineering, Central South University, Changsha, P.R. China,Reproductive and Genetic Hospital of Citic-Xiangya, Changsha, P.R. China
| |
Collapse
|
39
|
Jorgensen RA, Dorantes-Acosta AE. Conserved Peptide Upstream Open Reading Frames are Associated with Regulatory Genes in Angiosperms. FRONTIERS IN PLANT SCIENCE 2012; 3:191. [PMID: 22936940 PMCID: PMC3426882 DOI: 10.3389/fpls.2012.00191] [Citation(s) in RCA: 50] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/09/2012] [Accepted: 08/04/2012] [Indexed: 05/20/2023]
Abstract
Upstream open reading frames (uORFs) are common in eukaryotic transcripts, but those that encode conserved peptides occur in less than 1% of transcripts. The peptides encoded by three plant conserved peptide uORF (CPuORF) families are known to control translation of the downstream ORF in response to a small signal molecule (sucrose, polyamines, and phosphocholine). In flowering plants, transcription factors are statistically over-represented among genes that possess CPuORFs, and in general it appeared that many CPuORF genes also had other regulatory functions, though the significance of this suggestion was uncertain (Hayden and Jorgensen, 2007). Five years later the literature provides much more information on the functions of many CPuORF genes. Here we reassess the functions of 27 known CPuORF gene families and find that 22 of these families play a variety of different regulatory roles, from transcriptional control to protein turnover, and from small signal molecules to signal transduction kinases. Clearly then, there is indeed a strong association of CPuORFs with regulatory genes. In addition, 16 of these families play key roles in a variety of different biological processes. Most strikingly, the core sucrose response network includes three different CPuORFs, creating the potential for sophisticated balancing of the network in response to three different molecular inputs. We propose that the function of most CPuORFs is to modulate translation of a downstream major ORF (mORF) in response to a signal molecule recognized by the conserved peptide and that because the mORFs of CPuORF genes generally encode regulatory proteins, many of them centrally important in the biology of plants, CPuORFs play key roles in balancing such regulatory networks.
Collapse
Affiliation(s)
- Richard A. Jorgensen
- Laboratorio Nacional de Genómica para la Biodiversidad, Centro de Investigación y Estudios Avanzados del Instituto Politécnico NacionalIrapuato, Guanajuato, México
- *Correspondence: Richard A. Jorgensen, Laboratorio Nacional de Genómica para la Biodiversidad, Centro de Investigación y Estudios Avanzados del Instituto Politécnico Nacional, Km 9.6 Libramiento Norte Carretera León, 36821 Irapuato, Guanajuato, México. e-mail:
| | - Ana E. Dorantes-Acosta
- Instituto de Biotecnología y Ecología Aplicada, Universidad VeracruzanaXalapa, Veracruz, México
| |
Collapse
|
40
|
Kimbi GC, Kew MC, Kramvis A. The effect of the G1888A mutation of subgenotype A1 of hepatitis B virus on the translation of the core protein. Virus Res 2011; 163:334-40. [PMID: 22100339 DOI: 10.1016/j.virusres.2011.10.024] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2011] [Revised: 10/25/2011] [Accepted: 10/25/2011] [Indexed: 10/15/2022]
Abstract
A distinctive characteristic of subgenotype A1 of hepatitis B virus is G1888A in the precore region. This transition introduces an out-of-frame AUG, creating an overlapping upstream open reading frame (uORF), terminating five nucleotides downstream from the core AUG. This uORF can potentially be translated into a seven amino acid peptide. In addition to stabilizing the encapsidation signal by forming a base pair with T1871, this mutation may affect translation of the core protein. The aim of this study was to use reporter constructs to determine whether G1888A had any modulating effect on core protein translation. The complete core gene with part of the precore of subgenotype A1 was cloned into the amino terminal of a green fluorescent protein (GFP) plasmid. Core/GFP fusion protein expression was measured using flow cytometry following transfection of Huh 7 cells. The introduction of uORF resulted in an 18.75% reduction of core gene expression. When the suboptimal Kozak sequence of the 1888 AUG was replaced with an optimal one, this reduction was enhanced (64.84%). By increasing the distance between the stop of the overlapping uORF and the core AUG, by a minimum of 15 nucleotides, core/GFP expression was almost doubled, indicating that stalling of ribosomes at the stop of the uORF may be interfering with initiation at the core AUG through steric hindrance. Our findings indicate that the G1888A mutation, may interfere with initiation at the downstream 1901 core AUG, decreasing core protein translation. This decrease may account for the relatively low viral loads seen in individuals infected with subgenotype A1.
Collapse
Affiliation(s)
- Gerald C Kimbi
- Hepatitis Virus Diversity Research Programme (formerly MRC/CANSA/University Molecular Hepatology Research Unit), Department of Internal Medicine, Faculty of Health Sciences, University of the Witwatersrand, 7 York Road, Parktown, Johannesburg 2193, South Africa.
| | | | | |
Collapse
|
41
|
Just W, Zeller J, Riegert C, Speit G. Genetic polymorphisms in the formaldehyde dehydrogenase gene and their biological significance. Toxicol Lett 2011; 207:121-7. [PMID: 21920416 DOI: 10.1016/j.toxlet.2011.08.025] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2011] [Revised: 07/25/2011] [Accepted: 08/30/2011] [Indexed: 11/29/2022]
Abstract
The GSH-dependent formaldehyde dehydrogenase (FDH) is the most important enzyme for the metabolic inactivation of formaldehyde. We studied three polymorphisms of this gene with the intention to elucidate their relevance for inter-individual differences in the protection against the (geno-)toxicity of FA. The first polymorphism (rs11568816) was investigated using real-time PCR and restriction fragment analysis in 150 subjects. However, we did not find the polymorphic sequence in any of the subjects. We studied a second polymorphism (rs17028487), representing a base exchange (c.*114A>G) in exon 9 of the FDH gene. We analyzed 70 subjects with the SNaPshot Primer Extension method and subsequent analysis in a ABI PRISM 3100, but no variant allele was identified. A third polymorphism, rs13832 in exon 9 (c.*493G>T), was studied in a group of 105 subjects by the SNaPshot Primer Extension method. 43 of the subjects were heterozygous for the polymorphism (G/T), 46 homozygous for the T allele, and 16 were homozygous for the G-allele. Real-time RT-PCR measurements of FDH mRNA did not indicate a significant difference in transcript levels between the heterozygous and the homozygous groups. The in vitro comet assay after FA exposure of blood samples obtained from 5 homozygous GG and 3 homozygous TT subjects did not lead to a significant difference between these two groups. Altogether, our study did not identify biologically relevant polymorphisms in transcribed regions of the FDH gene, which may lead to inter-individual differences in the metabolic inactivation of FA.
Collapse
Affiliation(s)
- Walter Just
- Universität Ulm, Institut für Humangenetik, Ulm, Germany
| | | | | | | |
Collapse
|
42
|
Bazykin GA, Kochetov AV. Alternative translation start sites are conserved in eukaryotic genomes. Nucleic Acids Res 2010; 39:567-77. [PMID: 20864444 PMCID: PMC3025576 DOI: 10.1093/nar/gkq806] [Citation(s) in RCA: 81] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
Alternative start AUG codons within a single transcript can contribute to diversity of the proteome; however, their functional significance remains controversial. Here, we provide comparative genomics evidence that alternative start codons are under negative selection in vertebrates, insects and yeast. In genes where the annotated start codon (sAUG) resides within the suboptimal nucleotide context, the downstream in-frame AUG codons (dAUG) among the first ∼30 codon sites are significantly more conserved between species than in genes where the sAUG resides within the optimal context. Proteomics data show that this difference is not an annotation artifact and that dAUGs are in fact under selection as alternative start sites. The key optimal, and sometimes suboptimal, context-determining nucleotides of both the sAUG and dAUGs are conserved. Selection for secondary start sites is stronger in genes with the weak primary start site. Genes with multiple conserved start sites are enriched for transcription factors, and tend to have longer 5'UTRs and higher degree of alternative splicing. Together, these results imply that the use of alternative start sites by means of leaky mRNA scanning is a functional mechanism under selection for increased efficiency of translation and/or for translation of different N-terminal protein variants.
Collapse
Affiliation(s)
- Georgii A Bazykin
- Institute for Information Transmission Problems of the Russian Academy of Sciences (Kharkevich Institute), Moscow, Russia.
| | | |
Collapse
|
43
|
Selpi, Bryant CH, Kemp GJL, Sarv J, Kristiansson E, Sunnerhagen P. Predicting functional upstream open reading frames in Saccharomyces cerevisiae. BMC Bioinformatics 2009; 10:451. [PMID: 20042076 PMCID: PMC2813248 DOI: 10.1186/1471-2105-10-451] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2008] [Accepted: 12/30/2009] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Some upstream open reading frames (uORFs) regulate gene expression (i.e., they are functional) and can play key roles in keeping organisms healthy. However, how uORFs are involved in gene regulation is not yet fully understood. In order to get a complete view of how uORFs are involved in gene regulation, it is expected that a large number of experimentally verified functional uORFs are needed. Unfortunately, wet-experiments to verify that uORFs are functional are expensive. RESULTS In this paper, a new computational approach to predicting functional uORFs in the yeast Saccharomyces cerevisiae is presented. Our approach is based on inductive logic programming and makes use of a novel combination of knowledge about biological conservation, Gene Ontology annotations and genes' responses to different conditions. Our method results in a set of simple and informative hypotheses with an estimated sensitivity of 76%. The hypotheses predict 301 further genes to have 398 novel functional uORFs. Three (RPC11, TPK1, and FOL1) of these 301 genes have been hypothesised, following wet-experiments, by a related study to have functional uORFs. A comparison with another related study suggests that eleven of the predicted functional uORFs from genes LDB17, HEM3, CIN8, BCK2, PMC1, FAS1, APP1, ACC1, CKA2, SUR1, and ATH1 are strong candidates for wet-lab experimental studies. CONCLUSIONS Learning based prediction of functional uORFs can be done with a high sensitivity. The predictions made in this study can serve as a list of candidates for subsequent wet-lab verification and might help to elucidate the regulatory roles of uORFs.
Collapse
Affiliation(s)
- Selpi
- Department of Applied Mechanics, Chalmers University of Technology, Göteborg, Sweden.
| | | | | | | | | | | |
Collapse
|
44
|
Song KY, Choi HS, Hwang CK, Kim CS, Law PY, Wei LN, Loh HH. Differential use of an in-frame translation initiation codon regulates human mu opioid receptor (OPRM1). Cell Mol Life Sci 2009; 66:2933-42. [PMID: 19609488 PMCID: PMC11115551 DOI: 10.1007/s00018-009-0082-7] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2009] [Revised: 06/05/2009] [Accepted: 06/19/2009] [Indexed: 11/26/2022]
Abstract
The pharmacological effects of morphine and morphine-like drugs are mediated primarily through the micro opioid receptor. Here we show that differential use of an in-frame translational start codon in the 5'-untranslated region of the OPRM1 generates different translational products in vivo and in vitro. The 5'-end of the OPRM1 gene is necessary for initiating the alternate form and for subsequent degradation of the protein. Initiation of OPRM1 at the upstream site decreases the initiation at the main AUG site. However, alternative initiation of the long form of OPRM1 produces a protein with a short half-life, resulting from degradation mediated by the ubiquitin-proteasome pathway. Reporter and degradation assays showed that mutations of this long form at the second and third lysines reduce ubiquitin-dependent proteasome degradation, stabilizing the protein. The data suggest that MOP expression is controlled in part by initiation of the long form of MOP at the alternate site.
Collapse
Affiliation(s)
- Kyu Young Song
- Department of Pharmacology, University of Minnesota Medical School, Minneapolis, MN 55455, USA.
| | | | | | | | | | | | | |
Collapse
|
45
|
Dang Do AN, Kimball SR, Cavener DR, Jefferson LS. eIF2alpha kinases GCN2 and PERK modulate transcription and translation of distinct sets of mRNAs in mouse liver. Physiol Genomics 2009; 38:328-41. [PMID: 19509078 DOI: 10.1152/physiolgenomics.90396.2008] [Citation(s) in RCA: 59] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/02/2023] Open
Abstract
In eukaryotes, selective derepression of mRNA translation through altered utilization of upstream open reading frames (uORF) or internal ribosomal entry sites (IRES) regulatory motifs following exposure to stress is regulated at the initiation stage through the increased phosphorylation of eukaryotic initiation factor 2 on its alpha-subunit (eIF2alpha). While there is only one known eIF2alpha kinase in yeast, general control nonderepressible 2 (GCN2), mammals have evolved to express at least four: GCN2, heme-regulated inhibitor kinase (HRI), double-stranded RNA-activated protein kinase (PKR), and PKR-like endoplasmic reticulum-resident kinase (PERK). So far, the main known distinction among these four kinases is their activation in response to different acute stressors. In the present study, we used the in situ perfused mouse liver model and hybridization array analyses to assess the general translational response to stress regulated by two of these kinases, GCN2 and PERK, and to differentiate between the downstream effects of activating GCN2 versus PERK. The resulting data showed that at least 2.5% of mouse liver mRNAs are subject to derepressed translation following stress. In addition, the data demonstrated that eIF2alpha kinases GCN2 and PERK differentially regulate mRNA transcription and translation, which in the latter case suggests that increased eIF2alpha phosphorylation is not sufficient for derepression of translation. These findings open an avenue for more focused future research toward groups of mRNAs that code for the early cellular stress response proteins.
Collapse
Affiliation(s)
- An N Dang Do
- Department of Cellular and Molecular Physiology, Pennsylvania State University College of Medicine, Hershey, USA
| | | | | | | |
Collapse
|
46
|
Abstract
The systems for mRNA surveillance, capping, and cleavage/polyadenylation are proposed to play pivotal roles in the physical establishment and distribution of spliceosomal introns along a transcript.
Collapse
|
47
|
Lawless C, Pearson RD, Selley JN, Smirnova JB, Grant CM, Ashe MP, Pavitt GD, Hubbard SJ. Upstream sequence elements direct post-transcriptional regulation of gene expression under stress conditions in yeast. BMC Genomics 2009; 10:7. [PMID: 19128476 PMCID: PMC2649001 DOI: 10.1186/1471-2164-10-7] [Citation(s) in RCA: 75] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2008] [Accepted: 01/07/2009] [Indexed: 01/01/2023] Open
Abstract
Background The control of gene expression in eukaryotic cells occurs both transcriptionally and post-transcriptionally. Although many genes are now known to be regulated at the translational level, in general, the mechanisms are poorly understood. We have previously presented polysomal gradient and array-based evidence that translational control is widespread in a significant number of genes when yeast cells are exposed to a range of stresses. Here we have re-examined these gene sets, considering the role of UTR sequences in the translational responses of these genes using recent large-scale datasets which define 5' and 3' transcriptional ends for many yeast genes. In particular, we highlight the potential role of 5' UTRs and upstream open reading frames (uORFs). Results We show a highly significant enrichment in specific GO functional classes for genes that are translationally up- and down-regulated under given stresses (e.g. carbohydrate metabolism is up-regulated under amino acid starvation). Cross-referencing these data with the stress response data we show that translationally upregulated genes have longer 5' UTRs, consistent with their role in translational regulation. In the first genome-wide study of uORFs in a set of mapped 5' UTRs, we show that uORFs are rare, being statistically under-represented in UTR sequences. However, they have distinct compositional biases consistent with their putative role in translational control and are more common in genes which are apparently translationally up-regulated. Conclusion These results demonstrate a central regulatory role for UTR sequences, and 5' UTRs in particular, highlighting the significant role of uORFs in post-transcriptional control in yeast. Yeast uORFs are more highly conserved than has been suggested, lending further weight to their significance as functional elements involved in gene regulation. It also suggests a more complex and novel mechanism of control, whereby uORFs permit genes to escape from a more general attenuation of translation under conditions of stress. However, since uORFs are relatively rare (only ~13% of yeast genes have them) there remain many unanswered questions as to how UTR elements can direct translational control of many hundreds of genes under stress.
Collapse
Affiliation(s)
- Craig Lawless
- Michael Smith Building, Faculty of Life Sciences, University of Manchester, Manchester, UK.
| | | | | | | | | | | | | | | |
Collapse
|
48
|
An upstream open reading frame controls translation of var2csa, a gene implicated in placental malaria. PLoS Pathog 2009; 5:e1000256. [PMID: 19119419 PMCID: PMC2603286 DOI: 10.1371/journal.ppat.1000256] [Citation(s) in RCA: 67] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2008] [Accepted: 12/05/2008] [Indexed: 01/06/2023] Open
Abstract
Malaria, caused by the parasite Plasmodium falciparum, is responsible for substantial morbidity, mortality and economic losses in tropical regions of the world. Pregnant women are exceptionally vulnerable to severe consequences of the infection, due to the specific adhesion of parasite-infected erythrocytes in the placenta. This adhesion is mediated by a unique variant of PfEMP1, a parasite encoded, hyper-variable antigen placed on the surface of infected cells. This variant, called VAR2CSA, binds to chondroitin sulfate A on syncytiotrophoblasts in the intervillous space of placentas. VAR2CSA appears to only be expressed in the presence of a placenta, suggesting that its expression is actively repressed in men, children or non-pregnant women; however, the mechanism of repression is not understood. Using cultured parasite lines and reporter gene constructs, we show that the gene encoding VAR2CSA contains a small upstream open reading frame that acts to repress translation of the resulting mRNA, revealing a novel form of gene regulation in malaria parasites. The mechanism underlying this translational repression is reversible, allowing high levels of protein translation upon selection, thus potentially enabling parasites to upregulate expression of this variant antigen in the presence of the appropriate host tissue. Infection by the protozoan parasite Plasmodium falciparum results in the most severe form of human malaria and is responsible for significant morbidity and mortality in the developing world. This disease can be particularly severe in pregnant women due to the specific adhesion of parasite-infected red blood cells within the placenta. Expression of a single gene called var2csa has been linked to targeting of the placenta, and thus this gene represents a key element in the virulence of P. falciparum infections. It was previously shown that var2csa is predominantly expressed by parasites in pregnant women, suggesting that parasites might have the ability to down regulate this gene when no placenta is available. Here we describe an upstream open reading frame (uORF)–mediated mechanism used by parasites to repress translation of var2csa mRNA, thus providing a mechanism for controlling gene expression at the level of protein translation. This mechanism has not previously been observed in malaria parasites, and may represent a form of regulation used to control expression of other genes within the genome.
Collapse
|
49
|
Tautz D. Polycistronic peptide coding genes in eukaryotes--how widespread are they? BRIEFINGS IN FUNCTIONAL GENOMICS AND PROTEOMICS 2008; 8:68-74. [PMID: 19074495 DOI: 10.1093/bfgp/eln054] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/13/2022]
Abstract
The classical textbook assumption for the structure of an eukaryotic gene is that it codes for a single polypeptide of more than 100 amino acids in length. This is also the implicit assumption in most gene annotation pipelines. A gene family has now been discovered in insects that shows that an eukaryotic mRNA can code for peptides as short as eleven amino acids and that a single mRNA can code for several such peptides. This raises the question whether short open reading frames might also have a functional potential in other mRNAs, in particular those that occur in the 5'-UTR of many mRNAs. A number of these have been shown to act in cis to regulate the translation of the main open reading frame of the mRNA. But there may be others that could act in trans on other biological processes. The question of how many peptide-coding genes may exist is therefore worth revisiting. This poses new bioinformatic challenges that can only be resolved through multiple genome comparisons within a range of evolutionary distances.
Collapse
Affiliation(s)
- Diethard Tautz
- Max-Planck-Institut für Evolutionsbiologie, August-Thienemannstrasse 2, Plön, Germany.
| |
Collapse
|
50
|
Kochetov AV. Alternative translation start sites and hidden coding potential of eukaryotic mRNAs. Bioessays 2008; 30:683-91. [PMID: 18536038 DOI: 10.1002/bies.20771] [Citation(s) in RCA: 136] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
It is widely suggested that a eukaryotic mRNA typically contains one translation start site and encodes a single functional protein product. However, according to current points of view on translation initiation mechanisms, eukaryotic ribosomes can recognize several alternative translation start sites and the number of experimentally verified examples of alternative translation is growing rapidly. Also, the frequent occurrence of alternative translation events and their functional significance are supported by the results of computational evaluations. The functional role of alternative translation and its contribution to eukaryotic proteome complexity are discussed.
Collapse
|