1
|
Vasylieva V, Arefiev I, Bourassa F, Trifiro FA, Brunet MA. Proteomics Can Rise to the Challenge of Pseudogenes' Coding Nature. J Proteome Res 2024; 23:5233-5249. [PMID: 39486438 PMCID: PMC11629383 DOI: 10.1021/acs.jproteome.4c00116] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2024] [Revised: 09/18/2024] [Accepted: 10/18/2024] [Indexed: 11/04/2024]
Abstract
Throughout the past decade, technological advances in genomics and transcriptomics have revealed pervasive translation throughout mammalian genomes. These putative proteins are usually excluded from proteomics analyses, as they are absent from common protein repositories. A sizable portion of these noncanonical proteins is translated from pseudogenes. Pseudogenes are commonly termed defective copies of coding genes unable to produce proteins. Here, we suggest that proteomics can help in their annotation. First, we define important terms and review specific examples underlining the caveats in pseudogene annotation and their coding potential. Then, we will discuss the challenges inherent to pseudogenes that have thus far rendered complex their confidence in omics data. Finally, we identify recent developments in experimental procedures, instrumentation, and computational methods in proteomics that put the field in a unique position to solve the pseudogene annotation conundrum.
Collapse
Affiliation(s)
- Valeriia Vasylieva
- Pediatrics
Department, Université de Sherbrooke, Sherbrooke, Québec J1K 2R1, Canada
- Centre
de Recherche du Centre hospitalier de l’université de
Sherbrooke (CRCHUS), Sherbrooke, Québec J1E 4K8, Canada
| | - Ihor Arefiev
- Pediatrics
Department, Université de Sherbrooke, Sherbrooke, Québec J1K 2R1, Canada
- Centre
de Recherche du Centre hospitalier de l’université de
Sherbrooke (CRCHUS), Sherbrooke, Québec J1E 4K8, Canada
| | - Francis Bourassa
- Pediatrics
Department, Université de Sherbrooke, Sherbrooke, Québec J1K 2R1, Canada
- Centre
de Recherche du Centre hospitalier de l’université de
Sherbrooke (CRCHUS), Sherbrooke, Québec J1E 4K8, Canada
| | - Félix-Antoine Trifiro
- Pediatrics
Department, Université de Sherbrooke, Sherbrooke, Québec J1K 2R1, Canada
- Centre
de Recherche du Centre hospitalier de l’université de
Sherbrooke (CRCHUS), Sherbrooke, Québec J1E 4K8, Canada
| | - Marie A. Brunet
- Pediatrics
Department, Université de Sherbrooke, Sherbrooke, Québec J1K 2R1, Canada
- Centre
de Recherche du Centre hospitalier de l’université de
Sherbrooke (CRCHUS), Sherbrooke, Québec J1E 4K8, Canada
| |
Collapse
|
2
|
Xiao X, Wang Y, Li T, Wang Q, Luo X, Li J, Gao L. Microproteins encoded by short open reading frames: Vital regulators in neurological diseases. Prog Neurobiol 2024; 243:102694. [PMID: 39586488 DOI: 10.1016/j.pneurobio.2024.102694] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2024] [Revised: 10/18/2024] [Accepted: 11/20/2024] [Indexed: 11/27/2024]
Abstract
Short open reading frames (sORFs) are frequently overlooked because of their historical classification as non-coding elements or dismissed as "transcriptional noise". However, advanced genomic and proteomic technologies have allowed for screening and validating sORFs-encoded peptides, revealing their fundamental regulatory roles in cellular processes and sparking a growing interest in microprotein biology. In neuroscience, microproteins serve as neurotransmitters in signal transmission and regulate metabolism and emotions, exerting pivotal effects on neurological conditions such as nerve injury, neurogenic tumors, inflammation, and neurodegenerative diseases. This review summarizes the origins, characteristics, classifications, and functions of microproteins, focusing on their molecular mechanisms in neurological disorders. Potential applications, future perspectives, and challenges are discussed.
Collapse
Affiliation(s)
- Xiao Xiao
- Laboratory of Molecular Translational Medicine, Center for Translational Medicine, Key Laboratory of Birth Defects and Related Diseases of Women and Children (Sichuan University), Ministry of Education, West China Second University Hospital, Sichuan University, Chengdu, Sichuan 610041, PR China; Department of Medical Genetics, West China Second University Hospital, Sichuan University, Chengdu, Sichuan 610041, PR China.
| | - Yitian Wang
- Laboratory of Molecular Translational Medicine, Center for Translational Medicine, Key Laboratory of Birth Defects and Related Diseases of Women and Children (Sichuan University), Ministry of Education, West China Second University Hospital, Sichuan University, Chengdu, Sichuan 610041, PR China; West China School of Basic Medical Sciences & Forensic Medicine, Sichuan University, Chengdu, Sichuan 610041, PR China.
| | - Tingyu Li
- Laboratory of Molecular Translational Medicine, Center for Translational Medicine, Key Laboratory of Birth Defects and Related Diseases of Women and Children (Sichuan University), Ministry of Education, West China Second University Hospital, Sichuan University, Chengdu, Sichuan 610041, PR China.
| | - Qiang Wang
- Laboratory of Molecular Translational Medicine, Center for Translational Medicine, Key Laboratory of Birth Defects and Related Diseases of Women and Children (Sichuan University), Ministry of Education, West China Second University Hospital, Sichuan University, Chengdu, Sichuan 610041, PR China.
| | - Xiaolei Luo
- Laboratory of Molecular Translational Medicine, Center for Translational Medicine, Key Laboratory of Birth Defects and Related Diseases of Women and Children (Sichuan University), Ministry of Education, West China Second University Hospital, Sichuan University, Chengdu, Sichuan 610041, PR China.
| | - Jingdong Li
- Institute of Hepato-Biliary-Pancreatic-Intestinal Disease, Affiliated Hospital of North Sichuan Medical College, Nanchong, Sichuan 637100, PR China.
| | - Linbo Gao
- Laboratory of Molecular Translational Medicine, Center for Translational Medicine, Key Laboratory of Birth Defects and Related Diseases of Women and Children (Sichuan University), Ministry of Education, West China Second University Hospital, Sichuan University, Chengdu, Sichuan 610041, PR China.
| |
Collapse
|
3
|
Guillon C, Pichereaux C, Lazar I, Chaoui K, Mouton-Barbosa E, Liauzun M, Gourbeyre E, Altiner P, Bouyssié D, Stella A, Burlet-Schiltz O, Plaza S, Martineau Y, Fabre B. Mass Spectrometry-Based Workflow for the Identification and Quantification of Alternative and Canonical Proteins in Pancreatic Cancer Cells. Cells 2024; 13:1966. [PMID: 39682715 DOI: 10.3390/cells13231966] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2024] [Revised: 11/22/2024] [Accepted: 11/27/2024] [Indexed: 12/18/2024] Open
Abstract
The identification of small proteins and proteins produced from unannotated open reading frames (called alternative proteins or AltProts) has changed our vision of the proteome and has attracted more and more attention from the scientific community. Despite several studies investigating particular AltProts in diseases and demonstrating their importance in such context, we are still missing data on their expression and functions in many pathologies. Among these, pancreatic ductal adenocarcinoma (PDAC) is a particularly relevant case to study alternative proteins. Indeed, late detection of this disease, notably due to the lack of reliable biomarkers of early-stage PDAC, and the fact that tumors rapidly develop resistance to most of the treatments used in the clinics warrant the exploration of new repertoires of molecules. In the present article, we aim to investigate the alternative proteome of pancreatic cancer cell lines as a first attempt to decipher the expression of AltProts in PDAC. Thanks to a combined data-dependent and data-independent acquisition mass spectrometry workflow, we were able to identify tryptic peptides matching 113 AltProts in a panel of 6 cell lines. In addition, we identified AltProts differentially expressed between pancreatic cancer cell lines and other cells (HeLa and HEK293T). Finally, mining the TCGA and Gtex databases showed that the corresponding transcripts encoding several AltProts we identified are differentially expressed between PDAC tumors and normal tissues and are correlated with the patient's survival.
Collapse
Affiliation(s)
- Clémence Guillon
- Laboratoire de Recherche en Sciences Végétales (LRSV), CNRS/UT3/INPT, 31320 Auzeville-Tolosane, France
| | - Carole Pichereaux
- Institut de Pharmacologie et de Biologie Structurale (IPBS), CNRS, UPS, Université de Toulouse, 31077 Toulouse, France
- Fédération de Recherche (FR3450), Agrobiosciences, Interactions et Biodiversité (AIB), CNRS, 31326 Toulouse, France
- Infrastructure Nationale de Protéomique, ProFI, FR 2048, 31077 Toulouse, France
| | - Ikrame Lazar
- MCD, Centre de Biologie Intégrative (CBI), CNRS, UT3, Université de Toulouse, 31400 Toulouse, France
| | - Karima Chaoui
- Institut de Pharmacologie et de Biologie Structurale (IPBS), CNRS, UPS, Université de Toulouse, 31077 Toulouse, France
- Infrastructure Nationale de Protéomique, ProFI, FR 2048, 31077 Toulouse, France
| | - Emmanuelle Mouton-Barbosa
- Institut de Pharmacologie et de Biologie Structurale (IPBS), CNRS, UPS, Université de Toulouse, 31077 Toulouse, France
- Infrastructure Nationale de Protéomique, ProFI, FR 2048, 31077 Toulouse, France
| | - Mehdi Liauzun
- Centre de Recherche en Cancérologie de Toulouse (CRCT), INSERM U1037, Université Toulouse III-Paul Sabatier, ERL5294 CNRS, 31432 Toulouse, France
- Equipe Labellisée Ligue Contre Le Cancer, Université Toulouse III-Paul Sabatier, 31000 Toulouse, France
| | - Edith Gourbeyre
- MCD, Centre de Biologie Intégrative (CBI), CNRS, UT3, Université de Toulouse, 31400 Toulouse, France
| | - Pinar Altiner
- Institut de Pharmacologie et de Biologie Structurale (IPBS), CNRS, UPS, Université de Toulouse, 31077 Toulouse, France
- Infrastructure Nationale de Protéomique, ProFI, FR 2048, 31077 Toulouse, France
| | - David Bouyssié
- Institut de Pharmacologie et de Biologie Structurale (IPBS), CNRS, UPS, Université de Toulouse, 31077 Toulouse, France
- Infrastructure Nationale de Protéomique, ProFI, FR 2048, 31077 Toulouse, France
| | - Alexandre Stella
- Institut de Pharmacologie et de Biologie Structurale (IPBS), CNRS, UPS, Université de Toulouse, 31077 Toulouse, France
- Infrastructure Nationale de Protéomique, ProFI, FR 2048, 31077 Toulouse, France
| | - Odile Burlet-Schiltz
- Institut de Pharmacologie et de Biologie Structurale (IPBS), CNRS, UPS, Université de Toulouse, 31077 Toulouse, France
- Infrastructure Nationale de Protéomique, ProFI, FR 2048, 31077 Toulouse, France
| | - Serge Plaza
- Laboratoire de Recherche en Sciences Végétales (LRSV), CNRS/UT3/INPT, 31320 Auzeville-Tolosane, France
| | - Yvan Martineau
- Centre de Recherche en Cancérologie de Toulouse (CRCT), INSERM U1037, Université Toulouse III-Paul Sabatier, ERL5294 CNRS, 31432 Toulouse, France
- Equipe Labellisée Ligue Contre Le Cancer, Université Toulouse III-Paul Sabatier, 31000 Toulouse, France
| | - Bertrand Fabre
- Laboratoire de Recherche en Sciences Végétales (LRSV), CNRS/UT3/INPT, 31320 Auzeville-Tolosane, France
| |
Collapse
|
4
|
Tierney JAS, Świrski MI, Tjeldnes H, Kiran AM, Carancini G, Kiniry SJ, Michel AM, Kufel J, Valen E, Baranov PV. RiboSeq.Org: an integrated suite of resources for ribosome profiling data analysis and visualization. Nucleic Acids Res 2024:gkae1020. [PMID: 39540432 DOI: 10.1093/nar/gkae1020] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2024] [Revised: 10/11/2024] [Accepted: 10/16/2024] [Indexed: 11/16/2024] Open
Abstract
Ribosome profiling (Ribo-Seq) has revolutionised our understanding of translation, but the increasing complexity and volume of Ribo-Seq data present challenges for its reuse. Here, we formally introduce RiboSeq.Org, an integrated suite of resources designed to facilitate Ribo-Seq data analysis and visualisation within a web browser. RiboSeq.Org comprises several interconnected tools: GWIPS-viz for genome-wide visualisation, Trips-Viz for transcriptome-centric analysis, RiboGalaxy for data processing and the newly developed RiboSeq data portal (RDP) for centralised dataset identification and access. The RDP currently hosts preprocessed datasets corresponding to 14840 sequence libraries (samples) from 969 studies across 96 species, in various file formats along with standardised metadata. RiboSeq.Org addresses key challenges in Ribo-Seq data reuse through standardised sample preprocessing, semi-automated metadata curation and programmatic information access via a REST API and command-line utilities. RiboSeq.Org enhances the accessibility and utility of public Ribo-Seq data, enabling researchers to gain new insights into translational regulation and protein synthesis across diverse organisms and conditions. By providing these integrated, user-friendly resources, RiboSeq.Org aims to lower the barrier to reproducible research in the field of translatomics and promote more efficient utilisation of the wealth of available Ribo-Seq data.
Collapse
Affiliation(s)
- Jack A S Tierney
- School of Biochemistry and Cell Biology, University College Cork, Western Rd, Cork, T12 CY82, Ireland
- SFI CRT in Genomics Data Science, University of Galway, University Rd, Galway, H91 TK33, Ireland
| | - Michał I Świrski
- Institute of Genetics and Biotechnology, Faculty of Biology, University of Warsaw, ul. Pawińskiego 5A, Warsaw, 02-106, Poland
| | - Håkon Tjeldnes
- School of Biochemistry and Cell Biology, University College Cork, Western Rd, Cork, T12 CY82, Ireland
- Computational Biology Unit, Department of Informatics, University of Bergen, Thormøhlensgate Bergen, 55N-5008, Norway
| | - Anmol M Kiran
- School of Biochemistry and Cell Biology, University College Cork, Western Rd, Cork, T12 CY82, Ireland
| | - Gionmattia Carancini
- School of Biochemistry and Cell Biology, University College Cork, Western Rd, Cork, T12 CY82, Ireland
- SFI CRT in Genomics Data Science, University of Galway, University Rd, Galway, H91 TK33, Ireland
| | - Stephen J Kiniry
- EIRNA Bio, Food Science and Technology Building, 1 College Rd, Cork, T12 Y337, Ireland
| | - Audrey M Michel
- EIRNA Bio, Food Science and Technology Building, 1 College Rd, Cork, T12 Y337, Ireland
| | - Joanna Kufel
- Institute of Genetics and Biotechnology, Faculty of Biology, University of Warsaw, ul. Pawińskiego 5A, Warsaw, 02-106, Poland
| | - Eivind Valen
- Computational Biology Unit, Department of Informatics, University of Bergen, Thormøhlensgate Bergen, 55N-5008, Norway
- Department of Biosciences, University of Oslo, Kristine Bonnevies hus, Blindernveien 31, 0731 Oslo, Norway
| | - Pavel V Baranov
- School of Biochemistry and Cell Biology, University College Cork, Western Rd, Cork, T12 CY82, Ireland
| |
Collapse
|
5
|
Su H, Katz SG, Slavoff SA. Alternative transcripts recode human genes to express overlapping, frameshifted microproteins. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.10.22.619581. [PMID: 39484585 PMCID: PMC11526972 DOI: 10.1101/2024.10.22.619581] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 11/03/2024]
Abstract
Overlapping genes were thought to be essentially absent from the human genome until the discovery of abundant, frameshifted internal open reading frames (iORFs) nested within annotated protein coding sequences. However, it is currently unclear how many functional human iORFs exist and how they are expressed. We demonstrate that, in hundreds of cases, alternative transcript variants that bypass the start codon of annotated coding sequences (CDSs) can recode a human gene to express the iORF-encoded microprotein. While many human genes generate such non-coding alternative transcripts, they are poorly annotated. Here we develope a new analysis pipeline enabling the assignment of translated human iORFs to alternative transcripts, and provide long-read sequencing and molecular validation of their expression in dozens of cases. Finally, we demonstrate that a conserved DEDD2 iORF switches the function of this gene from pro- to anti-apoptotic. This work thus demonstrates that alternative transcript variants can broadly reprogram human genes to express frameshifted iORFs, revealing new levels of complexity in the human transcriptome and proteome.
Collapse
Affiliation(s)
- Haomiao Su
- Department of Chemistry, Yale University, New Haven, CT 06520, USA
- Institute for Biomolecular Design and Discovery, Yale University, West Haven, CT 06516, USA
| | - Samuel G Katz
- Department of Pathology, Yale School of Medicine, New Haven, CT 06525, USA
| | - Sarah A Slavoff
- Department of Chemistry, Yale University, New Haven, CT 06520, USA
- Institute for Biomolecular Design and Discovery, Yale University, West Haven, CT 06516, USA
- Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT 06529, USA
| |
Collapse
|
6
|
Chanut-Delalande H, Zanet J. Small ORFs, Big Insights: Drosophila as a Model to Unraveling Microprotein Functions. Cells 2024; 13:1645. [PMID: 39404408 PMCID: PMC11475943 DOI: 10.3390/cells13191645] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2024] [Revised: 09/27/2024] [Accepted: 10/02/2024] [Indexed: 10/19/2024] Open
Abstract
Recently developed experimental and computational approaches to identify putative coding small ORFs (smORFs) in genomes have revealed thousands of smORFs localized within coding and non-coding RNAs. They can be translated into smORF peptides or microproteins, which are defined as less than 100 amino acids in length. The identification of such a large number of potential biological regulators represents a major challenge, notably for elucidating the in vivo functions of these microproteins. Since the emergence of this field, Drosophila has proved to be a valuable model for studying the biological functions of microproteins in vivo. In this review, we outline how the smORF field emerged and the nomenclature used in this domain. We summarize the technical challenges associated with identifying putative coding smORFs in the genome and the relevant translated microproteins. Finally, recent findings on one of the best studied smORF peptides, Pri, and other microproteins studied so far in Drosophila are described. These studies highlight the diverse roles that microproteins can fulfil in the regulation of various molecular targets involved in distinct cellular processes during animal development and physiology. Given the recent emergence of the microprotein field and the associated discoveries, the microproteome represents an exquisite source of potentially bioactive molecules, whose in vivo biological functions can be explored in the Drosophila model.
Collapse
Affiliation(s)
| | - Jennifer Zanet
- Unité de Biologie Moléculaire, Cellulaire et du Développement (MCD), UMR 5077, Centre de Biologie Intégrative (CBI), CNRS, UPS, Université de Toulouse, 31062 Toulouse, France;
| |
Collapse
|
7
|
Garcia-Del Rio DF, Derhourhi M, Bonnefond A, Leblanc S, Guilloy N, Roucou X, Eyckerman S, Gevaert K, Salzet M, Cardon T. Deciphering the ghost proteome in ovarian cancer cells by deep proteogenomic characterization. Cell Death Dis 2024; 15:712. [PMID: 39349928 PMCID: PMC11442847 DOI: 10.1038/s41419-024-07046-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/19/2024] [Revised: 08/29/2024] [Accepted: 09/02/2024] [Indexed: 10/04/2024]
Abstract
Proteogenomics is becoming a powerful tool in personalized medicine by linking genomics, transcriptomics and mass spectrometry (MS)-based proteomics. Due to increasing evidence of alternative open reading frame-encoded proteins (AltProts), proteogenomics has a high potential to unravel the characteristics, variants, expression levels of the alternative proteome, in addition to already annotated proteins (RefProts). To obtain a broader view of the proteome of ovarian cancer cells compared to ovarian epithelial cells, cell-specific total RNA-sequencing profiles and customized protein databases were generated. In total, 128 RefProts and 30 AltProts were identified exclusively in SKOV-3 and PEO-4 cells. Among them, an AltProt variant of IP_715944, translated from DHX8, was found mutated (p.Leu44Pro). We show high variation in protein expression levels of RefProts and AltProts in different subcellular compartments. The presence of 117 RefProt and two AltProt variants was described, along with their possible implications in the different physiological/pathological characteristics. To identify the possible involvement of AltProts in cellular processes, cross-linking-MS (XL-MS) was performed in each cell line to identify AltProt-RefProt interactions. This approach revealed an interaction between POLD3 and the AltProt IP_183088, which after molecular docking, was placed between POLD3-POLD2 binding sites, highlighting its possibility of the involvement in DNA replication and repair.
Collapse
Affiliation(s)
- Diego Fernando Garcia-Del Rio
- Univ. Lille, Inserm, CHU Lille, U1192, Protéomique Réponse Inflammatoire Spectrométrie de Masse - PRISM, F-59000, Lille, France
- VIB Center for Medical Biotechnology, VIB, Ghent, 9052, Belgium
- Department of Biomolecular Medicine, Ghent University, Ghent, 9052, Belgium
| | - Mehdi Derhourhi
- Université de Lille, Inserm/CNRS UMR 1283/8199, Pasteur Institute of Lille, EGID, Lille, France University of Lille, Lille, France
| | - Amelie Bonnefond
- Université de Lille, Inserm/CNRS UMR 1283/8199, Pasteur Institute of Lille, EGID, Lille, France University of Lille, Lille, France
- Department of Metabolism, Digestion and Reproduction, Imperial College London, London, UK
| | - Sébastien Leblanc
- Department of Biochemistry and Functional Genomics, Université de Sherbrooke, Sherbrooke, Québec, J1E4K8, Canada
| | - Noé Guilloy
- Department of Biochemistry and Functional Genomics, Université de Sherbrooke, Sherbrooke, Québec, J1E4K8, Canada
| | - Xavier Roucou
- Department of Biochemistry and Functional Genomics, Université de Sherbrooke, Sherbrooke, Québec, J1E4K8, Canada
| | - Sven Eyckerman
- VIB Center for Medical Biotechnology, VIB, Ghent, 9052, Belgium
- Department of Biomolecular Medicine, Ghent University, Ghent, 9052, Belgium
| | - Kris Gevaert
- VIB Center for Medical Biotechnology, VIB, Ghent, 9052, Belgium
- Department of Biomolecular Medicine, Ghent University, Ghent, 9052, Belgium
| | - Michel Salzet
- Univ. Lille, Inserm, CHU Lille, U1192, Protéomique Réponse Inflammatoire Spectrométrie de Masse - PRISM, F-59000, Lille, France.
| | - Tristan Cardon
- Univ. Lille, Inserm, CHU Lille, U1192, Protéomique Réponse Inflammatoire Spectrométrie de Masse - PRISM, F-59000, Lille, France.
| |
Collapse
|
8
|
Britto-Borges T, Gehring NH, Boehm V, Dieterich C. NMDtxDB: data-driven identification and annotation of human NMD target transcripts. RNA (NEW YORK, N.Y.) 2024; 30:1277-1291. [PMID: 39095083 PMCID: PMC11404449 DOI: 10.1261/rna.080066.124] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/19/2024] [Accepted: 07/11/2024] [Indexed: 08/04/2024]
Abstract
The nonsense-mediated RNA decay (NMD) pathway is a crucial mechanism of mRNA quality control. Current annotations of NMD substrate RNAs are rarely data-driven, but use generally established rules. We present a data set with four cell lines and combinations for SMG5, SMG6, and SMG7 knockdowns or SMG7 knockout. Based on this data set, we implemented a workflow that combines Nanopore and Illumina sequencing to assemble a transcriptome, which is enriched for NMD target transcripts. Moreover, we use coding sequence information (CDS) from Ensembl, Gencode consensus Ribo-seq ORFs, and OpenProt to enhance the CDS annotation of novel transcript isoforms. In summary, 302,889 transcripts were obtained from the transcriptome assembly process, out of which 24% are absent from Ensembl database annotations, 48,213 contain a premature stop codon, and 6433 are significantly upregulated in three or more comparisons of NMD active versus deficient cell lines. We present an in-depth view of these results through the NMDtxDB database, which is available at https://shiny.dieterichlab.org/app/NMDtxDB, and supports the study of NMD-sensitive transcripts. We open sourced our implementation of the respective web-application and analysis workflow at https://github.com/dieterich-lab/NMDtxDB and https://github.com/dieterich-lab/nmd-wf.
Collapse
Affiliation(s)
- Thiago Britto-Borges
- Section of Bioinformatics and Systems Cardiology, Department of Internal Medicine III and Klaus Tschira Institute for Integrative Computational Cardiology, Heidelberg University Hospital, 69120 Heidelberg, Germany
- DZHK (German Centre for Cardiovascular Research), Partner site Heidelberg/Mannheim, 69120 Heidelberg, Germany
| | - Niels H Gehring
- Institute for Genetics, University of Cologne, 50674 Cologne, Germany
- Center for Molecular Medicine Cologne (CMMC), University of Cologne, 50674 Cologne, Germany
| | - Volker Boehm
- Institute for Genetics, University of Cologne, 50674 Cologne, Germany
- Center for Molecular Medicine Cologne (CMMC), University of Cologne, 50674 Cologne, Germany
| | - Christoph Dieterich
- Section of Bioinformatics and Systems Cardiology, Department of Internal Medicine III and Klaus Tschira Institute for Integrative Computational Cardiology, Heidelberg University Hospital, 69120 Heidelberg, Germany
- DZHK (German Centre for Cardiovascular Research), Partner site Heidelberg/Mannheim, 69120 Heidelberg, Germany
| |
Collapse
|
9
|
Periasamy P, Joseph C, Campos A, Rajandran S, Batho C, Hudson JE, Sivakumaran H, Kore H, Datta K, Yeong J, Gowda H. Regulation of non-canonical proteins from diverse origins through the nonsense-mediated mRNA decay pathway. Proteomics 2024; 24:e2300361. [PMID: 38350726 DOI: 10.1002/pmic.202300361] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2023] [Revised: 01/28/2024] [Accepted: 02/01/2024] [Indexed: 02/15/2024]
Abstract
Immunotherapy harnesses neoantigens encoded within the human genome, but their therapeutic potential is hampered by low expression, which may be controlled by the nonsense-mediated mRNA decay (NMD) pathway. This study investigates the impact of UPF1-knockdown on the expression of non-canonical/mutant proteins, employing proteogenomic to explore UPF1 role within the NMD pathway. Additionally, we conducted a comprehensive pan-cancer analysis of UPF1 expression and evaluated UPF1 expression in Triple-Negative Breast Cancer (TNBC) tissue in-vivo. Our findings reveal that UPF1-knockdown leads to increased translation of non-canonical/mutant proteins, particularly those originating from retained-introns, pseudogenes, long non-coding RNAs, and unannotated transcript biotypes. Moreover, our analysis demonstrates elevated UPF1 expression in various cancer types, with notably heightened protein levels in patient-derived TNBC tumors compared to adjacent tissues. This study elucidates UPF1 role in mitigating transcriptional noise by degrading transcripts encoding non-canonical/mutant proteins. Targeting this mechanism may reveal a new spectrum of neoantigens accessible to the antigen presentation pathway. Our novel findings provide a strong foundation for the development of therapeutic strategies aimed at targeting UPF1 or modulating the NMD pathway.
Collapse
Affiliation(s)
- Parthiban Periasamy
- Institute of Molecular and Cell Biology, Agency for Science, Technology, and Research (A*STAR), Singapore, Singapore
- QIMR Berghofer Medical Research Institute, Brisbane, Queensland, Australia
| | - Craig Joseph
- Institute of Molecular and Cell Biology, Agency for Science, Technology, and Research (A*STAR), Singapore, Singapore
| | - Adrian Campos
- QIMR Berghofer Medical Research Institute, Brisbane, Queensland, Australia
- Regeneron Genetics Center, Tarrytown, New York, USA
| | - Sureka Rajandran
- QIMR Berghofer Medical Research Institute, Brisbane, Queensland, Australia
- Flow Cytometry Department, Covance Central Laboratory Services, Singapore, 609917, Singapore
| | - Christopher Batho
- QIMR Berghofer Medical Research Institute, Brisbane, Queensland, Australia
| | - James E Hudson
- QIMR Berghofer Medical Research Institute, Brisbane, Queensland, Australia
| | - Haran Sivakumaran
- QIMR Berghofer Medical Research Institute, Brisbane, Queensland, Australia
| | - Hitesh Kore
- QIMR Berghofer Medical Research Institute, Brisbane, Queensland, Australia
| | - Keshava Datta
- QIMR Berghofer Medical Research Institute, Brisbane, Queensland, Australia
| | - Joe Yeong
- Institute of Molecular and Cell Biology, Agency for Science, Technology, and Research (A*STAR), Singapore, Singapore
| | - Harsha Gowda
- QIMR Berghofer Medical Research Institute, Brisbane, Queensland, Australia
| |
Collapse
|
10
|
Puchalski M, Tretiakow D, Skorek A, Szydłowski K, Stodulski D, Mikaszewski B, Odroniec A, Musiał N, Thiel M, Czaplewska P, Ołdziej S. Comparison of Peptidomes Extracted from Healthy Tissue and Tumor Tissue of the Parotid Glands and Saliva Samples. Int J Mol Sci 2024; 25:8799. [PMID: 39201484 PMCID: PMC11354857 DOI: 10.3390/ijms25168799] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2024] [Revised: 08/04/2024] [Accepted: 08/07/2024] [Indexed: 09/02/2024] Open
Abstract
Salivary gland tumors are highly variable in clinical presentation and histology. The World Health Organization (WHO) classifies 22 types of malignant and 11 types of benign tumors of the salivary glands. Diagnosis of salivary gland tumors is based on imaging (ultrasound, magnetic resonance imaging) and fine-needle aspiration biopsy, but the final diagnosis is based on histopathological examination of the removed tumor tissue. In this pilot study, we are testing a new approach to identifying peptide biomarkers in saliva that can be used to diagnose salivary gland tumors. The research material for the peptidomic studies was extracts from washings of neoplastic tissues and healthy tissues (control samples). At the same time, saliva samples from patients and healthy individuals were analyzed. The comparison of the peptidome composition of tissue extracts and saliva samples may allow the identification of potential peptide markers of salivary gland tumors in patients' saliva. The peptidome compositions extracted from 18 tumor and 18 healthy tissue samples, patients' saliva samples (11 samples), and healthy saliva samples (8 samples) were analyzed by LC-MS tandem mass spectrometry. A group of 109 peptides was identified that were present only in the tumor tissue extracts and in the patients' saliva samples. Some of the identified peptides were derived from proteins previously suggested as potential biomarkers of salivary gland tumors (ANXA1, BPIFA2, FGB, GAPDH, HSPB1, IGHG1, VIM) or tumors of other tissues or organs (SERPINA1, APOA2, CSTB, GSTP1, S100A8, S100A9, TPI1). Unfortunately, none of the identified peptides were present in all samples analyzed. This may be due to the high heterogeneity of this type of cancer. The surprising result was that extracts from tumor tissue did not contain peptides derived from salivary gland-specific proteins (STATH, SMR3B, HTN1, HTN3). These results could suggest that the developing tumor suppresses the production of proteins that are essential components of saliva.
Collapse
Affiliation(s)
- Michał Puchalski
- Intercollegiate Faculty of Biotechnology UG&MUG, University of Gdańsk, Abrahama 58, 80-307 Gdańsk, Poland; (M.P.); (A.O.); (N.M.); (M.T.); (P.C.)
| | - Dmitry Tretiakow
- Department of Otolaryngology, the Nicolaus Copernicus Hospital in Gdansk, Copernicus Healthcare Entity, Powstańców Warszawskich 1/2, 80-152 Gdansk, Poland; (A.S.); (K.S.)
- Department of Otolaryngology, Faculty of Medicine, Medical University of Gdansk, Smoluchowskiego 17, 80-214 Gdansk, Poland; (D.S.); (B.M.)
| | - Andrzej Skorek
- Department of Otolaryngology, the Nicolaus Copernicus Hospital in Gdansk, Copernicus Healthcare Entity, Powstańców Warszawskich 1/2, 80-152 Gdansk, Poland; (A.S.); (K.S.)
- Department of Otolaryngology, Faculty of Medicine, Medical University of Gdansk, Smoluchowskiego 17, 80-214 Gdansk, Poland; (D.S.); (B.M.)
| | - Konrad Szydłowski
- Department of Otolaryngology, the Nicolaus Copernicus Hospital in Gdansk, Copernicus Healthcare Entity, Powstańców Warszawskich 1/2, 80-152 Gdansk, Poland; (A.S.); (K.S.)
| | - Dominik Stodulski
- Department of Otolaryngology, Faculty of Medicine, Medical University of Gdansk, Smoluchowskiego 17, 80-214 Gdansk, Poland; (D.S.); (B.M.)
| | - Bogusław Mikaszewski
- Department of Otolaryngology, Faculty of Medicine, Medical University of Gdansk, Smoluchowskiego 17, 80-214 Gdansk, Poland; (D.S.); (B.M.)
| | - Amadeusz Odroniec
- Intercollegiate Faculty of Biotechnology UG&MUG, University of Gdańsk, Abrahama 58, 80-307 Gdańsk, Poland; (M.P.); (A.O.); (N.M.); (M.T.); (P.C.)
| | - Natalia Musiał
- Intercollegiate Faculty of Biotechnology UG&MUG, University of Gdańsk, Abrahama 58, 80-307 Gdańsk, Poland; (M.P.); (A.O.); (N.M.); (M.T.); (P.C.)
| | - Marcel Thiel
- Intercollegiate Faculty of Biotechnology UG&MUG, University of Gdańsk, Abrahama 58, 80-307 Gdańsk, Poland; (M.P.); (A.O.); (N.M.); (M.T.); (P.C.)
| | - Paulina Czaplewska
- Intercollegiate Faculty of Biotechnology UG&MUG, University of Gdańsk, Abrahama 58, 80-307 Gdańsk, Poland; (M.P.); (A.O.); (N.M.); (M.T.); (P.C.)
| | - Stanisław Ołdziej
- Intercollegiate Faculty of Biotechnology UG&MUG, University of Gdańsk, Abrahama 58, 80-307 Gdańsk, Poland; (M.P.); (A.O.); (N.M.); (M.T.); (P.C.)
| |
Collapse
|
11
|
Peng M, Zhou Y, Wan C. Identification of phosphorylated small ORF-encoded peptides in Hep3B cells by LC/MS/MS. J Proteomics 2024; 303:105214. [PMID: 38823442 DOI: 10.1016/j.jprot.2024.105214] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/20/2024] [Revised: 04/30/2024] [Accepted: 05/29/2024] [Indexed: 06/03/2024]
Abstract
Small ORF-encoded peptides (SEPs) are a class of low molecular weight proteins and peptides comprising <100 amino acids with important functions in various life activities. Although the sequence length is short, SEPs might also have post-translational modification (PTM). Phosphorylation is one of the most essential PTMs of proteins. In this work, we enriched phosphopeptides with IMAC and TiO2 materials and analyzed the phosphorylated SEPs in Hep3B cells. A total of 24 phosphorylated SEPs were identified, and 11 SEPs were coded by ncRNA. For the sequence analysis, we found that the general characteristics of phosphorylated SEPs are roughly the same as canonical proteins. Besides, two phosphorylation SEPs have the Stathmin family signature 2 motif, which can regulate the microtubule cytoskeleton. Some SEPs have domains or signal peptides, indicating their specific functions and subcellular locations. Kinase network analysis found a small number of kinases that may be a clue to the specific functions of some SEPs. However, only one-fifth of the predicted phosphorylation sites were identified by LC/MS/MS, indicating that many SEP PTMs are hidden in the dark, waiting to be uncovered and verified. This study helps expand our understanding of SEP and provides information for further SEP function investigation. SIGNIFICANCE: Small ORF-encoded peptides (SEPs) are important in various life activities. Although the sequence length is short (<100AA), SEPs might also have post-translational modification (PTM). Phosphorylation is one of the most essential PTMs of proteins. We enriched phosphopeptides and analyzed the phosphorylated SEPs in Hep3B cells. That is the first time to explore the PTM of SPEs systematically. Kinase network analysis found a small number of kinases that may be a clue to the specific functions of SEPs. More SEP PTMs are hidden in the dark and waiting to be uncovered and verified. This study helps expand our understanding of SEP and provides information for further SEP function investigation.
Collapse
Affiliation(s)
- Mingbo Peng
- School of Life Sciences, and Hubei Key Laboratory of Genetic Regulation and Integrative Biology, Central China Normal University, Wuhan, Hubei 430079, People's Republic of China
| | - Yutian Zhou
- School of Life Sciences, and Hubei Key Laboratory of Genetic Regulation and Integrative Biology, Central China Normal University, Wuhan, Hubei 430079, People's Republic of China
| | - Cuihong Wan
- School of Life Sciences, and Hubei Key Laboratory of Genetic Regulation and Integrative Biology, Central China Normal University, Wuhan, Hubei 430079, People's Republic of China.
| |
Collapse
|
12
|
Kim KH, Lee CB. Socialized mitochondria: mitonuclear crosstalk in stress. Exp Mol Med 2024; 56:1033-1042. [PMID: 38689084 PMCID: PMC11148012 DOI: 10.1038/s12276-024-01211-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2023] [Revised: 01/27/2024] [Accepted: 02/07/2024] [Indexed: 05/02/2024] Open
Abstract
Traditionally, mitochondria are considered sites of energy production. However, recent studies have suggested that mitochondria are signaling organelles that are involved in intracellular interactions with other organelles. Remarkably, stressed mitochondria appear to induce a beneficial response that restores mitochondrial function and cellular homeostasis. These mitochondrial stress-centered signaling pathways have been rapidly elucidated in multiple organisms. In this review, we examine current perspectives on how mitochondria communicate with the rest of the cell, highlighting mitochondria-to-nucleus (mitonuclear) communication under various stresses. Our understanding of mitochondria as signaling organelles may provide new insights into disease susceptibility and lifespan extension.
Collapse
Affiliation(s)
- Kyung Hwa Kim
- Department of Health Sciences, The Graduate School of Dong-A University, 840 Hadan-dong, Saha-gu, Busan, 49315, Korea.
| | - Cho Bi Lee
- Department of Health Sciences, The Graduate School of Dong-A University, 840 Hadan-dong, Saha-gu, Busan, 49315, Korea
| |
Collapse
|
13
|
Peng Z, Li J, Jiang X, Wan C. sOCP: a framework predicting smORF coding potential based on TIS and in-frame features and effectively applied in the human genome. Brief Bioinform 2024; 25:bbae147. [PMID: 38600664 PMCID: PMC11006793 DOI: 10.1093/bib/bbae147] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2023] [Revised: 02/25/2024] [Accepted: 03/19/2024] [Indexed: 04/12/2024] Open
Abstract
Small open reading frames (smORFs) have been acknowledged to play various roles on essential biological pathways and affect human beings from diabetes to tumorigenesis. Predicting smORFs in silico is quite a prerequisite for processing the omics data. Here, we proposed the smORF-coding-potential-predicting framework, sOCP, which provides functions to construct a model for predicting novel smORFs in some species. The sOCP model constructed in human was based on in-frame features and the nucleotide bias around the start codon, and the small feature subset was proved to be competent enough and avoid overfitting problems for complicated models. It showed more advanced prediction metrics than previous methods and could correlate closely with experimental evidence in a heterogeneous dataset. The model was applied to Rattus norvegicus and exhibited satisfactory performance. We then scanned smORFs with ATG and non-ATG start codons from the human genome and generated a database containing about a million novel smORFs with coding potential. Around 72 000 smORFs are located on the lncRNA regions of the genome. The smORF-encoded peptides may be involved in biological pathways rare for canonical proteins, including glucocorticoid catabolic process and the prokaryotic defense system. Our work provides a model and database for human smORF investigation and a convenient tool for further smORF prediction in other species.
Collapse
Affiliation(s)
- Zhao Peng
- School of Life Sciences, and Hubei Key Laboratory of Genetic Regulation and Integrative Biology, Central China Normal University, Wuhan 430079, Hubei, People’s Republic of China
| | - Jiaqiang Li
- School of Computer Science, and Hubei Provincial Key Laboratory of Artificial Intelligence and Smart Learning, Central China Normal University, Wuhan 430079, Hubei, People’s Republic of China
| | - Xingpeng Jiang
- School of Computer Science, and Hubei Provincial Key Laboratory of Artificial Intelligence and Smart Learning, Central China Normal University, Wuhan 430079, Hubei, People’s Republic of China
| | - Cuihong Wan
- School of Life Sciences, and Hubei Key Laboratory of Genetic Regulation and Integrative Biology, Central China Normal University, Wuhan 430079, Hubei, People’s Republic of China
| |
Collapse
|
14
|
Valdivia-Francia F, Sendoel A. No country for old methods: New tools for studying microproteins. iScience 2024; 27:108972. [PMID: 38333695 PMCID: PMC10850755 DOI: 10.1016/j.isci.2024.108972] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/10/2024] Open
Abstract
Microproteins encoded by small open reading frames (sORFs) have emerged as a fascinating frontier in genomics. Traditionally overlooked due to their small size, recent technological advancements such as ribosome profiling, mass spectrometry-based strategies and advanced computational approaches have led to the annotation of more than 7000 sORFs in the human genome. Despite the vast progress, only a tiny portion of these microproteins have been characterized and an important challenge in the field lies in identifying functionally relevant microproteins and understanding their role in different cellular contexts. In this review, we explore the recent advancements in sORF research, focusing on the new methodologies and computational approaches that have facilitated their identification and functional characterization. Leveraging these new tools hold great promise for dissecting the diverse cellular roles of microproteins and will ultimately pave the way for understanding their role in the pathogenesis of diseases and identifying new therapeutic targets.
Collapse
Affiliation(s)
- Fabiola Valdivia-Francia
- University of Zurich, Institute for Regenerative Medicine (IREM), Wagistrasse 12, 8952 Schlieren-Zurich, Switzerland
- Life Science Zurich Graduate School, Molecular Life Science Program, University of Zurich/ ETH Zurich, Schlieren-Zurich, Switzerland
| | - Ataman Sendoel
- University of Zurich, Institute for Regenerative Medicine (IREM), Wagistrasse 12, 8952 Schlieren-Zurich, Switzerland
| |
Collapse
|
15
|
Cao X, Sun S, Xing J. A Massive Proteogenomic Screen Identifies Thousands of Novel Peptides From the Human "Dark" Proteome. Mol Cell Proteomics 2024; 23:100719. [PMID: 38242438 PMCID: PMC10867589 DOI: 10.1016/j.mcpro.2024.100719] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2023] [Revised: 01/01/2024] [Accepted: 01/16/2024] [Indexed: 01/21/2024] Open
Abstract
Although the human gene annotation has been continuously improved over the past 2 decades, numerous studies demonstrated the existence of a "dark proteome", consisting of proteins that were critical for biological processes but not included in widely used gene catalogs. The Genotype-Tissue Expression project generated more than 15,000 RNA-seq datasets from multiple tissues, which modeled 30 million transcripts in the human genome. To provide a resource of high-confidence novel proteins from the dark proteome, we screened 50,000 mass spectrometry runs from over 900 projects to identify proteins translated from the Genotype-Tissue Expression transcript model with proteomic support. We also integrated 3.8 million common genetic variants from the gnomAD database to improve peptide identification. As a result, we identified 170,529 novel peptides with proteomic evidence, of which 6048 passed the strictest standard we defined and were supported by PepQuery. We provided a user-friendly website (https://ncorf.genes.fun/) for researchers to check the evidence of novel peptides from their studies. The findings will improve our understanding of coding genes and facilitate genomic data interpretation in biomedical research.
Collapse
Affiliation(s)
- Xiaolong Cao
- Department of Anesthesiology, Zhujiang Hospital, Southern Medical University, Guangzhou, Guangdong, China; Department of Genetics, Rutgers, The State University of New Jersey, Piscataway, New Jersey, USA; Human Genetic Institute of New Jersey, Rutgers, The State University of New Jersey, Piscataway, New Jersey, USA
| | - Siqi Sun
- Department of Genetics, Rutgers, The State University of New Jersey, Piscataway, New Jersey, USA; Human Genetic Institute of New Jersey, Rutgers, The State University of New Jersey, Piscataway, New Jersey, USA
| | - Jinchuan Xing
- Department of Genetics, Rutgers, The State University of New Jersey, Piscataway, New Jersey, USA; Human Genetic Institute of New Jersey, Rutgers, The State University of New Jersey, Piscataway, New Jersey, USA.
| |
Collapse
|
16
|
Santos LGC, Parreira VDSC, da Silva EMG, Santos MDM, Fernandes ADF, Neves-Ferreira AGDC, Carvalho PC, Freitas FCDP, Passetti F. SpliceProt 2.0: A Sequence Repository of Human, Mouse, and Rat Proteoforms. Int J Mol Sci 2024; 25:1183. [PMID: 38256255 PMCID: PMC10816255 DOI: 10.3390/ijms25021183] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2023] [Revised: 12/15/2023] [Accepted: 01/03/2024] [Indexed: 01/24/2024] Open
Abstract
SpliceProt 2.0 is a public proteogenomics database that aims to list the sequence of known proteins and potential new proteoforms in human, mouse, and rat proteomes. This updated repository provides an even broader range of computationally translated proteins and serves, for example, to aid with proteomic validation of splice variants absent from the reference UniProtKB/SwissProt database. We demonstrate the value of SpliceProt 2.0 to predict orthologous proteins between humans and murines based on transcript reconstruction, sequence annotation and detection at the transcriptome and proteome levels. In this release, the annotation data used in the reconstruction of transcripts based on the methodology of ternary matrices were acquired from new databases such as Ensembl, UniProt, and APPRIS. Another innovation implemented in the pipeline is the exclusion of transcripts predicted to be susceptible to degradation through the NMD pathway. Taken together, our repository and its applications represent a valuable resource for the proteogenomics community.
Collapse
Affiliation(s)
- Letícia Graziela Costa Santos
- Instituto Carlos Chagas, Fundação Oswaldo Cruz (FIOCRUZ), Rua Professor Algacyr Munhoz Mader 3775, Cidade Industrial De Curitiba, Curitiba 81310-020, PR, Brazil
| | - Vinícius da Silva Coutinho Parreira
- Instituto Carlos Chagas, Fundação Oswaldo Cruz (FIOCRUZ), Rua Professor Algacyr Munhoz Mader 3775, Cidade Industrial De Curitiba, Curitiba 81310-020, PR, Brazil
| | - Esdras Matheus Gomes da Silva
- Instituto Carlos Chagas, Fundação Oswaldo Cruz (FIOCRUZ), Rua Professor Algacyr Munhoz Mader 3775, Cidade Industrial De Curitiba, Curitiba 81310-020, PR, Brazil
- Laboratory of Toxinology, Oswaldo Cruz Institute, Fundação Oswaldo Cruz (FIOCRUZ), Av. Brazil 4036, Campus Maré, Rio de Janeiro 21040-361, RJ, Brazil
| | - Marlon Dias Mariano Santos
- Instituto Carlos Chagas, Fundação Oswaldo Cruz (FIOCRUZ), Rua Professor Algacyr Munhoz Mader 3775, Cidade Industrial De Curitiba, Curitiba 81310-020, PR, Brazil
| | - Alexander da Franca Fernandes
- Instituto Carlos Chagas, Fundação Oswaldo Cruz (FIOCRUZ), Rua Professor Algacyr Munhoz Mader 3775, Cidade Industrial De Curitiba, Curitiba 81310-020, PR, Brazil
| | - Ana Gisele da Costa Neves-Ferreira
- Laboratory of Toxinology, Oswaldo Cruz Institute, Fundação Oswaldo Cruz (FIOCRUZ), Av. Brazil 4036, Campus Maré, Rio de Janeiro 21040-361, RJ, Brazil
| | - Paulo Costa Carvalho
- Instituto Carlos Chagas, Fundação Oswaldo Cruz (FIOCRUZ), Rua Professor Algacyr Munhoz Mader 3775, Cidade Industrial De Curitiba, Curitiba 81310-020, PR, Brazil
| | - Flávia Cristina de Paula Freitas
- Instituto Carlos Chagas, Fundação Oswaldo Cruz (FIOCRUZ), Rua Professor Algacyr Munhoz Mader 3775, Cidade Industrial De Curitiba, Curitiba 81310-020, PR, Brazil
- Departamento de Genética e Evolução, Universidade Federal de São Carlos (UFSCar), Rodovia Washington Luis, Km 235, São Carlos 13565-905, SP, Brazil
| | - Fabio Passetti
- Instituto Carlos Chagas, Fundação Oswaldo Cruz (FIOCRUZ), Rua Professor Algacyr Munhoz Mader 3775, Cidade Industrial De Curitiba, Curitiba 81310-020, PR, Brazil
| |
Collapse
|
17
|
Leblanc S, Yala F, Provencher N, Lucier JF, Levesque M, Lapointe X, Jacques JF, Fournier I, Salzet M, Ouangraoua A, Scott MS, Boisvert FM, Brunet MA, Roucou X. OpenProt 2.0 builds a path to the functional characterization of alternative proteins. Nucleic Acids Res 2024; 52:D522-D528. [PMID: 37956315 PMCID: PMC10767855 DOI: 10.1093/nar/gkad1050] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2023] [Revised: 10/20/2023] [Accepted: 10/23/2023] [Indexed: 11/15/2023] Open
Abstract
The OpenProt proteogenomic resource (https://www.openprot.org/) provides users with a complete and freely accessible set of non-canonical or alternative open reading frames (AltORFs) within the transcriptome of various species, as well as functional annotations of the corresponding protein sequences not found in standard databases. Enhancements in this update are largely the result of user feedback and include the prediction of structure, subcellular localization, and intrinsic disorder, using cutting-edge algorithms based on machine learning techniques. The mass spectrometry pipeline now integrates a machine learning-based peptide rescoring method to improve peptide identification. We continue to help users explore this cryptic proteome by providing OpenCustomDB, a tool that enables users to build their own customized protein databases, and OpenVar, a genomic annotator including genetic variants within AltORFs and protein sequences. A new interface improves the visualization of all functional annotations, including a spectral viewer and the prediction of multicoding genes. All data on OpenProt are freely available and downloadable. Overall, OpenProt continues to establish itself as an important resource for the exploration and study of new proteins.
Collapse
Affiliation(s)
- Sébastien Leblanc
- Department of Biochemistry and Functional Genomics, Université de Sherbrooke, 3201 Jean Mignault, Sherbrooke, QC J1E 4K8, Canada
| | - Feriel Yala
- Department of Biochemistry and Functional Genomics, Université de Sherbrooke, 3201 Jean Mignault, Sherbrooke, QC J1E 4K8, Canada
| | - Nicolas Provencher
- Department of Biochemistry and Functional Genomics, Université de Sherbrooke, 3201 Jean Mignault, Sherbrooke, QC J1E 4K8, Canada
| | - Jean-François Lucier
- Center for Computational Science, Université de Sherbrooke, Sherbrooke, QC J1K 2R1, Canada
- Department of Biology, Université de Sherbrooke, Sherbrooke, QC J1K 2R1, Canada
| | - Maxime Levesque
- Center for Computational Science, Université de Sherbrooke, Sherbrooke, QC J1K 2R1, Canada
| | - Xavier Lapointe
- Department of Pediatrics, Medical Genetics Service, Université de Sherbrooke, Sherbrooke, QC J1H 5N4, Canada
| | - Jean-Francois Jacques
- Department of Biochemistry and Functional Genomics, Université de Sherbrooke, 3201 Jean Mignault, Sherbrooke, QC J1E 4K8, Canada
| | - Isabelle Fournier
- INSERM U1192, Laboratoire Protéomique, Réponse Inflammatoire & Spectrométrie de Masse (PRISM), Université de Lille, F-59000 Lille, France
| | - Michel Salzet
- INSERM U1192, Laboratoire Protéomique, Réponse Inflammatoire & Spectrométrie de Masse (PRISM), Université de Lille, F-59000 Lille, France
| | - Aïda Ouangraoua
- Informatics Department, Université de Sherbrooke, Sherbrooke, QC J1K 2R1, Canada
| | - Michelle S Scott
- Department of Biochemistry and Functional Genomics, Université de Sherbrooke, 3201 Jean Mignault, Sherbrooke, QC J1E 4K8, Canada
- Centre de Recherche du Centre Hospitalier Universitaire de Sherbrooke (CRCHUS), Sherbrooke, QC J1H 5N4, Canada
| | - François-Michel Boisvert
- Centre de Recherche du Centre Hospitalier Universitaire de Sherbrooke (CRCHUS), Sherbrooke, QC J1H 5N4, Canada
- Department of Immunology and Cellular Biology, Université de Sherbrooke, Sherbrooke, QC J1E 4K8, Canada
| | - Marie A Brunet
- Centre de Recherche du Centre Hospitalier Universitaire de Sherbrooke (CRCHUS), Sherbrooke, QC J1H 5N4, Canada
- Department of Pediatrics, Medical Genetics Service, Université de Sherbrooke, Sherbrooke, QC J1H 5N4, Canada
| | - Xavier Roucou
- Department of Biochemistry and Functional Genomics, Université de Sherbrooke, 3201 Jean Mignault, Sherbrooke, QC J1E 4K8, Canada
- Centre de Recherche du Centre Hospitalier Universitaire de Sherbrooke (CRCHUS), Sherbrooke, QC J1H 5N4, Canada
| |
Collapse
|
18
|
Kore H, Datta KK, Nagaraj SH, Gowda H. Protein-coding potential of non-canonical open reading frames in human transcriptome. Biochem Biophys Res Commun 2023; 684:149040. [PMID: 37897910 DOI: 10.1016/j.bbrc.2023.09.068] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2023] [Revised: 09/09/2023] [Accepted: 09/23/2023] [Indexed: 10/30/2023]
Abstract
In recent years, proteogenomics and ribosome profiling studies have identified a large number of proteins encoded by noncoding regions in the human genome. They are encoded by small open reading frames (sORFs) in the untranslated regions (UTRs) of mRNAs and long non-coding RNAs (lncRNAs). These sORF encoded proteins (SEPs) are often <150AA and show poor evolutionary conservation. A subset of them have been functionally characterized and shown to play an important role in fundamental biological processes including cardiac and muscle function, DNA repair, embryonic development and various human diseases. How many novel protein-coding regions exist in the human genome and what fraction of them are functionally important remains a mystery. In this review, we discuss current progress in unraveling SEPs, approaches used for their identification, their limitations and reliability of these identifications. We also discuss functionally characterized SEPs and their involvement in various biological processes and diseases. Lastly, we provide insights into their distinctive features compared to canonical proteins and challenges associated with annotating these in protein reference databases.
Collapse
Affiliation(s)
- Hitesh Kore
- Centre for Genomics and Personalised Health, Queensland University of Technology, Brisbane, Queensland, 4059, Australia; Cancer Precision Medicine Group, QIMR Berghofer Medical Research Institute, 300 Herston Road, Herston, Queensland, 4006, Australia; Faculty of Health, Queensland University of Technology, Brisbane, Queensland, 4059, Australia.
| | - Keshava K Datta
- Proteomics and Metabolomics Platform, La Trobe University, Melbourne, VIC, 3083, Australia
| | - Shivashankar H Nagaraj
- Centre for Genomics and Personalised Health, Queensland University of Technology, Brisbane, Queensland, 4059, Australia; Faculty of Health, Queensland University of Technology, Brisbane, Queensland, 4059, Australia
| | - Harsha Gowda
- Centre for Genomics and Personalised Health, Queensland University of Technology, Brisbane, Queensland, 4059, Australia; Cancer Precision Medicine Group, QIMR Berghofer Medical Research Institute, 300 Herston Road, Herston, Queensland, 4006, Australia; Faculty of Health, Queensland University of Technology, Brisbane, Queensland, 4059, Australia; Faculty of Medicine, The University of Queensland, Queensland, 4072, Australia.
| |
Collapse
|
19
|
Deng J, Xu W, Jie Y, Chong Y. Subcellular localization and relevant mechanisms of human cancer-related micropeptides. FASEB J 2023; 37:e23270. [PMID: 37994683 DOI: 10.1096/fj.202301019rr] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/21/2023] [Revised: 09/17/2023] [Accepted: 10/10/2023] [Indexed: 11/24/2023]
Abstract
Rapid advances in high-quality sequencing and bioinformatics have invalidated the argument that noncoding RNAs (ncRNAs) are junk transcripts that do not encode proteins. Increasing evidence suggests that small open reading frames (sORFs) in ncRNAs can encode micropeptides and polypeptides within 100 amino acids in length. Several micropeptides have been characterized and proven to have various functions in human physiology and pathology, particularly in cancer. The present review mainly highlights the latest studies on ncRNA-encoded micropeptides in different cancers and categorizes them based on their subcellular localization, thereby providing a theoretical basis for micropeptide applications in the early diagnosis and prognosis of cancer and as therapeutic targets. However, considering the inherent characteristics of micropeptides and the limitations of the assay technology methods, more detailed information is warranted.
Collapse
Affiliation(s)
- Jing Deng
- Department of Infectious Diseases, the Third Affiliated Hospital of Sun Yat-sen University, Guangzhou, Guangdong, China
| | - Wenli Xu
- Department of Infectious Diseases, the Third Affiliated Hospital of Sun Yat-sen University, Guangzhou, Guangdong, China
| | - Yusheng Jie
- Department of Infectious Diseases, the Third Affiliated Hospital of Sun Yat-sen University, Guangzhou, Guangdong, China
| | - Yutian Chong
- Department of Infectious Diseases, the Third Affiliated Hospital of Sun Yat-sen University, Guangzhou, Guangdong, China
| |
Collapse
|
20
|
Mohsen JJ, Martel AA, Slavoff SA. Microproteins-Discovery, structure, and function. Proteomics 2023; 23:e2100211. [PMID: 37603371 PMCID: PMC10841188 DOI: 10.1002/pmic.202100211] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2023] [Revised: 08/03/2023] [Accepted: 08/10/2023] [Indexed: 08/22/2023]
Abstract
Advances in proteogenomic technologies have revealed hundreds to thousands of translated small open reading frames (sORFs) that encode microproteins in genomes across evolutionary space. While many microproteins have now been shown to play critical roles in biology and human disease, a majority of recently identified microproteins have little or no experimental evidence regarding their functionality. Computational tools have some limitations for analysis of short, poorly conserved microprotein sequences, so additional approaches are needed to determine the role of each member of this recently discovered polypeptide class. A currently underexplored avenue in the study of microproteins is structure prediction and determination, which delivers a depth of functional information. In this review, we provide a brief overview of microprotein discovery methods, then examine examples of microprotein structures (and, conversely, intrinsic disorder) that have been experimentally determined using crystallography, cryo-electron microscopy, and NMR, which provide insight into their molecular functions and mechanisms. Additionally, we discuss examples of predicted microprotein structures that have provided insight or context regarding their function. Analysis of microprotein structure at the angstrom level, and confirmation of predicted structures, therefore, has potential to identify translated microproteins that are of biological importance and to provide molecular mechanism for their in vivo roles.
Collapse
Affiliation(s)
- Jessica J. Mohsen
- Department of Chemistry, Yale University, New Haven, CT, USA
- Institute of Biomolecular Design and Discovery, Yale University, West Haven, CT, USA
| | - Alina A. Martel
- Institute of Biomolecular Design and Discovery, Yale University, West Haven, CT, USA
| | - Sarah A. Slavoff
- Department of Chemistry, Yale University, New Haven, CT, USA
- Institute of Biomolecular Design and Discovery, Yale University, West Haven, CT, USA
- Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT, USA
| |
Collapse
|
21
|
Zhao B, Zhao J, Wang M, Guo Y, Mehmood A, Wang W, Xiong Y, Luo S, Wei DQ, Zhao XQ, Wang Y. Exploring microproteins from various model organisms using the mip-mining database. BMC Genomics 2023; 24:661. [PMID: 37919660 PMCID: PMC10623795 DOI: 10.1186/s12864-023-09735-1] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/11/2023] [Accepted: 10/12/2023] [Indexed: 11/04/2023] Open
Abstract
Microproteins, prevalent across all kingdoms of life, play a crucial role in cell physiology and human health. Although global gene transcription is widely explored and abundantly available, our understanding of microprotein functions using transcriptome data is still limited. To mitigate this problem, we present a database, Mip-mining ( https://weilab.sjtu.edu.cn/mipmining/ ), underpinned by high-quality RNA-sequencing data exclusively aimed at analyzing microprotein functions. The Mip-mining hosts 336 sets of high-quality transcriptome data from 8626 samples and nine representative living organisms, including microorganisms, plants, animals, and humans, in our Mip-mining database. Our database specifically provides a focus on a range of diseases and environmental stress conditions, taking into account chemical, physical, biological, and diseases-related stresses. Comparatively, our platform enables customized analysis by inputting desired data sets with self-determined cutoff values. The practicality of Mip-mining is demonstrated by identifying essential microproteins in different species and revealing the importance of ATP15 in the acetic acid stress tolerance of budding yeast. We believe that Mip-mining will facilitate a greater understanding and application of microproteins in biotechnology. Moreover, it will be beneficial for designing therapeutic strategies under various biological conditions.
Collapse
Affiliation(s)
- Bowen Zhao
- State Key Laboratory of Microbial Metabolism, Joint International Research Laboratory of Metabolic & Developmental Sciences, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, 200240, China
| | - Jing Zhao
- State Key Laboratory of Microbial Metabolism, Joint International Research Laboratory of Metabolic & Developmental Sciences, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, 200240, China
| | - Muyao Wang
- State Key Laboratory of Microbial Metabolism, Joint International Research Laboratory of Metabolic & Developmental Sciences, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, 200240, China
| | - Yangfan Guo
- Central Laboratory of Yan'an Hospital Affiliated to Kunming Medical University, Kunming, 650051, China
| | - Aamir Mehmood
- State Key Laboratory of Microbial Metabolism, Joint International Research Laboratory of Metabolic & Developmental Sciences, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, 200240, China
| | - Weibin Wang
- State Key Laboratory of Microbial Metabolism, Joint International Research Laboratory of Metabolic & Developmental Sciences, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, 200240, China
| | - Yi Xiong
- State Key Laboratory of Microbial Metabolism, Joint International Research Laboratory of Metabolic & Developmental Sciences, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, 200240, China
- Shanghai Artificial Intelligence Laboratory, Shanghai, 200232, China
| | - Shenggan Luo
- State Key Laboratory of Microbial Metabolism, Joint International Research Laboratory of Metabolic & Developmental Sciences, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, 200240, China
| | - Dong-Qing Wei
- State Key Laboratory of Microbial Metabolism, Joint International Research Laboratory of Metabolic & Developmental Sciences, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, 200240, China.
- Zhongjing Research and Industrialization Institute of Chinese Medicine, Zhongguancun Scientific Park, Meixi, Nayang, Henan, 473006, China.
- Peng Cheng Laboratory, Vanke Cloud City Phase I Building 8, Xili Street, Nanshan District, Shenzhen, 518055, Guangdong, China.
| | - Xin-Qing Zhao
- State Key Laboratory of Microbial Metabolism, Joint International Research Laboratory of Metabolic & Developmental Sciences, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, 200240, China.
| | - Yanjing Wang
- State Key Laboratory of Microbial Metabolism, Joint International Research Laboratory of Metabolic & Developmental Sciences, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, 200240, China.
- Engineering Research Center of Cell & Therapeutic Antibody, School of Pharmacy, Shanghai Jiao Tong University, Shanghai, 200240, China.
| |
Collapse
|
22
|
Tao S, Hou Y, Diao L, Hu Y, Xu W, Xie S, Xiao Z. Long noncoding RNA study: Genome-wide approaches. Genes Dis 2023; 10:2491-2510. [PMID: 37554208 PMCID: PMC10404890 DOI: 10.1016/j.gendis.2022.10.024] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2022] [Revised: 10/09/2022] [Accepted: 10/23/2022] [Indexed: 11/30/2022] Open
Abstract
Long noncoding RNAs (lncRNAs) have been confirmed to play a crucial role in various biological processes across several species. Though many efforts have been devoted to the expansion of the lncRNAs landscape, much about lncRNAs is still unknown due to their great complexity. The development of high-throughput technologies and the constantly improved bioinformatic methods have resulted in a rapid expansion of lncRNA research and relevant databases. In this review, we introduced genome-wide research of lncRNAs in three parts: (i) novel lncRNA identification by high-throughput sequencing and computational pipelines; (ii) functional characterization of lncRNAs by expression atlas profiling, genome-scale screening, and the research of cancer-related lncRNAs; (iii) mechanism research by large-scale experimental technologies and computational analysis. Besides, primary experimental methods and bioinformatic pipelines related to these three parts are summarized. This review aimed to provide a comprehensive and systemic overview of lncRNA genome-wide research strategies and indicate a genome-wide lncRNA research system.
Collapse
Affiliation(s)
- Shuang Tao
- The Biotherapy Center, The Third Affiliated Hospital of Sun Yat-sen University, Guangzhou, Guangdong 510630, China
| | - Yarui Hou
- The Biotherapy Center, The Third Affiliated Hospital of Sun Yat-sen University, Guangzhou, Guangdong 510630, China
| | - Liting Diao
- The Biotherapy Center, The Third Affiliated Hospital of Sun Yat-sen University, Guangzhou, Guangdong 510630, China
| | - Yanxia Hu
- The Biotherapy Center, The Third Affiliated Hospital of Sun Yat-sen University, Guangzhou, Guangdong 510630, China
| | - Wanyi Xu
- The Biotherapy Center, The Third Affiliated Hospital of Sun Yat-sen University, Guangzhou, Guangdong 510630, China
| | - Shujuan Xie
- The Biotherapy Center, The Third Affiliated Hospital of Sun Yat-sen University, Guangzhou, Guangdong 510630, China
- Institute of Vaccine, The Third Affiliated Hospital of Sun Yat-sen University, Guangzhou, Guangdong 510630, China
| | - Zhendong Xiao
- The Biotherapy Center, The Third Affiliated Hospital of Sun Yat-sen University, Guangzhou, Guangdong 510630, China
| |
Collapse
|
23
|
Garcia-Del Rio DF, Fournier I, Cardon T, Salzet M. Protocol to identify human subcellular alternative protein interactions using cross-linking mass spectrometry. STAR Protoc 2023; 4:102380. [PMID: 37384523 PMCID: PMC10511867 DOI: 10.1016/j.xpro.2023.102380] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2023] [Revised: 05/04/2023] [Accepted: 05/24/2023] [Indexed: 07/01/2023] Open
Abstract
Since the start of mass-spectrometry-based proteomics, proteins from non-referenced open reading frames or alternative proteins (AltProts) have been overlooked. Here, we present a protocol to identify human subcellular AltProt and decipher some interactions using cross-linking mass spectrometry. We describe steps for cell culture, in cellulo cross-link, subcellular extraction, and sequential digestion. We then detail both liquid chromatography-tandem mass spectrometry and cross-link data analyses. The implementation of a single workflow allows the non-targeted identification of signaling pathways involving AltProts. For complete details on the use and execution of this protocol, please refer to Garcia-del Rio et al.1.
Collapse
Affiliation(s)
- Diego Fernando Garcia-Del Rio
- Université de Lille, Univ. Lille, CHU Lille, Inserm U1192 - Protéomique Réponse Inflammatoire Spectrométrie de Masse - PRISM, F-59000 Lille, France; VIB Center for Medical Biotechnology, VIB, Ghent 9052, Belgium; Department of Biomolecular Medicine, Ghent University, Ghent 9052, Belgium
| | - Isabelle Fournier
- Université de Lille, Univ. Lille, CHU Lille, Inserm U1192 - Protéomique Réponse Inflammatoire Spectrométrie de Masse - PRISM, F-59000 Lille, France.
| | - Tristan Cardon
- Université de Lille, Univ. Lille, CHU Lille, Inserm U1192 - Protéomique Réponse Inflammatoire Spectrométrie de Masse - PRISM, F-59000 Lille, France.
| | - Michel Salzet
- Université de Lille, Univ. Lille, CHU Lille, Inserm U1192 - Protéomique Réponse Inflammatoire Spectrométrie de Masse - PRISM, F-59000 Lille, France
| |
Collapse
|
24
|
Zhang S, Guo Y, Fidelito G, Robinson DR, Liang C, Lim R, Bichler Z, Guo R, Wu G, Xu H, Zhou QD, Singh BK, Yen P, Kappei D, Stroud DA, Ho L. LINC00116-encoded microprotein mitoregulin regulates fatty acid metabolism at the mitochondrial outer membrane. iScience 2023; 26:107558. [PMID: 37664623 PMCID: PMC10469944 DOI: 10.1016/j.isci.2023.107558] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2022] [Revised: 07/04/2023] [Accepted: 08/02/2023] [Indexed: 09/05/2023] Open
Abstract
LINC00116 encodes a microprotein first identified as Mitoregulin (MTLN), where it was reported to localize to the inner membrane of mitochondria to regulate fatty acid oxidation and oxidative phosphorylation. These initial discoveries were followed by reports with differing findings about its molecular functions and submitochondrial localization. To clarify the apparent discrepancies, we constructed multiple orthogonal methods of determining the localization of MTLN, including split GFP-based reporters that enable efficient and reliable topology analyses for microproteins. These methods unequivocally demonstrate MTLN primarily localizes to the outer membrane of mitochondria, where it interacts with enzymes of fatty acid metabolism including CPT1B and CYB5B. Loss of MTLN causes the accumulation of very long-chain fatty acids (VLCFAs), especially docosahexaenoic acid (DHA). Intriguingly, loss of MTLN protects mice against western diet/fructose-induced insulin-resistance, suggests a protective effect of VLCFAs in this context. MTLN thus serves as an attractive target to control the catabolism of VLCFAs.
Collapse
Affiliation(s)
- Shan Zhang
- Department of Biochemistry, Department of Cardiology of The First Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou 310058, China
- Cardiovascular and Metabolic Diseases, Duke-NUS Medical School, Singapore 169857, Singapore
| | - Yabo Guo
- Department of Biochemistry, Department of Cardiology of The First Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou 310058, China
| | - Gio Fidelito
- Cardiovascular and Metabolic Diseases, Duke-NUS Medical School, Singapore 169857, Singapore
| | - David R.L. Robinson
- Department of Biochemistry and Pharmacology, The Bio21 Molecular Science & Biotechnology Institute, University of Melbourne, Melbourne, VIC 3010, Australia
| | - Chao Liang
- Cardiovascular and Metabolic Diseases, Duke-NUS Medical School, Singapore 169857, Singapore
| | - Radiance Lim
- Cardiovascular and Metabolic Diseases, Duke-NUS Medical School, Singapore 169857, Singapore
| | - Zoë Bichler
- Behavioral Neuroscience Laboratory, National Neuroscience Institute, Singapore 308433, Singapore
| | - Ruiyang Guo
- Department of Biochemistry, Department of Cardiology of The First Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou 310058, China
| | - Gaoqi Wu
- Institute of Immunology, Department of Surgical Oncology of The First Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou 310058, China
| | - He Xu
- Institute of Immunology, Department of Surgical Oncology of The First Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou 310058, China
| | - Quan D. Zhou
- Institute of Immunology, Department of Surgical Oncology of The First Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou 310058, China
| | - Brijesh K. Singh
- Cardiovascular and Metabolic Diseases, Duke-NUS Medical School, Singapore 169857, Singapore
| | - Paul Yen
- Cardiovascular and Metabolic Diseases, Duke-NUS Medical School, Singapore 169857, Singapore
| | - Dennis Kappei
- Department of Biochemistry, Yong Loo Lin School of Medicine, National University of Singapore, Singapore 117596, Singapore
- Cancer Science Institute of Singapore, National University of Singapore, Singapore 117599, Singapore
- NUS Center for Cancer Research, Yong Loo Lin School of Medicine, National University of Singapore, Singapore 117596, Singapore
| | - David A. Stroud
- Department of Biochemistry and Pharmacology, The Bio21 Molecular Science & Biotechnology Institute, University of Melbourne, Melbourne, VIC 3010, Australia
- Murdoch Children’s Research Institute, Royal Children’s Hospital, Melbourne, VIC 3010, Australia
- Victorian Clinical Genetics Services, Murdoch Children’s Research Institute, Melbourne, VIC 3010, Australia
| | - Lena Ho
- Cardiovascular and Metabolic Diseases, Duke-NUS Medical School, Singapore 169857, Singapore
| |
Collapse
|
25
|
Prensner JR, Abelin JG, Kok LW, Clauser KR, Mudge JM, Ruiz-Orera J, Bassani-Sternberg M, Moritz RL, Deutsch EW, van Heesch S. What Can Ribo-Seq, Immunopeptidomics, and Proteomics Tell Us About the Noncanonical Proteome? Mol Cell Proteomics 2023; 22:100631. [PMID: 37572790 PMCID: PMC10506109 DOI: 10.1016/j.mcpro.2023.100631] [Citation(s) in RCA: 13] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2023] [Revised: 07/21/2023] [Accepted: 08/08/2023] [Indexed: 08/14/2023] Open
Abstract
Ribosome profiling (Ribo-Seq) has proven transformative for our understanding of the human genome and proteome by illuminating thousands of noncanonical sites of ribosome translation outside the currently annotated coding sequences (CDSs). A conservative estimate suggests that at least 7000 noncanonical ORFs are translated, which, at first glance, has the potential to expand the number of human protein CDSs by 30%, from ∼19,500 annotated CDSs to over 26,000 annotated CDSs. Yet, additional scrutiny of these ORFs has raised numerous questions about what fraction of them truly produce a protein product and what fraction of those can be understood as proteins according to conventional understanding of the term. Adding further complication is the fact that published estimates of noncanonical ORFs vary widely by around 30-fold, from several thousand to several hundred thousand. The summation of this research has left the genomics and proteomics communities both excited by the prospect of new coding regions in the human genome but searching for guidance on how to proceed. Here, we discuss the current state of noncanonical ORF research, databases, and interpretation, focusing on how to assess whether a given ORF can be said to be "protein coding."
Collapse
Affiliation(s)
- John R Prensner
- Division of Pediatric Hematology/Oncology, Department of Pediatrics, University of Michigan Medical School, Ann Arbor, Michigan, USA; Department of Biological Chemistry, University of Michigan Medical School, Ann Arbor, Michigan, USA.
| | | | - Leron W Kok
- Princess Máxima Center for Pediatric Oncology, Utrecht, The Netherlands
| | - Karl R Clauser
- Broad Institute of MIT and Harvard, Cambridge, Massachusetts, USA
| | - Jonathan M Mudge
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Cambridge, UK
| | - Jorge Ruiz-Orera
- Cardiovascular and Metabolic Sciences, Max Delbrück Center for Molecular Medicine in the Helmholtz Association (MDC), Berlin, Germany
| | - Michal Bassani-Sternberg
- Ludwig Institute for Cancer Research, Agora Center Bugnon 25A, University of Lausanne, Lausanne, Switzerland; Department of Oncology, Centre Hospitalier Universitaire Vaudois (CHUV), Lausanne, Switzerland; Agora Cancer Research Centre, Lausanne, Switzerland
| | - Robert L Moritz
- Institute for Systems Biology (ISB), Seattle, Washington, USA
| | - Eric W Deutsch
- Institute for Systems Biology (ISB), Seattle, Washington, USA
| | | |
Collapse
|
26
|
Chen Y, Cao X, Loh KH, Slavoff SA. Chemical labeling and proteomics for characterization of unannotated small and alternative open reading frame-encoded polypeptides. Biochem Soc Trans 2023; 51:1071-1082. [PMID: 37171061 PMCID: PMC10317152 DOI: 10.1042/bst20221074] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2023] [Revised: 03/27/2023] [Accepted: 04/13/2023] [Indexed: 05/13/2023]
Abstract
Thousands of unannotated small and alternative open reading frames (smORFs and alt-ORFs, respectively) have recently been revealed in mammalian genomes. While hundreds of mammalian smORF- and alt-ORF-encoded proteins (SEPs and alt-proteins, respectively) affect cell proliferation, the overwhelming majority of smORFs and alt-ORFs remain uncharacterized at the molecular level. Complicating the task of identifying the biological roles of smORFs and alt-ORFs, the SEPs and alt-proteins that they encode exhibit limited sequence homology to protein domains of known function. Experimental techniques for the functionalization of these gene classes are therefore required. Approaches combining chemical labeling and quantitative proteomics have greatly advanced our ability to identify and characterize functional SEPs and alt-proteins in high throughput. In this review, we briefly describe the principles of proteomic discovery of SEPs and alt-proteins, then summarize how these technologies interface with chemical labeling for identification of SEPs and alt-proteins with specific properties, as well as in defining the interactome of SEPs and alt-proteins.
Collapse
Affiliation(s)
- Yanran Chen
- Department of Chemistry, Yale University, New Haven, CT, U.S.A
- Institute for Biomolecular Design and Discovery, Yale University, West Haven, CT, U.S.A
| | - Xiongwen Cao
- Department of Chemistry, Yale University, New Haven, CT, U.S.A
- Institute for Biomolecular Design and Discovery, Yale University, West Haven, CT, U.S.A
- Department of Comparative Medicine, Yale University School of Medicine, New Haven, CT, U.S.A
- Shanghai Key Laboratory of Regulatory Biology, Institute of Biomedical Sciences and School of Life Sciences, East China Normal University, Shanghai, China
| | - Ken H. Loh
- Institute for Biomolecular Design and Discovery, Yale University, West Haven, CT, U.S.A
- Department of Comparative Medicine, Yale University School of Medicine, New Haven, CT, U.S.A
| | - Sarah A. Slavoff
- Department of Chemistry, Yale University, New Haven, CT, U.S.A
- Institute for Biomolecular Design and Discovery, Yale University, West Haven, CT, U.S.A
- Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT, U.S.A
| |
Collapse
|
27
|
Dong X, Zhang K, Xun C, Chu T, Liang S, Zeng Y, Liu Z. Small Open Reading Frame-Encoded Micro-Peptides: An Emerging Protein World. Int J Mol Sci 2023; 24:10562. [PMID: 37445739 DOI: 10.3390/ijms241310562] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2023] [Revised: 06/20/2023] [Accepted: 06/21/2023] [Indexed: 07/15/2023] Open
Abstract
Small open reading frames (sORFs) are often overlooked features in genomes. In the past, they were labeled as noncoding or "transcriptional noise". However, accumulating evidence from recent years suggests that sORFs may be transcribed and translated to produce sORF-encoded polypeptides (SEPs) with less than 100 amino acids. The vigorous development of computational algorithms, ribosome profiling, and peptidome has facilitated the prediction and identification of many new SEPs. These SEPs were revealed to be involved in a wide range of basic biological processes, such as gene expression regulation, embryonic development, cellular metabolism, inflammation, and even carcinogenesis. To effectively understand the potential biological functions of SEPs, we discuss the history and development of the newly emerging research on sORFs and SEPs. In particular, we review a range of recently discovered bioinformatics tools for identifying, predicting, and validating SEPs as well as a variety of biochemical experiments for characterizing SEP functions. Lastly, this review underlines the challenges and future directions in identifying and validating sORFs and their encoded micropeptides, providing a significant reference for upcoming research on sORF-encoded peptides.
Collapse
Affiliation(s)
- Xiaoping Dong
- National & Local Joint Engineering Laboratory of Animal Peptide Drug Development, College of Life Sciences, Hunan Normal University, Changsha 410081, China
- Peptide and Small Molecule Drug R&D Platform, Furong Laboratory, Hunan Normal University, Changsha 410081, China
| | - Kun Zhang
- The State Key Laboratory of Developmental Biology of Freshwater Fish, College of Life Science, Hunan Normal University, Changsha 410081, China
| | - Chengfeng Xun
- National & Local Joint Engineering Laboratory of Animal Peptide Drug Development, College of Life Sciences, Hunan Normal University, Changsha 410081, China
- Peptide and Small Molecule Drug R&D Platform, Furong Laboratory, Hunan Normal University, Changsha 410081, China
| | - Tianqi Chu
- National & Local Joint Engineering Laboratory of Animal Peptide Drug Development, College of Life Sciences, Hunan Normal University, Changsha 410081, China
- Peptide and Small Molecule Drug R&D Platform, Furong Laboratory, Hunan Normal University, Changsha 410081, China
| | - Songping Liang
- National & Local Joint Engineering Laboratory of Animal Peptide Drug Development, College of Life Sciences, Hunan Normal University, Changsha 410081, China
- Peptide and Small Molecule Drug R&D Platform, Furong Laboratory, Hunan Normal University, Changsha 410081, China
| | - Yong Zeng
- National & Local Joint Engineering Laboratory of Animal Peptide Drug Development, College of Life Sciences, Hunan Normal University, Changsha 410081, China
- Peptide and Small Molecule Drug R&D Platform, Furong Laboratory, Hunan Normal University, Changsha 410081, China
- The State Key Laboratory of Developmental Biology of Freshwater Fish, College of Life Science, Hunan Normal University, Changsha 410081, China
| | - Zhonghua Liu
- National & Local Joint Engineering Laboratory of Animal Peptide Drug Development, College of Life Sciences, Hunan Normal University, Changsha 410081, China
- Peptide and Small Molecule Drug R&D Platform, Furong Laboratory, Hunan Normal University, Changsha 410081, China
| |
Collapse
|
28
|
Hassel KR, Brito-Estrada O, Makarewich CA. Microproteins: Overlooked regulators of physiology and disease. iScience 2023; 26:106781. [PMID: 37213226 PMCID: PMC10199267 DOI: 10.1016/j.isci.2023.106781] [Citation(s) in RCA: 9] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/23/2023] Open
Abstract
Ongoing efforts to generate a complete and accurate annotation of the genome have revealed a significant blind spot for small proteins (<100 amino acids) originating from short open reading frames (sORFs). The recent discovery of numerous sORF-encoded proteins, termed microproteins, that play diverse roles in critical cellular processes has ignited the field of microprotein biology. Large-scale efforts are currently underway to identify sORF-encoded microproteins in diverse cell-types and tissues and specialized methods and tools have been developed to aid in their discovery, validation, and functional characterization. Microproteins that have been identified thus far play important roles in fundamental processes including ion transport, oxidative phosphorylation, and stress signaling. In this review, we discuss the optimized tools available for microprotein discovery and validation, summarize the biological functions of numerous microproteins, outline the promise for developing microproteins as therapeutic targets, and look forward to the future of the field of microprotein biology.
Collapse
Affiliation(s)
- Keira R. Hassel
- The Heart Institute, Division of Molecular Cardiovascular Biology, Cincinnati Children’s Hospital Medical Center, Cincinnati, OH 45229, USA
- University of Cincinnati College of Medicine, Cincinnati, OH 45229, USA
| | - Omar Brito-Estrada
- The Heart Institute, Division of Molecular Cardiovascular Biology, Cincinnati Children’s Hospital Medical Center, Cincinnati, OH 45229, USA
- University of Cincinnati College of Medicine, Cincinnati, OH 45229, USA
| | - Catherine A. Makarewich
- The Heart Institute, Division of Molecular Cardiovascular Biology, Cincinnati Children’s Hospital Medical Center, Cincinnati, OH 45229, USA
- Department of Pediatrics, University of Cincinnati College of Medicine, Cincinnati, OH 45229, USA
| |
Collapse
|
29
|
Leblanc S, Brunet MA, Jacques JF, Lekehal AM, Duclos A, Tremblay A, Bruggeman-Gascon A, Samandi S, Brunelle M, Cohen AA, Scott MS, Roucou X. Newfound Coding Potential of Transcripts Unveils Missing Members of Human Protein Communities. GENOMICS, PROTEOMICS & BIOINFORMATICS 2023; 21:515-534. [PMID: 36183975 PMCID: PMC10787177 DOI: 10.1016/j.gpb.2022.09.008] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/01/2021] [Revised: 08/10/2022] [Accepted: 09/26/2022] [Indexed: 06/16/2023]
Abstract
Recent proteogenomic approaches have led to the discovery that regions of the transcriptome previously annotated as non-coding regions [i.e., untranslated regions (UTRs), open reading frames overlapping annotated coding sequences in a different reading frame, and non-coding RNAs] frequently encode proteins, termed alternative proteins (altProts). This suggests that previously identified protein-protein interaction (PPI) networks are partially incomplete because altProts are not present in conventional protein databases. Here, we used the proteogenomic resource OpenProt and a combined spectrum- and peptide-centric analysis for the re-analysis of a high-throughput human network proteomics dataset, thereby revealing the presence of 261 altProts in the network. We found 19 genes encoding both an annotated (reference) and an alternative protein interacting with each other. Of the 117 altProts encoded by pseudogenes, 38 are direct interactors of reference proteins encoded by their respective parental genes. Finally, we experimentally validate several interactions involving altProts. These data improve the blueprints of the human PPI network and suggest functional roles for hundreds of altProts.
Collapse
Affiliation(s)
- Sébastien Leblanc
- Department of Biochemistry and Functional Genomics, Université de Sherbrooke, Sherbrooke, QC J1E 4K8, Canada; PROTEO, Quebec Network for Research on Protein Function, Structure, and Engineering, Quebec City, QC G1V 0A6, Canada
| | - Marie A Brunet
- Department of Biochemistry and Functional Genomics, Université de Sherbrooke, Sherbrooke, QC J1E 4K8, Canada; PROTEO, Quebec Network for Research on Protein Function, Structure, and Engineering, Quebec City, QC G1V 0A6, Canada
| | - Jean-François Jacques
- Department of Biochemistry and Functional Genomics, Université de Sherbrooke, Sherbrooke, QC J1E 4K8, Canada; PROTEO, Quebec Network for Research on Protein Function, Structure, and Engineering, Quebec City, QC G1V 0A6, Canada
| | - Amina M Lekehal
- Department of Biochemistry and Functional Genomics, Université de Sherbrooke, Sherbrooke, QC J1E 4K8, Canada; PROTEO, Quebec Network for Research on Protein Function, Structure, and Engineering, Quebec City, QC G1V 0A6, Canada
| | - Andréa Duclos
- Department of Biochemistry and Functional Genomics, Université de Sherbrooke, Sherbrooke, QC J1E 4K8, Canada
| | - Alexia Tremblay
- Department of Biochemistry and Functional Genomics, Université de Sherbrooke, Sherbrooke, QC J1E 4K8, Canada
| | - Alexis Bruggeman-Gascon
- Department of Biochemistry and Functional Genomics, Université de Sherbrooke, Sherbrooke, QC J1E 4K8, Canada
| | - Sondos Samandi
- Department of Biochemistry and Functional Genomics, Université de Sherbrooke, Sherbrooke, QC J1E 4K8, Canada; PROTEO, Quebec Network for Research on Protein Function, Structure, and Engineering, Quebec City, QC G1V 0A6, Canada
| | - Mylène Brunelle
- Department of Biochemistry and Functional Genomics, Université de Sherbrooke, Sherbrooke, QC J1E 4K8, Canada; PROTEO, Quebec Network for Research on Protein Function, Structure, and Engineering, Quebec City, QC G1V 0A6, Canada
| | - Alan A Cohen
- Department of Family Medicine, Université de Sherbrooke, Sherbrooke, QC J1H 5N4, Canada
| | - Michelle S Scott
- Department of Biochemistry and Functional Genomics, Université de Sherbrooke, Sherbrooke, QC J1E 4K8, Canada
| | - Xavier Roucou
- Department of Biochemistry and Functional Genomics, Université de Sherbrooke, Sherbrooke, QC J1E 4K8, Canada; PROTEO, Quebec Network for Research on Protein Function, Structure, and Engineering, Quebec City, QC G1V 0A6, Canada.
| |
Collapse
|
30
|
Kienzle L, Bettinazzi S, Choquette T, Brunet M, Khorami HH, Jacques JF, Moreau M, Roucou X, Landry CR, Angers A, Breton S. A small protein coded within the mitochondrial canonical gene nd4 regulates mitochondrial bioenergetics. BMC Biol 2023; 21:111. [PMID: 37198654 DOI: 10.1186/s12915-023-01609-y] [Citation(s) in RCA: 8] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2022] [Accepted: 05/03/2023] [Indexed: 05/19/2023] Open
Abstract
BACKGROUND Mitochondria have a central role in cellular functions, aging, and in certain diseases. They possess their own genome, a vestige of their bacterial ancestor. Over the course of evolution, most of the genes of the ancestor have been lost or transferred to the nucleus. In humans, the mtDNA is a very small circular molecule with a functional repertoire limited to only 37 genes. Its extremely compact nature with genes arranged one after the other and separated by short non-coding regions suggests that there is little room for evolutionary novelties. This is radically different from bacterial genomes, which are also circular but much larger, and in which we can find genes inside other genes. These sequences, different from the reference coding sequences, are called alternatives open reading frames or altORFs, and they are involved in key biological functions. However, whether altORFs exist in mitochondrial protein-coding genes or elsewhere in the human mitogenome has not been fully addressed. RESULTS We found a downstream alternative ATG initiation codon in the + 3 reading frame of the human mitochondrial nd4 gene. This newly characterized altORF encodes a 99-amino-acid-long polypeptide, MTALTND4, which is conserved in primates. Our custom antibody, but not the pre-immune serum, was able to immunoprecipitate MTALTND4 from HeLa cell lysates, confirming the existence of an endogenous MTALTND4 peptide. The protein is localized in mitochondria and cytoplasm and is also found in the plasma, and it impacts cell and mitochondrial physiology. CONCLUSIONS Many human mitochondrial translated ORFs might have so far gone unnoticed. By ignoring mtaltORFs, we have underestimated the coding potential of the mitogenome. Alternative mitochondrial peptides such as MTALTND4 may offer a new framework for the investigation of mitochondrial functions and diseases.
Collapse
Affiliation(s)
- Laura Kienzle
- Département de sciences biologiques, Université de Montréal, Montréal, Canada
| | - Stefano Bettinazzi
- Département de sciences biologiques, Université de Montréal, Montréal, Canada
| | - Thierry Choquette
- Département de sciences biologiques, Université de Montréal, Montréal, Canada
| | - Marie Brunet
- Service de génétique médicale, Département de pédiatrie, Université de Sherbrooke, Sherbrooke, Canada
- Centre de recherche du Centre hospitalier universitaire de Sherbrooke (CRCHUS), Sherbrooke, Canada
| | | | - Jean-François Jacques
- Département de biochimie et génomique fonctionnelle, Université de Sherbrooke, Sherbrooke, Canada
| | - Mathilde Moreau
- Département de biochimie et génomique fonctionnelle, Université de Sherbrooke, Sherbrooke, Canada
| | - Xavier Roucou
- Centre de recherche du Centre hospitalier universitaire de Sherbrooke (CRCHUS), Sherbrooke, Canada
- Département de biochimie et génomique fonctionnelle, Université de Sherbrooke, Sherbrooke, Canada
| | - Christian R Landry
- Département de biochimie, de microbiologie et de bio-informatique, Faculté des sciences et de génie, Université Laval, Québec, Canada
- Institut de biologie intégrative et des systèmes, Université Laval, Québec, Canada
- PROTEO, Le regroupement québécois de recherche sur la fonction, l'ingénierie et les applications des protéines, Université Laval, Québec, Canada
- Centre de recherche sur les données massives, Université Laval, Québec, Canada
- Département de biologie, Faculté des sciences et de génie, Université Laval, Québec, Canada
| | - Annie Angers
- Département de sciences biologiques, Université de Montréal, Montréal, Canada
| | - Sophie Breton
- Département de sciences biologiques, Université de Montréal, Montréal, Canada.
| |
Collapse
|
31
|
Prensner JR, Abelin JG, Kok LW, Clauser KR, Mudge JM, Ruiz-Orera J, Bassani-Sternberg M, Deutsch EW, van Heesch S. What can Ribo-seq and proteomics tell us about the non-canonical proteome? BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.05.16.541049. [PMID: 37292611 PMCID: PMC10245706 DOI: 10.1101/2023.05.16.541049] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Ribosome profiling (Ribo-seq) has proven transformative for our understanding of the human genome and proteome by illuminating thousands of non-canonical sites of ribosome translation outside of the currently annotated coding sequences (CDSs). A conservative estimate suggests that at least 7,000 non-canonical open reading frames (ORFs) are translated, which, at first glance, has the potential to expand the number of human protein-coding sequences by 30%, from ∼19,500 annotated CDSs to over 26,000. Yet, additional scrutiny of these ORFs has raised numerous questions about what fraction of them truly produce a protein product and what fraction of those can be understood as proteins according to conventional understanding of the term. Adding further complication is the fact that published estimates of non-canonical ORFs vary widely by around 30-fold, from several thousand to several hundred thousand. The summation of this research has left the genomics and proteomics communities both excited by the prospect of new coding regions in the human genome, but searching for guidance on how to proceed. Here, we discuss the current state of non-canonical ORF research, databases, and interpretation, focusing on how to assess whether a given ORF can be said to be "protein-coding". In brief The human genome encodes thousands of non-canonical open reading frames (ORFs) in addition to protein-coding genes. As a nascent field, many questions remain regarding non-canonical ORFs. How many exist? Do they encode proteins? What level of evidence is needed for their verification? Central to these debates has been the advent of ribosome profiling (Ribo-seq) as a method to discern genome-wide ribosome occupancy, and immunopeptidomics as a method to detect peptides that are processed and presented by MHC molecules and not observed in traditional proteomics experiments. This article provides a synthesis of the current state of non-canonical ORF research and proposes standards for their future investigation and reporting. Highlights Combined use of Ribo-seq and proteomics-based methods enables optimal confidence in detecting non-canonical ORFs and their protein products.Ribo-seq can provide more sensitive detection of non-canonical ORFs, but data quality and analytical pipelines will impact results.Non-canonical ORF catalogs are diverse and span both high-stringency and low-stringency ORF nominations.A framework for standardized non-canonical ORF evidence will advance the research field.
Collapse
Affiliation(s)
- John R. Prensner
- Department of Pediatrics, Division of Pediatric Hematology/Oncology, University of Michigan Medical School, Ann Arbor, MI 48109, USA
| | | | - Leron W. Kok
- Princess Máxima Center for Pediatric Oncology, Heidelberglaan 25, 3584 CS, Utrecht, the Netherlands
| | - Karl R. Clauser
- Broad Institute of MIT and Harvard, Cambridge, MA, 02142, USA
| | - Jonathan M. Mudge
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Jorge Ruiz-Orera
- Cardiovascular and Metabolic Sciences, Max Delbrück Center for Molecular Medicine in the Helmholtz Association (MDC), 13125 Berlin, Germany
| | - Michal Bassani-Sternberg
- Ludwig Institute for Cancer Research, University of Lausanne, Agora Center Bugnon 25A, 1005 Lausanne, Switzerland
- Department of Oncology, Centre hospitalier universitaire vaudois (CHUV), Rue du Bugnon 46, 1005 Lausanne, Switzerland
- Agora Cancer Research Centre, 1011 Lausanne, Switzerland
| | - Eric W. Deutsch
- Institute for Systems Biology (ISB), Seattle, Washington 98109, USA
| | - Sebastiaan van Heesch
- Princess Máxima Center for Pediatric Oncology, Heidelberglaan 25, 3584 CS, Utrecht, the Netherlands
| |
Collapse
|
32
|
Prakash A, García-Seisdedos D, Wang S, Kundu DJ, Collins A, George N, Moreno P, Papatheodorou I, Jones AR, Vizcaíno JA. Integrated View of Baseline Protein Expression in Human Tissues. J Proteome Res 2023; 22:729-742. [PMID: 36577097 PMCID: PMC9990129 DOI: 10.1021/acs.jproteome.2c00406] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022]
Abstract
The availability of proteomics datasets in the public domain, and in the PRIDE database, in particular, has increased dramatically in recent years. This unprecedented large-scale availability of data provides an opportunity for combined analyses of datasets to get organism-wide protein abundance data in a consistent manner. We have reanalyzed 24 public proteomics datasets from healthy human individuals to assess baseline protein abundance in 31 organs. We defined tissue as a distinct functional or structural region within an organ. Overall, the aggregated dataset contains 67 healthy tissues, corresponding to 3,119 mass spectrometry runs covering 498 samples from 489 individuals. We compared protein abundances between different organs and studied the distribution of proteins across these organs. We also compared the results with data generated in analogous studies. Additionally, we performed gene ontology and pathway-enrichment analyses to identify organ-specific enriched biological processes and pathways. As a key point, we have integrated the protein abundance results into the resource Expression Atlas, where they can be accessed and visualized either individually or together with gene expression data coming from transcriptomics datasets. We believe this is a good mechanism to make proteomics data more accessible for life scientists.
Collapse
Affiliation(s)
- Ananth Prakash
- European Molecular Biology Laboratory - European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, CambridgeCB10 1SD, United Kingdom.,Open Targets, Wellcome Genome Campus, Hinxton, CambridgeCB10 1SD, United Kingdom
| | - David García-Seisdedos
- European Molecular Biology Laboratory - European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, CambridgeCB10 1SD, United Kingdom
| | - Shengbo Wang
- European Molecular Biology Laboratory - European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, CambridgeCB10 1SD, United Kingdom
| | - Deepti Jaiswal Kundu
- European Molecular Biology Laboratory - European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, CambridgeCB10 1SD, United Kingdom
| | - Andrew Collins
- Institute of Systems, Molecular and Integrative Biology, University of Liverpool, LiverpoolL69 7ZB, United Kingdom
| | - Nancy George
- European Molecular Biology Laboratory - European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, CambridgeCB10 1SD, United Kingdom
| | - Pablo Moreno
- European Molecular Biology Laboratory - European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, CambridgeCB10 1SD, United Kingdom
| | - Irene Papatheodorou
- European Molecular Biology Laboratory - European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, CambridgeCB10 1SD, United Kingdom.,Open Targets, Wellcome Genome Campus, Hinxton, CambridgeCB10 1SD, United Kingdom
| | - Andrew R Jones
- Institute of Systems, Molecular and Integrative Biology, University of Liverpool, LiverpoolL69 7ZB, United Kingdom
| | - Juan Antonio Vizcaíno
- European Molecular Biology Laboratory - European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, CambridgeCB10 1SD, United Kingdom.,Open Targets, Wellcome Genome Campus, Hinxton, CambridgeCB10 1SD, United Kingdom
| |
Collapse
|
33
|
Manuel JM, Guilloy N, Khatir I, Roucou X, Laurent B. Re-evaluating the impact of alternative RNA splicing on proteomic diversity. Front Genet 2023; 14:1089053. [PMID: 36845399 PMCID: PMC9947481 DOI: 10.3389/fgene.2023.1089053] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2022] [Accepted: 01/23/2023] [Indexed: 02/11/2023] Open
Abstract
Alternative splicing (AS) constitutes a mechanism by which protein-coding genes and long non-coding RNA (lncRNA) genes produce more than a single mature transcript. From plants to humans, AS is a powerful process that increases transcriptome complexity. Importantly, splice variants produced from AS can potentially encode for distinct protein isoforms which can lose or gain specific domains and, hence, differ in their functional properties. Advances in proteomics have shown that the proteome is indeed diverse due to the presence of numerous protein isoforms. For the past decades, with the help of advanced high-throughput technologies, numerous alternatively spliced transcripts have been identified. However, the low detection rate of protein isoforms in proteomic studies raised debatable questions on whether AS contributes to proteomic diversity and on how many AS events are really functional. We propose here to assess and discuss the impact of AS on proteomic complexity in the light of the technological progress, updated genome annotation, and current scientific knowledge.
Collapse
Affiliation(s)
- Jeru Manoj Manuel
- Research Center on Aging, Centre Intégré Universitaire de Santé et Services Sociaux de l’Estrie-Centre Hospitalier Universitaire de Sherbrooke, Sherbrooke, QC, Canada,Department of Biochemistry and Functional Genomics, Faculty of Medicine and Health Sciences, Université de Sherbrooke, Sherbrooke, QC, Canada
| | - Noé Guilloy
- Department of Biochemistry and Functional Genomics, Faculty of Medicine and Health Sciences, Université de Sherbrooke, Sherbrooke, QC, Canada
| | - Inès Khatir
- Research Center on Aging, Centre Intégré Universitaire de Santé et Services Sociaux de l’Estrie-Centre Hospitalier Universitaire de Sherbrooke, Sherbrooke, QC, Canada,Department of Biochemistry and Functional Genomics, Faculty of Medicine and Health Sciences, Université de Sherbrooke, Sherbrooke, QC, Canada
| | - Xavier Roucou
- Department of Biochemistry and Functional Genomics, Faculty of Medicine and Health Sciences, Université de Sherbrooke, Sherbrooke, QC, Canada,Centre de Recherche du Centre Hospitalier Universitaire de Sherbrooke (CRCHUS), Sherbrooke, QC, Canada,Quebec Network for Research on Protein Function Structure and Engineering, PROTEO, Québec, QC, Canada
| | - Benoit Laurent
- Research Center on Aging, Centre Intégré Universitaire de Santé et Services Sociaux de l’Estrie-Centre Hospitalier Universitaire de Sherbrooke, Sherbrooke, QC, Canada,Department of Biochemistry and Functional Genomics, Faculty of Medicine and Health Sciences, Université de Sherbrooke, Sherbrooke, QC, Canada,*Correspondence: Benoit Laurent,
| |
Collapse
|
34
|
Complementary peptides represent a credible alternative to agrochemicals by activating translation of targeted proteins. Nat Commun 2023; 14:254. [PMID: 36650156 PMCID: PMC9845214 DOI: 10.1038/s41467-023-35951-0] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2022] [Accepted: 01/10/2023] [Indexed: 01/18/2023] Open
Abstract
The current agriculture main challenge is to maintain food production while facing multiple threats such as increasing world population, temperature increase, lack of agrochemicals due to health issues and uprising of weeds resistant to herbicides. Developing novel, alternative, and safe methods is hence of paramount importance. Here, we show that complementary peptides (cPEPs) from any gene can be designed to target specifically plant coding genes. External application of synthetic peptides increases the abundance of the targeted protein, leading to related phenotypes. Moreover, we provide evidence that cPEPs can be powerful tools in agronomy to improve plant traits, such as growth, resistance to pathogen or heat stress, without the needs of genetic approaches. Finally, by combining their activity they can also be used to reduce weed growth.
Collapse
|
35
|
Deutsch EW, Bandeira N, Perez-Riverol Y, Sharma V, Carver J, Mendoza L, Kundu DJ, Wang S, Bandla C, Kamatchinathan S, Hewapathirana S, Pullman B, Wertz J, Sun Z, Kawano S, Okuda S, Watanabe Y, MacLean B, MacCoss M, Zhu Y, Ishihama Y, Vizcaíno J. The ProteomeXchange consortium at 10 years: 2023 update. Nucleic Acids Res 2023; 51:D1539-D1548. [PMID: 36370099 PMCID: PMC9825490 DOI: 10.1093/nar/gkac1040] [Citation(s) in RCA: 273] [Impact Index Per Article: 273.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2022] [Revised: 10/20/2022] [Accepted: 10/23/2022] [Indexed: 11/13/2022] Open
Abstract
Mass spectrometry (MS) is by far the most used experimental approach in high-throughput proteomics. The ProteomeXchange (PX) consortium of proteomics resources (http://www.proteomexchange.org) was originally set up to standardize data submission and dissemination of public MS proteomics data. It is now 10 years since the initial data workflow was implemented. In this manuscript, we describe the main developments in PX since the previous update manuscript in Nucleic Acids Research was published in 2020. The six members of the Consortium are PRIDE, PeptideAtlas (including PASSEL), MassIVE, jPOST, iProX and Panorama Public. We report the current data submission statistics, showcasing that the number of datasets submitted to PX resources has continued to increase every year. As of June 2022, more than 34 233 datasets had been submitted to PX resources, and from those, 20 062 (58.6%) just in the last three years. We also report the development of the Universal Spectrum Identifiers and the improvements in capturing the experimental metadata annotations. In parallel, we highlight that data re-use activities of public datasets continue to increase, enabling connections between PX resources and other popular bioinformatics resources, novel research and also new data resources. Finally, we summarise the current state-of-the-art in data management practices for sensitive human (clinical) proteomics data.
Collapse
Affiliation(s)
| | - Nuno Bandeira
- Center for Computational Mass Spectrometry, University of California, San Diego (UCSD), La Jolla, CA 92093, USA
- Dept. Computer Science and Engineering, University of California, San Diego (UCSD), La Jolla, CA 92093, USA
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego (UCSD), La Jolla, CA 92093, USA
| | - Yasset Perez-Riverol
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | | | - Jeremy J Carver
- Center for Computational Mass Spectrometry, University of California, San Diego (UCSD), La Jolla, CA 92093, USA
- Dept. Computer Science and Engineering, University of California, San Diego (UCSD), La Jolla, CA 92093, USA
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego (UCSD), La Jolla, CA 92093, USA
| | - Luis Mendoza
- Institute for Systems Biology, Seattle WA 98109, USA
| | - Deepti J Kundu
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Shengbo Wang
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Chakradhar Bandla
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Selvakumar Kamatchinathan
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Suresh Hewapathirana
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Benjamin S Pullman
- Center for Computational Mass Spectrometry, University of California, San Diego (UCSD), La Jolla, CA 92093, USA
- Dept. Computer Science and Engineering, University of California, San Diego (UCSD), La Jolla, CA 92093, USA
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego (UCSD), La Jolla, CA 92093, USA
| | - Julie Wertz
- Center for Computational Mass Spectrometry, University of California, San Diego (UCSD), La Jolla, CA 92093, USA
- Dept. Computer Science and Engineering, University of California, San Diego (UCSD), La Jolla, CA 92093, USA
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego (UCSD), La Jolla, CA 92093, USA
| | - Zhi Sun
- Institute for Systems Biology, Seattle WA 98109, USA
| | - Shin Kawano
- Faculty of Contemporary Society, Toyama University of International Studies, Toyama 930-1292, Japan
- Database Center for Life Science (DBCLS), Joint Support-Center for Data Science Research, Research Organization of Information and Systems, Chiba 277-0871, Japan
- School of Frontier Engineering, Kitasato University, Sagamihara 252-0373, Japan
| | - Shujiro Okuda
- Niigata University Graduate School of Medical and Dental Sciences, Niigata 951-8510, Japan
| | - Yu Watanabe
- Niigata University Graduate School of Medical and Dental Sciences, Niigata 951-8510, Japan
| | | | | | - Yunping Zhu
- Beijing Proteome Research Center, National Center for Protein Sciences, Beijing Institute of Lifeomics, Beijing 102206, China
| | - Yasushi Ishihama
- Graduate School of Pharmaceutical Sciences, Kyoto University, Kyoto 606-8501, Japan
| | - Juan Antonio Vizcaíno
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| |
Collapse
|
36
|
Employing non-targeted interactomics approach and subcellular fractionation to increase our understanding of the ghost proteome. iScience 2023; 26:105943. [PMID: 36866041 PMCID: PMC9971881 DOI: 10.1016/j.isci.2023.105943] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2022] [Revised: 11/07/2022] [Accepted: 01/04/2023] [Indexed: 01/09/2023] Open
Abstract
Eukaryotic mRNA has long been considered monocistronic, but nowadays, alternative proteins (AltProts) challenge this tenet. The alternative or ghost proteome has largely been neglected and the involvement of AltProts in biological processes. Here, we used subcellular fractionation to increase the information about AltProts and facilitate the detection of protein-protein interactions by the identification of crosslinked peptides. In total, 112 unique AltProts were identified, and we were able to identify 220 crosslinks without peptide enrichment. Among these, 16 crosslinks between AltProts and Referenced Proteins (RefProts) were identified. We further focused on specific examples such as the interaction between IP_2292176 (AltFAM227B) and HLA-B, in which this protein could be a potential new immunopeptide, and the interactions between HIST1H4F and several AltProts which can play a role in mRNA transcription. Thanks to the study of the interactome and the localization of AltProts, we can reveal more of the importance of the ghost proteome.
Collapse
|
37
|
Çakır U, Gabed N, Brunet M, Roucou X, Kryvoruchko I. Mosaic translation hypothesis: chimeric polypeptides produced via multiple ribosomal frameshifting as a basis for adaptability. FEBS J 2023; 290:370-378. [PMID: 34743413 DOI: 10.1111/febs.16269] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2021] [Revised: 10/03/2021] [Accepted: 11/05/2021] [Indexed: 02/05/2023]
Abstract
How many different proteins can be produced from a single spliced transcript? Genome annotation projects overlook the coding potential of reading frames other than that of the reference open reading frames (refORFs). Recently, alternative open reading frames (altORFs) and their translational products, alternative proteins, have been shown to carry out important functions in various organisms. AltORFs overlapping refORFs or other altORFs in a different reading frame may be involved in one fundamental mechanism so far overlooked. A few years ago, it was proposed that altORFs may act as building blocks for chimeric (mosaic) polypeptides, which are produced via multiple ribosomal frameshifting events from a single mature transcript. We adopt terminology from that earlier discussion and call this mechanism mosaic translation. This way of extracting and combining genetic information may significantly increase proteome diversity. Thus, we hypothesize that this mechanism may have contributed to the flexibility and adaptability of organisms to a variety of environmental conditions. Specialized ribosomes acting as sensors probably played a central role in this process. Importantly, mosaic translation may be the main source of protein diversity in genomes that lack alternative splicing. The idea of mosaic translation is a testable hypothesis, although its direct demonstration is challenging. Should mosaic translation occur, we would currently highly underestimate the complexity of translation mechanisms and thus the proteome.
Collapse
Affiliation(s)
- Umut Çakır
- Molecular Biology and Genetics Department, Faculty of Arts and Sciences, Boğaziçi University, Istanbul, Turkey
| | - Noujoud Gabed
- Cellular and Molecular Biology Department, Oran High School of Biological Sciences (ESSBO), Oran, Algeria
| | - Marie Brunet
- Department of Pediatrics, Medical Genetics Service, Université de Sherbrooke, QC, Canada.,Centre de Recherche du Centre Hospitalier Universitaire de Sherbrooke (CRCHUS), QC, Canada
| | - Xavier Roucou
- Centre de Recherche du Centre Hospitalier Universitaire de Sherbrooke (CRCHUS), QC, Canada.,Department of Biochemistry and Functional Genomics, Université de Sherbrooke, QC, Canada
| | - Igor Kryvoruchko
- Molecular Biology and Genetics Department, Faculty of Arts and Sciences, Boğaziçi University, Istanbul, Turkey
| |
Collapse
|
38
|
Jürgens L, Wethmar K. The Emerging Role of uORF-Encoded uPeptides and HLA uLigands in Cellular and Tumor Biology. Cancers (Basel) 2022; 14:6031. [PMID: 36551517 PMCID: PMC9776223 DOI: 10.3390/cancers14246031] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2022] [Revised: 11/29/2022] [Accepted: 11/30/2022] [Indexed: 12/13/2022] Open
Abstract
Recent technological advances have facilitated the detection of numerous non-canonical human peptides derived from regulatory regions of mRNAs, long non-coding RNAs, and other cryptic transcripts. In this review, we first give an overview of the classification of these novel peptides and summarize recent improvements in their annotation and detection by ribosome profiling, mass spectrometry, and individual experimental analysis. A large fraction of the novel peptides originates from translation at upstream open reading frames (uORFs) that are located within the transcript leader sequence of regular mRNA. In humans, uORF-encoded peptides (uPeptides) have been detected in both healthy and malignantly transformed cells and emerge as important regulators in cellular and immunological pathways. In the second part of the review, we focus on various functional implications of uPeptides. As uPeptides frequently act at the transition of translational regulation and individual peptide function, we describe the mechanistic modes of translational regulation through ribosome stalling, the involvement in cellular programs through protein interaction and complex formation, and their role within the human leukocyte antigen (HLA)-associated immunopeptidome as HLA uLigands. We delineate how malignant transformation may lead to the formation of novel uORFs, uPeptides, or HLA uLigands and explain their potential implication in tumor biology. Ultimately, we speculate on a potential use of uPeptides as peptide drugs and discuss how uPeptides and HLA uLigands may facilitate translational inhibition of oncogenic protein messages and immunotherapeutic approaches in cancer therapy.
Collapse
Affiliation(s)
| | - Klaus Wethmar
- University Hospital Münster, Department of Medicine A, Hematology, Oncology, Hemostaseology and Pneumology, 48149 Münster, Germany
| |
Collapse
|
39
|
Mohaupt P, Roucou X, Delaby C, Vialaret J, Lehmann S, Hirtz C. The alternative proteome in neurobiology. Front Cell Neurosci 2022; 16:1019680. [PMID: 36467612 PMCID: PMC9712206 DOI: 10.3389/fncel.2022.1019680] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2022] [Accepted: 11/02/2022] [Indexed: 10/13/2023] Open
Abstract
Translation involves the biosynthesis of a protein sequence following the decoding of the genetic information embedded in a messenger RNA (mRNA). Typically, the eukaryotic mRNA was considered to be inherently monocistronic, but this paradigm is not in agreement with the translational landscape of cells, tissues, and organs. Recent ribosome sequencing (Ribo-seq) and proteomics studies show that, in addition to currently annotated reference proteins (RefProt), other proteins termed alternative proteins (AltProts), and microproteins are encoded in regions of mRNAs thought to be untranslated or in transcripts annotated as non-coding. This experimental evidence expands the repertoire of functional proteins within a cell and potentially provides important information on biological processes. This review explores the hitherto overlooked alternative proteome in neurobiology and considers the role of AltProts in pathological and healthy neuromolecular processes.
Collapse
Affiliation(s)
- Pablo Mohaupt
- LBPC-PPC, Université de Montpellier, IRMB CHU de Montpellier, INM INSERM, Montpellier, France
| | - Xavier Roucou
- Department of Biochemistry and Functional Genomics, Université de Sherbrooke, Sherbrooke, QC, Canada
| | - Constance Delaby
- LBPC-PPC, Université de Montpellier, IRMB CHU de Montpellier, INM INSERM, Montpellier, France
| | - Jérôme Vialaret
- LBPC-PPC, Université de Montpellier, IRMB CHU de Montpellier, INM INSERM, Montpellier, France
| | - Sylvain Lehmann
- LBPC-PPC, Université de Montpellier, IRMB CHU de Montpellier, INM INSERM, Montpellier, France
| | - Christophe Hirtz
- LBPC-PPC, Université de Montpellier, IRMB CHU de Montpellier, INM INSERM, Montpellier, France
| |
Collapse
|
40
|
Duhamel M, Drelich L, Wisztorski M, Aboulouard S, Gimeno JP, Ogrinc N, Devos P, Cardon T, Weller M, Escande F, Zairi F, Maurage CA, Le Rhun É, Fournier I, Salzet M. Spatial analysis of the glioblastoma proteome reveals specific molecular signatures and markers of survival. Nat Commun 2022; 13:6665. [PMID: 36333286 PMCID: PMC9636229 DOI: 10.1038/s41467-022-34208-6] [Citation(s) in RCA: 22] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2022] [Accepted: 10/18/2022] [Indexed: 11/06/2022] Open
Abstract
Molecular heterogeneity is a key feature of glioblastoma that impedes patient stratification and leads to large discrepancies in mean patient survival. Here, we analyze a cohort of 96 glioblastoma patients with survival ranging from a few months to over 4 years. 46 tumors are analyzed by mass spectrometry-based spatially-resolved proteomics guided by mass spectrometry imaging. Integration of protein expression and clinical information highlights three molecular groups associated with immune, neurogenesis, and tumorigenesis signatures with high intra-tumoral heterogeneity. Furthermore, a set of proteins originating from reference and alternative ORFs is found to be statistically significant based on patient survival times. Among these proteins, a 5-protein signature is associated with survival. The expression of these 5 proteins is validated by immunofluorescence on an additional cohort of 50 patients. Overall, our work characterizes distinct molecular regions within glioblastoma tissues based on protein expression, which may help guide glioblastoma prognosis and improve current glioblastoma classification.
Collapse
Affiliation(s)
- Marie Duhamel
- Univ.Lille, Inserm, CHU Lille, U1192, Laboratoire Protéomique, Réponse Inflammatoire et Spectrométrie de Masse (PRISM), F-59000, Lille, France.
| | - Lauranne Drelich
- Univ.Lille, Inserm, CHU Lille, U1192, Laboratoire Protéomique, Réponse Inflammatoire et Spectrométrie de Masse (PRISM), F-59000, Lille, France
| | - Maxence Wisztorski
- Univ.Lille, Inserm, CHU Lille, U1192, Laboratoire Protéomique, Réponse Inflammatoire et Spectrométrie de Masse (PRISM), F-59000, Lille, France
| | - Soulaimane Aboulouard
- Univ.Lille, Inserm, CHU Lille, U1192, Laboratoire Protéomique, Réponse Inflammatoire et Spectrométrie de Masse (PRISM), F-59000, Lille, France
| | - Jean-Pascal Gimeno
- Univ.Lille, Inserm, CHU Lille, U1192, Laboratoire Protéomique, Réponse Inflammatoire et Spectrométrie de Masse (PRISM), F-59000, Lille, France
| | - Nina Ogrinc
- Univ.Lille, Inserm, CHU Lille, U1192, Laboratoire Protéomique, Réponse Inflammatoire et Spectrométrie de Masse (PRISM), F-59000, Lille, France
| | - Patrick Devos
- Univ. Lille, CHU Lille, ULR 2694 - METRICS: Évaluation des technologies de santé et des pratiques médicales, F-59000, Lille, France
| | - Tristan Cardon
- Univ.Lille, Inserm, CHU Lille, U1192, Laboratoire Protéomique, Réponse Inflammatoire et Spectrométrie de Masse (PRISM), F-59000, Lille, France
| | - Michael Weller
- Department of Neurology & Clinical Neuroscience Center, University Hospital and University of Zurich, Zurich, Switzerland
| | - Fabienne Escande
- CHU Lille, Service de biochimie et biologie moléculaire, CHU Lille, F-59000, Lille, France
| | - Fahed Zairi
- CHU Lille, Service de neurochirurgie, F-59000, Lille, France
| | - Claude-Alain Maurage
- CHU Lille, Service de biochimie et biologie moléculaire, CHU Lille, F-59000, Lille, France
| | - Émilie Le Rhun
- Univ.Lille, Inserm, CHU Lille, U1192, Laboratoire Protéomique, Réponse Inflammatoire et Spectrométrie de Masse (PRISM), F-59000, Lille, France.
- Department of Neurology & Clinical Neuroscience Center, University Hospital and University of Zurich, Zurich, Switzerland.
- CHU Lille, Service de biochimie et biologie moléculaire, CHU Lille, F-59000, Lille, France.
| | - Isabelle Fournier
- Univ.Lille, Inserm, CHU Lille, U1192, Laboratoire Protéomique, Réponse Inflammatoire et Spectrométrie de Masse (PRISM), F-59000, Lille, France.
- Institut Universitaire de France (IUF), 75000, Paris, France.
| | - Michel Salzet
- Univ.Lille, Inserm, CHU Lille, U1192, Laboratoire Protéomique, Réponse Inflammatoire et Spectrométrie de Masse (PRISM), F-59000, Lille, France.
- Institut Universitaire de France (IUF), 75000, Paris, France.
| |
Collapse
|
41
|
Manske F, Ogoniak L, Jürgens L, Grundmann N, Makałowski W, Wethmar K. The new uORFdb: integrating literature, sequence, and variation data in a central hub for uORF research. Nucleic Acids Res 2022; 51:D328-D336. [PMID: 36305828 PMCID: PMC9825577 DOI: 10.1093/nar/gkac899] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2022] [Revised: 09/28/2022] [Accepted: 10/03/2022] [Indexed: 02/07/2023] Open
Abstract
Upstream open reading frames (uORFs) are initiated by AUG or near-cognate start codons and have been identified in the transcript leader sequences of the majority of eukaryotic transcripts. Functionally, uORFs are implicated in downstream translational regulation of the main protein coding sequence and may serve as a source of non-canonical peptides. Genetic defects in uORF sequences have been linked to the development of various diseases, including cancer. To simplify uORF-related research, the initial release of uORFdb in 2014 provided a comprehensive and manually curated collection of uORF-related literature. Here, we present an updated sequence-based version of uORFdb, accessible at https://www.bioinformatics.uni-muenster.de/tools/uorfdb. The new uORFdb enables users to directly access sequence information, graphical displays, and genetic variation data for over 2.4 million human uORFs. It also includes sequence data of >4.2 million uORFs in 12 additional species. Multiple uORFs can be displayed in transcript- and reading-frame-specific models to visualize the translational context. A variety of filters, sequence-related information, and links to external resources (UCSC Genome Browser, dbSNP, ClinVar) facilitate immediate in-depth analysis of individual uORFs. The database also contains uORF-related somatic variation data obtained from whole-genome sequencing (WGS) analyses of 677 cancer samples collected by the TCGA consortium.
Collapse
Affiliation(s)
- Felix Manske
- Institute of Bioinformatics, University of Münster, Münster 48149, Germany
| | - Lynn Ogoniak
- Institute of Bioinformatics, University of Münster, Münster 48149, Germany
| | - Lara Jürgens
- Department of Medicine A, Hematology, Oncology, Hemostaseology and Pneumology, University Hospital Münster, Münster 48149, Germany
| | - Norbert Grundmann
- Institute of Bioinformatics, University of Münster, Münster 48149, Germany
| | - Wojciech Makałowski
- Correspondence may also be addressed to Wojciech Makałowski. Tel: +49 2518353006;
| | - Klaus Wethmar
- To whom correspondence should be addressed. Tel: +49 2518347587; Fax: +49 2518347588;
| |
Collapse
|
42
|
Vasu K, Khan D, Ramachandiran I, Blankenberg D, Fox P. Analysis of nested alternate open reading frames and their encoded proteins. NAR Genom Bioinform 2022; 4:lqac076. [PMID: 36267124 PMCID: PMC9580016 DOI: 10.1093/nargab/lqac076] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2022] [Revised: 08/14/2022] [Accepted: 09/27/2022] [Indexed: 11/22/2022] Open
Abstract
Transcriptional and post-transcriptional mechanisms diversify the proteome beyond gene number, while maintaining a sequence relationship between original and altered proteins. A new mechanism breaks this paradigm, generating novel proteins by translating alternative open reading frames (Alt-ORFs) within canonical host mRNAs. Uniquely, ‘alt-proteins’ lack sequence homology with host ORF-derived proteins. We show global amino acid frequencies, and consequent biochemical characteristics of Alt-ORFs nested within host ORFs (nAlt-ORFs), are genetically-driven, and predicted by summation of frequencies of hundreds of encompassing host codon-pairs. Analysis of 101 human nAlt-ORFs of length ≥150 codons confirms the theoretical predictions, revealing an extraordinarily high median isoelectric point (pI) of 11.68, due to anomalous charged amino acid levels. Also, nAlt-ORF proteins exhibit a >2-fold preference for reading frame 2 versus 3, predicted mitochondrial and nuclear localization, and elevated codon adaptation index indicative of natural selection. Our results provide a theoretical and conceptual framework for exploration of these largely unannotated, but potentially significant, alternative ORFs and their encoded proteins.
Collapse
Affiliation(s)
- Kommireddy Vasu
- Department of Cardiovascular and Metabolic Sciences, Lerner Research Institute, Cleveland Clinic, Cleveland, OH 44195, USA
| | - Debjit Khan
- Department of Cardiovascular and Metabolic Sciences, Lerner Research Institute, Cleveland Clinic, Cleveland, OH 44195, USA
| | - Iyappan Ramachandiran
- Department of Cardiovascular and Metabolic Sciences, Lerner Research Institute, Cleveland Clinic, Cleveland, OH 44195, USA
| | - Daniel Blankenberg
- Correspondence may also be addressed to Daniel Blankenberg. Tel: +1 216 444 4336;
| | - Paul L Fox
- To whom correspondence should be addressed. Tel: +1 216 444 8053; Fax: +1 216 444 9404;
| |
Collapse
|
43
|
Malekos E, Carpenter S. Short open reading frame genes in innate immunity: from discovery to characterization. Trends Immunol 2022; 43:741-756. [PMID: 35965152 PMCID: PMC10118063 DOI: 10.1016/j.it.2022.07.005] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2022] [Revised: 07/11/2022] [Accepted: 07/13/2022] [Indexed: 12/27/2022]
Abstract
Next-generation sequencing (NGS) technologies have greatly expanded the size of the known transcriptome. Many newly discovered transcripts are classified as long noncoding RNAs (lncRNAs) which are assumed to affect phenotype through sequence and structure and not via translated protein products despite the vast majority of them harboring short open reading frames (sORFs). Recent advances have demonstrated that the noncoding designation is incorrect in many cases and that sORF-encoded peptides (SEPs) translated from these transcripts are important contributors to diverse biological processes. Interest in SEPs is at an early stage and there is evidence for the existence of thousands of SEPs that are yet unstudied. We hope to pique interest in investigating this unexplored proteome by providing a discussion of SEP characterization generally and describing specific discoveries in innate immunity.
Collapse
Affiliation(s)
- Eric Malekos
- Department of Biomolecular Engineering, University of California Santa Cruz, Santa Cruz, CA, USA; Genomics Institute, University of California Santa Cruz, Santa Cruz, CA, USA
| | - Susan Carpenter
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA, USA; Department of Molecular Cell and Developmental Biology, University of California Santa Cruz, Santa Cruz, CA, USA.
| |
Collapse
|
44
|
Brunet MA, Leblanc S, Roucou X. OpenVar: functional annotation of variants in non-canonical open reading frames. Cell Biosci 2022; 12:130. [PMID: 35965322 PMCID: PMC9375913 DOI: 10.1186/s13578-022-00871-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2022] [Accepted: 08/03/2022] [Indexed: 11/12/2022] Open
Abstract
Background Recent technological advances have revealed thousands of functional open reading frames (ORF) that have eluded reference genome annotations. These overlooked ORFs are found throughout the genome, in any reading frame of transcripts, mature or non-coding, and can overlap annotated ORFs in a different reading frame. The exploration of these novel ORFs in genomic datasets and of their role in genetic traits is hindered by a lack of software. Results Here, we present OpenVar, a genomic variant annotator that mends that gap and fosters meaningful discoveries. To illustrate the potential of OpenVar, we analysed all variants within SynMicDB, a database of cancer-associated synonymous mutations. By including non-canonical ORFs in the analysis, OpenVar yields a 33.6-fold, 13.8-fold and 8.3-fold increase in high impact variants over Annovar, SnpEff and VEP respectively. We highlighted an overlapping non-canonical ORF in the HEY2 gene where variants significantly clustered. Conclusions OpenVar integrates non-canonical ORFs in the analysis of genomic variants, unveiling new research avenues to better understand the genotype–phenotype relationships.
Collapse
|
45
|
Identification and analysis of smORFs in Chlamydomonas reinhardtii. Genomics 2022; 114:110444. [PMID: 35933072 DOI: 10.1016/j.ygeno.2022.110444] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2022] [Revised: 07/06/2022] [Accepted: 07/31/2022] [Indexed: 11/24/2022]
Abstract
Small open reading frames (smORFs) have been acknowledged as an important partner in organism functions ranging from bacteria to higher eukaryotes. However, lack of investigation of smORFs in green algae, despite their importance in ecology and evolution. We applied bioinformatic analysis, ribosome profiling, and small peptide proteomics to provide a genome-wide and high-confident smORF database in the model green alga Chlamydomonas reinhardtii. The whole genome was screened first to mine potential coding smORFs. Then conservative analysis, ribosome profiling, and proteomics data were processed to identify conserved smORFs and generate translation evidence. The combination of procedures resulted in 2014 smORFs that might exist in the C. reinhardtii genome. The expression of smORFs in Cd treatment suggested that two smORFs might participate in redox reaction, three in inorganic phosphate transport, and one in DNA repair under stress. Our study built a genome-widely database in C. reinhardtii, providing target smORFs for further research.
Collapse
|
46
|
Na Z, Dai X, Zheng SJ, Bryant CJ, Loh KH, Su H, Luo Y, Buhagiar AF, Cao X, Baserga SJ, Chen S, Slavoff SA. Mapping subcellular localizations of unannotated microproteins and alternative proteins with MicroID. Mol Cell 2022; 82:2900-2911.e7. [PMID: 35905735 PMCID: PMC9662605 DOI: 10.1016/j.molcel.2022.06.035] [Citation(s) in RCA: 24] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2021] [Revised: 04/08/2022] [Accepted: 06/29/2022] [Indexed: 11/15/2022]
Abstract
Proteogenomic identification of translated small open reading frames has revealed thousands of previously unannotated, largely uncharacterized microproteins, or polypeptides of less than 100 amino acids, and alternative proteins (alt-proteins) that are co-encoded with canonical proteins and are often larger. The subcellular localizations of microproteins and alt-proteins are generally unknown but can have significant implications for their functions. Proximity biotinylation is an attractive approach to define the protein composition of subcellular compartments in cells and in animals. Here, we developed a high-throughput technology to map unannotated microproteins and alt-proteins to subcellular localizations by proximity biotinylation with TurboID (MicroID). More than 150 microproteins and alt-proteins are associated with subnuclear organelles. One alt-protein, alt-LAMA3, localizes to the nucleolus and functions in pre-rRNA transcription. We applied MicroID in a mouse model, validating expression of a conserved nuclear microprotein, and establishing MicroID for discovery of microproteins and alt-proteins in vivo.
Collapse
Affiliation(s)
- Zhenkun Na
- Department of Chemistry, Yale University, New Haven, CT 06520, USA; Institute of Biomolecular Design and Discovery, Yale University, West Haven, CT 06516, USA
| | - Xiaoyun Dai
- Department of Genetics, Yale University School of Medicine, New Haven, CT 06520, USA; Systems Biology Institute, Yale University, West Haven, CT 06516, USA
| | - Shu-Jian Zheng
- Department of Chemistry, Yale University, New Haven, CT 06520, USA; Institute of Biomolecular Design and Discovery, Yale University, West Haven, CT 06516, USA
| | - Carson J Bryant
- Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT 06529, USA
| | - Ken H Loh
- Laboratory of Molecular Genetics, Howard Hughes Medical Institute, The Rockefeller University, New York, NY 10065, USA
| | - Haomiao Su
- Department of Chemistry, Yale University, New Haven, CT 06520, USA; Institute of Biomolecular Design and Discovery, Yale University, West Haven, CT 06516, USA
| | - Yang Luo
- Department of Chemistry, Yale University, New Haven, CT 06520, USA; Institute of Biomolecular Design and Discovery, Yale University, West Haven, CT 06516, USA
| | - Amber F Buhagiar
- Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT 06529, USA
| | - Xiongwen Cao
- Department of Chemistry, Yale University, New Haven, CT 06520, USA; Institute of Biomolecular Design and Discovery, Yale University, West Haven, CT 06516, USA
| | - Susan J Baserga
- Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT 06529, USA; Department of Genetics, Yale University School of Medicine, New Haven, CT 06520, USA; Department of Therapeutic Radiology, Yale University School of Medicine, New Haven, CT 06520, USA
| | - Sidi Chen
- Department of Genetics, Yale University School of Medicine, New Haven, CT 06520, USA; Systems Biology Institute, Yale University, West Haven, CT 06516, USA
| | - Sarah A Slavoff
- Department of Chemistry, Yale University, New Haven, CT 06520, USA; Institute of Biomolecular Design and Discovery, Yale University, West Haven, CT 06516, USA; Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT 06529, USA.
| |
Collapse
|
47
|
Bogaert A, Fijalkowska D, Staes A, Van de Steene T, Demol H, Gevaert K. Limited evidence for protein products of non-coding transcripts in the HEK293T cellular cytosol. Mol Cell Proteomics 2022; 21:100264. [PMID: 35788065 PMCID: PMC9396073 DOI: 10.1016/j.mcpro.2022.100264] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2022] [Revised: 06/22/2022] [Accepted: 06/30/2022] [Indexed: 10/25/2022] Open
Abstract
Ribosome profiling has revealed translation outside of canonical coding sequences (CDSs) including translation of short upstream ORFs, long non-coding RNAs, overlapping ORFs, ORFs in UTRs or ORFs in alternative reading frames. Studies combining mass spectrometry, ribosome profiling and CRISPR-based screens showed that hundreds of ORFs derived from non-coding transcripts produce (micro)proteins, while other studies failed to find evidence for such types of non-canonical translation products. Here, we attempted to discover translation products from non-coding regions by strongly reducing the complexity of the sample prior to mass spectrometric analysis. We used an extended database as the search space and applied stringent filtering of the identified peptides to find evidence for novel translation events. We show that, theoretically our strategy facilitates the detection of translation events of transcripts from non-coding regions, but experimentally only find 19 peptides that might originate from such translation events. Finally, Virotrap based interactome analysis of two N-terminal proteoforms originating from non-coding regions finally showed the functional potential of these novel proteins.
Collapse
Affiliation(s)
- Annelies Bogaert
- VIB Center for Medical Biotechnology, VIB, Ghent, 9052, Belgium; Department of Biomolecular Medicine, Ghent University, Ghent, 9052, Belgium
| | - Daria Fijalkowska
- VIB Center for Medical Biotechnology, VIB, Ghent, 9052, Belgium; Department of Biomolecular Medicine, Ghent University, Ghent, 9052, Belgium
| | - An Staes
- VIB Center for Medical Biotechnology, VIB, Ghent, 9052, Belgium; Department of Biomolecular Medicine, Ghent University, Ghent, 9052, Belgium
| | - Tessa Van de Steene
- VIB Center for Medical Biotechnology, VIB, Ghent, 9052, Belgium; Department of Biomolecular Medicine, Ghent University, Ghent, 9052, Belgium
| | - Hans Demol
- VIB Center for Medical Biotechnology, VIB, Ghent, 9052, Belgium; Department of Biomolecular Medicine, Ghent University, Ghent, 9052, Belgium
| | - Kris Gevaert
- VIB Center for Medical Biotechnology, VIB, Ghent, 9052, Belgium; Department of Biomolecular Medicine, Ghent University, Ghent, 9052, Belgium.
| |
Collapse
|
48
|
Perez-Riverol Y. Proteomic repository data submission, dissemination, and reuse: key messages. Expert Rev Proteomics 2022; 19:297-310. [PMID: 36529941 PMCID: PMC7614296 DOI: 10.1080/14789450.2022.2160324] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2022] [Accepted: 12/07/2022] [Indexed: 12/23/2022]
Abstract
INTRODUCTION The creation of ProteomeXchange data workflows in 2012 transformed the field of proteomics, consisting of the standardization of data submission and dissemination and enabling the widespread reanalysis of public MS proteomics data worldwide. ProteomeXchange has triggered a growing trend toward public dissemination of proteomics data, facilitating the assessment, reuse, comparative analyses, and extraction of new findings from public datasets. By 2022, the consortium is integrated by PRIDE, PeptideAtlas, MassIVE, jPOST, iProX, and Panorama Public. AREAS COVERED Here, we review and discuss the current ecosystem of resources, guidelines, and file formats for proteomics data dissemination and reanalysis. Special attention is drawn to new exciting quantitative and post-translational modification-oriented resources. The challenges and future directions on data depositions including the lack of metadata and cloud-based and high-performance software solutions for fast and reproducible reanalysis of the available data are discussed. EXPERT OPINION The success of ProteomeXchange and the amount of proteomics data available in the public domain have triggered the creation and/or growth of other protein knowledgebase resources. Data reuse is a leading, active, and evolving field; supporting the creation of new formats, tools, and workflows to rediscover and reshape the public proteomics data.
Collapse
Affiliation(s)
- Yasset Perez-Riverol
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge, UK
| |
Collapse
|
49
|
Liu Y, Zeng S, Wu M. Novel insights into noncanonical open reading frames in cancer. Biochim Biophys Acta Rev Cancer 2022; 1877:188755. [PMID: 35777601 DOI: 10.1016/j.bbcan.2022.188755] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2022] [Revised: 06/11/2022] [Accepted: 06/23/2022] [Indexed: 12/12/2022]
Abstract
With technological advances, previously neglected noncanonical open reading frames (nORFs) are drawing ever-increasing attention. However, the translation potential of numerous putative nORFs remains elusive, and the functions of noncanonical peptides have not been systemically summarized. Moreover, the relationship between noncanonical peptides and their counterpart protein or RNA products remains elusive and the clinical implementation of noncanonical peptides has not been explored. In this review, we highlight how recent technological advances such as ribosome profiling, bioinformatics approaches and CRISPR/Cas9 facilitate the research of noncanonical peptides. We delineate the features of each nORF category and the evolutionary process underneath the nORFs. Most importantly, we summarize the diversified functions of noncanonical peptides in cancer based on their subcellular location, which reflect their extensive participation in key pathways and essential cellular activities in cancer cells. Meanwhile, the equilibrium between noncanonical peptides and their corresponding transcripts or counterpart products may be dysregulated under pathological states, which is essential for their roles in cancer. Lastly, we explore their underestimated potential in clinical application as diagnostic biomarkers and treatment targets against cancer.
Collapse
Affiliation(s)
- Yihan Liu
- Hunan Cancer Hospital and the Affiliated Cancer Hospital of Xiangya School of Medicine, Central South University, Changsha 410013, Hunan, China; The Key Laboratory of Carcinogenesis of the Chinese Ministry of Health, The Key Laboratory of Carcinogenesis and Cancer Invasion of the Chinese Ministry of Education, Cancer Research Institute, Central South University, Changsha, Hunan 410008, China; Department of Oncology, Xiangya Hospital, Central South University, Changsha, Hunan 410008, China; Key Laboratory for Molecular Radiation Oncology of Hunan Province, Xiangya Hospital, Central South University, Changsha, Hunan 410008, China; National Clinical Research Center for Geriatric Disorders, Xiangya Hospital, Central South University, Changsha, Hunan 410008, China
| | - Shan Zeng
- Department of Oncology, Xiangya Hospital, Central South University, Changsha, Hunan 410008, China; Key Laboratory for Molecular Radiation Oncology of Hunan Province, Xiangya Hospital, Central South University, Changsha, Hunan 410008, China; National Clinical Research Center for Geriatric Disorders, Xiangya Hospital, Central South University, Changsha, Hunan 410008, China.
| | - Minghua Wu
- Hunan Cancer Hospital and the Affiliated Cancer Hospital of Xiangya School of Medicine, Central South University, Changsha 410013, Hunan, China; The Key Laboratory of Carcinogenesis of the Chinese Ministry of Health, The Key Laboratory of Carcinogenesis and Cancer Invasion of the Chinese Ministry of Education, Cancer Research Institute, Central South University, Changsha, Hunan 410008, China.
| |
Collapse
|
50
|
Fabre B, Choteau SA, Duboé C, Pichereaux C, Montigny A, Korona D, Deery MJ, Camus M, Brun C, Burlet-Schiltz O, Russell S, Combier JP, Lilley KS, Plaza S. In Depth Exploration of the Alternative Proteome of Drosophila melanogaster. Front Cell Dev Biol 2022; 10:901351. [PMID: 35721519 PMCID: PMC9204603 DOI: 10.3389/fcell.2022.901351] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2022] [Accepted: 04/25/2022] [Indexed: 12/13/2022] Open
Abstract
Recent studies have shown that hundreds of small proteins were occulted when protein-coding genes were annotated. These proteins, called alternative proteins, have failed to be annotated notably due to the short length of their open reading frame (less than 100 codons) or the enforced rule establishing that messenger RNAs (mRNAs) are monocistronic. Several alternative proteins were shown to be biologically active molecules and seem to be involved in a wide range of biological functions. However, genome-wide exploration of the alternative proteome is still limited to a few species. In the present article, we describe a deep peptidomics workflow which enabled the identification of 401 alternative proteins in Drosophila melanogaster. Subcellular localization, protein domains, and short linear motifs were predicted for 235 of the alternative proteins identified and point toward specific functions of these small proteins. Several alternative proteins had approximated abundances higher than their canonical counterparts, suggesting that these alternative proteins are actually the main products of their corresponding genes. Finally, we observed 14 alternative proteins with developmentally regulated expression patterns and 10 induced upon the heat-shock treatment of embryos, demonstrating stage or stress-specific production of alternative proteins.
Collapse
Affiliation(s)
- Bertrand Fabre
- Laboratoire de Recherche en Sciences Végétales, UMR5546, Université de Toulouse, UPS, INP, CNRS, Auzeville-Tolosane, France,Cambridge Centre for Proteomics, Cambridge Systems Biology Centre and Department of Biochemistry, University of Cambridge, Cambridge, United Kingdom,*Correspondence: Bertrand Fabre, ; Serge Plaza,
| | - Sebastien A. Choteau
- Aix-Marseille Université, INSERM, TAGC, Turing Centre for Living Systems, Marseille, France
| | - Carine Duboé
- Laboratoire de Recherche en Sciences Végétales, UMR5546, Université de Toulouse, UPS, INP, CNRS, Auzeville-Tolosane, France
| | - Carole Pichereaux
- Fédération de Recherche (FR3450), Agrobiosciences, Interactions et Biodiversité (AIB), CNRS, Toulouse, France,Institut de Pharmacologie et de Biologie Structurale (IPBS), Université de Toulouse, CNRS, UPS, Toulouse, France,Infrastructure Nationale de Protéomique, ProFI, FR 2048, Toulouse, France
| | - Audrey Montigny
- Laboratoire de Recherche en Sciences Végétales, UMR5546, Université de Toulouse, UPS, INP, CNRS, Auzeville-Tolosane, France
| | - Dagmara Korona
- Cambridge Systems Biology Centre and Department of Genetics, University of Cambridge, Cambridge, United Kingdom
| | - Michael J. Deery
- Cambridge Centre for Proteomics, Cambridge Systems Biology Centre and Department of Biochemistry, University of Cambridge, Cambridge, United Kingdom
| | - Mylène Camus
- Institut de Pharmacologie et de Biologie Structurale (IPBS), Université de Toulouse, CNRS, UPS, Toulouse, France,Infrastructure Nationale de Protéomique, ProFI, FR 2048, Toulouse, France
| | - Christine Brun
- Aix-Marseille Université, INSERM, TAGC, Turing Centre for Living Systems, Marseille, France,CNRS, Marseille, France
| | - Odile Burlet-Schiltz
- Institut de Pharmacologie et de Biologie Structurale (IPBS), Université de Toulouse, CNRS, UPS, Toulouse, France,Infrastructure Nationale de Protéomique, ProFI, FR 2048, Toulouse, France
| | - Steven Russell
- Cambridge Systems Biology Centre and Department of Genetics, University of Cambridge, Cambridge, United Kingdom
| | - Jean-Philippe Combier
- Laboratoire de Recherche en Sciences Végétales, UMR5546, Université de Toulouse, UPS, INP, CNRS, Auzeville-Tolosane, France
| | - Kathryn S. Lilley
- Cambridge Centre for Proteomics, Cambridge Systems Biology Centre and Department of Biochemistry, University of Cambridge, Cambridge, United Kingdom
| | - Serge Plaza
- Laboratoire de Recherche en Sciences Végétales, UMR5546, Université de Toulouse, UPS, INP, CNRS, Auzeville-Tolosane, France,*Correspondence: Bertrand Fabre, ; Serge Plaza,
| |
Collapse
|