1
|
Naidu P, Holford M. Microscopic marvels: Decoding the role of micropeptides in innate immunity. Immunology 2024; 173:605-621. [PMID: 39188052 DOI: 10.1111/imm.13850] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2024] [Accepted: 07/30/2024] [Indexed: 08/28/2024] Open
Abstract
The innate immune response is under selection pressures from changing environments and pathogens. While sequence evolution can be studied by comparing rates of amino acid mutations within and between species, how a gene's birth and death contribute to the evolution of immunity is less known. Short open reading frames, once regarded as untranslated or transcriptional noise, can often produce micropeptides of <100 amino acids with a wide array of biological functions. Some micropeptide sequences are well conserved, whereas others have no evolutionary conservation, potentially representing new functional compounds that arise from species-specific adaptations. To date, few reports have described the discovery of novel micropeptides of the innate immune system. The diversity of immune-related micropeptides is a blind spot for gene and functional annotation. Immune-related micropeptides represent a potential reservoir of untapped compounds for understanding and treating disease. This review consolidates what is currently known about the evolution and function of innate immune-related micropeptides to facilitate their investigation.
Collapse
Affiliation(s)
- Praveena Naidu
- Graduate Center, Programs in Biology, Biochemistry, Chemistry, City University of New York, New York, New York, USA
- Department of Chemistry and Biochemistry, City University of New York, Hunter College, Belfer Research Building, New York, New York, USA
| | - Mandë Holford
- Graduate Center, Programs in Biology, Biochemistry, Chemistry, City University of New York, New York, New York, USA
- Department of Chemistry and Biochemistry, City University of New York, Hunter College, Belfer Research Building, New York, New York, USA
- American Museum of Natural History, Invertebrate Zoology, Sackler Institute for Comparative Genomics, New York, New York, USA
- Weill Cornell Medicine, Department of Biochemistry, New York, New York, USA
| |
Collapse
|
2
|
Vasylieva V, Arefiev I, Bourassa F, Trifiro FA, Brunet MA. Proteomics Can Rise to the Challenge of Pseudogenes' Coding Nature. J Proteome Res 2024. [PMID: 39486438 DOI: 10.1021/acs.jproteome.4c00116] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2024]
Abstract
Throughout the past decade, technological advances in genomics and transcriptomics have revealed pervasive translation throughout mammalian genomes. These putative proteins are usually excluded from proteomics analyses, as they are absent from common protein repositories. A sizable portion of these noncanonical proteins is translated from pseudogenes. Pseudogenes are commonly termed defective copies of coding genes unable to produce proteins. Here, we suggest that proteomics can help in their annotation. First, we define important terms and review specific examples underlining the caveats in pseudogene annotation and their coding potential. Then, we will discuss the challenges inherent to pseudogenes that have thus far rendered complex their confidence in omics data. Finally, we identify recent developments in experimental procedures, instrumentation, and computational methods in proteomics that put the field in a unique position to solve the pseudogene annotation conundrum.
Collapse
Affiliation(s)
- Valeriia Vasylieva
- Pediatrics Department, Université de Sherbrooke, Sherbrooke, Québec J1K 2R1, Canada
- Centre de Recherche du Centre hospitalier de l'université de Sherbrooke (CRCHUS), Sherbrooke, Québec J1E 4K8, Canada
| | - Ihor Arefiev
- Pediatrics Department, Université de Sherbrooke, Sherbrooke, Québec J1K 2R1, Canada
- Centre de Recherche du Centre hospitalier de l'université de Sherbrooke (CRCHUS), Sherbrooke, Québec J1E 4K8, Canada
| | - Francis Bourassa
- Pediatrics Department, Université de Sherbrooke, Sherbrooke, Québec J1K 2R1, Canada
- Centre de Recherche du Centre hospitalier de l'université de Sherbrooke (CRCHUS), Sherbrooke, Québec J1E 4K8, Canada
| | - Félix-Antoine Trifiro
- Pediatrics Department, Université de Sherbrooke, Sherbrooke, Québec J1K 2R1, Canada
- Centre de Recherche du Centre hospitalier de l'université de Sherbrooke (CRCHUS), Sherbrooke, Québec J1E 4K8, Canada
| | - Marie A Brunet
- Pediatrics Department, Université de Sherbrooke, Sherbrooke, Québec J1K 2R1, Canada
- Centre de Recherche du Centre hospitalier de l'université de Sherbrooke (CRCHUS), Sherbrooke, Québec J1E 4K8, Canada
| |
Collapse
|
3
|
Yi Q, Feng J, Lan W, Shi H, Sun W, Sun W. CircRNA and lncRNA-encoded peptide in diseases, an update review. Mol Cancer 2024; 23:214. [PMID: 39343883 PMCID: PMC11441268 DOI: 10.1186/s12943-024-02131-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2024] [Accepted: 09/19/2024] [Indexed: 10/01/2024] Open
Abstract
Non-coding RNAs (ncRNAs), including circular RNAs (circRNAs) and long non-coding RNAs (lncRNAs), are unique RNA molecules widely identified in the eukaryotic genome. Their dysregulation has been discovered and played key roles in the pathogenesis of numerous diseases, including various cancers. Previously considered devoid of protein-coding ability, recent research has revealed that a small number of open reading frames (ORFs) within these ncRNAs endow them with the potential for protein coding. These ncRNAs-derived peptides or proteins have been proven to regulate various physiological and pathological processes through diverse mechanisms. Their emerging roles in disease diagnosis and targeted therapy underscore their potential utility in clinical settings. This comprehensive review aims to provide a systematic overview of proteins or peptides encoded by lncRNAs and circRNAs, elucidate their production and functional mechanisms, and explore their promising applications in cancer diagnosis, disease prediction, and targeted therapy.
Collapse
Affiliation(s)
- Qian Yi
- Department of Physiology, School of Basic Medical Sciences, Southwest Medical University, Luzhou, Sichuan, 646099, China
| | - Jianguo Feng
- Department of Anesthesiology, The Affiliated Hospital of Southwest Medical University, Luzhou, Sichuan, China.
| | - Weiwu Lan
- Department of Orthopedics, Shenzhen Second People's Hospital/First Affiliated Hospital of Shenzhen University Health Science Center, Shenzhen, Guangdong, 518035, China
| | - Houyin Shi
- Department of Orthopedics, The Affiliated Traditional Chinese Medicine Hospital of Southwest Medical University, Luzhou, 646000, China
| | - Wei Sun
- Department of Orthopedics, Shenzhen Second People's Hospital/First Affiliated Hospital of Shenzhen University Health Science Center, Shenzhen, Guangdong, 518035, China.
| | - Weichao Sun
- Department of Orthopedics, Shenzhen Second People's Hospital/First Affiliated Hospital of Shenzhen University Health Science Center, Shenzhen, Guangdong, 518035, China.
| |
Collapse
|
4
|
Zhang Y. LncRNA-encoded peptides in cancer. J Hematol Oncol 2024; 17:66. [PMID: 39135098 PMCID: PMC11320871 DOI: 10.1186/s13045-024-01591-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2024] [Accepted: 08/05/2024] [Indexed: 08/15/2024] Open
Abstract
Long non-coding RNAs (lncRNAs), once considered transcriptional noise, have emerged as critical regulators of gene expression and key players in cancer biology. Recent breakthroughs have revealed that certain lncRNAs can encode small open reading frame (sORF)-derived peptides, which are now understood to contribute to the pathogenesis of various cancers. This review synthesizes current knowledge on the detection, functional roles, and clinical implications of lncRNA-encoded peptides in cancer. We discuss technological advancements in the detection and validation of sORFs, including ribosome profiling and mass spectrometry, which have facilitated the discovery of these peptides. The functional roles of lncRNA-encoded peptides in cancer processes such as gene transcription, translation regulation, signal transduction, and metabolic reprogramming are explored in various types of cancer. The clinical potential of these peptides is highlighted, with a focus on their utility as diagnostic biomarkers, prognostic indicators, and therapeutic targets. The challenges and future directions in translating these findings into clinical practice are also discussed, including the need for large-scale validation, development of sensitive detection methods, and optimization of peptide stability and delivery.
Collapse
Affiliation(s)
- Yaguang Zhang
- Laboratory of Gastrointestinal Tumor Epigenetics and Genomics, Frontiers Science Center for Disease-Related Molecular Network, West China Hospital, Sichuan University, Chengdu, 610041, People's Republic of China.
| |
Collapse
|
5
|
Genth J, Schäfer K, Cassidy L, Graspeuntner S, Rupp J, Tholey A. Identification of proteoforms of short open reading frame-encoded peptides in Blautia producta under different cultivation conditions. Microbiol Spectr 2023; 11:e0252823. [PMID: 37782090 PMCID: PMC10715070 DOI: 10.1128/spectrum.02528-23] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2023] [Accepted: 08/14/2023] [Indexed: 10/03/2023] Open
Abstract
IMPORTANCE The identification of short open reading frame-encoded peptides (SEP) and different proteoforms in single cultures of gut microbes offers new insights into a largely neglected part of the microbial proteome landscape. This is of particular importance as SEP provide various predicted functions, such as acting as antimicrobial peptides, maintaining cell homeostasis under stress conditions, or even contributing to the virulence pattern. They are, thus, taking a poorly understood role in structure and function of microbial networks in the human body. A better understanding of SEP in the context of human health requires a precise understanding of the abundance of SEP both in commensal microbes as well as pathogens. For the gut beneficial B. producta, we demonstrate the importance of specific environmental conditions for biosynthesis of SEP expanding previous findings about their role in microbial interactions.
Collapse
Affiliation(s)
- Jerome Genth
- Systematic Proteome Research & Bioanalytics, Institute for Experimental Medicine, Christian-Albrechts-Universität zu Kiel, Kiel, Germany
| | - Kathrin Schäfer
- Department of Infectious Diseases and Microbiology, University of Lübeck, Lübeck, Germany
| | - Liam Cassidy
- Systematic Proteome Research & Bioanalytics, Institute for Experimental Medicine, Christian-Albrechts-Universität zu Kiel, Kiel, Germany
| | - Simon Graspeuntner
- Department of Infectious Diseases and Microbiology, University of Lübeck, Lübeck, Germany
- German Center for Infection Research (DZIF), Partner Site Hamburg-Lübeck-Borstel-Riems, Lübeck, Germany
| | - Jan Rupp
- Department of Infectious Diseases and Microbiology, University of Lübeck, Lübeck, Germany
- German Center for Infection Research (DZIF), Partner Site Hamburg-Lübeck-Borstel-Riems, Lübeck, Germany
| | - Andreas Tholey
- Systematic Proteome Research & Bioanalytics, Institute for Experimental Medicine, Christian-Albrechts-Universität zu Kiel, Kiel, Germany
| |
Collapse
|
6
|
Xie L, Bowman ME, Louie GV, Zhang C, Ardejani MS, Huang X, Chu Q, Donaldson CJ, Vaughan JM, Shan H, Powers ET, Kelly JW, Lyumkis D, Noel JP, Saghatelian A. Biochemistry and Protein Interactions of the CYREN Microprotein. Biochemistry 2023; 62:3050-3060. [PMID: 37813856 DOI: 10.1021/acs.biochem.3c00397] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/11/2023]
Abstract
Over the past decade, advances in genomics have identified thousands of additional protein-coding small open reading frames (smORFs) missed by traditional gene finding approaches. These smORFs encode peptides and small proteins, commonly termed micropeptides or microproteins. Several of these newly discovered microproteins have biological functions and operate through interactions with proteins and protein complexes within the cell. CYREN1 is a characterized microprotein that regulates double-strand break repair in mammalian cells through interaction with Ku70/80 heterodimer. Ku70/80 binds to and stabilizes double-strand breaks and recruits the machinery needed for nonhomologous end join repair. In this study, we examined the biochemical properties of CYREN1 to better understand and explain its cellular protein interactions. Our findings support that CYREN1 is an intrinsically disordered microprotein and this disordered structure allows it to enriches several proteins, including a newly discovered interaction with SF3B1 via a distinct short linear motif (SLiMs) on CYREN1. Since many microproteins are predicted to be disordered, CYREN1 is an exemplar of how microproteins interact with other proteins and reveals an unknown scaffolding function of this microprotein that may link NHEJ and splicing.
Collapse
Affiliation(s)
- Lina Xie
- Clayton Foundation Peptide Biology Laboratories, The Salk Institute for Biological Studies, 10010 North Torrey Pines Road, La Jolla, California 92037, United States
| | - Marianne E Bowman
- Jack H. Skirball Center for Chemical Biology and Proteomics, The Salk Institute for Biological Studies, 10010 North Torrey Pines Road, La Jolla, California 92037, United States
| | - Gordon V Louie
- Jack H. Skirball Center for Chemical Biology and Proteomics, The Salk Institute for Biological Studies, 10010 North Torrey Pines Road, La Jolla, California 92037, United States
| | - Cheng Zhang
- Laboratory of Genetics, The Salk Institute for Biological Studies, 10010 North Torrey Pines Road, La Jolla, California 92037, United States
| | - Maziar S Ardejani
- Department of Chemistry and The Skaggs Institute for Chemical Biology, The Scripps Research Institute, La Jolla, California 92037, United States
| | - Xuemei Huang
- Department of Chemistry and Biochemistry, University of California, San Diego, 9500 Gilman Drive, La Jolla, California 92037, United States
| | - Qian Chu
- Department of Pharmacy, China Pharmaceutical University, Nanjing 210009, Jiangsu, China
| | - Cynthia J Donaldson
- Clayton Foundation Peptide Biology Laboratories, The Salk Institute for Biological Studies, 10010 North Torrey Pines Road, La Jolla, California 92037, United States
| | - Joan M Vaughan
- Clayton Foundation Peptide Biology Laboratories, The Salk Institute for Biological Studies, 10010 North Torrey Pines Road, La Jolla, California 92037, United States
| | - Huanqi Shan
- Clayton Foundation Peptide Biology Laboratories, The Salk Institute for Biological Studies, 10010 North Torrey Pines Road, La Jolla, California 92037, United States
| | - Evan T Powers
- Department of Chemistry and The Skaggs Institute for Chemical Biology, The Scripps Research Institute, La Jolla, California 92037, United States
| | - Jeffery W Kelly
- Department of Chemistry and The Skaggs Institute for Chemical Biology, The Scripps Research Institute, La Jolla, California 92037, United States
| | - Dimitry Lyumkis
- Laboratory of Genetics, The Salk Institute for Biological Studies, 10010 North Torrey Pines Road, La Jolla, California 92037, United States
| | - Joseph P Noel
- Jack H. Skirball Center for Chemical Biology and Proteomics, The Salk Institute for Biological Studies, 10010 North Torrey Pines Road, La Jolla, California 92037, United States
| | - Alan Saghatelian
- Clayton Foundation Peptide Biology Laboratories, The Salk Institute for Biological Studies, 10010 North Torrey Pines Road, La Jolla, California 92037, United States
| |
Collapse
|
7
|
Simoens L, Fijalkowski I, Van Damme P. Exposing the small protein load of bacterial life. FEMS Microbiol Rev 2023; 47:fuad063. [PMID: 38012116 PMCID: PMC10723866 DOI: 10.1093/femsre/fuad063] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2023] [Revised: 11/10/2023] [Accepted: 11/24/2023] [Indexed: 11/29/2023] Open
Abstract
The ever-growing repertoire of genomic techniques continues to expand our understanding of the true diversity and richness of prokaryotic genomes. Riboproteogenomics laid the foundation for dynamic studies of previously overlooked genomic elements. Most strikingly, bacterial genomes were revealed to harbor robust repertoires of small open reading frames (sORFs) encoding a diverse and broadly expressed range of small proteins, or sORF-encoded polypeptides (SEPs). In recent years, continuous efforts led to great improvements in the annotation and characterization of such proteins, yet many challenges remain to fully comprehend the pervasive nature of small proteins and their impact on bacterial biology. In this work, we review the recent developments in the dynamic field of bacterial genome reannotation, catalog the important biological roles carried out by small proteins and identify challenges obstructing the way to full understanding of these elusive proteins.
Collapse
Affiliation(s)
- Laure Simoens
- iRIP Unit, Laboratory of Microbiology, Department of Biochemistry and Microbiology, Ghent University, K. L. Ledeganckstraat 35, 9000 Ghent, Belgium
| | - Igor Fijalkowski
- iRIP Unit, Laboratory of Microbiology, Department of Biochemistry and Microbiology, Ghent University, K. L. Ledeganckstraat 35, 9000 Ghent, Belgium
| | - Petra Van Damme
- iRIP Unit, Laboratory of Microbiology, Department of Biochemistry and Microbiology, Ghent University, K. L. Ledeganckstraat 35, 9000 Ghent, Belgium
| |
Collapse
|
8
|
Leblanc S, Brunet MA, Jacques JF, Lekehal AM, Duclos A, Tremblay A, Bruggeman-Gascon A, Samandi S, Brunelle M, Cohen AA, Scott MS, Roucou X. Newfound Coding Potential of Transcripts Unveils Missing Members of Human Protein Communities. GENOMICS, PROTEOMICS & BIOINFORMATICS 2023; 21:515-534. [PMID: 36183975 PMCID: PMC10787177 DOI: 10.1016/j.gpb.2022.09.008] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/01/2021] [Revised: 08/10/2022] [Accepted: 09/26/2022] [Indexed: 06/16/2023]
Abstract
Recent proteogenomic approaches have led to the discovery that regions of the transcriptome previously annotated as non-coding regions [i.e., untranslated regions (UTRs), open reading frames overlapping annotated coding sequences in a different reading frame, and non-coding RNAs] frequently encode proteins, termed alternative proteins (altProts). This suggests that previously identified protein-protein interaction (PPI) networks are partially incomplete because altProts are not present in conventional protein databases. Here, we used the proteogenomic resource OpenProt and a combined spectrum- and peptide-centric analysis for the re-analysis of a high-throughput human network proteomics dataset, thereby revealing the presence of 261 altProts in the network. We found 19 genes encoding both an annotated (reference) and an alternative protein interacting with each other. Of the 117 altProts encoded by pseudogenes, 38 are direct interactors of reference proteins encoded by their respective parental genes. Finally, we experimentally validate several interactions involving altProts. These data improve the blueprints of the human PPI network and suggest functional roles for hundreds of altProts.
Collapse
Affiliation(s)
- Sébastien Leblanc
- Department of Biochemistry and Functional Genomics, Université de Sherbrooke, Sherbrooke, QC J1E 4K8, Canada; PROTEO, Quebec Network for Research on Protein Function, Structure, and Engineering, Quebec City, QC G1V 0A6, Canada
| | - Marie A Brunet
- Department of Biochemistry and Functional Genomics, Université de Sherbrooke, Sherbrooke, QC J1E 4K8, Canada; PROTEO, Quebec Network for Research on Protein Function, Structure, and Engineering, Quebec City, QC G1V 0A6, Canada
| | - Jean-François Jacques
- Department of Biochemistry and Functional Genomics, Université de Sherbrooke, Sherbrooke, QC J1E 4K8, Canada; PROTEO, Quebec Network for Research on Protein Function, Structure, and Engineering, Quebec City, QC G1V 0A6, Canada
| | - Amina M Lekehal
- Department of Biochemistry and Functional Genomics, Université de Sherbrooke, Sherbrooke, QC J1E 4K8, Canada; PROTEO, Quebec Network for Research on Protein Function, Structure, and Engineering, Quebec City, QC G1V 0A6, Canada
| | - Andréa Duclos
- Department of Biochemistry and Functional Genomics, Université de Sherbrooke, Sherbrooke, QC J1E 4K8, Canada
| | - Alexia Tremblay
- Department of Biochemistry and Functional Genomics, Université de Sherbrooke, Sherbrooke, QC J1E 4K8, Canada
| | - Alexis Bruggeman-Gascon
- Department of Biochemistry and Functional Genomics, Université de Sherbrooke, Sherbrooke, QC J1E 4K8, Canada
| | - Sondos Samandi
- Department of Biochemistry and Functional Genomics, Université de Sherbrooke, Sherbrooke, QC J1E 4K8, Canada; PROTEO, Quebec Network for Research on Protein Function, Structure, and Engineering, Quebec City, QC G1V 0A6, Canada
| | - Mylène Brunelle
- Department of Biochemistry and Functional Genomics, Université de Sherbrooke, Sherbrooke, QC J1E 4K8, Canada; PROTEO, Quebec Network for Research on Protein Function, Structure, and Engineering, Quebec City, QC G1V 0A6, Canada
| | - Alan A Cohen
- Department of Family Medicine, Université de Sherbrooke, Sherbrooke, QC J1H 5N4, Canada
| | - Michelle S Scott
- Department of Biochemistry and Functional Genomics, Université de Sherbrooke, Sherbrooke, QC J1E 4K8, Canada
| | - Xavier Roucou
- Department of Biochemistry and Functional Genomics, Université de Sherbrooke, Sherbrooke, QC J1E 4K8, Canada; PROTEO, Quebec Network for Research on Protein Function, Structure, and Engineering, Quebec City, QC G1V 0A6, Canada.
| |
Collapse
|
9
|
Cassidy L, Kaulich PT, Tholey A. Proteoforms expand the world of microproteins and short open reading frame-encoded peptides. iScience 2023; 26:106069. [PMID: 36818287 PMCID: PMC9929600 DOI: 10.1016/j.isci.2023.106069] [Citation(s) in RCA: 11] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/28/2023] Open
Abstract
Microproteins and short open reading frame-encoded peptides (SEPs) can, like all proteins, carry numerous posttranslational modifications. Together with posttranscriptional processes, this leads to a high number of possible distinct protein molecules, the proteoforms, out of a limited number of genes. The identification, quantification, and molecular characterization of proteoforms possess special challenges to established, mainly bottom-up proteomics (BUP) based analytical approaches. While BUP methods are powerful, proteins have to be inferred rather than directly identified, which hampers the detection of proteoforms. An alternative approach is top-down proteomics (TDP) which allows to identify intact proteoforms. This perspective article provides a brief overview of modified microproteins and SEPs, introduces the proteoform terminology, and compares present BUP and TDP workflows highlighting their major advantages and caveats. Necessary future developments in TDP to fully accentuate its potential for proteoform-centric analytics of microproteins and SEPs will be discussed.
Collapse
Affiliation(s)
- Liam Cassidy
- Systematic Proteome Research & Bioanalytics, Institute for Experimental Medicine, Christian-Albrechts-Universität zu Kiel, 24105 Kiel, Germany
| | - Philipp T. Kaulich
- Systematic Proteome Research & Bioanalytics, Institute for Experimental Medicine, Christian-Albrechts-Universität zu Kiel, 24105 Kiel, Germany
| | - Andreas Tholey
- Systematic Proteome Research & Bioanalytics, Institute for Experimental Medicine, Christian-Albrechts-Universität zu Kiel, 24105 Kiel, Germany,Corresponding author
| |
Collapse
|
10
|
Peptidomics as a tool to analyze endogenous peptides in milk and milk-related peptides. FOOD BIOSCI 2022. [DOI: 10.1016/j.fbio.2022.102199] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022]
|
11
|
Probing the sORF-Encoded Peptides of Deinococcus radiodurans in Response to Extreme Stress. Mol Cell Proteomics 2022; 21:100423. [PMID: 36210010 PMCID: PMC9650054 DOI: 10.1016/j.mcpro.2022.100423] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2022] [Revised: 09/27/2022] [Accepted: 10/03/2022] [Indexed: 11/09/2022] Open
Abstract
Organisms have developed different mechanisms to respond to stresses. However, the roles of small ORF-encoded peptides (SEPs) in these regulatory systems remain elusive, which is partially because of the lack of comprehensive knowledge regarding these biomolecules. We chose the extremophile Deinococcus radiodurans R1 as a model species and conducted large-scale profiling of the SEPs related to the stress response. The integrated workflow consisting of multiple omics approaches for SEP identification was streamlined, and an SEPome of D. radiodurans containing 109 novel and high-confidence SEPs was drafted. Forty-four percent of these SEPs were predicted to function as antimicrobial peptides. Quantitative peptidomics analysis indicated that the expression of SEP068184 was upregulated upon oxidative treatment and gamma irradiation of the bacteria. SEP068184 was conserved in Deinococcus and exhibited negative regulation of oxidative stress resistance in a comparative phenotypic assay of its mutants. Further quantitative and interactive proteomics analyses suggested that SEP068184 might function through metabolic pathways and interact with cytoplasmic proteins. Collectively, our findings demonstrate that SEPs are involved in the regulation of oxidative resistance, and the SEPome dataset provides a rich resource for research on the molecular mechanisms of the response to extreme stress in organisms.
Collapse
|
12
|
The Emerging Roles of Long Non-Coding RNAs in Intellectual Disability and Related Neurodevelopmental Disorders. Int J Mol Sci 2022; 23:ijms23116118. [PMID: 35682796 PMCID: PMC9181295 DOI: 10.3390/ijms23116118] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2022] [Revised: 05/23/2022] [Accepted: 05/27/2022] [Indexed: 02/05/2023] Open
Abstract
In the human brain, long non-coding RNAs (lncRNAs) are widely expressed in an exquisitely temporally and spatially regulated manner, thus suggesting their contribution to normal brain development and their probable involvement in the molecular pathology of neurodevelopmental disorders (NDD). Bypassing the classic protein-centric conception of disease mechanisms, some studies have been conducted to identify and characterize the putative roles of non-coding sequences in the genetic pathogenesis and diagnosis of complex diseases. However, their involvement in NDD, and more specifically in intellectual disability (ID), is still poorly documented and only a few genomic alterations affecting the lncRNAs function and/or expression have been causally linked to the disease endophenotype. Considering that a significant fraction of patients still lacks a genetic or molecular explanation, we expect that a deeper investigation of the non-coding genome will unravel novel pathogenic mechanisms, opening new translational opportunities. Here, we present evidence of the possible involvement of many lncRNAs in the etiology of different forms of ID and NDD, grouping the candidate disease-genes in the most frequently affected cellular processes in which ID-risk genes were previously collected. We also illustrate new approaches for the identification and prioritization of NDD-risk lncRNAs, together with the current strategies to exploit them in diagnosis.
Collapse
|
13
|
Fijalkowski I, Willems P, Jonckheere V, Simoens L, Van Damme P. Hidden in plain sight: challenges in proteomics detection of small ORF-encoded polypeptides. MICROLIFE 2022; 3:uqac005. [PMID: 37223358 PMCID: PMC10117744 DOI: 10.1093/femsml/uqac005] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/13/2021] [Revised: 04/18/2022] [Accepted: 04/29/2022] [Indexed: 05/25/2023]
Abstract
Genomic studies of bacteria have long pointed toward widespread prevalence of small open reading frames (sORFs) encoding for short proteins, <100 amino acids in length. Despite the mounting genomic evidence of their robust expression, relatively little progress has been made in their mass spectrometry-based detection and various blanket statements have been used to explain this observed discrepancy. In this study, we provide a large-scale riboproteogenomics investigation of the challenging nature of proteomic detection of such small proteins as informed by conditional translation data. A panel of physiochemical properties alongside recently developed mass spectrometry detectability metrics was interrogated to provide a comprehensive evidence-based assessment of sORF-encoded polypeptide (SEP) detectability. Moreover, a large-scale proteomics and translatomics compendium of proteins produced by Salmonella Typhimurium (S. Typhimurium), a model human pathogen, across a panel of growth conditions is presented and used in support of our in silico SEP detectability analysis. This integrative approach is used to provide a data-driven census of small proteins expressed by S. Typhimurium across growth phases and infection-relevant conditions. Taken together, our study pinpoints current limitations in proteomics-based detection of novel small proteins currently missing from bacterial genome annotations.
Collapse
Affiliation(s)
- Igor Fijalkowski
- iRIP Unit, Laboratory of Microbiology, Department of Biochemistry and Microbiology, Ghent University, 9000 Ghent, Belgium
| | - Patrick Willems
- iRIP Unit, Laboratory of Microbiology, Department of Biochemistry and Microbiology, Ghent University, 9000 Ghent, Belgium
| | - Veronique Jonckheere
- iRIP Unit, Laboratory of Microbiology, Department of Biochemistry and Microbiology, Ghent University, 9000 Ghent, Belgium
| | - Laure Simoens
- iRIP Unit, Laboratory of Microbiology, Department of Biochemistry and Microbiology, Ghent University, 9000 Ghent, Belgium
| | - Petra Van Damme
- iRIP Unit, Laboratory of Microbiology, Department of Biochemistry and Microbiology, Ghent University, 9000 Ghent, Belgium
| |
Collapse
|
14
|
Identification and characterisation of sPEPs in Cryptococcus neoformans. Fungal Genet Biol 2022; 160:103688. [PMID: 35339703 DOI: 10.1016/j.fgb.2022.103688] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/04/2022] [Revised: 03/02/2022] [Accepted: 03/21/2022] [Indexed: 11/24/2022]
Abstract
Short open reading frame (sORF)-encoded peptides (sPEPs) have been found across a wide range of genomic locations in a variety of species. To date, their identification, validation, and characterisation in the human fungal pathogen Cryptococcus neoformans has been limited due to a lack of standardised protocols. We have developed an enrichment process that enables sPEP detection within a protein sample from this polysaccharide-encapsulated yeast, and implemented proteogenomics to provide insights into the validity of predicted and hypothetical sORFs annotated in the C. neoformans genome. Novel sORFs were discovered within the 5' and 3' UTRs of known transcripts as well as in "non-coding" RNAs. One novel candidate, dubbed NPB1, that resided in an RNA annotated as "non-coding", was chosen for characterisation. Through the creation of both specific point mutations and a full deletion allele, the function of the new sPEP, Npb1, was shown to resemble that of the bacterial trans-translation protein SmpB.
Collapse
|
15
|
Leong AZX, Lee PY, Mohtar MA, Syafruddin SE, Pung YF, Low TY. Short open reading frames (sORFs) and microproteins: an update on their identification and validation measures. J Biomed Sci 2022; 29:19. [PMID: 35300685 PMCID: PMC8928697 DOI: 10.1186/s12929-022-00802-5] [Citation(s) in RCA: 21] [Impact Index Per Article: 10.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/31/2021] [Accepted: 03/09/2022] [Indexed: 12/17/2022] Open
Abstract
A short open reading frame (sORFs) constitutes ≤ 300 bases, encoding a microprotein or sORF-encoded protein (SEP) which comprises ≤ 100 amino acids. Traditionally dismissed by genome annotation pipelines as meaningless noise, sORFs were found to possess coding potential with ribosome profiling (RIBO-Seq), which unveiled sORF-based transcripts at various genome locations. Nonetheless, the existence of corresponding microproteins that are stable and functional was little substantiated by experimental evidence initially. With recent advancements in multi-omics, the identification, validation, and functional characterisation of sORFs and microproteins have become feasible. In this review, we discuss the history and development of an emerging research field of sORFs and microproteins. In particular, we focus on an array of bioinformatics and OMICS approaches used for predicting, sequencing, validating, and characterizing these recently discovered entities. These strategies include RIBO-Seq which detects sORF transcripts via ribosome footprints, and mass spectrometry (MS)-based proteomics for sequencing the resultant microproteins. Subsequently, our discussion extends to the functional characterisation of microproteins by incorporating CRISPR/Cas9 screen and protein–protein interaction (PPI) studies. Our review discusses not only detection methodologies, but we also highlight on the challenges and potential solutions in identifying and validating sORFs and their microproteins. The novelty of this review lies within its validation for the functional role of microproteins, which could contribute towards the future landscape of microproteomics.
Collapse
Affiliation(s)
- Alyssa Zi-Xin Leong
- UKM Medical Molecular Biology Institute (UMBI), Universiti Kebangsaan Malaysia, 56000, Kuala Lumpur, Malaysia
| | - Pey Yee Lee
- UKM Medical Molecular Biology Institute (UMBI), Universiti Kebangsaan Malaysia, 56000, Kuala Lumpur, Malaysia
| | - M Aiman Mohtar
- UKM Medical Molecular Biology Institute (UMBI), Universiti Kebangsaan Malaysia, 56000, Kuala Lumpur, Malaysia
| | - Saiful Effendi Syafruddin
- UKM Medical Molecular Biology Institute (UMBI), Universiti Kebangsaan Malaysia, 56000, Kuala Lumpur, Malaysia
| | - Yuh-Fen Pung
- Division of Biomedical Science, School of Pharmacy, University of Nottingham Malaysia, Semenyih, 43500, Selangor, Malaysia
| | - Teck Yew Low
- UKM Medical Molecular Biology Institute (UMBI), Universiti Kebangsaan Malaysia, 56000, Kuala Lumpur, Malaysia.
| |
Collapse
|
16
|
Liu W, He QY, Brunet MA. Editorial: Emerging Proteins and Polypeptides Expressed by "Non-Coding RNAs". Front Cell Dev Biol 2022; 10:862870. [PMID: 35265627 PMCID: PMC8899286 DOI: 10.3389/fcell.2022.862870] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2022] [Accepted: 01/31/2022] [Indexed: 11/29/2022] Open
Affiliation(s)
- Wanting Liu
- MOE Key Laboratory of Tumor Molecular Biology and Key Laboratory of Functional Protein Research of Guangdong Higher Education Institutes, Institute of Life and Health Engineering, Jinan University, Guangzhou, China
| | - Qing-Yu He
- MOE Key Laboratory of Tumor Molecular Biology and Key Laboratory of Functional Protein Research of Guangdong Higher Education Institutes, Institute of Life and Health Engineering, Jinan University, Guangzhou, China
| | - Marie A Brunet
- Department of Pediatrics, Medical Genetics Service, Université de Sherbrooke, Sherbrooke, QC, Canada.,Centre de Recherche du Centre Hospitalier Universitaire de Sherbrooke, Sherbrooke, QC, Canada
| |
Collapse
|
17
|
Small open reading frames in plant research: from prediction to functional characterization. 3 Biotech 2022; 12:76. [PMID: 35251879 PMCID: PMC8873315 DOI: 10.1007/s13205-022-03147-w] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2021] [Accepted: 02/11/2022] [Indexed: 11/01/2022] Open
Abstract
Gene prediction is a laborious and time-consuming task. The advancement of sequencing technologies and bioinformatics tools, coupled with accelerated rate of ribosome profiling and mass spectrometry development, have made identification of small open reading frames (sORFs) (< 100 codons) in various plant genomes possible. The past 50 years have seen sORFs being isolated from many organisms. However, to date, a comprehensive sORF annotation pipeline is as yet unavailable, hence, addressed in our review. Here, we also provide current information on classification and functions of plant sORFs and their potential applications in crop improvement programs.
Collapse
|
18
|
Hu XL, Zhang J, Kaundal R, Kataria R, Labbé JL, Mitchell JC, Tschaplinski TJ, Tuskan GA, Cheng ZM(M, Yang X. Diversity and conservation of plant small secreted proteins associated with arbuscular mycorrhizal symbiosis. HORTICULTURE RESEARCH 2022; 9:uhac043. [PMID: 35184190 PMCID: PMC8985099 DOI: 10.1093/hr/uhac043] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/19/2021] [Accepted: 01/18/2022] [Indexed: 05/12/2023]
Abstract
Arbuscular mycorrhizal symbiosis (AMS) is widespread mutualistic association between plants and fungi, which plays an essential role in nutrient exchange, enhancement in plant stress resistance, development of host, and ecosystem sustainability. Previous studies have shown that plant small secreted proteins (SSPs) are involved in beneficial symbiotic interactions. However, the role of SSPs in the evolution of AMS has not been well studied yet. In this study, we performed computational analysis of SSPs in 60 plant species and identified three AMS-specific ortholog groups containing SSPs only from at least 30% of the AMS species in this study and three AMS-preferential ortholog groups containing SSPs from both AMS and non-AMS species, with AMS species containing significantly more SSPs than non-AMS species. We found that independent lineages of monocot and eudicot plants contained genes in the AMS-specific ortholog groups and had significant expansion in the AMS-preferential ortholog groups. Also, two AMS-preferential ortholog groups showed convergent changes, between monocot and eudicot species, in gene expression in response to arbuscular mycorrhizal fungus Rhizophagus irregularis. Furthermore, conserved cis-elements were identified in the promoter regions of the genes showing convergent gene expression. We found that the SSPs, and their closely related homologs, in each of three AMS-preferential ortholog groups, had some local variations in the protein structural alignment. We also identified genes co-expressed with the Populus trichocarpa SSP genes in the AMS-preferential ortholog groups. This first plant kingdom-wide analysis on SSP provides insights on plant-AMS convergent evolution with specific SSP gene expression and local diversification of protein structures.
Collapse
Affiliation(s)
- Xiao-Li Hu
- Department of Plant Sciences, University of Tennessee, Knoxville, TN 37996, USA
- Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, TN 37831, USA
| | - Jin Zhang
- State Key Laboratory of Subtropical Silviculture, Zhejiang A&F University, Hangzhou, Zhejiang 311300, China
| | - Rakesh Kaundal
- Department of Plants, Soils and Climate, Utah State University, Logan, UT 84322, USA
| | - Raghav Kataria
- Department of Plants, Soils and Climate, Utah State University, Logan, UT 84322, USA
| | - Jesse L Labbé
- Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, TN 37831, USA
| | - Julie C Mitchell
- Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, TN 37831, USA
| | - Timothy J Tschaplinski
- Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, TN 37831, USA
- The Center for Bioenergy Innovation, Oak Ridge National Laboratory, Oak Ridge, TN 37831, USA
| | - Gerald A Tuskan
- Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, TN 37831, USA
- The Center for Bioenergy Innovation, Oak Ridge National Laboratory, Oak Ridge, TN 37831, USA
| | - Zong-Ming (Max) Cheng
- Department of Plant Sciences, University of Tennessee, Knoxville, TN 37996, USA
- College of Horticulture, Nanjing Agricultural University, Nanjing, Jiangsu 210095 China
| | - Xiaohan Yang
- Department of Plant Sciences, University of Tennessee, Knoxville, TN 37996, USA
- Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, TN 37831, USA
- The Center for Bioenergy Innovation, Oak Ridge National Laboratory, Oak Ridge, TN 37831, USA
| |
Collapse
|
19
|
Senís E, Esgleas M, Najas S, Jiménez-Sábado V, Bertani C, Giménez-Alejandre M, Escriche A, Ruiz-Orera J, Hergueta-Redondo M, Jiménez M, Giralt A, Nuciforo P, Albà MM, Peinado H, Del Toro D, Hove-Madsen L, Götz M, Abad M. TUNAR lncRNA Encodes a Microprotein that Regulates Neural Differentiation and Neurite Formation by Modulating Calcium Dynamics. Front Cell Dev Biol 2022; 9:747667. [PMID: 35036403 PMCID: PMC8758570 DOI: 10.3389/fcell.2021.747667] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2021] [Accepted: 11/03/2021] [Indexed: 11/13/2022] Open
Abstract
Long noncoding RNAs (lncRNAs) are regulatory molecules which have been traditionally considered as “non-coding”. Strikingly, recent evidence has demonstrated that many non-coding regions, including lncRNAs, do in fact contain small-open reading frames that code for small proteins that have been called microproteins. Only a few of them have been characterized so far, but they display key functions in a wide variety of cellular processes. Here, we show that TUNAR lncRNA encodes an evolutionarily conserved microprotein expressed in the nervous system that we have named pTUNAR. pTUNAR deficiency in mouse embryonic stem cells improves their differentiation potential towards neural lineage both in vitro and in vivo. Conversely, pTUNAR overexpression impairs neuronal differentiation by reduced neurite formation in different model systems. At the subcellular level, pTUNAR is a transmembrane protein that localizes in the endoplasmic reticulum and interacts with the calcium transporter SERCA2. pTUNAR overexpression reduces cytoplasmatic calcium, consistent with a possible role of pTUNAR as an activator of SERCA2. Altogether, our results suggest that our newly discovered microprotein has an important role in neural differentiation and neurite formation through the regulation of intracellular calcium. From a more general point of view, our results provide a proof of concept of the role of lncRNAs-encoded microproteins in neural differentiation.
Collapse
Affiliation(s)
- Elena Senís
- Cellular Plasticity and Cancer Group, Vall d'Hebron Institute of Oncology (VHIO), Barcelona, Spain
| | - Miriam Esgleas
- Physiological Genomics, Biomedical Center (BMC), Helmholtz Center Munich, Institute of Stem Cell Research, Großhaderner Str, SyNergy Excellence Cluster, Ludwig-Maximilians-Universitaet (LMU), Munich, Germany
| | - Sonia Najas
- Physiological Genomics, Biomedical Center (BMC), Helmholtz Center Munich, Institute of Stem Cell Research, Großhaderner Str, SyNergy Excellence Cluster, Ludwig-Maximilians-Universitaet (LMU), Munich, Germany
| | - Verónica Jiménez-Sábado
- Instituto de Investigación Biomédica Barcelona (IIBB-CSIC), Instituto de Investigación Biomédica Sant Pau (IIB-Sant Pau) and CIBERCV, Barcelona, Spain
| | - Camilla Bertani
- Cellular Plasticity and Cancer Group, Vall d'Hebron Institute of Oncology (VHIO), Barcelona, Spain
| | - Marta Giménez-Alejandre
- Cellular Plasticity and Cancer Group, Vall d'Hebron Institute of Oncology (VHIO), Barcelona, Spain
| | - Alba Escriche
- Cellular Plasticity and Cancer Group, Vall d'Hebron Institute of Oncology (VHIO), Barcelona, Spain
| | - Jorge Ruiz-Orera
- Cardiovascular and Metabolic Sciences, Max Delbrück Center for Molecular Medicine in the Helmholtz Association (MDC), Berlin, Germany
| | - Marta Hergueta-Redondo
- Microenvironment and Metastasis Laboratory, Molecular Oncology Programme, Spanish National Cancer Research Center (CNIO), Madrid, Spain
| | - Mireia Jiménez
- Cellular Plasticity and Cancer Group, Vall d'Hebron Institute of Oncology (VHIO), Barcelona, Spain
| | - Albert Giralt
- Department of Biological Sciences, Institute of Neurosciences, IDIBAPS, CIBERNED, University of Barcelona, Barcelona, Spain
| | - Paolo Nuciforo
- Molecular Oncology Group, Vall d'Hebron Institute of Oncology (VHIO), Barcelona, Spain
| | - M Mar Albà
- Evolutionary Genomics Group, Research Programme on Biomedical Informatics, Hospital del Mar Medical Research Institute (IMIM) and Universitat Pompeu Fabra (UPF), Barcelona, Spain.,Catalan Institution for Research and Advanced Studies (ICREA), Barcelona, Spain
| | - Héctor Peinado
- Microenvironment and Metastasis Laboratory, Molecular Oncology Programme, Spanish National Cancer Research Center (CNIO), Madrid, Spain
| | - Daniel Del Toro
- Department of Biological Sciences, Institute of Neurosciences, IDIBAPS, CIBERNED, University of Barcelona, Barcelona, Spain
| | - Leif Hove-Madsen
- Instituto de Investigación Biomédica Barcelona (IIBB-CSIC), Instituto de Investigación Biomédica Sant Pau (IIB-Sant Pau) and CIBERCV, Barcelona, Spain
| | - Magdalena Götz
- Physiological Genomics, Biomedical Center (BMC), Helmholtz Center Munich, Institute of Stem Cell Research, Großhaderner Str, SyNergy Excellence Cluster, Ludwig-Maximilians-Universitaet (LMU), Munich, Germany
| | - María Abad
- Cellular Plasticity and Cancer Group, Vall d'Hebron Institute of Oncology (VHIO), Barcelona, Spain
| |
Collapse
|
20
|
Chen L, Yang Y, Zhang Y, Li K, Cai H, Wang H, Zhao Q. The Small Open Reading Frame-Encoded Peptides: Advances in Methodologies and Functional Studies. Chembiochem 2021; 23:e202100534. [PMID: 34862721 DOI: 10.1002/cbic.202100534] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2021] [Revised: 11/15/2021] [Indexed: 11/07/2022]
Abstract
Small open reading frames (sORFs) are an important class of genes with less than 100 codons. They were historically annotated as noncoding or even junk sequences. In recent years, accumulating evidence suggests that sORFs could encode a considerable number of polypeptides, many of which play important roles in both physiology and disease pathology. However, it has been technically challenging to directly detect sORF-encoded peptides (SEPs). Here, we discuss the latest advances in methodologies for identifying SEPs with mass spectrometry, as well as the progress on functional studies of SEPs.
Collapse
Affiliation(s)
- Lei Chen
- State Key Laboratory of Chemical Biology and Drug Discovery, Department of Applied Biology and Chemical Technology, Hong Kong Polytechnic University, Hung Hom, Hong Kong SAR, 999077, P. R. China.,Laboratory for Synthetic Chemistry and Chemical Biology Limited, Hong Kong Science and Technology Park, New Territories, Hong Kong SAR, 999077, P. R. China
| | - Ying Yang
- State Key Laboratory of Chemical Biology and Drug Discovery, Department of Applied Biology and Chemical Technology, Hong Kong Polytechnic University, Hung Hom, Hong Kong SAR, 999077, P. R. China
| | - Yuanliang Zhang
- State Key Laboratory of Chemical Biology and Drug Discovery, Department of Applied Biology and Chemical Technology, Hong Kong Polytechnic University, Hung Hom, Hong Kong SAR, 999077, P. R. China
| | - Kecheng Li
- State Key Laboratory of Chemical Biology and Drug Discovery, Department of Applied Biology and Chemical Technology, Hong Kong Polytechnic University, Hung Hom, Hong Kong SAR, 999077, P. R. China
| | - Hongmin Cai
- School of Computer Science and Engineering, South China University of Technology, Guangzhou, 510623, P. R. China
| | - Hongwei Wang
- State Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-sen University, Guangdong Provincial Key Laboratory of Ophthalmology and Visual Science, Guangzhou, 510623, P. R. China
| | - Qian Zhao
- State Key Laboratory of Chemical Biology and Drug Discovery, Department of Applied Biology and Chemical Technology, Hong Kong Polytechnic University, Hung Hom, Hong Kong SAR, 999077, P. R. China
| |
Collapse
|
21
|
Peeters MKR, Baggerman G, Gabriels R, Pepermans E, Menschaert G, Boonen K. Ion Mobility Coupled to a Time-of-Flight Mass Analyzer Combined With Fragment Intensity Predictions Improves Identification of Classical Bioactive Peptides and Small Open Reading Frame-Encoded Peptides. Front Cell Dev Biol 2021; 9:720570. [PMID: 34604223 PMCID: PMC8484717 DOI: 10.3389/fcell.2021.720570] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/04/2021] [Accepted: 08/25/2021] [Indexed: 12/29/2022] Open
Abstract
Bioactive peptides exhibit key roles in a wide variety of complex processes, such as regulation of body weight, learning, aging, and innate immune response. Next to the classical bioactive peptides, emerging from larger precursor proteins by specific proteolytic processing, a new class of peptides originating from small open reading frames (sORFs) have been recognized as important biological regulators. But their intrinsic properties, specific expression pattern and location on presumed non-coding regions have hindered the full characterization of the repertoire of bioactive peptides, despite their predominant role in various pathways. Although the development of peptidomics has offered the opportunity to study these peptides in vivo, it remains challenging to identify the full peptidome as the lack of cleavage enzyme specification and large search space complicates conventional database search approaches. In this study, we introduce a proteogenomics methodology using a new type of mass spectrometry instrument and the implementation of machine learning tools toward improved identification of potential bioactive peptides in the mouse brain. The application of trapped ion mobility spectrometry (tims) coupled to a time-of-flight mass analyzer (TOF) offers improved sensitivity, an enhanced peptide coverage, reduction in chemical noise and the reduced occurrence of chimeric spectra. Subsequent machine learning tools MS2PIP, predicting fragment ion intensities and DeepLC, predicting retention times, improve the database searching based on a large and comprehensive custom database containing both sORFs and alternative ORFs. Finally, the identification of peptides is further enhanced by applying the post-processing semi-supervised learning tool Percolator. Applying this workflow, the first peptidomics workflow combined with spectral intensity and retention time predictions, we identified a total of 167 predicted sORF-encoded peptides, of which 48 originating from presumed non-coding locations, next to 401 peptides from known neuropeptide precursors, linked to 66 annotated bioactive neuropeptides from within 22 different families. Additional PEAKS analysis expanded the pool of SEPs on presumed non-coding locations to 84, while an additional 204 peptides completed the list of peptides from neuropeptide precursors. Altogether, this study provides insights into a new robust pipeline that fuses technological advancements from different fields ensuring an improved coverage of the neuropeptidome in the mouse brain.
Collapse
Affiliation(s)
- Marlies K. R. Peeters
- BioBix, Department of Data Analysis and Mathematical Modelling, Ghent University, Ghent, Belgium
| | - Geert Baggerman
- Centre for Proteomics, University of Antwerp, Antwerp, Belgium
- Unit Environmental Risk and Health, Flemish Institute for Technological Research, Mol, Belgium
| | - Ralf Gabriels
- Department of Biomolecular Medicine, Ghent University, Ghent, Belgium
- VIB-UGent Center for Medical Biotechnology, Flanders Institute for Biotechnology, Ghent, Belgium
| | - Elise Pepermans
- Centre for Proteomics, University of Antwerp, Antwerp, Belgium
- Unit Environmental Risk and Health, Flemish Institute for Technological Research, Mol, Belgium
| | - Gerben Menschaert
- BioBix, Department of Data Analysis and Mathematical Modelling, Ghent University, Ghent, Belgium
- OHMX.bio, Ghent, Belgium
| | - Kurt Boonen
- Centre for Proteomics, University of Antwerp, Antwerp, Belgium
- Unit Environmental Risk and Health, Flemish Institute for Technological Research, Mol, Belgium
| |
Collapse
|
22
|
Verbruggen S, Gessulat S, Gabriels R, Matsaroki A, Van de Voorde H, Kuster B, Degroeve S, Martens L, Van Criekinge W, Wilhelm M, Menschaert G. Spectral Prediction Features as a Solution for the Search Space Size Problem in Proteogenomics. Mol Cell Proteomics 2021; 20:100076. [PMID: 33823297 PMCID: PMC8214147 DOI: 10.1016/j.mcpro.2021.100076] [Citation(s) in RCA: 25] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2020] [Revised: 03/04/2021] [Accepted: 03/25/2021] [Indexed: 11/17/2022] Open
Abstract
Proteogenomics approaches often struggle with the distinction between true and false peptide-to-spectrum matches as the database size enlarges. However, features extracted from tandem mass spectrometry intensity predictors can enhance the peptide identification rate and can provide extra confidence for peptide-to-spectrum matching in a proteogenomics context. To that end, features from the spectral intensity pattern predictors MS2PIP and Prosit were combined with the canonical scores from MaxQuant in the Percolator postprocessing tool for protein sequence databases constructed out of ribosome profiling and nanopore RNA-Seq analyses. The presented results provide evidence that this approach enhances both the identification rate as well as the validation stringency in a proteogenomic setting. First proteogenomics with PSM rescoring using machine learning–predicted spectra Demonstrated on both ribosome profiling and nanopore RNA-Seq–derived databases Rescoring leads to elevated stringency and increased identification rates Rescoring compensates for the search space size issues in proteogenomics
Collapse
Affiliation(s)
- Steven Verbruggen
- BioBix, Lab of Bioinformatics and Computational Genomics, Department of Mathematical Modeling, Statistics and Bioinformatics, Faculty of Bioscience Engineering, Ghent University, Ghent, Belgium; OHMX.bio, Ghent, Belgium
| | - Siegfried Gessulat
- Chair of Proteomics and Bioanalytics, Technical University of Munich, Freising, Germany
| | - Ralf Gabriels
- Department of Biomolecular Medicine, Faculty of Medicine and Health Sciences, Ghent University, Ghent, Belgium; VIB-UGent Center for Medical Biotechnology, VIB, Ghent, Belgium
| | | | | | - Bernhard Kuster
- Chair of Proteomics and Bioanalytics, Technical University of Munich, Freising, Germany
| | - Sven Degroeve
- Department of Biomolecular Medicine, Faculty of Medicine and Health Sciences, Ghent University, Ghent, Belgium; VIB-UGent Center for Medical Biotechnology, VIB, Ghent, Belgium
| | - Lennart Martens
- Department of Biomolecular Medicine, Faculty of Medicine and Health Sciences, Ghent University, Ghent, Belgium; VIB-UGent Center for Medical Biotechnology, VIB, Ghent, Belgium
| | - Wim Van Criekinge
- BioBix, Lab of Bioinformatics and Computational Genomics, Department of Mathematical Modeling, Statistics and Bioinformatics, Faculty of Bioscience Engineering, Ghent University, Ghent, Belgium
| | - Mathias Wilhelm
- Chair of Proteomics and Bioanalytics, Technical University of Munich, Freising, Germany
| | - Gerben Menschaert
- BioBix, Lab of Bioinformatics and Computational Genomics, Department of Mathematical Modeling, Statistics and Bioinformatics, Faculty of Bioscience Engineering, Ghent University, Ghent, Belgium; OHMX.bio, Ghent, Belgium.
| |
Collapse
|
23
|
Brunet MA, Lucier JF, Levesque M, Leblanc S, Jacques JF, Al-Saedi HRH, Guilloy N, Grenier F, Avino M, Fournier I, Salzet M, Ouangraoua A, Scott M, Boisvert FM, Roucou X. OpenProt 2021: deeper functional annotation of the coding potential of eukaryotic genomes. Nucleic Acids Res 2021; 49:D380-D388. [PMID: 33179748 PMCID: PMC7779043 DOI: 10.1093/nar/gkaa1036] [Citation(s) in RCA: 57] [Impact Index Per Article: 19.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2020] [Revised: 10/15/2020] [Accepted: 10/16/2020] [Indexed: 12/12/2022] Open
Abstract
OpenProt (www.openprot.org) is the first proteogenomic resource supporting a polycistronic annotation model for eukaryotic genomes. It provides a deeper annotation of open reading frames (ORFs) while mining experimental data for supporting evidence using cutting-edge algorithms. This update presents the major improvements since the initial release of OpenProt. All species support recent NCBI RefSeq and Ensembl annotations, with changes in annotations being reported in OpenProt. Using the 131 ribosome profiling datasets re-analysed by OpenProt to date, non-AUG initiation starts are reported alongside a confidence score of the initiating codon. From the 177 mass spectrometry datasets re-analysed by OpenProt to date, the unicity of the detected peptides is controlled at each implementation. Furthermore, to guide the users, detectability statistics and protein relationships (isoforms) are now reported for each protein. Finally, to foster access to deeper ORF annotation independently of one's bioinformatics skills or computational resources, OpenProt now offers a data analysis platform. Users can submit their dataset for analysis and receive the results from the analysis by OpenProt. All data on OpenProt are freely available and downloadable for each species, the release-based format ensuring a continuous access to the data. Thus, OpenProt enables a more comprehensive annotation of eukaryotic genomes and fosters functional proteomic discoveries.
Collapse
Affiliation(s)
- Marie A Brunet
- Department of Biochemistry and Functional Genomics, Université de Sherbrooke, 3201 Jean Mignault, Sherbrooke, QC J1E 4K8, Canada
- PROTEO, Quebec Network for Research on Protein Function, Structure, and Engineering, Université Laval, Quebec City, QC G1V0A6, Canada
| | - Jean-François Lucier
- Center for Computational Science, Université de Sherbrooke, Sherbrooke, QC J1K 2R1, Canada
- Biology Department, Université de Sherbrooke, Sherbrooke, QC J1K 2R1, Canada
| | - Maxime Levesque
- Center for Computational Science, Université de Sherbrooke, Sherbrooke, QC J1K 2R1, Canada
- Biology Department, Université de Sherbrooke, Sherbrooke, QC J1K 2R1, Canada
| | - Sébastien Leblanc
- Department of Biochemistry and Functional Genomics, Université de Sherbrooke, 3201 Jean Mignault, Sherbrooke, QC J1E 4K8, Canada
- PROTEO, Quebec Network for Research on Protein Function, Structure, and Engineering, Université Laval, Quebec City, QC G1V0A6, Canada
| | - Jean-Francois Jacques
- Department of Biochemistry and Functional Genomics, Université de Sherbrooke, 3201 Jean Mignault, Sherbrooke, QC J1E 4K8, Canada
- PROTEO, Quebec Network for Research on Protein Function, Structure, and Engineering, Université Laval, Quebec City, QC G1V0A6, Canada
| | - Hassan R H Al-Saedi
- Department of Biochemistry and Functional Genomics, Université de Sherbrooke, 3201 Jean Mignault, Sherbrooke, QC J1E 4K8, Canada
| | - Noé Guilloy
- Department of Biochemistry and Functional Genomics, Université de Sherbrooke, 3201 Jean Mignault, Sherbrooke, QC J1E 4K8, Canada
- PROTEO, Quebec Network for Research on Protein Function, Structure, and Engineering, Université Laval, Quebec City, QC G1V0A6, Canada
| | - Frederic Grenier
- Center for Computational Science, Université de Sherbrooke, Sherbrooke, QC J1K 2R1, Canada
- Biology Department, Université de Sherbrooke, Sherbrooke, QC J1K 2R1, Canada
| | - Mariano Avino
- Department of Biochemistry and Functional Genomics, Université de Sherbrooke, 3201 Jean Mignault, Sherbrooke, QC J1E 4K8, Canada
| | - Isabelle Fournier
- INSERM U1192, Laboratoire Protéomique, Réponse Inflammatoire & Spectrométrie de Masse (PRISM), Université de Lille, F-59000 Lille, France
| | - Michel Salzet
- INSERM U1192, Laboratoire Protéomique, Réponse Inflammatoire & Spectrométrie de Masse (PRISM), Université de Lille, F-59000 Lille, France
| | - Aïda Ouangraoua
- Informatics Department, Université de Sherbrooke, Sherbrooke, QC J1K 2R1, Canada
| | - Michelle S Scott
- Department of Biochemistry and Functional Genomics, Université de Sherbrooke, 3201 Jean Mignault, Sherbrooke, QC J1E 4K8, Canada
| | - François-Michel Boisvert
- Department of Immunology and Cellular Biology, Université de Sherbrooke, Sherbrooke, QC J1E 4K8, Canada
| | - Xavier Roucou
- Department of Biochemistry and Functional Genomics, Université de Sherbrooke, 3201 Jean Mignault, Sherbrooke, QC J1E 4K8, Canada
- PROTEO, Quebec Network for Research on Protein Function, Structure, and Engineering, Université Laval, Quebec City, QC G1V0A6, Canada
| |
Collapse
|
24
|
Cardon T, Fournier I, Salzet M. Shedding Light on the Ghost Proteome. Trends Biochem Sci 2020; 46:239-250. [PMID: 33246829 DOI: 10.1016/j.tibs.2020.10.003] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2020] [Revised: 10/21/2020] [Accepted: 10/22/2020] [Indexed: 01/19/2023]
Abstract
Conventionally, eukaryotic mRNAs were thought to be monocistronic, leading to the translation of a single protein. However, large-scale proteomics has led to the identification of proteins translated from alternative open reading frames (AltORFs) in mRNAs. AltORFs are found in addition to predicted reference ORFs and noncoding RNA. Alternative proteins are not represented in the conventional protein databases, and this 'Ghost proteome' was not considered until recently. Some of these proteins are functional, and there is growing evidence that they are involved in central functions in physiological and physiopathological contexts. Here, we review how this Ghost proteome fills the gap in our understanding of signaling pathways, establishes new markers of pathologies, and highlights therapeutic targets.
Collapse
Affiliation(s)
- Tristan Cardon
- Laboratoire Protéomique, Réponse Inflammatoire Spectrométrie de Masse (PRISM), Inserm U1192, University of Lille, CHU Lille, F-59000 Lille, France.
| | - Isabelle Fournier
- Laboratoire Protéomique, Réponse Inflammatoire Spectrométrie de Masse (PRISM), Inserm U1192, University of Lille, CHU Lille, F-59000 Lille, France; Institut Universitaire de France, Paris, France.
| | - Michel Salzet
- Laboratoire Protéomique, Réponse Inflammatoire Spectrométrie de Masse (PRISM), Inserm U1192, University of Lille, CHU Lille, F-59000 Lille, France; Institut Universitaire de France, Paris, France.
| |
Collapse
|
25
|
The hidden world of non-canonical ORFs. Exp Cell Res 2020; 396:112267. [PMID: 32926940 DOI: 10.1016/j.yexcr.2020.112267] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
|