1
|
Tufail MA, Jordan B, Hadjeras L, Gelhausen R, Cassidy L, Habenicht T, Gutt M, Hellwig L, Backofen R, Tholey A, Sharma CM, Schmitz RA. Uncovering the small proteome of Methanosarcina mazei using Ribo-seq and peptidomics under different nitrogen conditions. Nat Commun 2024; 15:8659. [PMID: 39370430 PMCID: PMC11456600 DOI: 10.1038/s41467-024-53008-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/08/2023] [Accepted: 09/25/2024] [Indexed: 10/08/2024] Open
Abstract
The mesophilic methanogenic archaeal model organism Methanosarcina mazei strain Gö1 is crucial for climate and environmental research due to its ability to produce methane. Here, we establish a Ribo-seq protocol for M. mazei strain Gö1 under two growth conditions (nitrogen sufficiency and limitation). The translation of 93 previously annotated and 314 unannotated small ORFs, coding for proteins ≤ 70 amino acids, is predicted with high confidence based on Ribo-seq data. LC-MS analysis validates the translation for 62 annotated small ORFs and 26 unannotated small ORFs. Epitope tagging followed by immunoblotting analysis confirms the translation of 13 out of 16 selected unannotated small ORFs. A comprehensive differential transcription and translation analysis reveals that 29 of 314 unannotated small ORFs are differentially regulated in response to nitrogen availability at the transcriptional and 49 at the translational level. A high number of reported small RNAs are emerging as dual-function RNAs, including sRNA154, the central regulatory small RNA of nitrogen metabolism. Several unannotated small ORFs are conserved in Methanosarcina species and overproducing several (small ORF encoded) small proteins suggests key physiological functions. Overall, the comprehensive analysis opens an avenue to elucidate the function(s) of multitudinous small proteins and dual-function RNAs in M. mazei.
Collapse
Affiliation(s)
| | - Britta Jordan
- Institute for General Microbiology, Kiel University, 24118, Kiel, Germany
| | - Lydia Hadjeras
- Institute of Molecular Infection Biology, University of Würzburg, 97080, Würzburg, Germany
| | - Rick Gelhausen
- Bioinformatics Group, Department of Computer Science, University of Freiburg, 79110, Freiburg, Germany
| | - Liam Cassidy
- Systematic Proteome Research & Bioanalytics, Institute for Experimental Medicine, Kiel University, 24105, Kiel, Germany
| | - Tim Habenicht
- Institute for General Microbiology, Kiel University, 24118, Kiel, Germany
| | - Miriam Gutt
- Institute for General Microbiology, Kiel University, 24118, Kiel, Germany
| | - Lisa Hellwig
- Institute for General Microbiology, Kiel University, 24118, Kiel, Germany
| | - Rolf Backofen
- Bioinformatics Group, Department of Computer Science, University of Freiburg, 79110, Freiburg, Germany
| | - Andreas Tholey
- Systematic Proteome Research & Bioanalytics, Institute for Experimental Medicine, Kiel University, 24105, Kiel, Germany
| | - Cynthia M Sharma
- Institute of Molecular Infection Biology, University of Würzburg, 97080, Würzburg, Germany
| | - Ruth A Schmitz
- Institute for General Microbiology, Kiel University, 24118, Kiel, Germany.
| |
Collapse
|
2
|
Beals J, Hu H, Li X. A survey of experimental and computational identification of small proteins. Brief Bioinform 2024; 25:bbae345. [PMID: 39007598 PMCID: PMC11247407 DOI: 10.1093/bib/bbae345] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2024] [Revised: 05/27/2024] [Accepted: 07/02/2024] [Indexed: 07/16/2024] Open
Abstract
Small proteins (SPs) are typically characterized as eukaryotic proteins shorter than 100 amino acids and prokaryotic proteins shorter than 50 amino acids. Historically, they were disregarded because of the arbitrary size thresholds to define proteins. However, recent research has revealed the existence of many SPs and their crucial roles. Despite this, the identification of SPs and the elucidation of their functions are still in their infancy. To pave the way for future SP studies, we briefly introduce the limitations and advancements in experimental techniques for SP identification. We then provide an overview of available computational tools for SP identification, their constraints, and their evaluation. Additionally, we highlight existing resources for SP research. This survey aims to initiate further exploration into SPs and encourage the development of more sophisticated computational tools for SP identification in prokaryotes and microbiomes.
Collapse
Affiliation(s)
- Joshua Beals
- Burnett School of Biomedical Science, University of Central Florida, 4000 Central Florida Blvd, Orlando, FL 32816, United States
| | - Haiyan Hu
- Department of Computer Science, University of Central Florida, 4000 Central Florida Blvd, Orlando, FL 32816, United States
| | - Xiaoman Li
- Burnett School of Biomedical Science, University of Central Florida, 4000 Central Florida Blvd, Orlando, FL 32816, United States
| |
Collapse
|
3
|
Fesenko I, Sahakyan H, Shabalina SA, Koonin EV. The Cryptic Bacterial Microproteome. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.02.17.580829. [PMID: 38903115 PMCID: PMC11188072 DOI: 10.1101/2024.02.17.580829] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/22/2024]
Abstract
Microproteins encoded by small open reading frames (smORFs) comprise the "dark matter" of proteomes. Although functional microproteins were identified in diverse organisms from all three domains of life, bacterial smORFs remain poorly characterized. In this comprehensive study of intergenic smORFs (ismORFs, 15-70 codons) in 5,668 bacterial genomes of the family Enterobacteriaceae, we identified 67,297 clusters of ismORFs subject to purifying selection. The ismORFs mainly code for hydrophobic, potentially transmembrane, unstructured, or minimally structured microproteins. Using AlphaFold Multimer, we predicted interactions of some of the predicted microproteins encoded by transcribed ismORFs with proteins encoded by neighboring genes, revealing the potential of microproteins to regulate the activity of various proteins, particularly, under stress. We compiled a catalog of predicted microprotein families with different levels of evidence from synteny analysis, structure prediction, and transcription and translation data. This study offers a resource for investigation of biological functions of microproteins.
Collapse
Affiliation(s)
- Igor Fesenko
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA
| | - Harutyun Sahakyan
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA
| | - Svetlana A. Shabalina
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA
| | - Eugene V. Koonin
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA
| |
Collapse
|
4
|
Schumacher K, Gelhausen R, Kion-Crosby W, Barquist L, Backofen R, Jung K. Ribosome profiling reveals the fine-tuned response of Escherichia coli to mild and severe acid stress. mSystems 2023; 8:e0103723. [PMID: 37909716 PMCID: PMC10746267 DOI: 10.1128/msystems.01037-23] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2023] [Accepted: 09/28/2023] [Indexed: 11/03/2023] Open
Abstract
IMPORTANCE Bacteria react very differently to survive in acidic environments, such as the human gastrointestinal tract. Escherichia coli is one of the extremely acid-resistant bacteria and has a variety of acid-defense mechanisms. Here, we provide the first genome-wide overview of the adaptations of E. coli K-12 to mild and severe acid stress at both the transcriptional and translational levels. Using ribosome profiling and RNA sequencing, we uncover novel adaptations to different degrees of acidity, including previously hidden stress-induced small proteins and novel key transcription factors for acid defense, and report mRNAs with pH-dependent differential translation efficiency. In addition, we distinguish between acid-specific adaptations and general stress response mechanisms using denoising autoencoders. This workflow represents a powerful approach that takes advantage of next-generation sequencing techniques and machine learning to systematically analyze bacterial stress responses.
Collapse
Affiliation(s)
- Kilian Schumacher
- Faculty of Biology, Microbiology, Ludwig-Maximilians-Universität München, Martinsried, Germany
| | - Rick Gelhausen
- Bioinformatics Group, Department of Computer Science, Albert-Ludwigs-Universität Freiburg, Freiburg, Germany
| | - Willow Kion-Crosby
- Helmholtz Institute for RNA-based Infection Research (HIRI)/Helmholtz Centre for Infection Research (HZI), Würzburg, Germany
- University of Würzburg, Faculty of Medicine, Würzburg, Germany
| | - Lars Barquist
- Helmholtz Institute for RNA-based Infection Research (HIRI)/Helmholtz Centre for Infection Research (HZI), Würzburg, Germany
- University of Würzburg, Faculty of Medicine, Würzburg, Germany
| | - Rolf Backofen
- Bioinformatics Group, Department of Computer Science, Albert-Ludwigs-Universität Freiburg, Freiburg, Germany
- Signalling Research Centres BIOSS and CIBSS, University of Freiburg, Freiburg, Germany
| | - Kirsten Jung
- Faculty of Biology, Microbiology, Ludwig-Maximilians-Universität München, Martinsried, Germany
| |
Collapse
|
5
|
Fuchs S, Engelmann S. Small proteins in bacteria - Big challenges in prediction and identification. Proteomics 2023; 23:e2200421. [PMID: 37609810 DOI: 10.1002/pmic.202200421] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2023] [Revised: 08/03/2023] [Accepted: 08/10/2023] [Indexed: 08/24/2023]
Abstract
Proteins with up to 100 amino acids have been largely overlooked due to the challenges associated with predicting and identifying them using traditional methods. Recent advances in bioinformatics and machine learning, DNA sequencing, RNA and Ribo-seq technologies, and mass spectrometry (MS) have greatly facilitated the detection and characterisation of these elusive proteins in recent years. This has revealed their crucial role in various cellular processes including regulation, signalling and transport, as toxins and as folding helpers for protein complexes. Consequently, the systematic identification and characterisation of these proteins in bacteria have emerged as a prominent field of interest within the microbial research community. This review provides an overview of different strategies for predicting and identifying these proteins on a large scale, leveraging the power of these advanced technologies. Furthermore, the review offers insights into the future developments that may be expected in this field.
Collapse
Affiliation(s)
- Stephan Fuchs
- Genome Competence Center (MF1), Department MFI, Robert-Koch-Institut, Berlin, Germany
| | - Susanne Engelmann
- Institute for Microbiology, Technische Universität Braunschweig, Braunschweig, Germany
- Microbial Proteomics, Helmholtzzentrum für Infektionsforschung GmbH, Braunschweig, Germany
| |
Collapse
|
6
|
Simoens L, Fijalkowski I, Van Damme P. Exposing the small protein load of bacterial life. FEMS Microbiol Rev 2023; 47:fuad063. [PMID: 38012116 PMCID: PMC10723866 DOI: 10.1093/femsre/fuad063] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2023] [Revised: 11/10/2023] [Accepted: 11/24/2023] [Indexed: 11/29/2023] Open
Abstract
The ever-growing repertoire of genomic techniques continues to expand our understanding of the true diversity and richness of prokaryotic genomes. Riboproteogenomics laid the foundation for dynamic studies of previously overlooked genomic elements. Most strikingly, bacterial genomes were revealed to harbor robust repertoires of small open reading frames (sORFs) encoding a diverse and broadly expressed range of small proteins, or sORF-encoded polypeptides (SEPs). In recent years, continuous efforts led to great improvements in the annotation and characterization of such proteins, yet many challenges remain to fully comprehend the pervasive nature of small proteins and their impact on bacterial biology. In this work, we review the recent developments in the dynamic field of bacterial genome reannotation, catalog the important biological roles carried out by small proteins and identify challenges obstructing the way to full understanding of these elusive proteins.
Collapse
Affiliation(s)
- Laure Simoens
- iRIP Unit, Laboratory of Microbiology, Department of Biochemistry and Microbiology, Ghent University, K. L. Ledeganckstraat 35, 9000 Ghent, Belgium
| | - Igor Fijalkowski
- iRIP Unit, Laboratory of Microbiology, Department of Biochemistry and Microbiology, Ghent University, K. L. Ledeganckstraat 35, 9000 Ghent, Belgium
| | - Petra Van Damme
- iRIP Unit, Laboratory of Microbiology, Department of Biochemistry and Microbiology, Ghent University, K. L. Ledeganckstraat 35, 9000 Ghent, Belgium
| |
Collapse
|
7
|
Brantl S, Ul Haq I. Small proteins in Gram-positive bacteria. FEMS Microbiol Rev 2023; 47:fuad064. [PMID: 38052429 PMCID: PMC10730256 DOI: 10.1093/femsre/fuad064] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2023] [Revised: 11/27/2023] [Accepted: 12/04/2023] [Indexed: 12/07/2023] Open
Abstract
Small proteins comprising less than 100 amino acids have been often ignored in bacterial genome annotations. About 10 years ago, focused efforts started to investigate whole peptidomes, which resulted in the discovery of a multitude of small proteins, but only a number of them have been characterized in detail. Generally, small proteins can be either membrane or cytosolic proteins. The latter interact with larger proteins, RNA or even metal ions. Here, we summarize our current knowledge on small proteins from Gram-positive bacteria with a special emphasis on the model organism Bacillus subtilis. Our examples include membrane-bound toxins of type I toxin-antitoxin systems, proteins that block the assembly of higher order structures, regulate sporulation or modulate the RNA degradosome. We do not consider antimicrobial peptides. Furthermore, we present methods for the identification and investigation of small proteins.
Collapse
Affiliation(s)
- Sabine Brantl
- AG Bakteriengenetik, Matthias-Schleiden-Institut, Friedrich-Schiller-Universität Jena, Philosophenweg 12, Jena D-07743, Germany
| | - Inam Ul Haq
- AG Bakteriengenetik, Matthias-Schleiden-Institut, Friedrich-Schiller-Universität Jena, Philosophenweg 12, Jena D-07743, Germany
| |
Collapse
|
8
|
Prensner JR, Abelin JG, Kok LW, Clauser KR, Mudge JM, Ruiz-Orera J, Bassani-Sternberg M, Moritz RL, Deutsch EW, van Heesch S. What Can Ribo-Seq, Immunopeptidomics, and Proteomics Tell Us About the Noncanonical Proteome? Mol Cell Proteomics 2023; 22:100631. [PMID: 37572790 PMCID: PMC10506109 DOI: 10.1016/j.mcpro.2023.100631] [Citation(s) in RCA: 13] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2023] [Revised: 07/21/2023] [Accepted: 08/08/2023] [Indexed: 08/14/2023] Open
Abstract
Ribosome profiling (Ribo-Seq) has proven transformative for our understanding of the human genome and proteome by illuminating thousands of noncanonical sites of ribosome translation outside the currently annotated coding sequences (CDSs). A conservative estimate suggests that at least 7000 noncanonical ORFs are translated, which, at first glance, has the potential to expand the number of human protein CDSs by 30%, from ∼19,500 annotated CDSs to over 26,000 annotated CDSs. Yet, additional scrutiny of these ORFs has raised numerous questions about what fraction of them truly produce a protein product and what fraction of those can be understood as proteins according to conventional understanding of the term. Adding further complication is the fact that published estimates of noncanonical ORFs vary widely by around 30-fold, from several thousand to several hundred thousand. The summation of this research has left the genomics and proteomics communities both excited by the prospect of new coding regions in the human genome but searching for guidance on how to proceed. Here, we discuss the current state of noncanonical ORF research, databases, and interpretation, focusing on how to assess whether a given ORF can be said to be "protein coding."
Collapse
Affiliation(s)
- John R Prensner
- Division of Pediatric Hematology/Oncology, Department of Pediatrics, University of Michigan Medical School, Ann Arbor, Michigan, USA; Department of Biological Chemistry, University of Michigan Medical School, Ann Arbor, Michigan, USA.
| | | | - Leron W Kok
- Princess Máxima Center for Pediatric Oncology, Utrecht, The Netherlands
| | - Karl R Clauser
- Broad Institute of MIT and Harvard, Cambridge, Massachusetts, USA
| | - Jonathan M Mudge
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Cambridge, UK
| | - Jorge Ruiz-Orera
- Cardiovascular and Metabolic Sciences, Max Delbrück Center for Molecular Medicine in the Helmholtz Association (MDC), Berlin, Germany
| | - Michal Bassani-Sternberg
- Ludwig Institute for Cancer Research, Agora Center Bugnon 25A, University of Lausanne, Lausanne, Switzerland; Department of Oncology, Centre Hospitalier Universitaire Vaudois (CHUV), Lausanne, Switzerland; Agora Cancer Research Centre, Lausanne, Switzerland
| | - Robert L Moritz
- Institute for Systems Biology (ISB), Seattle, Washington, USA
| | - Eric W Deutsch
- Institute for Systems Biology (ISB), Seattle, Washington, USA
| | | |
Collapse
|
9
|
Prensner JR, Abelin JG, Kok LW, Clauser KR, Mudge JM, Ruiz-Orera J, Bassani-Sternberg M, Deutsch EW, van Heesch S. What can Ribo-seq and proteomics tell us about the non-canonical proteome? BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.05.16.541049. [PMID: 37292611 PMCID: PMC10245706 DOI: 10.1101/2023.05.16.541049] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Ribosome profiling (Ribo-seq) has proven transformative for our understanding of the human genome and proteome by illuminating thousands of non-canonical sites of ribosome translation outside of the currently annotated coding sequences (CDSs). A conservative estimate suggests that at least 7,000 non-canonical open reading frames (ORFs) are translated, which, at first glance, has the potential to expand the number of human protein-coding sequences by 30%, from ∼19,500 annotated CDSs to over 26,000. Yet, additional scrutiny of these ORFs has raised numerous questions about what fraction of them truly produce a protein product and what fraction of those can be understood as proteins according to conventional understanding of the term. Adding further complication is the fact that published estimates of non-canonical ORFs vary widely by around 30-fold, from several thousand to several hundred thousand. The summation of this research has left the genomics and proteomics communities both excited by the prospect of new coding regions in the human genome, but searching for guidance on how to proceed. Here, we discuss the current state of non-canonical ORF research, databases, and interpretation, focusing on how to assess whether a given ORF can be said to be "protein-coding". In brief The human genome encodes thousands of non-canonical open reading frames (ORFs) in addition to protein-coding genes. As a nascent field, many questions remain regarding non-canonical ORFs. How many exist? Do they encode proteins? What level of evidence is needed for their verification? Central to these debates has been the advent of ribosome profiling (Ribo-seq) as a method to discern genome-wide ribosome occupancy, and immunopeptidomics as a method to detect peptides that are processed and presented by MHC molecules and not observed in traditional proteomics experiments. This article provides a synthesis of the current state of non-canonical ORF research and proposes standards for their future investigation and reporting. Highlights Combined use of Ribo-seq and proteomics-based methods enables optimal confidence in detecting non-canonical ORFs and their protein products.Ribo-seq can provide more sensitive detection of non-canonical ORFs, but data quality and analytical pipelines will impact results.Non-canonical ORF catalogs are diverse and span both high-stringency and low-stringency ORF nominations.A framework for standardized non-canonical ORF evidence will advance the research field.
Collapse
Affiliation(s)
- John R. Prensner
- Department of Pediatrics, Division of Pediatric Hematology/Oncology, University of Michigan Medical School, Ann Arbor, MI 48109, USA
| | | | - Leron W. Kok
- Princess Máxima Center for Pediatric Oncology, Heidelberglaan 25, 3584 CS, Utrecht, the Netherlands
| | - Karl R. Clauser
- Broad Institute of MIT and Harvard, Cambridge, MA, 02142, USA
| | - Jonathan M. Mudge
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Jorge Ruiz-Orera
- Cardiovascular and Metabolic Sciences, Max Delbrück Center for Molecular Medicine in the Helmholtz Association (MDC), 13125 Berlin, Germany
| | - Michal Bassani-Sternberg
- Ludwig Institute for Cancer Research, University of Lausanne, Agora Center Bugnon 25A, 1005 Lausanne, Switzerland
- Department of Oncology, Centre hospitalier universitaire vaudois (CHUV), Rue du Bugnon 46, 1005 Lausanne, Switzerland
- Agora Cancer Research Centre, 1011 Lausanne, Switzerland
| | - Eric W. Deutsch
- Institute for Systems Biology (ISB), Seattle, Washington 98109, USA
| | - Sebastiaan van Heesch
- Princess Máxima Center for Pediatric Oncology, Heidelberglaan 25, 3584 CS, Utrecht, the Netherlands
| |
Collapse
|
10
|
Hadjeras L, Heiniger B, Maaß S, Scheuer R, Gelhausen R, Azarderakhsh S, Barth-Weber S, Backofen R, Becher D, Ahrens CH, Sharma CM, Evguenieva-Hackenberg E. Unraveling the small proteome of the plant symbiont Sinorhizobium meliloti by ribosome profiling and proteogenomics. MICROLIFE 2023; 4:uqad012. [PMID: 37223733 PMCID: PMC10117765 DOI: 10.1093/femsml/uqad012] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/15/2022] [Revised: 02/08/2023] [Accepted: 03/07/2023] [Indexed: 05/25/2023]
Abstract
The soil-dwelling plant symbiont Sinorhizobium meliloti is a major model organism of Alphaproteobacteria. Despite numerous detailed OMICS studies, information about small open reading frame (sORF)-encoded proteins (SEPs) is largely missing, because sORFs are poorly annotated and SEPs are hard to detect experimentally. However, given that SEPs can fulfill important functions, identification of translated sORFs is critical for analyzing their roles in bacterial physiology. Ribosome profiling (Ribo-seq) can detect translated sORFs with high sensitivity, but is not yet routinely applied to bacteria because it must be adapted for each species. Here, we established a Ribo-seq procedure for S. meliloti 2011 based on RNase I digestion and detected translation for 60% of the annotated coding sequences during growth in minimal medium. Using ORF prediction tools based on Ribo-seq data, subsequent filtering, and manual curation, the translation of 37 non-annotated sORFs with ≤ 70 amino acids was predicted with confidence. The Ribo-seq data were supplemented by mass spectrometry (MS) analyses from three sample preparation approaches and two integrated proteogenomic search database (iPtgxDB) types. Searches against standard and 20-fold smaller Ribo-seq data-informed custom iPtgxDBs confirmed 47 annotated SEPs and identified 11 additional novel SEPs. Epitope tagging and Western blot analysis confirmed the translation of 15 out of 20 SEPs selected from the translatome map. Overall, by combining MS and Ribo-seq approaches, the small proteome of S. meliloti was substantially expanded by 48 novel SEPs. Several of them are part of predicted operons and/or are conserved from Rhizobiaceae to Bacteria, suggesting important physiological functions.
Collapse
Affiliation(s)
- Lydia Hadjeras
- Institute of Molecular Infection Biology, University of Würzburg, 97080 Würzburg, Germany
| | - Benjamin Heiniger
- Molecular Ecology,
Agroscope and SIB Swiss Institute of Bioinformatics, 8046 Zurich, Switzerland
| | - Sandra Maaß
- Institute of Microbiology, University of Greifswald, 17489 Greifswald, Germany
| | - Robina Scheuer
- Institute of Microbiology and Molecular Biology, University of Giessen, 35392 Giessen, Germany
| | - Rick Gelhausen
- Bioinformatics Group, Department of Computer Science, University of Freiburg, 79110 Freiburg, Germany
| | - Saina Azarderakhsh
- Institute of Microbiology and Molecular Biology, University of Giessen, 35392 Giessen, Germany
| | - Susanne Barth-Weber
- Institute of Microbiology and Molecular Biology, University of Giessen, 35392 Giessen, Germany
| | - Rolf Backofen
- Bioinformatics Group, Department of Computer Science, University of Freiburg, 79110 Freiburg, Germany
| | - Dörte Becher
- Institute of Microbiology, University of Greifswald, 17489 Greifswald, Germany
| | - Christian H Ahrens
- Molecular Ecology, Agroscope and SIB Swiss Institute of Bioinformatics, 8046 Zurich, Switzerland
| | - Cynthia M Sharma
- Institute of Molecular Infection Biology, University of Würzburg, 97080 Würzburg, Germany
| | | |
Collapse
|
11
|
Hadjeras L, Bartel J, Maier LK, Maaß S, Vogel V, Svensson SL, Eggenhofer F, Gelhausen R, Müller T, Alkhnbashi OS, Backofen R, Becher D, Sharma CM, Marchfelder A. Revealing the small proteome of Haloferax volcanii by combining ribosome profiling and small-protein optimized mass spectrometry. MICROLIFE 2023; 4:uqad001. [PMID: 37223747 PMCID: PMC10117724 DOI: 10.1093/femsml/uqad001] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/01/2022] [Revised: 11/29/2022] [Accepted: 01/13/2023] [Indexed: 05/25/2023]
Abstract
In contrast to extensively studied prokaryotic 'small' transcriptomes (encompassing all small noncoding RNAs), small proteomes (here defined as including proteins ≤70 aa) are only now entering the limelight. The absence of a complete small protein catalogue in most prokaryotes precludes our understanding of how these molecules affect physiology. So far, archaeal genomes have not yet been analyzed broadly with a dedicated focus on small proteins. Here, we present a combinatorial approach, integrating experimental data from small protein-optimized mass spectrometry (MS) and ribosome profiling (Ribo-seq), to generate a high confidence inventory of small proteins in the model archaeon Haloferax volcanii. We demonstrate by MS and Ribo-seq that 67% of the 317 annotated small open reading frames (sORFs) are translated under standard growth conditions. Furthermore, annotation-independent analysis of Ribo-seq data showed ribosomal engagement for 47 novel sORFs in intergenic regions. A total of seven of these were also detected by proteomics, in addition to an eighth novel small protein solely identified by MS. We also provide independent experimental evidence in vivo for the translation of 12 sORFs (annotated and novel) using epitope tagging and western blotting, underlining the validity of our identification scheme. Several novel sORFs are conserved in Haloferax species and might have important functions. Based on our findings, we conclude that the small proteome of H. volcanii is larger than previously appreciated, and that combining MS with Ribo-seq is a powerful approach for the discovery of novel small protein coding genes in archaea.
Collapse
Affiliation(s)
- Lydia Hadjeras
- Department of Molecular Infection Biology II, Institute of Molecular Infection Biology (IMIB), University of Würzburg, Josef-Schneider-Straße 2 / D15, 97080 Würzburg, Germany
| | - Jürgen Bartel
- Department of Microbial Proteomics, Institute of Microbiology, University of Greifswald, Felix-Hausdorff-Str. 8, 17489 Greifswald, Germany
| | | | - Sandra Maaß
- Department of Microbial Proteomics, Institute of Microbiology, University of Greifswald, Felix-Hausdorff-Str. 8, 17489 Greifswald, Germany
| | - Verena Vogel
- Biology II, Ulm University, Albert-Einstein-Allee 11, 89081 Ulm, Germany
| | - Sarah L Svensson
- Department of Molecular Infection Biology II, Institute of Molecular Infection Biology (IMIB), University of Würzburg, Josef-Schneider-Straße 2 / D15, 97080 Würzburg, Germany
| | - Florian Eggenhofer
- Bioinformatics Group, Department of Computer Science, University of Freiburg, Georges-Koehler-Allee 106, 79110 Freiburg, Germany
| | - Rick Gelhausen
- Bioinformatics Group, Department of Computer Science, University of Freiburg, Georges-Koehler-Allee 106, 79110 Freiburg, Germany
| | - Teresa Müller
- Bioinformatics Group, Department of Computer Science, University of Freiburg, Georges-Koehler-Allee 106, 79110 Freiburg, Germany
| | - Omer S Alkhnbashi
- Information and Computer Science Department, King Fahd University of Petroleum and Minerals, Dhahran 31261, Saudi Arabia
| | - Rolf Backofen
- Bioinformatics Group, Department of Computer Science, University of Freiburg, Georges-Koehler-Allee 106, 79110 Freiburg, Germany
- Signalling Research Centres BIOSS and CIBSS, University of Freiburg, Schaenzlestr. 18, 79104 Freiburg, Germany
| | - Dörte Becher
- Department of Microbial Proteomics, Institute of Microbiology, University of Greifswald, Felix-Hausdorff-Str. 8, 17489 Greifswald, Germany
| | - Cynthia M Sharma
- Department of Molecular Infection Biology II, Institute of Molecular Infection Biology (IMIB), University of Würzburg, Josef-Schneider-Straße 2 / D15, 97080 Würzburg, Germany
| | - Anita Marchfelder
- Biology II, Ulm University, Albert-Einstein-Allee 11, 89081 Ulm, Germany
| |
Collapse
|
12
|
Fijalkowski I, Willems P, Jonckheere V, Simoens L, Van Damme P. Hidden in plain sight: challenges in proteomics detection of small ORF-encoded polypeptides. MICROLIFE 2022; 3:uqac005. [PMID: 37223358 PMCID: PMC10117744 DOI: 10.1093/femsml/uqac005] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/13/2021] [Revised: 04/18/2022] [Accepted: 04/29/2022] [Indexed: 05/25/2023]
Abstract
Genomic studies of bacteria have long pointed toward widespread prevalence of small open reading frames (sORFs) encoding for short proteins, <100 amino acids in length. Despite the mounting genomic evidence of their robust expression, relatively little progress has been made in their mass spectrometry-based detection and various blanket statements have been used to explain this observed discrepancy. In this study, we provide a large-scale riboproteogenomics investigation of the challenging nature of proteomic detection of such small proteins as informed by conditional translation data. A panel of physiochemical properties alongside recently developed mass spectrometry detectability metrics was interrogated to provide a comprehensive evidence-based assessment of sORF-encoded polypeptide (SEP) detectability. Moreover, a large-scale proteomics and translatomics compendium of proteins produced by Salmonella Typhimurium (S. Typhimurium), a model human pathogen, across a panel of growth conditions is presented and used in support of our in silico SEP detectability analysis. This integrative approach is used to provide a data-driven census of small proteins expressed by S. Typhimurium across growth phases and infection-relevant conditions. Taken together, our study pinpoints current limitations in proteomics-based detection of novel small proteins currently missing from bacterial genome annotations.
Collapse
Affiliation(s)
- Igor Fijalkowski
- iRIP Unit, Laboratory of Microbiology, Department of Biochemistry and Microbiology, Ghent University, 9000 Ghent, Belgium
| | - Patrick Willems
- iRIP Unit, Laboratory of Microbiology, Department of Biochemistry and Microbiology, Ghent University, 9000 Ghent, Belgium
| | - Veronique Jonckheere
- iRIP Unit, Laboratory of Microbiology, Department of Biochemistry and Microbiology, Ghent University, 9000 Ghent, Belgium
| | - Laure Simoens
- iRIP Unit, Laboratory of Microbiology, Department of Biochemistry and Microbiology, Ghent University, 9000 Ghent, Belgium
| | - Petra Van Damme
- iRIP Unit, Laboratory of Microbiology, Department of Biochemistry and Microbiology, Ghent University, 9000 Ghent, Belgium
| |
Collapse
|
13
|
Zhang Z, Li Y, Yuan W, Wang Z, Wan C. Proteomic-driven identification of short open reading frame-encoded peptides. Proteomics 2022; 22:e2100312. [PMID: 35384297 DOI: 10.1002/pmic.202100312] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2022] [Revised: 03/29/2022] [Accepted: 03/30/2022] [Indexed: 11/10/2022]
Abstract
Accumulating evidence has shown that a large number of short open reading frames (sORFs) also have the ability to encode proteins. The discovery of sORFs opens up a new research area, leading to the identification and functional study of sORF encoded peptides (SEPs) at the omics level. Besides bioinformatics prediction and ribosomal profiling, mass spectrometry (MS) has become a significant tool as it directly detects the sequence of SEPs. Though MS-based proteomics methods have proved to be effective for qualitative and quantitative analysis of SEPs, the detection of SEPs is still a great challenge due to their low abundance and short sequence. To illustrate the progress in method development, we described and discussed the main steps of large-scale proteomics identification of SEPs, including SEP extraction and enrichment, MS detection, data processing and quality control, quantification, and function prediction and validation methods. This article is protected by copyright. All rights reserved.
Collapse
Affiliation(s)
- Zheng Zhang
- School of Life Sciences and Hubei Key Laboratory of Genetic Regulation and Integrative Biology, Central China Normal University, Wuhan, Hubei, 430079, People's Republic of China
| | - Yujie Li
- School of Life Sciences and Hubei Key Laboratory of Genetic Regulation and Integrative Biology, Central China Normal University, Wuhan, Hubei, 430079, People's Republic of China
| | - Wenqian Yuan
- School of Life Sciences and Hubei Key Laboratory of Genetic Regulation and Integrative Biology, Central China Normal University, Wuhan, Hubei, 430079, People's Republic of China
| | - Zhiwei Wang
- School of Life Sciences and Hubei Key Laboratory of Genetic Regulation and Integrative Biology, Central China Normal University, Wuhan, Hubei, 430079, People's Republic of China
| | - Cuihong Wan
- School of Life Sciences and Hubei Key Laboratory of Genetic Regulation and Integrative Biology, Central China Normal University, Wuhan, Hubei, 430079, People's Republic of China
| |
Collapse
|