1
|
Fijalkowski I, Snauwaert V, Van Damme P. Proteins à la carte: riboproteogenomic exploration of bacterial N-terminal proteoform expression. mBio 2024; 15:e0033324. [PMID: 38511928 PMCID: PMC11005335 DOI: 10.1128/mbio.00333-24] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2024] [Accepted: 02/28/2024] [Indexed: 03/22/2024] Open
Abstract
In recent years, it has become evident that the true complexity of bacterial proteomes remains underestimated. Gene annotation tools are known to propagate biases and overlook certain classes of truly expressed proteins, particularly proteoforms-protein isoforms arising from a single gene. Recent (re-)annotation efforts heavily rely on ribosome profiling by providing a direct readout of translation to fully describe bacterial proteomes. In this study, we employ a robust riboproteogenomic pipeline to conduct a systematic census of expressed N-terminal proteoform pairs, representing two isoforms encoded by a single gene raised by annotated and alternative translation initiation, in Salmonella. Intriguingly, conditional-dependent changes in relative utilization of annotated and alternative translation initiation sites (TIS) were observed in several cases. This suggests that TIS selection is subject to regulatory control, adding yet another layer of complexity to our understanding of bacterial proteomes. IMPORTANCE With the emerging theme of genes within genes comprising the existence of alternative open reading frames (ORFs) generated by translation initiation at in-frame start codons, mechanisms that control the relative utilization of annotated and alternative TIS need to be unraveled and our molecular understanding of resulting proteoforms broadened. Utilizing complementary ribosome profiling strategies to map ORF boundaries, we uncovered dual-encoding ORFs generated by in-frame TIS usage in Salmonella. Besides demonstrating that alternative TIS usage may generate proteoforms with different characteristics, such as differential localization and specialized function, quantitative aspects of conditional retapamulin-assisted ribosome profiling (Ribo-RET) translation initiation maps offer unprecedented insights into the relative utilization of annotated and alternative TIS, enabling the exploration of gene regulatory mechanisms that control TIS usage and, consequently, the translation of N-terminal proteoform pairs.
Collapse
Affiliation(s)
- Igor Fijalkowski
- iRIP Unit, Laboratory of Microbiology, Department of Biochemistry and Microbiology, Ghent University, Ghent, Belgium
| | - Valdes Snauwaert
- iRIP Unit, Laboratory of Microbiology, Department of Biochemistry and Microbiology, Ghent University, Ghent, Belgium
| | - Petra Van Damme
- iRIP Unit, Laboratory of Microbiology, Department of Biochemistry and Microbiology, Ghent University, Ghent, Belgium
| |
Collapse
|
2
|
O’Connor PBF, Mahony J, Casey E, Baranov PV, van Sinderen D, Yordanova MM. Ribosome profiling reveals downregulation of UMP biosynthesis as the major early response to phage infection. Microbiol Spectr 2024; 12:e0398923. [PMID: 38451091 PMCID: PMC10986495 DOI: 10.1128/spectrum.03989-23] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2023] [Accepted: 02/14/2024] [Indexed: 03/08/2024] Open
Abstract
Bacteria have evolved diverse defense mechanisms to counter bacteriophage attacks. Genetic programs activated upon infection characterize phage-host molecular interactions and ultimately determine the outcome of the infection. In this study, we applied ribosome profiling to monitor protein synthesis during the early stages of sk1 bacteriophage infection in Lactococcus cremoris. Our analysis revealed major changes in gene expression within 5 minutes of sk1 infection. Notably, we observed a specific and severe downregulation of several pyr operons which encode enzymes required for uridine monophosphate biosynthesis. Consistent with previous findings, this is likely an attempt of the host to starve the phage of nucleotides it requires for propagation. We also observed a gene expression response that we expect to benefit the phage. This included the upregulation of 40 ribosome proteins that likely increased the host's translational capacity, concurrent with a downregulation of genes that promote translational fidelity (lepA and raiA). In addition to the characterization of host-phage gene expression responses, the obtained ribosome profiling data enabled us to identify two putative recoding events as well as dozens of loci currently annotated as pseudogenes that are actively translated. Furthermore, our study elucidated alterations in the dynamics of the translation process, as indicated by time-dependent changes in the metagene profile, suggesting global shifts in translation rates upon infection. Additionally, we observed consistent modifications in the ribosome profiles of individual genes, which were apparent as early as 2 minutes post-infection. The study emphasizes our ability to capture rapid alterations of gene expression during phage infection through ribosome profiling. IMPORTANCE The ribosome profiling technology has provided invaluable insights for understanding cellular translation and eukaryotic viral infections. However, its potential for investigating host-phage interactions remains largely untapped. Here, we applied ribosome profiling to Lactococcus cremoris cultures infected with sk1, a major infectious agent in dairy fermentation processes. This revealed a profound downregulation of genes involved in pyrimidine nucleotide synthesis at an early stage of phage infection, suggesting an anti-phage program aimed at restricting nucleotide availability and, consequently, phage propagation. This is consistent with recent findings and contributes to our growing appreciation for the role of nucleotide limitation as an anti-viral strategy. In addition to capturing rapid alterations in gene expression levels, we identified translation occurring outside annotated regions, as well as signatures of non-standard translation mechanisms. The gene profiles revealed specific changes in ribosomal densities upon infection, reflecting alterations in the dynamics of the translation process.
Collapse
Affiliation(s)
- Patrick B. F. O’Connor
- School of Biochemistry and Cell Biology, University College Cork, Cork, Ireland
- EIRNA Bio, Bioinnovation Hub, Cork, Ireland
| | - Jennifer Mahony
- School of Microbiology and APC Microbiome Ireland, University College Cork, Cork, Ireland
| | - Eoghan Casey
- School of Microbiology and APC Microbiome Ireland, University College Cork, Cork, Ireland
| | - Pavel V. Baranov
- School of Biochemistry and Cell Biology, University College Cork, Cork, Ireland
| | - Douwe van Sinderen
- School of Microbiology and APC Microbiome Ireland, University College Cork, Cork, Ireland
| | | |
Collapse
|
3
|
Simoens L, Fijalkowski I, Van Damme P. Exposing the small protein load of bacterial life. FEMS Microbiol Rev 2023; 47:fuad063. [PMID: 38012116 PMCID: PMC10723866 DOI: 10.1093/femsre/fuad063] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2023] [Revised: 11/10/2023] [Accepted: 11/24/2023] [Indexed: 11/29/2023] Open
Abstract
The ever-growing repertoire of genomic techniques continues to expand our understanding of the true diversity and richness of prokaryotic genomes. Riboproteogenomics laid the foundation for dynamic studies of previously overlooked genomic elements. Most strikingly, bacterial genomes were revealed to harbor robust repertoires of small open reading frames (sORFs) encoding a diverse and broadly expressed range of small proteins, or sORF-encoded polypeptides (SEPs). In recent years, continuous efforts led to great improvements in the annotation and characterization of such proteins, yet many challenges remain to fully comprehend the pervasive nature of small proteins and their impact on bacterial biology. In this work, we review the recent developments in the dynamic field of bacterial genome reannotation, catalog the important biological roles carried out by small proteins and identify challenges obstructing the way to full understanding of these elusive proteins.
Collapse
Affiliation(s)
- Laure Simoens
- iRIP Unit, Laboratory of Microbiology, Department of Biochemistry and Microbiology, Ghent University, K. L. Ledeganckstraat 35, 9000 Ghent, Belgium
| | - Igor Fijalkowski
- iRIP Unit, Laboratory of Microbiology, Department of Biochemistry and Microbiology, Ghent University, K. L. Ledeganckstraat 35, 9000 Ghent, Belgium
| | - Petra Van Damme
- iRIP Unit, Laboratory of Microbiology, Department of Biochemistry and Microbiology, Ghent University, K. L. Ledeganckstraat 35, 9000 Ghent, Belgium
| |
Collapse
|
4
|
Fijalkowski I, Willems P, Jonckheere V, Simoens L, Van Damme P. Hidden in plain sight: challenges in proteomics detection of small ORF-encoded polypeptides. MICROLIFE 2022; 3:uqac005. [PMID: 37223358 PMCID: PMC10117744 DOI: 10.1093/femsml/uqac005] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/13/2021] [Revised: 04/18/2022] [Accepted: 04/29/2022] [Indexed: 05/25/2023]
Abstract
Genomic studies of bacteria have long pointed toward widespread prevalence of small open reading frames (sORFs) encoding for short proteins, <100 amino acids in length. Despite the mounting genomic evidence of their robust expression, relatively little progress has been made in their mass spectrometry-based detection and various blanket statements have been used to explain this observed discrepancy. In this study, we provide a large-scale riboproteogenomics investigation of the challenging nature of proteomic detection of such small proteins as informed by conditional translation data. A panel of physiochemical properties alongside recently developed mass spectrometry detectability metrics was interrogated to provide a comprehensive evidence-based assessment of sORF-encoded polypeptide (SEP) detectability. Moreover, a large-scale proteomics and translatomics compendium of proteins produced by Salmonella Typhimurium (S. Typhimurium), a model human pathogen, across a panel of growth conditions is presented and used in support of our in silico SEP detectability analysis. This integrative approach is used to provide a data-driven census of small proteins expressed by S. Typhimurium across growth phases and infection-relevant conditions. Taken together, our study pinpoints current limitations in proteomics-based detection of novel small proteins currently missing from bacterial genome annotations.
Collapse
Affiliation(s)
- Igor Fijalkowski
- iRIP Unit, Laboratory of Microbiology, Department of Biochemistry and Microbiology, Ghent University, 9000 Ghent, Belgium
| | - Patrick Willems
- iRIP Unit, Laboratory of Microbiology, Department of Biochemistry and Microbiology, Ghent University, 9000 Ghent, Belgium
| | - Veronique Jonckheere
- iRIP Unit, Laboratory of Microbiology, Department of Biochemistry and Microbiology, Ghent University, 9000 Ghent, Belgium
| | - Laure Simoens
- iRIP Unit, Laboratory of Microbiology, Department of Biochemistry and Microbiology, Ghent University, 9000 Ghent, Belgium
| | - Petra Van Damme
- iRIP Unit, Laboratory of Microbiology, Department of Biochemistry and Microbiology, Ghent University, 9000 Ghent, Belgium
| |
Collapse
|
5
|
Gelhausen R, Müller T, Svensson SL, Alkhnbashi OS, Sharma CM, Eggenhofer F, Backofen R. RiboReport - benchmarking tools for ribosome profiling-based identification of open reading frames in bacteria. Brief Bioinform 2022; 23:bbab549. [PMID: 35037022 PMCID: PMC8921622 DOI: 10.1093/bib/bbab549] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2021] [Revised: 11/22/2021] [Accepted: 11/29/2021] [Indexed: 11/19/2022] Open
Abstract
Small proteins encoded by short open reading frames (ORFs) with 50 codons or fewer are emerging as an important class of cellular macromolecules in diverse organisms. However, they often evade detection by proteomics or in silico methods. Ribosome profiling (Ribo-seq) has revealed widespread translation in genomic regions previously thought to be non-coding, driving the development of ORF detection tools using Ribo-seq data. However, only a handful of tools have been designed for bacteria, and these have not yet been systematically compared. Here, we aimed to identify tools that use Ribo-seq data to correctly determine the translational status of annotated bacterial ORFs and also discover novel translated regions with high sensitivity. To this end, we generated a large set of annotated ORFs from four diverse bacterial organisms, manually labeled for their translation status based on Ribo-seq data, which are available for future benchmarking studies. This set was used to investigate the predictive performance of seven Ribo-seq-based ORF detection tools (REPARATION_blast, DeepRibo, Ribo-TISH, PRICE, smORFer, ribotricer and SPECtre), as well as IRSOM, which uses coding potential and RNA-seq coverage only. DeepRibo and REPARATION_blast robustly predicted translated ORFs, including sORFs, with no significant difference for ORFs in close proximity to other genes versus stand-alone genes. However, no tool predicted a set of novel, experimentally verified sORFs with high sensitivity. Start codon predictions with smORFer show the value of initiation site profiling data to further improve the sensitivity of ORF prediction tools in bacteria. Overall, we find that bacterial tools perform well for sORF detection, although there is potential for improving their performance, applicability, usability and reproducibility.
Collapse
Affiliation(s)
- Rick Gelhausen
- Bioinformatics Group, Department of Computer Science, University of Freiburg, Georges-Köhler-Allee 106, 79110, Freiburg, Germany
| | - Teresa Müller
- Bioinformatics Group, Department of Computer Science, University of Freiburg, Georges-Köhler-Allee 106, 79110, Freiburg, Germany
| | - Sarah L Svensson
- Department of Molecular Infection Biology II, Institute of Molecular Infection Biology (IMIB), University of Würzburg, Josef-Schneider-Str. 2 / D15, 97080, Würzburg, Germany
| | - Omer S Alkhnbashi
- Information and Computer Science Department, King Fahd University of Petroleum and Minerals, Saudi Arabia
- SDAIA-KFUPM Joint Research Center for Artificial Intelligence (JRC-AI), King Fahd University of Petroleum and Minerals, Saudi Arabia
| | - Cynthia M Sharma
- Department of Molecular Infection Biology II, Institute of Molecular Infection Biology (IMIB), University of Würzburg, Josef-Schneider-Str. 2 / D15, 97080, Würzburg, Germany
| | - Florian Eggenhofer
- Bioinformatics Group, Department of Computer Science, University of Freiburg, Georges-Köhler-Allee 106, 79110, Freiburg, Germany
| | - Rolf Backofen
- Bioinformatics Group, Department of Computer Science, University of Freiburg, Georges-Köhler-Allee 106, 79110, Freiburg, Germany
- Signalling Research Centres BIOSS and CIBSS, University of Freiburg, Schänzlestr. 18, 79104, State, Germany
| |
Collapse
|
6
|
Hwang S, Lee N, Choe D, Lee Y, Kim W, Kim JH, Kim G, Kim H, Ahn NH, Lee BH, Palsson BO, Cho BK. System-Level Analysis of Transcriptional and Translational Regulatory Elements in Streptomyces griseus. Front Bioeng Biotechnol 2022; 10:844200. [PMID: 35284422 PMCID: PMC8914203 DOI: 10.3389/fbioe.2022.844200] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/27/2021] [Accepted: 02/10/2022] [Indexed: 11/13/2022] Open
Abstract
Bacteria belonging to Streptomyces have the ability to produce a wide range of secondary metabolites through a shift from primary to secondary metabolism regulated by complex networks activated after vegetative growth terminates. Despite considerable effort to understand the regulatory elements governing gene expression related to primary and secondary metabolism in Streptomyces, system-level information remains limited. In this study, we integrated four multi-omics datasets from Streptomyces griseus NBRC 13350: RNA-seq, ribosome profiling, dRNA-seq, and Term-Seq, to analyze the regulatory elements of transcription and translation of differentially expressed genes during cell growth. With the functional enrichment of gene expression in different growth phases, one sigma factor regulon and four transcription factor regulons governing differential gene transcription patterns were found. In addition, the regulatory elements of transcription termination and post-transcriptional processing at transcript 3'-end positions were elucidated, including their conserved motifs, stem-loop RNA structures, and non-terminal locations within the polycistronic operons, and the potential regulatory elements of translation initiation and elongation such as 5'-UTR length, RNA structures at ribosome-bound sites, and codon usage were investigated. This comprehensive genetic information provides a foundational genetic resource for strain engineering to enhance secondary metabolite production in Streptomyces.
Collapse
Affiliation(s)
- Soonkyu Hwang
- Department of Biological Sciences, Korea Advanced Institute of Science and Technology, Daejeon, South Korea
- KAIST Institute for the BioCentury, Korea Advanced Institute of Science and Technology, Daejeon, South Korea
| | - Namil Lee
- Department of Biological Sciences, Korea Advanced Institute of Science and Technology, Daejeon, South Korea
- KAIST Institute for the BioCentury, Korea Advanced Institute of Science and Technology, Daejeon, South Korea
| | - Donghui Choe
- Department of Biological Sciences, Korea Advanced Institute of Science and Technology, Daejeon, South Korea
- KAIST Institute for the BioCentury, Korea Advanced Institute of Science and Technology, Daejeon, South Korea
| | - Yongjae Lee
- Department of Biological Sciences, Korea Advanced Institute of Science and Technology, Daejeon, South Korea
- KAIST Institute for the BioCentury, Korea Advanced Institute of Science and Technology, Daejeon, South Korea
| | - Woori Kim
- Department of Biological Sciences, Korea Advanced Institute of Science and Technology, Daejeon, South Korea
- KAIST Institute for the BioCentury, Korea Advanced Institute of Science and Technology, Daejeon, South Korea
| | - Ji Hun Kim
- Department of Biological Sciences, Korea Advanced Institute of Science and Technology, Daejeon, South Korea
- KAIST Institute for the BioCentury, Korea Advanced Institute of Science and Technology, Daejeon, South Korea
| | - Gahyeon Kim
- Department of Biological Sciences, Korea Advanced Institute of Science and Technology, Daejeon, South Korea
- KAIST Institute for the BioCentury, Korea Advanced Institute of Science and Technology, Daejeon, South Korea
| | - Hyeseong Kim
- Department of Biological Sciences, Korea Advanced Institute of Science and Technology, Daejeon, South Korea
- KAIST Institute for the BioCentury, Korea Advanced Institute of Science and Technology, Daejeon, South Korea
| | - Neung-Ho Ahn
- Biological and Genetic Resources Assessment Division, National Institute of Biological Resources, Incheon, South Korea
| | - Byoung-Hee Lee
- Biological and Genetic Resources Assessment Division, National Institute of Biological Resources, Incheon, South Korea
| | - Bernhard O. Palsson
- Department of Bioengineering, University of California, San Diego, La Jolla, CA, United States
- Department of Pediatrics, University of California, San Diego, La Jolla, CA, United States
- Novo Nordisk Foundation Center for Biosustainability, Technical University of Denmark, Lyngby, Denmark
| | - Byung-Kwan Cho
- Department of Biological Sciences, Korea Advanced Institute of Science and Technology, Daejeon, South Korea
- KAIST Institute for the BioCentury, Korea Advanced Institute of Science and Technology, Daejeon, South Korea
| |
Collapse
|
7
|
Fijalkowski I, Peeters MKR, Van Damme P. Small Protein Enrichment Improves Proteomics Detection of sORF Encoded Polypeptides. Front Genet 2021; 12:713400. [PMID: 34721520 PMCID: PMC8554064 DOI: 10.3389/fgene.2021.713400] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2021] [Accepted: 10/01/2021] [Indexed: 11/13/2022] Open
Abstract
With the rapid growth in the number of sequenced genomes, genome annotation efforts became almost exclusively reliant on automated pipelines. Despite their unquestionable utility, these methods have been shown to underestimate the true complexity of the studied genomes, with small open reading frames (sORFs; ORFs typically considered shorter than 300 nucleotides) and, in consequence, their protein products (sORF encoded polypeptides or SEPs) being the primary example of a poorly annotated and highly underexplored class of genomic elements. With the advent of advanced translatomics such as ribosome profiling, reannotation efforts have progressed a great deal in providing translation evidence for numerous, previously unannotated sORFs. However, proteomics validation of these riboproteogenomics discoveries remains challenging due to their short length and often highly variable physiochemical properties. In this work we evaluate and compare tailored, yet easily adaptable, protein extraction methodologies for their efficacy in the extraction and concomitantly proteomics detection of SEPs expressed in the prokaryotic model pathogen Salmonella typhimurium (S. typhimurium). Further, an optimized protocol for the enrichment and efficient detection of SEPs making use of the of amphipathic polymer amphipol A8-35 and relying on differential peptide vs. protein solubility was developed and compared with global extraction methods making use of chaotropic agents. Given the versatile biological functions SEPs have been shown to exert, this work provides an accessible protocol for proteomics exploration of this fascinating class of small proteins.
Collapse
Affiliation(s)
- Igor Fijalkowski
- iRIP Unit, Laboratory of Microbiology, Department of Biochemistry and Microbiology, Ghent University, Gent, Belgium
| | - Marlies K. R. Peeters
- BioBix, Department of Data Analysis and Mathematical Modelling, Ghent University, Gent, Belgium
| | - Petra Van Damme
- iRIP Unit, Laboratory of Microbiology, Department of Biochemistry and Microbiology, Ghent University, Gent, Belgium
| |
Collapse
|
8
|
Fijalkowska D, Fijalkowski I, Willems P, Van Damme P. Bacterial riboproteogenomics: the era of N-terminal proteoform existence revealed. FEMS Microbiol Rev 2021; 44:418-431. [PMID: 32386204 DOI: 10.1093/femsre/fuaa013] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2019] [Accepted: 05/07/2020] [Indexed: 12/17/2022] Open
Abstract
With the rapid increase in the number of sequenced prokaryotic genomes, relying on automated gene annotation became a necessity. Multiple lines of evidence, however, suggest that current bacterial genome annotations may contain inconsistencies and are incomplete, even for so-called well-annotated genomes. We here discuss underexplored sources of protein diversity and new methodologies for high-throughput genome reannotation. The expression of multiple molecular forms of proteins (proteoforms) from a single gene, particularly driven by alternative translation initiation, is gaining interest as a prominent contributor to bacterial protein diversity. In consequence, riboproteogenomic pipelines were proposed to comprehensively capture proteoform expression in prokaryotes by the complementary use of (positional) proteomics and the direct readout of translated genomic regions using ribosome profiling. To complement these discoveries, tailored strategies are required for the functional characterization of newly discovered bacterial proteoforms.
Collapse
Affiliation(s)
- Daria Fijalkowska
- Department of Biochemistry and Microbiology, Ghent University, K. L. Ledeganckstraat 35, B-9000 Ghent, Belgium
| | - Igor Fijalkowski
- Department of Biochemistry and Microbiology, Ghent University, K. L. Ledeganckstraat 35, B-9000 Ghent, Belgium
| | - Patrick Willems
- Department of Biochemistry and Microbiology, Ghent University, K. L. Ledeganckstraat 35, B-9000 Ghent, Belgium
| | - Petra Van Damme
- Department of Biochemistry and Microbiology, Ghent University, K. L. Ledeganckstraat 35, B-9000 Ghent, Belgium
| |
Collapse
|
9
|
Oh E. Monitoring Bacterial Translation Rates Genome-Wide. Methods Mol Biol 2021; 2252:3-26. [PMID: 33765269 DOI: 10.1007/978-1-0716-1150-0_1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Modern DNA sequencing technologies have allowed for the sequencing of tens of thousands of bacterial genomes. While this explosion of information has brought about new insights into the diversity of the prokaryotic world, much less is known of the identity of proteins encoded within these genomes, as well as their rates of production. The advent of ribosome profiling, or the deep sequencing of ribosome-protected footprints, has recently enabled the systematic evaluation of every protein-coding region in a given experimental condition, the rates of protein production for each gene, and the variability in translation rates across each message. Here, I provide an update to the bacterial ribosome profiling approach, with a particular emphasis on a simplified strategy to reduce cloning time.
Collapse
Affiliation(s)
- Eugene Oh
- Center for Cancer Research, Massachusetts General Hospital, Charlestown, MA, USA.
- Department of Medicine, Harvard Medical School, Boston, MA, USA.
| |
Collapse
|
10
|
Meydan S, Klepacki D, Mankin AS, Vázquez-Laslop N. Identification of Translation Start Sites in Bacterial Genomes. Methods Mol Biol 2021; 2252:27-55. [PMID: 33765270 DOI: 10.1007/978-1-0716-1150-0_2] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/09/2023]
Abstract
The knowledge of translation start sites is crucial for annotation of genes in bacterial genomes. However, systematic mapping of start codons in bacterial genes has mainly relied on predictions based on protein conservation and mRNA sequence features which, although useful, are not always accurate. We recently found that the pleuromutilin antibiotic retapamulin (RET) is a specific inhibitor of translation initiation that traps ribosomes specifically at start codons, and we used it in combination with ribosome profiling to map start codons in the Escherichia coli genome. This genome-wide strategy, that was named Ribo-RET, not only verifies the position of start codons in already annotated genes but also enables identification of previously unannotated open reading frames and reveals the presence of internal start sites within genes. Here, we provide a detailed Ribo-RET protocol for E. coli. Ribo-RET can be adapted for mapping the start codons of the protein-coding sequences in a variety of bacterial species.
Collapse
Affiliation(s)
- Sezen Meydan
- National Institute of Diabetes and Digestive and Kidney Diseases, National Institutes of Health, Bethesda, MD, USA
| | - Dorota Klepacki
- Center for Biomolecular Sciences, University of Illinois at Chicago, Chicago, IL, USA
| | - Alexander S Mankin
- Center for Biomolecular Sciences, University of Illinois at Chicago, Chicago, IL, USA.
| | - Nora Vázquez-Laslop
- Center for Biomolecular Sciences, University of Illinois at Chicago, Chicago, IL, USA.
| |
Collapse
|
11
|
Lost and Found: Re-searching and Re-scoring Proteomics Data Aids Genome Annotation and Improves Proteome Coverage. mSystems 2020; 5:5/5/e00833-20. [PMID: 33109751 PMCID: PMC7593589 DOI: 10.1128/msystems.00833-20] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/12/2023] Open
Abstract
Delineation of open reading frames (ORFs) causes persistent inconsistencies in prokaryote genome annotation. We demonstrate that by advanced (re)analysis of omics data, a higher proteome coverage and sensitive detection of unannotated ORFs can be achieved, which can be exploited for conditional bacterial genome (re)annotation, which is especially relevant in view of annotating the wealth of sequenced prokaryotic genomes obtained in recent years. Prokaryotic genome annotation is heavily dependent on automated gene annotation pipelines that are prone to propagate errors and underestimate genome complexity. We describe an optimized proteogenomic workflow that uses ribosome profiling (ribo-seq) and proteomic data for Salmonella enterica serovar Typhimurium to identify unannotated proteins or alternative protein forms. This data analysis encompasses the searching of cofragmenting peptides and postprocessing with extended peptide-to-spectrum quality features, including comparison to predicted fragment ion intensities. When this strategy is applied, an enhanced proteome depth is achieved, as well as greater confidence for unannotated peptide hits. We demonstrate the general applicability of our pipeline by reanalyzing public Deinococcus radiodurans data sets. Taken together, our results show that systematic reanalysis using available prokaryotic (proteome) data sets holds great promise to assist in experimentally based genome annotation. IMPORTANCE Delineation of open reading frames (ORFs) causes persistent inconsistencies in prokaryote genome annotation. We demonstrate that by advanced (re)analysis of omics data, a higher proteome coverage and sensitive detection of unannotated ORFs can be achieved, which can be exploited for conditional bacterial genome (re)annotation, which is especially relevant in view of annotating the wealth of sequenced prokaryotic genomes obtained in recent years.
Collapse
|
12
|
Goel N, Singh S, Aseri TC. Global sequence features based translation initiation site prediction in human genomic sequences. Heliyon 2020; 6:e04825. [PMID: 32964155 PMCID: PMC7490824 DOI: 10.1016/j.heliyon.2020.e04825] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2019] [Revised: 05/25/2020] [Accepted: 08/26/2020] [Indexed: 11/26/2022] Open
Abstract
Gene prediction has been increasingly important in genome annotation due to advancements in sequencing technology. Genome annotation further helps in determining the structure and function of these genes. Translation initiation site prediction (TIS) in human genomic sequences is one of the fundamental and essential steps in gene prediction. Thus, accurate prediction of TIS in these sequences is highly desirable. Although many computational methods were developed for this problem, none of them focused on finding these sites in human genomic sequences. In this paper, a new TIS prediction method is proposed by incorporating global sequence based features. Support vector machine is used to assess the prediction power of these features. The proposed method achieved accuracy of above 90% when tested for genomic as well as cDNA sequences. The experimental results indicate that the method works well for both genomic and cDNA sequences. The method can be integrated into gene prediction system in future.
Collapse
Affiliation(s)
- Neelam Goel
- Department of Information Technology, University Institute of Engineering and Technology, Sector-25, Panjab University, Chandigarh 160014, India
| | - Shailendra Singh
- Department of Computer Science and Engineering, Punjab Engineering College (Deemed to be University), Sector-12, Chandigarh 160012, India
| | - Trilok Chand Aseri
- Department of Computer Science and Engineering, Punjab Engineering College (Deemed to be University), Sector-12, Chandigarh 160012, India
| |
Collapse
|
13
|
Glaub A, Huptas C, Neuhaus K, Ardern Z. Recommendations for bacterial ribosome profiling experiments based on bioinformatic evaluation of published data. J Biol Chem 2020; 295:8999-9011. [PMID: 32385111 PMCID: PMC7335797 DOI: 10.1074/jbc.ra119.012161] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2019] [Revised: 05/05/2020] [Indexed: 02/03/2023] Open
Abstract
Ribosome profiling (RIBO-Seq) has improved our understanding of bacterial translation, including finding many unannotated genes. However, protocols for RIBO-Seq and corresponding data analysis are not yet standardized. Here, we analyzed 48 RIBO-Seq samples from nine studies of Escherichia coli K12 grown in lysogeny broth medium and particularly focused on the size-selection step. We show that for conventional expression analysis, a size range between 22 and 30 nucleotides is sufficient to obtain protein-coding fragments, which has the advantage of removing many unwanted rRNA and tRNA reads. More specific analyses may require longer reads and a corresponding improvement in rRNA/tRNA depletion. There is no consensus about the appropriate sequencing depth for RIBO-Seq experiments in prokaryotes, and studies vary significantly in total read number. Our analysis suggests that 20 million reads that are not mapping to rRNA/tRNA are required for global detection of translated annotated genes. We also highlight the influence of drug-induced ribosome stalling, which causes bias at translation start sites. The resulting accumulation of reads at the start site may be especially useful for detecting weakly expressed genes. As different methods suit different questions, it may not be possible to produce a "one-size-fits-all" ribosome profiling data set. Therefore, experiments should be carefully designed in light of the scientific questions of interest. We propose some basic characteristics that should be reported with any new RIBO-Seq data sets. Careful attention to the factors discussed should improve prokaryotic gene detection and the comparability of ribosome profiling data sets.
Collapse
Affiliation(s)
- Alina Glaub
- Chair for Microbial Ecology, Technical University of Munich, Freising, Germany
| | - Christopher Huptas
- Chair for Microbial Ecology, Technical University of Munich, Freising, Germany
| | - Klaus Neuhaus
- Chair for Microbial Ecology, Technical University of Munich, Freising, Germany; Core Facility Microbiome, ZIEL Institute for Food and Health, Technical University of Munich, Freising, Germany
| | - Zachary Ardern
- Chair for Microbial Ecology, Technical University of Munich, Freising, Germany.
| |
Collapse
|
14
|
Clauwaert J, Menschaert G, Waegeman W. DeepRibo: a neural network for precise gene annotation of prokaryotes by combining ribosome profiling signal and binding site patterns. Nucleic Acids Res 2019; 47:e36. [PMID: 30753697 PMCID: PMC6451124 DOI: 10.1093/nar/gkz061] [Citation(s) in RCA: 43] [Impact Index Per Article: 8.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2018] [Revised: 01/02/2019] [Accepted: 01/30/2019] [Indexed: 12/13/2022] Open
Abstract
Annotation of gene expression in prokaryotes often finds itself corrected due to small variations of the annotated gene regions observed between different (sub)-species. It has become apparent that traditional sequence alignment algorithms, used for the curation of genomes, are not able to map the full complexity of the genomic landscape. We present DeepRibo, a novel neural network utilizing features extracted from ribosome profiling information and binding site sequence patterns that shows to be a precise tool for the delineation and annotation of expressed genes in prokaryotes. The neural network combines recurrent memory cells and convolutional layers, adapting the information gained from both the high-throughput ribosome profiling data and ribosome binding translation initiation sequence region into one model. DeepRibo is designed as a single model trained on a variety of ribosome profiling experiments, used for the identification of open reading frames in prokaryotes without a priori knowledge of the translational landscape. Through extensive validation of the model trained on various sets of data, multiple species sequence similarity, mass spectrometry and Edman degradation verified proteins, the effectiveness of DeepRibo is highlighted.
Collapse
Affiliation(s)
- Jim Clauwaert
- KERMIT, Department of Data Analysis and Mathematical Modelling, Ghent University, Coupure Links 653, 9000 Gent, Belgium
| | - Gerben Menschaert
- Biobix, Department of Data Analysis and Mathematical Modelling, Ghent University, Coupure Links 653, 9000 Gent, Belgium
| | - Willem Waegeman
- KERMIT, Department of Data Analysis and Mathematical Modelling, Ghent University, Coupure Links 653, 9000 Gent, Belgium
| |
Collapse
|
15
|
Korandla DR, Wozniak JM, Campeau A, Gonzalez DJ, Wright ES. AssessORF: combining evolutionary conservation and proteomics to assess prokaryotic gene predictions. Bioinformatics 2019; 36:1022-1029. [PMID: 31532487 PMCID: PMC7998711 DOI: 10.1093/bioinformatics/btz714] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2019] [Revised: 09/05/2019] [Accepted: 09/13/2019] [Indexed: 01/31/2023] Open
Abstract
MOTIVATION A core task of genomics is to identify the boundaries of protein coding genes, which may cover over 90% of a prokaryote's genome. Several programs are available for gene finding, yet it is currently unclear how well these programs perform and whether any offers superior accuracy. This is in part because there is no universal benchmark for gene finding and, therefore, most developers select their own benchmarking strategy. RESULTS Here, we introduce AssessORF, a new approach for benchmarking prokaryotic gene predictions based on evidence from proteomics data and the evolutionary conservation of start and stop codons. We applied AssessORF to compare gene predictions offered by GenBank, GeneMarkS-2, Glimmer and Prodigal on genomes spanning the prokaryotic tree of life. Gene predictions were 88-95% in agreement with the available evidence, with Glimmer performing the worst but no clear winner. All programs were biased towards selecting start codons that were upstream of the actual start. Given these findings, there remains considerable room for improvement, especially in the detection of correct start sites. AVAILABILITY AND IMPLEMENTATION AssessORF is available as an R package via the Bioconductor package repository. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Deepank R Korandla
- Department of Biological Sciences, USA,Computational Biology Department, Carnegie Mellon University, Pittsburgh, PA 15213, USA,Department of Biomedical Informatics, University of Pittsburgh School of Medicine, Pittsburgh, PA 15219, USA
| | - Jacob M Wozniak
- Department of Pharmacology, University of California San Diego, La Jolla, CA 92093, USA,Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, La Jolla, CA 92093, USA
| | - Anaamika Campeau
- Department of Pharmacology, University of California San Diego, La Jolla, CA 92093, USA,Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, La Jolla, CA 92093, USA
| | - David J Gonzalez
- Department of Pharmacology, University of California San Diego, La Jolla, CA 92093, USA,Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, La Jolla, CA 92093, USA
| | | |
Collapse
|
16
|
Retapamulin-Assisted Ribosome Profiling Reveals the Alternative Bacterial Proteome. Mol Cell 2019; 74:481-493.e6. [PMID: 30904393 DOI: 10.1016/j.molcel.2019.02.017] [Citation(s) in RCA: 101] [Impact Index Per Article: 20.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2018] [Revised: 01/25/2019] [Accepted: 02/12/2019] [Indexed: 12/21/2022]
Abstract
The use of alternative translation initiation sites enables production of more than one protein from a single gene, thereby expanding the cellular proteome. Although several such examples have been serendipitously found in bacteria, genome-wide mapping of alternative translation start sites has been unattainable. We found that the antibiotic retapamulin specifically arrests initiating ribosomes at start codons of the genes. Retapamulin-enhanced Ribo-seq analysis (Ribo-RET) not only allowed mapping of conventional initiation sites at the beginning of the genes, but strikingly, it also revealed putative internal start sites in a number of Escherichia coli genes. Experiments demonstrated that the internal start codons can be recognized by the ribosomes and direct translation initiation in vitro and in vivo. Proteins, whose synthesis is initiated at internal in-frame and out-of-frame start sites, can be functionally important and contribute to the "alternative" bacterial proteome. The internal start sites may also play regulatory roles in gene expression.
Collapse
|
17
|
Low TY, Mohtar MA, Ang MY, Jamal R. Connecting Proteomics to Next‐Generation Sequencing: Proteogenomics and Its Current Applications in Biology. Proteomics 2018; 19:e1800235. [DOI: 10.1002/pmic.201800235] [Citation(s) in RCA: 23] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2018] [Revised: 10/09/2018] [Indexed: 12/17/2022]
Affiliation(s)
- Teck Yew Low
- UKM Medical Molecular Biology Institute (UMBI)Universiti Kebangsaan Malaysia 56000 Kuala Lumpur Malaysia
| | - M. Aiman Mohtar
- UKM Medical Molecular Biology Institute (UMBI)Universiti Kebangsaan Malaysia 56000 Kuala Lumpur Malaysia
| | - Mia Yang Ang
- UKM Medical Molecular Biology Institute (UMBI)Universiti Kebangsaan Malaysia 56000 Kuala Lumpur Malaysia
| | - Rahman Jamal
- UKM Medical Molecular Biology Institute (UMBI)Universiti Kebangsaan Malaysia 56000 Kuala Lumpur Malaysia
| |
Collapse
|
18
|
Ten-Caten F, Vêncio RZN, Lorenzetti APR, Zaramela LS, Santana AC, Koide T. Internal RNAs overlapping coding sequences can drive the production of alternative proteins in archaea. RNA Biol 2018; 15:1119-1132. [PMID: 30175688 PMCID: PMC6161675 DOI: 10.1080/15476286.2018.1509661] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022] Open
Abstract
Prokaryotic genomes show a high level of information compaction often with different molecules transcribed from the same locus. Although antisense RNAs have been relatively well studied, RNAs in the same strand, internal RNAs (intraRNAs), are still poorly understood. The question of how common is the translation of overlapping reading frames remains open. We address this question in the model archaeon Halobacterium salinarum. In the present work we used differential RNA-seq (dRNA-seq) in H. salinarum NRC-1 to locate intraRNA signals in subsets of internal transcription start sites (iTSS) and establish the open reading frames associated to them (intraORFs). Using C-terminally flagged proteins, we experimentally observed isoforms accurately predicted by intraRNA translation for kef1, acs3 and orc4 genes. We also recovered from the literature and mass spectrometry databases several instances of protein isoforms consistent with intraRNA translation such as the gas vesicle protein gene gvpC1. We found evidence for intraRNAs in horizontally transferred genes such as the chaperone dnaK and the aerobic respiration related cydA in both H. salinarum and Escherichia coli. Also, intraRNA translation evidence in H. salinarum, E. coli and yeast of a universal elongation factor (aEF-2, fusA and eEF-2) suggests that this is an ancient phenomenon present in all domains of life.
Collapse
Affiliation(s)
- Felipe Ten-Caten
- a Department of Biochemistry and Immunology , Ribeirão Preto Medical School, University of São Paulo , Ribeirão Preto , Brazil
| | - Ricardo Z N Vêncio
- b Department of Computation and Mathematics, Faculdade de Filosofia , Ciências e Letras de Ribeirão Preto, University of São Paulo , Ribeirão Preto , Brazil
| | - Alan Péricles R Lorenzetti
- a Department of Biochemistry and Immunology , Ribeirão Preto Medical School, University of São Paulo , Ribeirão Preto , Brazil
| | - Livia Soares Zaramela
- a Department of Biochemistry and Immunology , Ribeirão Preto Medical School, University of São Paulo , Ribeirão Preto , Brazil
| | - Ana Carolina Santana
- c Department of Cell and Molecular Biology and Pathogenic Bioagents , Ribeirão Preto Medical School, University of São Paulo , Ribeirão Preto , Brazil
| | - Tie Koide
- a Department of Biochemistry and Immunology , Ribeirão Preto Medical School, University of São Paulo , Ribeirão Preto , Brazil
| |
Collapse
|
19
|
Birkeland Å, ChyŻyńska K, Valen E. Shoelaces: an interactive tool for ribosome profiling processing and visualization. BMC Genomics 2018; 19:543. [PMID: 30021517 PMCID: PMC6052522 DOI: 10.1186/s12864-018-4912-6] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2018] [Accepted: 07/02/2018] [Indexed: 01/23/2023] Open
Abstract
Background The emergence of ribosome profiling to map actively translating ribosomes has laid the foundation for a diverse range of studies on translational regulation. The data obtained with different variations of this assay is typically manually processed, which has created a need for tools that would streamline and standardize processing steps. Results We present Shoelaces, a toolkit for ribosome profiling experiments automating read selection and filtering to obtain genuine translating footprints. Based on periodicity, favoring enrichment over the coding regions, it determines the read lengths corresponding to bona fide ribosome protected fragments. The specific codon under translation (P-site) is determined by automatic offset calculations resulting in sub-codon resolution. Shoelaces provides both a user-friendly graphical interface for interactive visualisation in a genome browser-like fashion and a command line interface for integration into automated pipelines. We process 79 libraries and show that studies typically discard excessive amounts of quality data in their manual analysis pipelines. Conclusions Shoelaces streamlines ribosome profiling analysis offering automation of the processing, a range of interactive visualization features and export of the data into standard formats. Shoelaces stores all processing steps performed in an XML file that can be used by other groups to exactly reproduce the processing of a given study. We therefore anticipate that Shoelaces can aid researchers by automating what is typically performed manually and contribute to the overall reproducibility of studies. The tool is freely distributed as a Python package, with additional instructions, tutorial and demo datasets available at https://bitbucket.org/valenlab/shoelaces. Electronic supplementary material The online version of this article (10.1186/s12864-018-4912-6) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Åsmund Birkeland
- Department of Informatics, University of Bergen, Bergen, 5008, Norway
| | - Katarzyna ChyŻyńska
- Computational Biology Unit, Department of Informatics, University of Bergen, Bergen, 5008, Norway
| | - Eivind Valen
- Computational Biology Unit, Department of Informatics, University of Bergen, Bergen, 5008, Norway. .,Sars International Centre for Marine Molecular Biology, University of Bergen, Bergen, 5008, Norway.
| |
Collapse
|