51
|
Fijalkowska D, Fijalkowski I, Willems P, Van Damme P. Bacterial riboproteogenomics: the era of N-terminal proteoform existence revealed. FEMS Microbiol Rev 2021; 44:418-431. [PMID: 32386204 DOI: 10.1093/femsre/fuaa013] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2019] [Accepted: 05/07/2020] [Indexed: 12/17/2022] Open
Abstract
With the rapid increase in the number of sequenced prokaryotic genomes, relying on automated gene annotation became a necessity. Multiple lines of evidence, however, suggest that current bacterial genome annotations may contain inconsistencies and are incomplete, even for so-called well-annotated genomes. We here discuss underexplored sources of protein diversity and new methodologies for high-throughput genome reannotation. The expression of multiple molecular forms of proteins (proteoforms) from a single gene, particularly driven by alternative translation initiation, is gaining interest as a prominent contributor to bacterial protein diversity. In consequence, riboproteogenomic pipelines were proposed to comprehensively capture proteoform expression in prokaryotes by the complementary use of (positional) proteomics and the direct readout of translated genomic regions using ribosome profiling. To complement these discoveries, tailored strategies are required for the functional characterization of newly discovered bacterial proteoforms.
Collapse
Affiliation(s)
- Daria Fijalkowska
- Department of Biochemistry and Microbiology, Ghent University, K. L. Ledeganckstraat 35, B-9000 Ghent, Belgium
| | - Igor Fijalkowski
- Department of Biochemistry and Microbiology, Ghent University, K. L. Ledeganckstraat 35, B-9000 Ghent, Belgium
| | - Patrick Willems
- Department of Biochemistry and Microbiology, Ghent University, K. L. Ledeganckstraat 35, B-9000 Ghent, Belgium
| | - Petra Van Damme
- Department of Biochemistry and Microbiology, Ghent University, K. L. Ledeganckstraat 35, B-9000 Ghent, Belgium
| |
Collapse
|
52
|
Yang M, Shang X, Zhou Y, Wang C, Wei G, Tang J, Zhang M, Liu Y, Cao J, Zhang Q. Full-Length Transcriptome Analysis of Plasmodium falciparum by Single-Molecule Long-Read Sequencing. Front Cell Infect Microbiol 2021; 11:631545. [PMID: 33708645 PMCID: PMC7942025 DOI: 10.3389/fcimb.2021.631545] [Citation(s) in RCA: 20] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2020] [Accepted: 01/05/2021] [Indexed: 11/25/2022] Open
Abstract
Malaria, an infectious disease caused by Plasmodium parasites, still accounts for amounts of deaths annually in last decades. Despite the significance of Plasmodium falciparum as a model organism of malaria parasites, our understanding of gene expression of this parasite remains largely elusive since lots of progress on its genome and transcriptome are based on assembly with short sequencing reads. Herein, we report the new version of transcriptome dataset containing all full-length transcripts over the whole asexual blood stages by adopting a full-length sequencing approach with optimized experimental conditions of cDNA library preparation. We have identified a total of 393 alternative splicing (AS) events, 3,623 long non-coding RNAs (lncRNAs), 1,555 alternative polyadenylation (APA) events, 57 transcription factors (TF), 1,721 fusion transcripts in P. falciparum. Furthermore, the shotgun proteome was performed to validate the full-length transcriptome of P. falciparum. More importantly, integration of full-length transcriptomic and proteomic data identified 160 novel small proteins in lncRNA regions. Collectively, this full-length transcriptome dataset with high quality and accuracy and the shotgun proteome analyses shed light on the complex gene expression in malaria parasites and provide a valuable resource for related functional and mechanistic researches on P. falciparum genes.
Collapse
Affiliation(s)
- Mengquan Yang
- Research Center for Translational Medicine, Key Laboratory of Arrhythmias of the Ministry of Education of China, East Hospital, Tongji University School of Medicine, Shanghai, China.,State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai, China.,CAS Key Laboratory of Synthetic Biology, CAS Center for Excellence in Molecular Plant Sciences, Chinese Academy of Sciences, Shanghai, China
| | - Xiaomin Shang
- Research Center for Translational Medicine, Key Laboratory of Arrhythmias of the Ministry of Education of China, East Hospital, Tongji University School of Medicine, Shanghai, China
| | - Yiqing Zhou
- CAS Key Laboratory of Synthetic Biology, CAS Center for Excellence in Molecular Plant Sciences, Chinese Academy of Sciences, Shanghai, China
| | - Changhong Wang
- Research Center for Translational Medicine, Key Laboratory of Arrhythmias of the Ministry of Education of China, East Hospital, Tongji University School of Medicine, Shanghai, China
| | - Guiying Wei
- Research Center for Translational Medicine, Key Laboratory of Arrhythmias of the Ministry of Education of China, East Hospital, Tongji University School of Medicine, Shanghai, China
| | - Jianxia Tang
- National Health Commission Key Laboratory of Parasitic Disease Control and Prevention, Jiangsu Provincial Key Laboratory on Parasite and Vector Control Technology, Jiangsu Institute of Parasitic Diseases, Wuxi, China
| | - Meihua Zhang
- National Health Commission Key Laboratory of Parasitic Disease Control and Prevention, Jiangsu Provincial Key Laboratory on Parasite and Vector Control Technology, Jiangsu Institute of Parasitic Diseases, Wuxi, China
| | - Yaobao Liu
- National Health Commission Key Laboratory of Parasitic Disease Control and Prevention, Jiangsu Provincial Key Laboratory on Parasite and Vector Control Technology, Jiangsu Institute of Parasitic Diseases, Wuxi, China
| | - Jun Cao
- National Health Commission Key Laboratory of Parasitic Disease Control and Prevention, Jiangsu Provincial Key Laboratory on Parasite and Vector Control Technology, Jiangsu Institute of Parasitic Diseases, Wuxi, China.,Center for Global Health, School of Public Health, Nanjing Medical University, Nanjing, China
| | - Qingfeng Zhang
- Research Center for Translational Medicine, Key Laboratory of Arrhythmias of the Ministry of Education of China, East Hospital, Tongji University School of Medicine, Shanghai, China
| |
Collapse
|
53
|
Neville MDC, Kohze R, Erady C, Meena N, Hayden M, Cooper DN, Mort M, Prabakaran S. A platform for curated products from novel open reading frames prompts reinterpretation of disease variants. Genome Res 2021; 31:327-336. [PMID: 33468550 PMCID: PMC7849405 DOI: 10.1101/gr.263202.120] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2020] [Accepted: 08/26/2020] [Indexed: 11/29/2022]
Abstract
Recent evidence from proteomics and deep massively parallel sequencing studies have revealed that eukaryotic genomes contain substantial numbers of as-yet-uncharacterized open reading frames (ORFs). We define these uncharacterized ORFs as novel ORFs (nORFs). nORFs in humans are mostly under 100 codons and are found in diverse regions of the genome, including in long noncoding RNAs, pseudogenes, 3' UTRs, 5' UTRs, and alternative reading frames of canonical protein coding exons. There is therefore a pressing need to evaluate the potential functional importance of these unannotated transcripts and proteins in biological pathways and human disease on a larger scale, rather than one at a time. In this study, we outline the creation of a valuable nORFs data set with experimental evidence of translation for the community, use measures of heritability and selection that reveal signals for functional importance, and show the potential implications for functional interpretation of genetic variants in nORFs. Our results indicate that some variants that were previously classified as being benign or of uncertain significance may have to be reinterpreted.
Collapse
Affiliation(s)
- Matthew D C Neville
- Department of Genetics, University of Cambridge, Cambridge CB2 3EH, United Kingdom
| | - Robin Kohze
- Department of Genetics, University of Cambridge, Cambridge CB2 3EH, United Kingdom
| | - Chaitanya Erady
- Department of Genetics, University of Cambridge, Cambridge CB2 3EH, United Kingdom
| | - Narendra Meena
- Department of Biology, Indian Institute of Science Education and Research, Pune, Maharashtra 411008, India
| | - Matthew Hayden
- Institute of Medical Genetics, Cardiff University, Heath Park, Cardiff CF14 4XN, United Kingdom
| | - David N Cooper
- Institute of Medical Genetics, Cardiff University, Heath Park, Cardiff CF14 4XN, United Kingdom
| | - Matthew Mort
- Institute of Medical Genetics, Cardiff University, Heath Park, Cardiff CF14 4XN, United Kingdom
| | - Sudhakaran Prabakaran
- Department of Genetics, University of Cambridge, Cambridge CB2 3EH, United Kingdom
- Department of Biology, Indian Institute of Science Education and Research, Pune, Maharashtra 411008, India
- St Edmund's College, University of Cambridge, Cambridge CB3 0BN, United Kingdom
| |
Collapse
|
54
|
Gul I, Le W, Jie Z, Ruiqin F, Bilal M, Tang L. Recent advances on engineered enzyme-conjugated biosensing modalities and devices for halogenated compounds. Trends Analyt Chem 2021. [DOI: 10.1016/j.trac.2020.116145] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022]
|
55
|
Stringer A, Smith C, Mangano K, Wade JT. Identification of novel translated small ORFs in Escherichia coli using complementary ribosome profiling approaches. J Bacteriol 2021; 204:JB0035221. [PMID: 34662240 PMCID: PMC8765432 DOI: 10.1128/jb.00352-21] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2021] [Accepted: 10/12/2021] [Indexed: 11/20/2022] Open
Abstract
Small proteins of <51 amino acids are abundant across all domains of life but are often overlooked because their small size makes them difficult to predict computationally, and they are refractory to standard proteomic approaches. Ribosome profiling has been used to infer the existence of small proteins by detecting the translation of the corresponding open reading frames (ORFs). Detection of translated short ORFs by ribosome profiling can be improved by treating cells with drugs that stall ribosomes at specific codons. Here, we combine the analysis of ribosome profiling data for Escherichia coli cells treated with antibiotics that stall ribosomes at either start or stop codons. Thus, we identify ribosome-occupied start and stop codons with high sensitivity for ∼400 novel putative ORFs. The newly discovered ORFs are mostly short, with 365 encoding proteins of <51 amino acids. We validate translation of several selected short ORFs, and show that many likely encode unstable proteins. Moreover, we present evidence that most of the newly identified short ORFs are not under purifying selection, suggesting they do not impact cell fitness, although a small subset have the hallmarks of functional ORFs. IMPORTANCE Small proteins of <51 amino acids are abundant across all domains of life but are often overlooked because their small size makes them difficult to predict computationally, and they are refractory to standard proteomic approaches. Recent studies have discovered small proteins by mapping the location of translating ribosomes on RNA using a technique known as ribosome profiling. Discovery of translated sORFs using ribosome profiling can be improved by treating cells with drugs that trap initiating ribosomes. Here, we show that combining these data with equivalent data for cells treated with a drug that stalls terminating ribosomes facilitates the discovery of small proteins. We use this approach to discover 365 putative genes that encode small proteins in Escherichia coli.
Collapse
Affiliation(s)
- Anne Stringer
- Wadsworth Center, New York State Department of Health, Albany, New York, USA
| | - Carol Smith
- Wadsworth Center, New York State Department of Health, Albany, New York, USA
| | - Kyle Mangano
- Center for Biomolecular Sciences, University of Illinois, Chicago, Illinois, USA
| | - Joseph T. Wade
- Wadsworth Center, New York State Department of Health, Albany, New York, USA
- Department of Biomedical Sciences, School of Public Health, University at Albany, Albany, New York, USA
| |
Collapse
|
56
|
Santos-Júnior CD, Pan S, Zhao XM, Coelho LP. Macrel: antimicrobial peptide screening in genomes and metagenomes. PeerJ 2020; 8:e10555. [PMID: 33384902 PMCID: PMC7751412 DOI: 10.7717/peerj.10555] [Citation(s) in RCA: 28] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2020] [Accepted: 11/22/2020] [Indexed: 12/21/2022] Open
Abstract
Motivation Antimicrobial peptides (AMPs) have the potential to tackle multidrug-resistant pathogens in both clinical and non-clinical contexts. The recent growth in the availability of genomes and metagenomes provides an opportunity for in silico prediction of novel AMP molecules. However, due to the small size of these peptides, standard gene prospection methods cannot be applied in this domain and alternative approaches are necessary. In particular, standard gene prediction methods have low precision for short peptides, and functional classification by homology results in low recall. Results Here, we present Macrel (for metagenomic AMP classification and retrieval), which is an end-to-end pipeline for the prospection of high-quality AMP candidates from (meta)genomes. For this, we introduce a novel set of 22 peptide features. These were used to build classifiers which perform similarly to the state-of-the-art in the prediction of both antimicrobial and hemolytic activity of peptides, but with enhanced precision (using standard benchmarks as well as a stricter testing regime). We demonstrate that Macrel recovers high-quality AMP candidates using realistic simulations and real data. Availability Macrel is implemented in Python 3. It is available as open source at https://github.com/BigDataBiology/macrel and through bioconda. Classification of peptides or prediction of AMPs in contigs can also be performed on the webserver: https://big-data-biology.org/software/macrel.
Collapse
Affiliation(s)
- Célio Dias Santos-Júnior
- Institute of Science and Technology for Brain-Inspired Intelligence, Fudan University, Shanghai, China.,Ministry of Education, Key Laboratory of Computational Neuroscience and Brain-Inspired Intelligence, Shanghai, China
| | - Shaojun Pan
- Institute of Science and Technology for Brain-Inspired Intelligence, Fudan University, Shanghai, China.,Ministry of Education, Key Laboratory of Computational Neuroscience and Brain-Inspired Intelligence, Shanghai, China
| | - Xing-Ming Zhao
- Institute of Science and Technology for Brain-Inspired Intelligence, Fudan University, Shanghai, China.,Ministry of Education, Key Laboratory of Computational Neuroscience and Brain-Inspired Intelligence, Shanghai, China
| | - Luis Pedro Coelho
- Institute of Science and Technology for Brain-Inspired Intelligence, Fudan University, Shanghai, China.,Ministry of Education, Key Laboratory of Computational Neuroscience and Brain-Inspired Intelligence, Shanghai, China
| |
Collapse
|
57
|
Automated Prediction and Annotation of Small Open Reading Frames in Microbial Genomes. Cell Host Microbe 2020; 29:121-131.e4. [PMID: 33290720 DOI: 10.1016/j.chom.2020.11.002] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2020] [Revised: 09/26/2020] [Accepted: 11/07/2020] [Indexed: 01/21/2023]
Abstract
Small open reading frames (smORFs) and their encoded microproteins play central roles in microbes. However, there is a vast unexplored space of smORFs within human-associated microbes. A recent bioinformatic analysis used evolutionary conservation signals to enhance prediction of small protein families. To facilitate the annotation of specific smORFs, we introduce SmORFinder. This tool combines profile hidden Markov models of each smORF family and deep learning models that better generalize to smORF families not seen in the training set, resulting in predictions enriched for Ribo-seq translation signals. Feature importance analysis reveals that the deep learning models learn to identify Shine-Dalgarno sequences, deprioritize the wobble position in each codon, and group codon synonyms found in the codon table. A core-genome analysis of 26 bacterial species identifies several core smORFs of unknown function. We pre-compute smORF annotations for thousands of RefSeq isolate genomes and Human Microbiome Project metagenomes and provide these data through a public web portal.
Collapse
|
58
|
Matteau D, Lachance J, Grenier F, Gauthier S, Daubenspeck JM, Dybvig K, Garneau D, Knight TF, Jacques P, Rodrigue S. Integrative characterization of the near-minimal bacterium Mesoplasma florum. Mol Syst Biol 2020; 16:e9844. [PMID: 33331123 PMCID: PMC7745072 DOI: 10.15252/msb.20209844] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2020] [Revised: 11/02/2020] [Accepted: 11/03/2020] [Indexed: 12/11/2022] Open
Abstract
The near-minimal bacterium Mesoplasma florum is an interesting model for synthetic genomics and systems biology due to its small genome (~ 800 kb), fast growth rate, and lack of pathogenic potential. However, fundamental aspects of its biology remain largely unexplored. Here, we report a broad yet remarkably detailed characterization of M. florum by combining a wide variety of experimental approaches. We investigated several physical and physiological parameters of this bacterium, including cell size, growth kinetics, and biomass composition of the cell. We also performed the first genome-wide analysis of its transcriptome and proteome, notably revealing a conserved promoter motif, the organization of transcription units, and the transcription and protein expression levels of all protein-coding sequences. We converted gene transcription and expression levels into absolute molecular abundances using biomass quantification results, generating an unprecedented view of the M. florum cellular composition and functions. These characterization efforts provide a strong experimental foundation for the development of a genome-scale model for M. florum and will guide future genome engineering endeavors in this simple organism.
Collapse
Affiliation(s)
- Dominick Matteau
- Département de biologieUniversité de SherbrookeSherbrookeQCCanada
| | | | - Frédéric Grenier
- Département de biologieUniversité de SherbrookeSherbrookeQCCanada
| | - Samuel Gauthier
- Département de biologieUniversité de SherbrookeSherbrookeQCCanada
| | | | - Kevin Dybvig
- Department of GeneticsUniversity of Alabama at BirminghamBirminghamALUSA
| | - Daniel Garneau
- Département de biologieUniversité de SherbrookeSherbrookeQCCanada
| | | | | | | |
Collapse
|
59
|
Burgos R, Weber M, Martinez S, Lluch‐Senar M, Serrano L. Protein quality control and regulated proteolysis in the genome-reduced organism Mycoplasma pneumoniae. Mol Syst Biol 2020; 16:e9530. [PMID: 33320415 PMCID: PMC7737663 DOI: 10.15252/msb.20209530] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2020] [Revised: 11/04/2020] [Accepted: 11/08/2020] [Indexed: 12/14/2022] Open
Abstract
Protein degradation is a crucial cellular process in all-living systems. Here, using Mycoplasma pneumoniae as a model organism, we defined the minimal protein degradation machinery required to maintain proteome homeostasis. Then, we conditionally depleted the two essential ATP-dependent proteases. Whereas depletion of Lon results in increased protein aggregation and decreased heat tolerance, FtsH depletion induces cell membrane damage, suggesting a role in quality control of membrane proteins. An integrative comparative study combining shotgun proteomics and RNA-seq revealed 62 and 34 candidate substrates, respectively. Cellular localization of substrates and epistasis studies supports separate functions for Lon and FtsH. Protein half-life measurements also suggest a role for Lon-modulated protein decay. Lon plays a key role in protein quality control, degrading misfolded proteins and those not assembled into functional complexes. We propose that regulating complex assembly and degradation of isolated proteins is a mechanism that coordinates important cellular processes like cell division. Finally, by considering the entire set of proteases and chaperones, we provide a fully integrated view of how a minimal cell regulates protein folding and degradation.
Collapse
Affiliation(s)
- Raul Burgos
- Centre for Genomic Regulation (CRG)The Barcelona Institute of Science and TechnologyBarcelonaSpain
| | - Marc Weber
- Centre for Genomic Regulation (CRG)The Barcelona Institute of Science and TechnologyBarcelonaSpain
| | - Sira Martinez
- Centre for Genomic Regulation (CRG)The Barcelona Institute of Science and TechnologyBarcelonaSpain
| | - Maria Lluch‐Senar
- Centre for Genomic Regulation (CRG)The Barcelona Institute of Science and TechnologyBarcelonaSpain
| | - Luis Serrano
- Centre for Genomic Regulation (CRG)The Barcelona Institute of Science and TechnologyBarcelonaSpain
- Universitat Pompeu Fabra (UPF)BarcelonaSpain
- ICREABarcelonaSpain
| |
Collapse
|
60
|
Brandenburg F, Klähn S. Small but Smart: On the Diverse Role of Small Proteins in the Regulation of Cyanobacterial Metabolism. Life (Basel) 2020; 10:E322. [PMID: 33271798 PMCID: PMC7760959 DOI: 10.3390/life10120322] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2020] [Revised: 11/25/2020] [Accepted: 11/26/2020] [Indexed: 12/17/2022] Open
Abstract
Over the past few decades, bioengineered cyanobacteria have become a major focus of research for the production of energy carriers and high value chemical compounds. Besides improvements in cultivation routines and reactor technology, the integral understanding of the regulation of metabolic fluxes is the key to designing production strains that are able to compete with established industrial processes. In cyanobacteria, many enzymes and metabolic pathways are regulated differently compared to other bacteria. For instance, while glutamine synthetase in proteobacteria is mainly regulated by covalent enzyme modifications, the same enzyme in cyanobacteria is controlled by the interaction with unique small proteins. Other prominent examples, such as the small protein CP12 which controls the Calvin-Benson cycle, indicate that the regulation of enzymes and/or pathways via the attachment of small proteins might be a widespread mechanism in cyanobacteria. Accordingly, this review highlights the diverse role of small proteins in the control of cyanobacterial metabolism, focusing on well-studied examples as well as those most recently described. Moreover, it will discuss their potential to implement metabolic engineering strategies in order to make cyanobacteria more definable for biotechnological applications.
Collapse
Affiliation(s)
| | - Stephan Klähn
- Helmholtz Centre for Environmental Research—UFZ, 04318 Leipzig, Germany;
| |
Collapse
|
61
|
Lost and Found: Re-searching and Re-scoring Proteomics Data Aids Genome Annotation and Improves Proteome Coverage. mSystems 2020; 5:5/5/e00833-20. [PMID: 33109751 PMCID: PMC7593589 DOI: 10.1128/msystems.00833-20] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/12/2023] Open
Abstract
Delineation of open reading frames (ORFs) causes persistent inconsistencies in prokaryote genome annotation. We demonstrate that by advanced (re)analysis of omics data, a higher proteome coverage and sensitive detection of unannotated ORFs can be achieved, which can be exploited for conditional bacterial genome (re)annotation, which is especially relevant in view of annotating the wealth of sequenced prokaryotic genomes obtained in recent years. Prokaryotic genome annotation is heavily dependent on automated gene annotation pipelines that are prone to propagate errors and underestimate genome complexity. We describe an optimized proteogenomic workflow that uses ribosome profiling (ribo-seq) and proteomic data for Salmonella enterica serovar Typhimurium to identify unannotated proteins or alternative protein forms. This data analysis encompasses the searching of cofragmenting peptides and postprocessing with extended peptide-to-spectrum quality features, including comparison to predicted fragment ion intensities. When this strategy is applied, an enhanced proteome depth is achieved, as well as greater confidence for unannotated peptide hits. We demonstrate the general applicability of our pipeline by reanalyzing public Deinococcus radiodurans data sets. Taken together, our results show that systematic reanalysis using available prokaryotic (proteome) data sets holds great promise to assist in experimentally based genome annotation. IMPORTANCE Delineation of open reading frames (ORFs) causes persistent inconsistencies in prokaryote genome annotation. We demonstrate that by advanced (re)analysis of omics data, a higher proteome coverage and sensitive detection of unannotated ORFs can be achieved, which can be exploited for conditional bacterial genome (re)annotation, which is especially relevant in view of annotating the wealth of sequenced prokaryotic genomes obtained in recent years.
Collapse
|
62
|
Cassidy L, Helbig AO, Kaulich PT, Weidenbach K, Schmitz RA, Tholey A. Multidimensional separation schemes enhance the identification and molecular characterization of low molecular weight proteomes and short open reading frame-encoded peptides in top-down proteomics. J Proteomics 2020; 230:103988. [PMID: 32949814 DOI: 10.1016/j.jprot.2020.103988] [Citation(s) in RCA: 22] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2020] [Revised: 08/17/2020] [Accepted: 09/14/2020] [Indexed: 12/13/2022]
Abstract
Short open reading frame-encoded peptides (SEP) represent a widely undiscovered part of the proteome. The detailed analysis of SEP has, despite inherent limitations such as incomplete sequence coverage, challenges encountered with protein inference, the identification of posttranslational modifications and the assignment of potential N- and C-terminal truncations, predominantly been assessed using bottom-up proteomic workflows. The use of top-down based proteomic workflows is capable of providing an unparalleled level of characterization information, which is of increased importance in the case of alternatively encoded protein products. However, top-down based analysis is not without its own limitations, for which efficient separation prior to MS analysis is a major issue. We established a sample preparation approach for the combined bottom-up and top-down proteomic analysis of SEP. Key improvements were made by the application of solid phase extraction (SPE), which supported enrichment of proteins below ca. 20 kDa, followed by 2D-LC-MS top-down analysis encompassing both HCD and EThcD ion activation. Bottom-up experiments were used to support and confirm top-down data interpretation. This strategy allowed for the top-down characterization of 36 proteoforms mapping to 12 SEP from the archaeon Methanosarcina mazei strain Gö1, with the concurrent detection and identification of several posttranslational modifications in SEP. BIOLOGICAL SIGNIFICANCE: Small or short open reading frames (sORF) have been widely neglected in genome research in the past. With their increasing discovery, the question about the presence and molecular function of their translation products, the short open reading frame-encoded peptides (SEP), arises. As these small proteins are usually below the 10 kDa range, the number of peptides identifiable by bottom-up proteomics is limited which hampers both the identification and the recognition of potential posttranslational modifications. The presented top-down approach allowed for the detection of full length SEP, as well as of terminally truncated proteoforms, and further enabled the identification of disulfide bonds in these small proteins. This demonstrates, that this yet widely undiscovered part of the proteome undergoes the same modifications as classical proteins which is an essential step for future understanding of the biological functions of these molecules.
Collapse
Affiliation(s)
- Liam Cassidy
- Systematic Proteome Research & Bioanalytics, Institute for Experimental Medicine, Christian-Albrechts-Universität zu Kiel, 24105 Kiel, Germany
| | - Andreas O Helbig
- Systematic Proteome Research & Bioanalytics, Institute for Experimental Medicine, Christian-Albrechts-Universität zu Kiel, 24105 Kiel, Germany
| | - Philipp T Kaulich
- Systematic Proteome Research & Bioanalytics, Institute for Experimental Medicine, Christian-Albrechts-Universität zu Kiel, 24105 Kiel, Germany
| | - Kathrin Weidenbach
- Institute for General Microbiology, Christian-Albrechts-Universität zu Kiel, 24118 Kiel, Germany
| | - Ruth A Schmitz
- Institute for General Microbiology, Christian-Albrechts-Universität zu Kiel, 24118 Kiel, Germany
| | - Andreas Tholey
- Systematic Proteome Research & Bioanalytics, Institute for Experimental Medicine, Christian-Albrechts-Universität zu Kiel, 24105 Kiel, Germany.
| |
Collapse
|
63
|
Ardern Z, Neuhaus K, Scherer S. Are Antisense Proteins in Prokaryotes Functional? Front Mol Biosci 2020; 7:187. [PMID: 32923454 PMCID: PMC7457138 DOI: 10.3389/fmolb.2020.00187] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/20/2020] [Accepted: 07/16/2020] [Indexed: 12/16/2022] Open
Abstract
Many prokaryotic RNAs are transcribed from loci outside of annotated protein coding genes. Across bacterial species hundreds of short open reading frames antisense to annotated genes show evidence of both transcription and translation, for instance in ribosome profiling data. Determining the functional fraction of these protein products awaits further research, including insights from studies of molecular interactions and detailed evolutionary analysis. There are multiple lines of evidence, however, that many of these newly discovered proteins are of use to the organism. Condition-specific phenotypes have been characterized for a few. These proteins should be added to genome annotations, and the methods for predicting them standardized. Evolutionary analysis of these typically young sequences also may provide important insights into gene evolution. This research should be prioritized for its exciting potential to uncover large numbers of novel proteins with extremely diverse potential practical uses, including applications in synthetic biology and responding to pathogens.
Collapse
Affiliation(s)
- Zachary Ardern
- Chair for Microbial Ecology, Technical University of Munich, Munich, Germany
| | | | | |
Collapse
|
64
|
Garai P, Blanc‐Potard A. Uncovering small membrane proteins in pathogenic bacteria: Regulatory functions and therapeutic potential. Mol Microbiol 2020; 114:710-720. [DOI: 10.1111/mmi.14564] [Citation(s) in RCA: 18] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2020] [Revised: 06/19/2020] [Accepted: 06/20/2020] [Indexed: 01/01/2023]
Affiliation(s)
- Preeti Garai
- Laboratory of Pathogen‐Host Interactions Université de MontpellierCNRS‐UMR5235 Montpellier France
| | - Anne Blanc‐Potard
- Laboratory of Pathogen‐Host Interactions Université de MontpellierCNRS‐UMR5235 Montpellier France
| |
Collapse
|
65
|
Glaub A, Huptas C, Neuhaus K, Ardern Z. Recommendations for bacterial ribosome profiling experiments based on bioinformatic evaluation of published data. J Biol Chem 2020; 295:8999-9011. [PMID: 32385111 PMCID: PMC7335797 DOI: 10.1074/jbc.ra119.012161] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2019] [Revised: 05/05/2020] [Indexed: 02/03/2023] Open
Abstract
Ribosome profiling (RIBO-Seq) has improved our understanding of bacterial translation, including finding many unannotated genes. However, protocols for RIBO-Seq and corresponding data analysis are not yet standardized. Here, we analyzed 48 RIBO-Seq samples from nine studies of Escherichia coli K12 grown in lysogeny broth medium and particularly focused on the size-selection step. We show that for conventional expression analysis, a size range between 22 and 30 nucleotides is sufficient to obtain protein-coding fragments, which has the advantage of removing many unwanted rRNA and tRNA reads. More specific analyses may require longer reads and a corresponding improvement in rRNA/tRNA depletion. There is no consensus about the appropriate sequencing depth for RIBO-Seq experiments in prokaryotes, and studies vary significantly in total read number. Our analysis suggests that 20 million reads that are not mapping to rRNA/tRNA are required for global detection of translated annotated genes. We also highlight the influence of drug-induced ribosome stalling, which causes bias at translation start sites. The resulting accumulation of reads at the start site may be especially useful for detecting weakly expressed genes. As different methods suit different questions, it may not be possible to produce a "one-size-fits-all" ribosome profiling data set. Therefore, experiments should be carefully designed in light of the scientific questions of interest. We propose some basic characteristics that should be reported with any new RIBO-Seq data sets. Careful attention to the factors discussed should improve prokaryotic gene detection and the comparability of ribosome profiling data sets.
Collapse
Affiliation(s)
- Alina Glaub
- Chair for Microbial Ecology, Technical University of Munich, Freising, Germany
| | - Christopher Huptas
- Chair for Microbial Ecology, Technical University of Munich, Freising, Germany
| | - Klaus Neuhaus
- Chair for Microbial Ecology, Technical University of Munich, Freising, Germany; Core Facility Microbiome, ZIEL Institute for Food and Health, Technical University of Munich, Freising, Germany
| | - Zachary Ardern
- Chair for Microbial Ecology, Technical University of Munich, Freising, Germany.
| |
Collapse
|
66
|
Abstract
No method exists to measure large-scale translation of genes in uncultured organisms in microbiomes. To overcome this limitation, we develop MetaRibo-Seq, a method for simultaneous ribosome profiling of tens to hundreds of organisms in microbiome samples. MetaRibo-Seq was benchmarked against gold-standard Ribo-Seq in a mock microbial community and applied to five different human fecal samples. Unlike RNA-Seq, Ribo-Seq signal of a predicted gene suggests it encodes a translated protein. We demonstrate two applications of this technique: First, MetaRibo-Seq identifies small genes, whose identification until now has been challenging. For example, MetaRibo-Seq identifies 2,091 translated, previously unannotated small protein families from five fecal samples, more than doubling the number of small proteins predicted to exist in this niche. Second, the combined application of RNA-Seq and MetaRibo-Seq identifies differences in the translation of transcripts. In summary, MetaRibo-Seq enables comprehensive translational profiling in microbiomes and identifies previously unannotated small proteins. Defining the functions of individual organisms or communities within microbiomes is a challenging task. Here, the authors develop MetaRibo-Seq, a method for simultaneous high-throughput ribosome profiling of organisms in uncultured microbiome samples.
Collapse
|
67
|
The Archaeal Proteome Project advances knowledge about archaeal cell biology through comprehensive proteomics. Nat Commun 2020; 11:3145. [PMID: 32561711 PMCID: PMC7305310 DOI: 10.1038/s41467-020-16784-7] [Citation(s) in RCA: 31] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2019] [Accepted: 05/18/2020] [Indexed: 11/08/2022] Open
Abstract
While many aspects of archaeal cell biology remain relatively unexplored, systems biology approaches like mass spectrometry (MS) based proteomics offer an opportunity for rapid advances. Unfortunately, the enormous amount of MS data generated often remains incompletely analyzed due to a lack of sophisticated bioinformatic tools and field-specific biological expertise for data interpretation. Here we present the initiation of the Archaeal Proteome Project (ArcPP), a community-based effort to comprehensively analyze archaeal proteomes. Starting with the model archaeon Haloferax volcanii, we reanalyze MS datasets from various strains and culture conditions. Optimized peptide spectrum matching, with strict control of false discovery rates, facilitates identifying > 72% of the reference proteome, with a median protein sequence coverage of 51%. These analyses, together with expert knowledge in diverse aspects of cell biology, provide meaningful insights into processes such as N-terminal protein maturation, N-glycosylation, and metabolism. Altogether, ArcPP serves as an invaluable blueprint for comprehensive prokaryotic proteomics.
Collapse
|
68
|
Brunet MA, Leblanc S, Roucou X. Reconsidering proteomic diversity with functional investigation of small ORFs and alternative ORFs. Exp Cell Res 2020; 393:112057. [PMID: 32387289 DOI: 10.1016/j.yexcr.2020.112057] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2019] [Revised: 04/21/2020] [Accepted: 05/02/2020] [Indexed: 12/13/2022]
Abstract
The discovery of functional yet non-annotated open reading frames (ORFs) throughout the genome of several species presents an unprecedented challenge in current genome annotation. These novel ORFs are shorter than annotated ones and many can be found on the same RNA, in opposition to current assumptions in annotation methodologies. Whilst the literature lacks consensus, these novel ORFs are commonly referred to as small ORFs (sORFs) or alternative ORFs (alt-ORFs). Unannotated ORFs represent an overlooked layer of complexity in the coding potential of genomes and are transforming our current vision of the nature of coding genes. In this review, we outline what constitutes a sORF or an alt-ORF and emphasize differences between both nomenclatures. We then describe complementary large-scale methods to accurately discover novel ORFs as well as yield functional insights on the novel proteins they encode. While serendipitous discoveries highlighted the functional importance of some novel ORFs, omics methods facilitate and improve their characterization to better understand physiological and pathological pathways. Functional annotation of sORFs, alt-ORFs and their corresponding microproteins will likely help fundamental and clinical research.
Collapse
Affiliation(s)
- Marie A Brunet
- Department of Biochemistry and Functional Genomics, Université de Sherbrooke, Sherbrooke, Québec, Canada; PROTEO, Quebec Network for Research on Protein Function, Structure, and Engineering, Canada.
| | - Sebastien Leblanc
- Department of Biochemistry and Functional Genomics, Université de Sherbrooke, Sherbrooke, Québec, Canada; PROTEO, Quebec Network for Research on Protein Function, Structure, and Engineering, Canada
| | - Xavier Roucou
- Department of Biochemistry and Functional Genomics, Université de Sherbrooke, Sherbrooke, Québec, Canada; PROTEO, Quebec Network for Research on Protein Function, Structure, and Engineering, Canada.
| |
Collapse
|
69
|
Orr MW, Mao Y, Storz G, Qian SB. Alternative ORFs and small ORFs: shedding light on the dark proteome. Nucleic Acids Res 2020; 48:1029-1042. [PMID: 31504789 DOI: 10.1093/nar/gkz734] [Citation(s) in RCA: 146] [Impact Index Per Article: 36.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2019] [Revised: 08/03/2019] [Accepted: 08/15/2019] [Indexed: 02/06/2023] Open
Abstract
Traditional annotation of protein-encoding genes relied on assumptions, such as one open reading frame (ORF) encodes one protein and minimal lengths for translated proteins. With the serendipitous discoveries of translated ORFs encoded upstream and downstream of annotated ORFs, from alternative start sites nested within annotated ORFs and from RNAs previously considered noncoding, it is becoming clear that these initial assumptions are incorrect. The findings have led to the realization that genetic information is more densely coded and that the proteome is more complex than previously anticipated. As such, interest in the identification and characterization of the previously ignored 'dark proteome' is increasing, though we note that research in eukaryotes and bacteria has largely progressed in isolation. To bridge this gap and illustrate exciting findings emerging from studies of the dark proteome, we highlight recent advances in both eukaryotic and bacterial cells. We discuss progress in the detection of alternative ORFs as well as in the understanding of functions and the regulation of their expression and posit questions for future work.
Collapse
Affiliation(s)
- Mona Wu Orr
- Division of Molecular and Cellular Biology, Eunice Kennedy Shriver National Institute of Child Health and Human Development, Bethesda, MD 20892, USA
| | - Yuanhui Mao
- Division of Nutritional Sciences, Cornell University, Ithaca, NY 14853, USA
| | - Gisela Storz
- Division of Molecular and Cellular Biology, Eunice Kennedy Shriver National Institute of Child Health and Human Development, Bethesda, MD 20892, USA
| | - Shu-Bing Qian
- Division of Nutritional Sciences, Cornell University, Ithaca, NY 14853, USA
| |
Collapse
|
70
|
Varadarajan AR, Goetze S, Pavlou MP, Grosboillot V, Shen Y, Loessner MJ, Ahrens CH, Wollscheid B. A Proteogenomic Resource Enabling Integrated Analysis of Listeria Genotype-Proteotype-Phenotype Relationships. J Proteome Res 2020; 19:1647-1662. [PMID: 32091902 DOI: 10.1021/acs.jproteome.9b00842] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022]
Abstract
Listeria monocytogenes is an opportunistic foodborne pathogen responsible for listeriosis, a potentially fatal foodborne disease. Many different Listeria strains and serotypes exist, but a proteogenomic resource that bridges the gap in our molecular understanding of the relationships between the Listeria genotypes and phenotypes via proteotypes is still missing. Here, we devised a next-generation proteogenomics strategy that enables the community to rapidly proteotype Listeria strains and relate this information back to the genotype. Based on sequencing and de novo assembly of the two most commonly used Listeria model strains, EGD-e and ScottA, we established two comprehensive Listeria proteogenomic databases. A genome comparison established core- and strain-specific genes potentially responsible for virulence differences. Next, we established a DIA/SWATH-based proteotyping strategy, including a new and robust sample preparation workflow, that enables the reproducible, sensitive, and relative quantitative measurement of Listeria proteotypes. This reusable and publicly available DIA/SWATH library covers 70% of open reading frames of Listeria and represents the most extensive spectral library for Listeria proteotype analysis to date. We used these two new resources to investigate the Listeria proteotype in states mimicking the upper gastrointestinal passage. Exposure of Listeria to bile salts at 37 °C, which simulates conditions encountered in the duodenum, showed significant proteotype perturbations including an increase of FlaA, the structural protein of flagella. Given that Listeria is known to lose its flagella above 30 °C, this was an unexpected finding. The formation of flagella, which might have implications on infectivity, was validated by parallel reaction monitoring and light and scanning electron microscopy. flaA transcript levels did not change significantly upon exposure to bile salts at 37 °C, suggesting regulation at the post-transcriptional level. Together, these analyses provide a comprehensive proteogenomic resource and toolbox for the Listeria community enabling the analysis of Listeria genotype-proteotype-phenotype relationships.
Collapse
Affiliation(s)
- Adithi R Varadarajan
- Department of Health Sciences and Technology (D-HEST), ETH Zürich, 8092 Zürich, Switzerland.,Agroscope, Molecular Diagnostics, Genomics & Bioinformatics, 8820 Wädenswil, Switzerland.,Swiss Institute of Bioinformatics (SIB), 1015 Lausanne, Switzerland
| | - Sandra Goetze
- Department of Health Sciences and Technology (D-HEST), ETH Zürich, 8092 Zürich, Switzerland.,Swiss Institute of Bioinformatics (SIB), 1015 Lausanne, Switzerland.,Institute of Translational Medicine (ITM), ETH Zürich, 8093 Zürich, Switzerland
| | - Maria P Pavlou
- Department of Health Sciences and Technology (D-HEST), ETH Zürich, 8092 Zürich, Switzerland.,Institute of Translational Medicine (ITM), ETH Zürich, 8093 Zürich, Switzerland
| | - Virginie Grosboillot
- Department of Health Sciences and Technology (D-HEST), ETH Zürich, 8092 Zürich, Switzerland.,Institute of Food, Nutrition and Health (IFNH), ETH Zürich, 8092 Zürich, Switzerland
| | - Yang Shen
- Department of Health Sciences and Technology (D-HEST), ETH Zürich, 8092 Zürich, Switzerland.,Institute of Food, Nutrition and Health (IFNH), ETH Zürich, 8092 Zürich, Switzerland
| | - Martin J Loessner
- Department of Health Sciences and Technology (D-HEST), ETH Zürich, 8092 Zürich, Switzerland.,Institute of Food, Nutrition and Health (IFNH), ETH Zürich, 8092 Zürich, Switzerland
| | - Christian H Ahrens
- Agroscope, Molecular Diagnostics, Genomics & Bioinformatics, 8820 Wädenswil, Switzerland.,Swiss Institute of Bioinformatics (SIB), 1015 Lausanne, Switzerland
| | - Bernd Wollscheid
- Department of Health Sciences and Technology (D-HEST), ETH Zürich, 8092 Zürich, Switzerland.,Swiss Institute of Bioinformatics (SIB), 1015 Lausanne, Switzerland.,Institute of Translational Medicine (ITM), ETH Zürich, 8093 Zürich, Switzerland
| |
Collapse
|
71
|
Korandla DR, Wozniak JM, Campeau A, Gonzalez DJ, Wright ES. AssessORF: combining evolutionary conservation and proteomics to assess prokaryotic gene predictions. Bioinformatics 2019; 36:1022-1029. [PMID: 31532487 PMCID: PMC7998711 DOI: 10.1093/bioinformatics/btz714] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2019] [Revised: 09/05/2019] [Accepted: 09/13/2019] [Indexed: 01/31/2023] Open
Abstract
MOTIVATION A core task of genomics is to identify the boundaries of protein coding genes, which may cover over 90% of a prokaryote's genome. Several programs are available for gene finding, yet it is currently unclear how well these programs perform and whether any offers superior accuracy. This is in part because there is no universal benchmark for gene finding and, therefore, most developers select their own benchmarking strategy. RESULTS Here, we introduce AssessORF, a new approach for benchmarking prokaryotic gene predictions based on evidence from proteomics data and the evolutionary conservation of start and stop codons. We applied AssessORF to compare gene predictions offered by GenBank, GeneMarkS-2, Glimmer and Prodigal on genomes spanning the prokaryotic tree of life. Gene predictions were 88-95% in agreement with the available evidence, with Glimmer performing the worst but no clear winner. All programs were biased towards selecting start codons that were upstream of the actual start. Given these findings, there remains considerable room for improvement, especially in the detection of correct start sites. AVAILABILITY AND IMPLEMENTATION AssessORF is available as an R package via the Bioconductor package repository. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Deepank R Korandla
- Department of Biological Sciences, USA,Computational Biology Department, Carnegie Mellon University, Pittsburgh, PA 15213, USA,Department of Biomedical Informatics, University of Pittsburgh School of Medicine, Pittsburgh, PA 15219, USA
| | - Jacob M Wozniak
- Department of Pharmacology, University of California San Diego, La Jolla, CA 92093, USA,Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, La Jolla, CA 92093, USA
| | - Anaamika Campeau
- Department of Pharmacology, University of California San Diego, La Jolla, CA 92093, USA,Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, La Jolla, CA 92093, USA
| | - David J Gonzalez
- Department of Pharmacology, University of California San Diego, La Jolla, CA 92093, USA,Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, La Jolla, CA 92093, USA
| | | |
Collapse
|
72
|
Benayoun BA, Lee C. MOTS-c: A Mitochondrial-Encoded Regulator of the Nucleus. Bioessays 2019; 41:e1900046. [PMID: 31378979 PMCID: PMC8224472 DOI: 10.1002/bies.201900046] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2019] [Revised: 06/28/2019] [Indexed: 12/25/2022]
Abstract
Mitochondria are increasingly being recognized as information hubs that sense cellular changes and transmit messages to other cellular components, such as the nucleus, the endoplasmic reticulum (ER), the Golgi apparatus, and lysosomes. Nonetheless, the interaction between mitochondria and the nucleus is of special interest because they both host part of the cellular genome. Thus, the communication between genome-bearing organelles would likely include gene expression regulation. Multiple nuclear-encoded proteins have been known to regulate mitochondrial gene expression. On the contrary, no mitochondrial-encoded factors are known to actively regulate nuclear gene expression. MOTS-c (mitochondrial open reading frame of the 12S ribosomal RNA type-c) is a recently identified peptide encoded within the mitochondrial 12S ribosomal RNA gene that has metabolic functions. Notably, MOTS-c can translocate to the nucleus upon metabolic stress (e.g., glucose restriction and oxidative stress) and directly regulate adaptive nuclear gene expression to promote cellular homeostasis. It is hypothesized that cellular fitness requires the coevolved mitonuclear genomes to coordinate adaptive responses using gene-encoded factors that cross-regulate the opposite genome. This suggests that cellular gene expression requires the bipartite split genomes to operate as a unified system, rather than the nucleus being the sole master regulator.
Collapse
Affiliation(s)
- Bérénice A Benayoun
- Leonard Davis School of Gerontology, University of Southern California, Los Angeles, CA, 90089, USA
- USC Norris Comprehensive Cancer Center, Epigenetics and Gene Regulation Program, Los Angeles, CA, 90089, USA
- USC Stem Cell Initiative, Los Angeles, CA, 90089, USA
| | - Changhan Lee
- Leonard Davis School of Gerontology, University of Southern California, Los Angeles, CA, 90089, USA
- USC Norris Comprehensive Cancer Center, Epigenetics and Gene Regulation Program, Los Angeles, CA, 90089, USA
- Biomedical Sciences, Graduate School, Ajou University, Suwon, 16499, Republic of Korea
| |
Collapse
|
73
|
Abstract
Exploration of tiny protein-coding sequences within the human microbiome reveals thousands of conserved gene families that have been overlooked by traditional analyses. These small proteins may play key roles in the crosstalk among bacteria within the microbiome and in interactions with their human hosts.
Collapse
|
74
|
Yus E, Lloréns-Rico V, Martínez S, Gallo C, Eilers H, Blötz C, Stülke J, Lluch-Senar M, Serrano L. Determination of the Gene Regulatory Network of a Genome-Reduced Bacterium Highlights Alternative Regulation Independent of Transcription Factors. Cell Syst 2019; 9:143-158.e13. [PMID: 31445891 PMCID: PMC6721554 DOI: 10.1016/j.cels.2019.07.001] [Citation(s) in RCA: 22] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2018] [Revised: 04/14/2019] [Accepted: 06/27/2019] [Indexed: 11/30/2022]
Abstract
Here, we determined the relative importance of different transcriptional mechanisms in the genome-reduced bacterium Mycoplasma pneumoniae, by employing an array of experimental techniques under multiple genetic and environmental perturbations. Of the 143 genes tested (21% of the bacterium’s annotated proteins), only 55% showed an altered phenotype, highlighting the robustness of biological systems. We identified nine transcription factors (TFs) and their targets, representing 43% of the genome, and 16 regulators that indirectly affect transcription. Only 20% of transcriptional regulation is mediated by canonical TFs when responding to perturbations. Using a Random Forest, we quantified the non-redundant contribution of different mechanisms such as supercoiling, metabolic control, RNA degradation, and chromosome topology to transcriptional changes. Model-predicted gene changes correlate well with experimental data in 95% of the tested perturbations, explaining up to 70% of the total variance when also considering noise. This analysis highlights the importance of considering non-TF-mediated regulation when engineering bacteria. Full comprehensive reconstruction of a bacterial gene regulatory network achieved Genome-reduced bacterium Mycoplasma pneumoniae is robust to genetic perturbations Large part of transcription regulation in bacteria is transcription-factor independent Transcription-factor-independent regulation has a smaller dynamic range
Collapse
Affiliation(s)
- Eva Yus
- Centre for Genomic Regulation (CRG), The Barcelona Institute for Science and Technology, Doctor Aiguader 88, Barcelona 08003, Spain.
| | - Verónica Lloréns-Rico
- Centre for Genomic Regulation (CRG), The Barcelona Institute for Science and Technology, Doctor Aiguader 88, Barcelona 08003, Spain.
| | - Sira Martínez
- Centre for Genomic Regulation (CRG), The Barcelona Institute for Science and Technology, Doctor Aiguader 88, Barcelona 08003, Spain
| | - Carolina Gallo
- Centre for Genomic Regulation (CRG), The Barcelona Institute for Science and Technology, Doctor Aiguader 88, Barcelona 08003, Spain
| | - Hinnerk Eilers
- Department for General Microbiology, Georg-August-University Göttingen, 37077 Göttingen, Germany
| | - Cedric Blötz
- Department for General Microbiology, Georg-August-University Göttingen, 37077 Göttingen, Germany
| | - Jörg Stülke
- Department for General Microbiology, Georg-August-University Göttingen, 37077 Göttingen, Germany
| | - Maria Lluch-Senar
- Centre for Genomic Regulation (CRG), The Barcelona Institute for Science and Technology, Doctor Aiguader 88, Barcelona 08003, Spain
| | - Luis Serrano
- Centre for Genomic Regulation (CRG), The Barcelona Institute for Science and Technology, Doctor Aiguader 88, Barcelona 08003, Spain; Universitat Pompeu Fabra (UPF), Barcelona, Spain; Institució Catalana de Recerca i Estudis Avançats (ICREA), Pg. Lluis Companys 23, Barcelona 08010, Spain.
| |
Collapse
|
75
|
Montero-Blay A, Miravet-Verde S, Lluch-Senar M, Piñero-Lambea C, Serrano L. SynMyco transposon: engineering transposon vectors for efficient transformation of minimal genomes. DNA Res 2019; 26:327-339. [PMID: 31257417 PMCID: PMC6704405 DOI: 10.1093/dnares/dsz012] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2018] [Accepted: 05/16/2019] [Indexed: 11/17/2022] Open
Abstract
Mycoplasmas are important model organisms for Systems and Synthetic Biology, and are pathogenic to a wide variety of species. Despite their relevance, many of the tools established for genome editing in other microorganisms are not available for Mycoplasmas. The Tn4001 transposon is the reference tool to work with these bacteria, but the transformation efficiencies (TEs) reported for the different species vary substantially. Here, we explore the mechanisms underlying these differences in four Mycoplasma species, Mycoplasma agalactiae, Mycoplasma feriruminatoris, Mycoplasma gallisepticum and Mycoplasma pneumoniae, selected for being representative members of each cluster of the Mycoplasma genus. We found that regulatory regions (RRs) driving the expression of the transposase and the antibiotic resistance marker have a major impact on the TEs. We then designed a synthetic RR termed SynMyco RR to control the expression of the key transposon vector elements. Using this synthetic RR, we were able to increase the TE for M. gallisepticum, M. feriruminatoris and M. agalactiae by 30-, 980- and 1036-fold, respectively. Finally, to illustrate the potential of this new transposon, we performed the first essentiality study in M. agalactiae, basing our study on more than 199,000 genome insertions.
Collapse
Affiliation(s)
- Ariadna Montero-Blay
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Dr. Aiguader 88, Barcelona, Spain
| | - Samuel Miravet-Verde
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Dr. Aiguader 88, Barcelona, Spain
| | - Maria Lluch-Senar
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Dr. Aiguader 88, Barcelona, Spain
| | - Carlos Piñero-Lambea
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Dr. Aiguader 88, Barcelona, Spain
| | - Luis Serrano
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Dr. Aiguader 88, Barcelona, Spain.,Universitat Pompeu Fabra (UPF), Barcelona, Spain.,ICREA, Pg. Lluis Companys 23, Barcelona, Spain
| |
Collapse
|
76
|
Miravet-Verde S, Ferrar T, Espadas-García G, Mazzolini R, Gharrab A, Sabido E, Serrano L, Lluch-Senar M. Unraveling the hidden universe of small proteins in bacterial genomes. Mol Syst Biol 2019; 15:e8290. [PMID: 30796087 PMCID: PMC6385055 DOI: 10.15252/msb.20188290] [Citation(s) in RCA: 70] [Impact Index Per Article: 14.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022] Open
Abstract
Identification of small open reading frames (smORFs) encoding small proteins (≤ 100 amino acids; SEPs) is a challenge in the fields of genome annotation and protein discovery. Here, by combining a novel bioinformatics tool (RanSEPs) with “‐omics” approaches, we were able to describe 109 bacterial small ORFomes. Predictions were first validated by performing an exhaustive search of SEPs present in Mycoplasma pneumoniae proteome via mass spectrometry, which illustrated the limitations of shotgun approaches. Then, RanSEPs predictions were validated and compared with other tools using proteomic datasets from different bacterial species and SEPs from the literature. We found that up to 16 ± 9% of proteins in an organism could be classified as SEPs. Integration of RanSEPs predictions with transcriptomics data showed that some annotated non‐coding RNAs could in fact encode for SEPs. A functional study of SEPs highlighted an enrichment in the membrane, translation, metabolism, and nucleotide‐binding categories. Additionally, 9.7% of the SEPs included a N‐terminus predicted signal peptide. We envision RanSEPs as a tool to unmask the hidden universe of small bacterial proteins.
Collapse
Affiliation(s)
- Samuel Miravet-Verde
- EMBL/CRG Systems Biology Research Unit, Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Barcelona, Spain
| | - Tony Ferrar
- EMBL/CRG Systems Biology Research Unit, Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Barcelona, Spain
| | - Guadalupe Espadas-García
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Barcelona, Spain
| | - Rocco Mazzolini
- EMBL/CRG Systems Biology Research Unit, Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Barcelona, Spain
| | - Anas Gharrab
- EMBL/CRG Systems Biology Research Unit, Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Barcelona, Spain
| | - Eduard Sabido
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Barcelona, Spain
| | - Luis Serrano
- EMBL/CRG Systems Biology Research Unit, Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Barcelona, Spain .,Universitat Pompeu Fabra (UPF), Barcelona, Spain.,Institució Catalana de Recerca i Estudis Avançats (ICREA), Barcelona, Spain
| | - Maria Lluch-Senar
- EMBL/CRG Systems Biology Research Unit, Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Barcelona, Spain .,Universitat Pompeu Fabra (UPF), Barcelona, Spain
| |
Collapse
|