1
|
Lewin LE, Daniels KG, Hurst LD. Genes for highly abundant proteins in Escherichia coli avoid 5' codons that promote ribosomal initiation. PLoS Comput Biol 2023; 19:e1011581. [PMID: 37878567 PMCID: PMC10599525 DOI: 10.1371/journal.pcbi.1011581] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/10/2023] [Accepted: 10/09/2023] [Indexed: 10/27/2023] Open
Abstract
In many species highly expressed genes (HEGs) over-employ the synonymous codons that match the more abundant iso-acceptor tRNAs. Bacterial transgene codon randomization experiments report, however, that enrichment with such "translationally optimal" codons has little to no effect on the resultant protein level. By contrast, consistent with the view that ribosomal initiation is rate limiting, synonymous codon usage following the 5' ATG greatly influences protein levels, at least in part by modifying RNA stability. For the design of bacterial transgenes, for simple codon based in silico inference of protein levels and for understanding selection on synonymous mutations, it would be valuable to computationally determine initiation optimality (IO) scores for codons for any given species. One attractive approach is to characterize the 5' codon enrichment of HEGs compared with the most lowly expressed genes, just as translational optimality scores of codons have been similarly defined employing the full gene body. Here we determine the viability of this approach employing a unique opportunity: for Escherichia coli there is both the most extensive protein abundance data for native genes and a unique large-scale transgene codon randomization experiment enabling objective definition of the 5' codons that cause, rather than just correlate with, high protein abundance (that we equate with initiation optimality, broadly defined). Surprisingly, the 5' ends of native genes that specify highly abundant proteins avoid such initiation optimal codons. We find that this is probably owing to conflicting selection pressures particular to native HEGs, including selection favouring low initiation rates, this potentially enabling high efficiency of ribosomal usage and low noise. While the classical HEG enrichment approach does not work, rendering simple prediction of native protein abundance from 5' codon content futile, we report evidence that initiation optimality scores derived from the transgene experiment may hold relevance for in silico transgene design for a broad spectrum of bacteria.
Collapse
Affiliation(s)
- Loveday E. Lewin
- The Milner Centre for Evolution, Department of Life Sciences, University of Bath, Bath, United Kingdom
| | - Kate G. Daniels
- The Milner Centre for Evolution, Department of Life Sciences, University of Bath, Bath, United Kingdom
| | - Laurence D. Hurst
- The Milner Centre for Evolution, Department of Life Sciences, University of Bath, Bath, United Kingdom
| |
Collapse
|
2
|
Sauer DB, Wang DN. Predicting the optimal growth temperatures of prokaryotes using only genome derived features. Bioinformatics 2020; 35:3224-3231. [PMID: 30689741 DOI: 10.1093/bioinformatics/btz059] [Citation(s) in RCA: 37] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2018] [Revised: 12/20/2018] [Accepted: 01/22/2019] [Indexed: 11/14/2022] Open
Abstract
MOTIVATION Optimal growth temperature is a fundamental characteristic of all living organisms. Knowledge of this temperature is central to the study of a prokaryote, the thermal stability and temperature dependent activity of its genes, and the bioprospecting of its genome for thermally adapted proteins. While high throughput sequencing methods have dramatically increased the availability of genomic information, the growth temperatures of the source organisms are often unknown. This limits the study and technological application of these species and their genomes. Here, we present a novel method for the prediction of growth temperatures of prokaryotes using only genomic sequences. RESULTS By applying the reverse ecology principle that an organism's genome includes identifiable adaptations to its native environment, we can predict a species' optimal growth temperature with an accuracy of 5.17°C root-mean-square error and a coefficient of determination of 0.835. The accuracy can be further improved for specific taxonomic clades or by excluding psychrophiles. This method provides a valuable tool for the rapid calculation of organism growth temperature when only the genome sequence is known. AVAILABILITY AND IMPLEMENTATION Source code, genomes analyzed and features calculated are available at: https://github.com/DavidBSauer/OGT_prediction. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- David B Sauer
- Department of Cell Biology, and The Helen L. and Martin S. Kimmel Center for Biology and Medicine, Skirball Institute of Biomolecular Medicine, New York University School of Medicine, New York, New York, USA
| | - Da-Neng Wang
- Department of Cell Biology, and The Helen L. and Martin S. Kimmel Center for Biology and Medicine, Skirball Institute of Biomolecular Medicine, New York University School of Medicine, New York, New York, USA
| |
Collapse
|
3
|
Panicker IS, Browning GF, Markham PF. The Effect of an Alternate Start Codon on Heterologous Expression of a PhoA Fusion Protein in Mycoplasma gallisepticum. PLoS One 2015; 10:e0127911. [PMID: 26010086 PMCID: PMC4444185 DOI: 10.1371/journal.pone.0127911] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2014] [Accepted: 04/20/2015] [Indexed: 11/18/2022] Open
Abstract
While the genomes of many Mycoplasma species have been sequenced, there are no collated data on translational start codon usage, and the effects of alternate start codons on gene expression have not been studied. Analysis of the annotated genomes found that ATG was the most prevalent translational start codon among Mycoplasma spp. However in Mycoplasma gallisepticum a GTG start codon is commonly used in the vlhA multigene family, which encodes a highly abundant, phase variable lipoprotein adhesin. Therefore, the effect of this alternate start codon on expression of a reporter PhoA lipoprotein was examined in M. gallisepticum. Mutation of the start codon from ATG to GTG resulted in a 2.5 fold reduction in the level of transcription of the phoA reporter, but the level of PhoA activity in the transformants containing phoA with a GTG start codon was only 63% of that of the transformants with a phoA with an ATG start codon, suggesting that GTG was a more efficient translational initiation codon. The effect of swapping the translational start codon in phoA reporter gene expression was less in M. gallisepticum than has been seen previously in Escherichia coli or Bacillus subtilis, suggesting the process of translational initiation in mycoplasmas may have some significant differences from those used in other bacteria. This is the first study of translational start codon usage in mycoplasmas and the impact of the use of an alternate start codon on expression in these bacteria.
Collapse
Affiliation(s)
- Indu S. Panicker
- Asia-Pacific Centre for Animal Health, Faculty of Veterinary and Agricultural Sciences, The University of Melbourne, Melbourne, Victoria, Australia
| | - Glenn F. Browning
- Asia-Pacific Centre for Animal Health, Faculty of Veterinary and Agricultural Sciences, The University of Melbourne, Melbourne, Victoria, Australia
- * E-mail:
| | - Philip F. Markham
- Asia-Pacific Centre for Animal Health, Faculty of Veterinary and Agricultural Sciences, The University of Melbourne, Melbourne, Victoria, Australia
| |
Collapse
|
4
|
Zhou JH, Ding YZ, He Y, Chu YF, Zhao P, Ma LY, Wang XJ, Li XR, Liu YS. The effect of multiple evolutionary selections on synonymous codon usage of genes in the Mycoplasma bovis genome. PLoS One 2014; 9:e108949. [PMID: 25350396 PMCID: PMC4211681 DOI: 10.1371/journal.pone.0108949] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2014] [Accepted: 08/26/2014] [Indexed: 11/19/2022] Open
Abstract
Mycoplasma bovis is a major pathogen causing arthritis, respiratory disease and mastitis in cattle. A better understanding of its genetic features and evolution might represent evidences of surviving host environments. In this study, multiple factors influencing synonymous codon usage patterns in M. bovis (three strains’ genomes) were analyzed. The overall nucleotide content of genes in the M. bovis genome is AT-rich. Although the G and C contents at the third codon position of genes in the leading strand differ from those in the lagging strand (p<0.05), the 59 synonymous codon usage patterns of genes in the leading strand are highly similar to those in the lagging strand. The over-represented codons and the under-represented codons were identified. A comparison of the synonymous codon usage pattern of M. bovis and cattle (susceptible host) indicated the independent formation of synonymous codon usage of M. bovis. Principal component analysis revealed that (i) strand-specific mutational bias fails to affect the synonymous codon usage pattern in the leading and lagging strands, (ii) mutation pressure from nucleotide content plays a role in shaping the overall codon usage, and (iii) the major trend of synonymous codon usage has a significant correlation with the gene expression level that is estimated by the codon adaptation index. The plot of the effective number of codons against the G+C content at the third codon position also reveals that mutation pressure undoubtedly contributes to the synonymous codon usage pattern of M. bovis. Additionally, the formation of the overall codon usage is determined by certain evolutionary selections for gene function classification (30S protein, 50S protein, transposase, membrane protein, and lipoprotein) and translation elongation region of genes in M. bovis. The information could be helpful in further investigations of evolutionary mechanisms of the Mycoplasma family and heterologous expression of its functionally important proteins.
Collapse
Affiliation(s)
- Jian-hua Zhou
- State Key Laboratory of Veterinary Etiological Biology, National Foot-and-Mouth Disease Reference Laboratory, Lanzhou Veterinary Research Institute, Chinese Academy of Agricultural Sciences, Lanzhou, Gansu, P.R. China
| | - Yao-zhong Ding
- State Key Laboratory of Veterinary Etiological Biology, National Foot-and-Mouth Disease Reference Laboratory, Lanzhou Veterinary Research Institute, Chinese Academy of Agricultural Sciences, Lanzhou, Gansu, P.R. China
| | - Ying He
- State Key Laboratory of Veterinary Etiological Biology, National Foot-and-Mouth Disease Reference Laboratory, Lanzhou Veterinary Research Institute, Chinese Academy of Agricultural Sciences, Lanzhou, Gansu, P.R. China
| | - Yue-feng Chu
- State Key Laboratory of Veterinary Etiological Biology, National Foot-and-Mouth Disease Reference Laboratory, Lanzhou Veterinary Research Institute, Chinese Academy of Agricultural Sciences, Lanzhou, Gansu, P.R. China
| | - Ping Zhao
- State Key Laboratory of Veterinary Etiological Biology, National Foot-and-Mouth Disease Reference Laboratory, Lanzhou Veterinary Research Institute, Chinese Academy of Agricultural Sciences, Lanzhou, Gansu, P.R. China
| | - Li-ya Ma
- State Key Laboratory of Veterinary Etiological Biology, National Foot-and-Mouth Disease Reference Laboratory, Lanzhou Veterinary Research Institute, Chinese Academy of Agricultural Sciences, Lanzhou, Gansu, P.R. China
| | - Xin-jun Wang
- State Key Laboratory of Veterinary Etiological Biology, National Foot-and-Mouth Disease Reference Laboratory, Lanzhou Veterinary Research Institute, Chinese Academy of Agricultural Sciences, Lanzhou, Gansu, P.R. China
| | - Xue-rui Li
- State Key Laboratory of Veterinary Etiological Biology, National Foot-and-Mouth Disease Reference Laboratory, Lanzhou Veterinary Research Institute, Chinese Academy of Agricultural Sciences, Lanzhou, Gansu, P.R. China
- * E-mail: (XRL); (YSL)
| | - Yong-sheng Liu
- State Key Laboratory of Veterinary Etiological Biology, National Foot-and-Mouth Disease Reference Laboratory, Lanzhou Veterinary Research Institute, Chinese Academy of Agricultural Sciences, Lanzhou, Gansu, P.R. China
- * E-mail: (XRL); (YSL)
| |
Collapse
|
5
|
Asada M, Hirakawa H, Kuhara S. Classification of Bacteria Based on the Biases of Terminal Amino Acid Residues. Protein J 2011; 30:290-7. [DOI: 10.1007/s10930-011-9332-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
|
6
|
Li W, Yang B, Liang S, Wang Y, Whiteley C, Cao Y, Wang X. BLogo: a tool for visualization of bias in biological sequences. ACTA ACUST UNITED AC 2008; 24:2254-5. [PMID: 18682425 DOI: 10.1093/bioinformatics/btn407] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022]
Abstract
UNLABELLED Blogo is a web-based tool that detects and displays statistically significant position-specific sequence bias with reduced background noise. The over-represented and under-represented symbols in a particular position are shown above and below the zero line. When the sequences are in open reading frames, the background frequency of nucleotides could be calculated separately for the three positions of a codon, thus greatly reducing the background noise. The chi(2)-test or Fisher's exact test is used to evaluate the statistical significance of every symbol in every position and only those that are significant are highlighted in the resulting logo. The perl source code of the program is freely available and can be run locally. AVAILABILITY http://acephpx.cropdb.org/blogo/, http://www.bioinformatics.org/blogo/.
Collapse
Affiliation(s)
- Wencheng Li
- School of Bioscience and Bioengineering, South China University of Technology, Guangzhou 510641, China
| | | | | | | | | | | | | |
Collapse
|