1
|
Jiang R, Yuan S, Zhou Y, Wei Y, Li F, Wang M, Chen B, Yu H. Strategies to overcome the challenges of low or no expression of heterologous proteins in Escherichia coli. Biotechnol Adv 2024; 75:108417. [PMID: 39038691 DOI: 10.1016/j.biotechadv.2024.108417] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2024] [Revised: 07/18/2024] [Accepted: 07/19/2024] [Indexed: 07/24/2024]
Abstract
Protein expression is a critical process in diverse biological systems. For Escherichia coli, a widely employed microbial host in industrial catalysis and healthcare, researchers often face significant challenges in constructing recombinant expression systems. To maximize the potential of E. coli expression systems, it is essential to address problems regarding the low or absent production of certain target proteins. This article presents viable solutions to the main factors posing challenges to heterologous protein expression in E. coli, which includes protein toxicity, the intrinsic influence of gene sequences, and mRNA structure. These strategies include specialized approaches for managing toxic protein expression, addressing issues related to mRNA structure and codon bias, advanced codon optimization methodologies that consider multiple factors, and emerging optimization techniques facilitated by big data and machine learning.
Collapse
Affiliation(s)
- Ruizhao Jiang
- Department of Chemical Engineering, Tsinghua University, Beijing 100084, China; Key Laboratory of Industrial Biocatalysis (Tsinghua University), the Ministry of Education, Beijing 100084, China
| | - Shuting Yuan
- Department of Chemical Engineering, Tsinghua University, Beijing 100084, China; Key Laboratory of Industrial Biocatalysis (Tsinghua University), the Ministry of Education, Beijing 100084, China
| | - Yilong Zhou
- Tanwei College, Tsinghua University, Beijing 100084, China
| | - Yuwen Wei
- Department of Chemical Engineering, Tsinghua University, Beijing 100084, China; Key Laboratory of Industrial Biocatalysis (Tsinghua University), the Ministry of Education, Beijing 100084, China
| | - Fulong Li
- Beijing Evolyzer Co.,Ltd., 100176, China
| | | | - Bo Chen
- Beijing Evolyzer Co.,Ltd., 100176, China
| | - Huimin Yu
- Department of Chemical Engineering, Tsinghua University, Beijing 100084, China; Key Laboratory of Industrial Biocatalysis (Tsinghua University), the Ministry of Education, Beijing 100084, China; Center for Synthetic and Systems Biology, Tsinghua University, Beijing 100084, China.
| |
Collapse
|
2
|
Ali F. Patterns of Change in Nucleotide Diversity Over Gene Length. Genome Biol Evol 2024; 16:evae078. [PMID: 38608148 PMCID: PMC11040516 DOI: 10.1093/gbe/evae078] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2023] [Revised: 03/26/2024] [Accepted: 04/03/2024] [Indexed: 04/14/2024] Open
Abstract
Nucleotide diversity at a site is influenced by the relative strengths of neutral and selective population genetic processes. Therefore, attempts to estimate Effective population size based on the diversity of synonymous sites demand a better understanding of their selective constraints. The nucleotide diversity of a gene was previously found to correlate with its length. In this work, I measure nucleotide diversity at synonymous sites and uncover a pattern of low diversity towards the translation initiation site of a gene. The degree of reduction in diversity at the translation initiation site and the length of this region of reduced diversity can be quantified as "Effect Size" and "Effect Length" respectively, using parameters of an asymptotic regression model. Estimates of Effect Length across bacteria covaried with recombination rates as well as with a multitude of translation-associated traits such as the avoidance of mRNA secondary structure around translation initiation site, the number of rRNAs, and relative codon usage of ribosomal genes. Evolutionary simulations under purifying selection reproduce the observed patterns and diversity-length correlation and highlight that selective constraints on the 5'-region of a gene may be more extensive than previously believed. These results have implications for the estimation of effective population size, and relative mutation rates, and for genome scans of genes under positive selection based on "silent-site" diversity.
Collapse
Affiliation(s)
- Farhan Ali
- Biodesign Center for Mechanisms of Evolution, Arizona State University, Tempe, AZ 85281, USA
| |
Collapse
|
3
|
Moutinho AF, Eyre-Walker A. No Evidence that Selection on Synonymous Codon Usage Affects Patterns of Protein Evolution in Bacteria. Genome Biol Evol 2024; 16:evad232. [PMID: 38149940 PMCID: PMC10849182 DOI: 10.1093/gbe/evad232] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2022] [Revised: 12/11/2023] [Accepted: 12/17/2023] [Indexed: 12/28/2023] Open
Abstract
Bias in synonymous codon usage has been reported across all kingdoms of life. Evidence suggests that codon usage bias is often driven by selective pressures, typically for translational efficiency. These selective pressures have been shown to depress the rate at which synonymous sites evolve. We hypothesize that selection on synonymous codon use could also slow the rate of protein evolution if a non-synonymous mutation changes the codon from being preferred to unpreferred. We test this hypothesis by looking at patterns of protein evolution using polymorphism and substitution data in two bacterial species, Escherichia coli and Streptococcus pneumoniae. We find no evidence that non-synonymous mutations that change a codon from being unpreferred to preferred are more common than the opposite. Overall, selection on codon bias seems to have little influence over non-synonymous polymorphism or substitution patterns.
Collapse
|
4
|
Bajaj P, Bhasin M, Varadarajan R. Molecular bases for strong phenotypic effects of single synonymous codon substitutions in the E. coli ccdB toxin gene. BMC Genomics 2023; 24:732. [PMID: 38049728 PMCID: PMC10694988 DOI: 10.1186/s12864-023-09817-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2023] [Accepted: 11/18/2023] [Indexed: 12/06/2023] Open
Abstract
BACKGROUND Single synonymous codon mutations typically have only minor or no effects on gene function. Here, we estimate the effects on cell growth of ~ 200 single synonymous codon mutations in an operonic context by mutating almost all positions of ccdB, the 101-residue long cytotoxin of the ccdAB Toxin-Antitoxin (TA) operon to most degenerate codons. Phenotypes were assayed by transforming the mutant library into CcdB sensitive and resistant E. coli strains, isolating plasmid pools, and subjecting them to deep sequencing. Since autoregulation is a hallmark of TA operons, phenotypes obtained for ccdB synonymous mutants after transformation in a RelE toxin reporter strain followed by deep sequencing provided information on the amount of CcdAB complex formed. RESULTS Synonymous mutations in the N-terminal region involved in translation initiation showed the strongest non-neutral phenotypic effects. We observe an interplay of numerous factors, namely, location of the codon, codon usage, t-RNA abundance, formation of anti-Shine Dalgarno sequences, predicted transcript secondary structure, and evolutionary conservation in determining phenotypic effects of ccdB synonymous mutations. Incorporation of an N-terminal, hyperactive synonymous mutation, in the background of the single synonymous codon mutant library sufficiently increased translation initiation, such that mutational effects on either folding or termination of translation became more apparent. Introduction of putative pause sites not only affects the translational rate, but might also alter the folding kinetics of the protein in vivo. CONCLUSION In summary, the study provides novel insights into diverse mechanisms by which synonymous mutations modulate gene function. This information is useful in optimizing heterologous gene expression in E. coli and understanding the molecular bases for alteration in gene expression that arise due to synonymous mutations.
Collapse
Affiliation(s)
- Priyanka Bajaj
- Molecular Biophysics Unit, Indian Institute of Science, Bangalore, 560012, India
- Present address: Department of Bioengineering and Therapeutic Sciences, University of CA - San Francisco, San Francisco, CA, 94158, USA
| | - Munmun Bhasin
- Molecular Biophysics Unit, Indian Institute of Science, Bangalore, 560012, India
| | - Raghavan Varadarajan
- Molecular Biophysics Unit, Indian Institute of Science, Bangalore, 560012, India.
| |
Collapse
|
5
|
Lewin LE, Daniels KG, Hurst LD. Genes for highly abundant proteins in Escherichia coli avoid 5' codons that promote ribosomal initiation. PLoS Comput Biol 2023; 19:e1011581. [PMID: 37878567 PMCID: PMC10599525 DOI: 10.1371/journal.pcbi.1011581] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/10/2023] [Accepted: 10/09/2023] [Indexed: 10/27/2023] Open
Abstract
In many species highly expressed genes (HEGs) over-employ the synonymous codons that match the more abundant iso-acceptor tRNAs. Bacterial transgene codon randomization experiments report, however, that enrichment with such "translationally optimal" codons has little to no effect on the resultant protein level. By contrast, consistent with the view that ribosomal initiation is rate limiting, synonymous codon usage following the 5' ATG greatly influences protein levels, at least in part by modifying RNA stability. For the design of bacterial transgenes, for simple codon based in silico inference of protein levels and for understanding selection on synonymous mutations, it would be valuable to computationally determine initiation optimality (IO) scores for codons for any given species. One attractive approach is to characterize the 5' codon enrichment of HEGs compared with the most lowly expressed genes, just as translational optimality scores of codons have been similarly defined employing the full gene body. Here we determine the viability of this approach employing a unique opportunity: for Escherichia coli there is both the most extensive protein abundance data for native genes and a unique large-scale transgene codon randomization experiment enabling objective definition of the 5' codons that cause, rather than just correlate with, high protein abundance (that we equate with initiation optimality, broadly defined). Surprisingly, the 5' ends of native genes that specify highly abundant proteins avoid such initiation optimal codons. We find that this is probably owing to conflicting selection pressures particular to native HEGs, including selection favouring low initiation rates, this potentially enabling high efficiency of ribosomal usage and low noise. While the classical HEG enrichment approach does not work, rendering simple prediction of native protein abundance from 5' codon content futile, we report evidence that initiation optimality scores derived from the transgene experiment may hold relevance for in silico transgene design for a broad spectrum of bacteria.
Collapse
Affiliation(s)
- Loveday E. Lewin
- The Milner Centre for Evolution, Department of Life Sciences, University of Bath, Bath, United Kingdom
| | - Kate G. Daniels
- The Milner Centre for Evolution, Department of Life Sciences, University of Bath, Bath, United Kingdom
| | - Laurence D. Hurst
- The Milner Centre for Evolution, Department of Life Sciences, University of Bath, Bath, United Kingdom
| |
Collapse
|
6
|
Ali F. Patterns of change in nucleotide diversity over gene length. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.07.13.548940. [PMID: 37503020 PMCID: PMC10369989 DOI: 10.1101/2023.07.13.548940] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/29/2023]
Abstract
Nucleotide diversity at a site is influenced by the relative strengths of neutral and selective population genetic processes. Therefore, attempts to identify sites under positive selection require an understanding of the expected diversity in its absence. The nucleotide diversity of a gene was previously found to correlate with its length. In this work, I measure nucleotide diversity at synonymous sites and uncover a pattern of low diversity towards the translation initiation site (TIS) of a gene. The degree of reduction in diversity at the TIS and the length of this region of reduced diversity can be quantified as "Effect Size" and "Effect Length" respectively, using parameters of an asymptotic regression model. Estimates of Effect Length across bacteria covaried with recombination rates as well as with a multitude of fast-growth adaptations such as the avoidance of mRNA secondary structure around TIS, the number of rRNAs, and relative codon usage of ribosomal genes. Thus, the dependence of nucleotide diversity on gene length is governed by a combination of selective and non-selective processes. These results have implications for the estimation of effective population size and relative mutation rates based on "silent-site" diversity, and for pN/pS-based prediction of genes under selection.
Collapse
Affiliation(s)
- Farhan Ali
- Biodesign Institute, Arizona State University, Tempe, Arizona
| |
Collapse
|
7
|
Höllerer S, Jeschek M. Ultradeep characterisation of translational sequence determinants refutes rare-codon hypothesis and unveils quadruplet base pairing of initiator tRNA and transcript. Nucleic Acids Res 2023; 51:2377-2396. [PMID: 36727459 PMCID: PMC10018350 DOI: 10.1093/nar/gkad040] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2022] [Revised: 12/05/2022] [Accepted: 01/13/2023] [Indexed: 02/03/2023] Open
Abstract
Translation is a key determinant of gene expression and an important biotechnological engineering target. In bacteria, 5'-untranslated region (5'-UTR) and coding sequence (CDS) are well-known mRNA parts controlling translation and thus cellular protein levels. However, the complex interaction of 5'-UTR and CDS has so far only been studied for few sequences leading to non-generalisable and partly contradictory conclusions. Herein, we systematically assess the dynamic translation from over 1.2 million 5'-UTR-CDS pairs in Escherichia coli to investigate their collective effect using a new method for ultradeep sequence-function mapping. This allows us to disentangle and precisely quantify effects of various sequence determinants of translation. We find that 5'-UTR and CDS individually account for 53% and 20% of variance in translation, respectively, and show conclusively that, contrary to a common hypothesis, tRNA abundance does not explain expression changes between CDSs with different synonymous codons. Moreover, the obtained large-scale data provide clear experimental evidence for a base-pairing interaction between initiator tRNA and mRNA beyond the anticodon-codon interaction, an effect that is often masked for individual sequences and therefore inaccessible to low-throughput approaches. Our study highlights the indispensability of ultradeep sequence-function mapping to accurately determine the contribution of parts and phenomena involved in gene regulation.
Collapse
Affiliation(s)
- Simon Höllerer
- Department of Biosystems Science and Engineering, Swiss Federal Institute of Technology – ETH Zurich, Basel CH-4058, Switzerland
| | - Markus Jeschek
- Department of Biosystems Science and Engineering, Swiss Federal Institute of Technology – ETH Zurich, Basel CH-4058, Switzerland
- Institute of Microbiology, Synthetic Microbiology Group, University of Regensburg, Regensburg D-93053, Germany
| |
Collapse
|
8
|
Synonymous mutation rs1129293 is associated with PIK3CG expression and PI3Kγ activation in patients with chronic Chagas cardiomyopathy. Immunobiology 2022; 227:152242. [PMID: 35870262 DOI: 10.1016/j.imbio.2022.152242] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2022] [Revised: 06/23/2022] [Accepted: 07/06/2022] [Indexed: 11/20/2022]
Abstract
Single nucleotide polymorphisms (SNPs) that do not change the composition of amino acids and cause synonymous mutations (sSNPs) were previously considered to lack any functional roles. However, sSNPs have recently been shown to interfere with protein expression owing to a myriad of factors related to the regulation of transcription, mRNA stability, and protein translation processes. In patients with Chagas disease, the presence of the synonymous mutation rs1129293 in phosphatidylinositol-4,5-bisphosphate 3-kinase gamma (PIK3CG) gene contributes to the development of the chronic Chagas cardiomyopathy (CCC), instead of the digestive or asymptomatic forms. In this study, we aimed to investigate whether rs1129293 is associated with the transcription of PIK3CG mRNA and its activity by quantifying AKT phosphorylation in the heart samples of 26 chagasic patients with CCC. Our results showed an association between rs1129293 and decreased PIK3CG mRNA expression levels in the cardiac tissues of patients with CCC. The phosphorylation levels of AKT, the protein target of PI3K, were also reduced in patients with this mutation, but were not correlated with PI3KCG mRNA expression levels. Moreover, bioinformatics analysis showed that rs1129293 and other SNPs in linkage disequilibrium (LD) were associated with the transcriptional regulatory elements, post-transcriptional modifications, and cell-specific splicing expression of PIK3CG mRNA. Therefore, our data demonstrates that the synonymous SNP rs1129293 is capable of affecting the PIK3CG mRNA expression and PI3Kγ activation.
Collapse
|
9
|
Abstract
Bacterial protein synthesis rates have evolved to maintain preferred stoichiometries at striking precision, from the components of protein complexes to constituents of entire pathways. Setting relative protein production rates to be well within a factor of two requires concerted tuning of transcription, RNA turnover, and translation, allowing many potential regulatory strategies to achieve the preferred output. The last decade has seen a greatly expanded capacity for precise interrogation of each step of the central dogma genome-wide. Here, we summarize how these technologies have shaped the current understanding of diverse bacterial regulatory architectures underpinning stoichiometric protein synthesis. We focus on the emerging expanded view of bacterial operons, which encode diverse primary and secondary mRNA structures for tuning protein stoichiometry. Emphasis is placed on how quantitative tuning is achieved. We discuss the challenges and open questions in the application of quantitative, genome-wide methodologies to the problem of precise protein production. Expected final online publication date for the Annual Review of Microbiology, Volume 75 is October 2021. Please see http://www.annualreviews.org/page/journal/pubdates for revised estimates.
Collapse
Affiliation(s)
- James C Taggart
- Department of Biology, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, USA; ,
| | - Jean-Benoît Lalanne
- Department of Biology, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, USA; , .,Department of Physics, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, USA.,Current affiliation: Department of Genome Sciences, University of Washington, Seattle, Washington 98195, USA;
| | - Gene-Wei Li
- Department of Biology, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, USA; ,
| |
Collapse
|
10
|
Abstract
The development of safe and effective vaccines against viruses is central to disease control. With advancements in DNA synthesis technology, the production of synthetic viral genomes has fueled many research efforts that aim to generate attenuated viruses by introducing synonymous mutations. Elucidation of the mechanisms underlying virus attenuation through synonymous mutagenesis is revealing interesting new biology that can be exploited for vaccine development. Here, we review recent advancements in this field of synthetic virology and focus on the molecular mechanisms of attenuation by genetic recoding of viruses. We highlight the action of the zinc finger antiviral protein (ZAP) and RNase L, two proteins involved in the inhibition of viruses enriched for CpG and UpA dinucleotides, that are often the products of virus recoding algorithms. Additionally, we discuss current challenges in the field as well as studies that may illuminate how other host functions, such as translation, are potentially involved in the attenuation of recoded viruses.
Collapse
|
11
|
Simms CL, Yan LL, Qiu JK, Zaher HS. Ribosome Collisions Result in +1 Frameshifting in the Absence of No-Go Decay. Cell Rep 2020; 28:1679-1689.e4. [PMID: 31412239 PMCID: PMC6701860 DOI: 10.1016/j.celrep.2019.07.046] [Citation(s) in RCA: 30] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2019] [Revised: 06/17/2019] [Accepted: 07/15/2019] [Indexed: 12/22/2022] Open
Abstract
During translation, an mRNA is typically occupied by multiple ribosomes sparsely distributed across the coding sequence. This distribution, mediated by slow rates of initiation relative to elongation, ensures that they rarely collide with each other, but given the stochastic nature of protein synthesis, collision events do occur. Recent work from our lab suggested that collisions signal for mRNA degradation through no-go decay (NGD). We have explored the impact of stalling on ribosome function when NGD is compromised and found it to result in +1 frameshifting. We used reporters that limit the number of ribosomes on a transcript to show that +1 frameshifting is induced through ribosome collision in yeast and bacteria. Furthermore, we observe a positive correlation between ribosome density and frameshifting efficiency. It is thus tempting to speculate that NGD, in addition to its role in mRNA quality control, evolved to cope with stochastic collision events to prevent deleterious frameshifting events. Ribosome collisions, resulting from stalling, activate quality control processes to degrade the aberrant mRNA and the incomplete peptide. mRNA degradation proceeds through an endonucleolytic cleavage between the stacked ribosomes, which resolves the collisions. Simms et al. show that, when cleavage is inhibited, colliding ribosomes move out of frame.
Collapse
Affiliation(s)
- Carrie L Simms
- Department of Biology, Washington University in St. Louis, St. Louis, MO 63130, USA
| | - Liewei L Yan
- Department of Biology, Washington University in St. Louis, St. Louis, MO 63130, USA
| | - Jessica K Qiu
- Department of Biology, Washington University in St. Louis, St. Louis, MO 63130, USA
| | - Hani S Zaher
- Department of Biology, Washington University in St. Louis, St. Louis, MO 63130, USA.
| |
Collapse
|
12
|
Nucleotide composition affects codon usage toward the 3'-end. PLoS One 2019; 14:e0225633. [PMID: 31800603 PMCID: PMC6892556 DOI: 10.1371/journal.pone.0225633] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2019] [Accepted: 11/09/2019] [Indexed: 12/24/2022] Open
Abstract
The 3’-end of the coding sequence in several species is known to show specific codon usage bias. Several factors have been suggested to underlie this phenomenon, including selection against translation efficiency, selection for translation accuracy, and selection against RNA folding. All are supported by some evidence, but there is no general agreement as to which factors are the main determinants. Nor is it known how universal this phenomenon is, and whether the same factors explain it in different species. To answer these questions, we developed a measure that quantifies the codon usage bias at the gene end, and used it to compute this bias for 91 species that span the three domains of life. In addition, we characterized the codons in each species by features that allow discrimination between the different factors. Combining all these data, we were able to show that there is a universal trend to favor AT-rich codons toward the gene end. Moreover, we suggest that this trend is explained by avoidance from forming RNA secondary structures around the stop codon, which may interfere with normal translation termination.
Collapse
|
13
|
Natural tuning of restriction endonuclease synthesis by cluster of rare arginine codons. Sci Rep 2019; 9:5808. [PMID: 30967604 PMCID: PMC6456624 DOI: 10.1038/s41598-019-42311-w] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2018] [Accepted: 03/28/2019] [Indexed: 01/21/2023] Open
Abstract
Restriction–modification (R-M) systems are highly widespread among bacteria and archaea, and they appear to play a pivotal role in modulating horizontal gene transfer, as well as in protecting the host organism against viruses and other invasive DNA particles. Type II R-M systems specify two independent enzymes: a restriction endonuclease (REase) and protective DNA methyltransferase (MTase). If the cell is to survive, the counteracting activities as toxin and antitoxin, must be finely balanced in vivo. The molecular basis of this regulatory process remains unclear and current searches for regulatory elements in R-M modules are focused mainly at the transcription step. In this report, we show new aspects of REase control that are linked to translation. We used the EcoVIII R-M system as a model. Both, the REase and MTase genes for this R-M system contain an unusually high number of rare arginine codons (AGA and AGG) when compared to the rest of the E. coli K-12 genome. Clusters of these codons near the N-terminus of the REase greatly affect the translational efficiency. Changing these to higher frequency codons for E. coli (CGC) improves the REase synthesis, making the R-M system more potent to defend its host against bacteriophages. However, this improved efficiency in synthesis reduces host fitness due to increased autorestriction. We hypothesize that expression of the endonuclease gene can be modulated depending on the host genetic context and we propose a novel post-transcriptional mode of R–M system regulation that alleviates the potential lethal action of the restriction enzyme.
Collapse
|
14
|
Pedersen S, Terkelsen TB, Eriksen M, Hauge MK, Lund CC, Sneppen K, Mitarai N. Fast Translation within the First 45 Codons Decreases mRNA Stability and Increases Premature Transcription Termination in E. coli. J Mol Biol 2019; 431:1088-1097. [DOI: 10.1016/j.jmb.2019.01.026] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2018] [Revised: 01/11/2019] [Accepted: 01/16/2019] [Indexed: 10/27/2022]
|
15
|
Mirihana Arachchilage G, Hetti Arachchilage M, Venkataraman A, Piontkivska H, Basu S. Stable G-quadruplex enabling sequences are selected against by the context-dependent codon bias. Gene 2019; 696:149-161. [PMID: 30753890 DOI: 10.1016/j.gene.2019.02.006] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2018] [Revised: 01/14/2019] [Accepted: 02/05/2019] [Indexed: 12/22/2022]
Abstract
The distributions of secondary structural elements appear to differ between coding regions (CDS) of mRNAs compared to the untranslated regions (UTRs), presumably as a mechanism to fine-tune gene expression, including efficiency of translation. However, a systematic and comprehensive analysis of secondary structure avoidance because of potential bias in codon usage is difficult as some of the common secondary structures, such as, hairpins can be formed by numerous sequence combinations. Using G-quadruplex (GQ) as the model secondary structure we studied the impact of codon bias on GQs within the CDS. Because GQs can be predicted using specific consensus sequence motifs, they provide an excellent platform for investigation of the selectivity of such putative structures at the codon level. Using a bioinformatics approach, we calculated the frequencies of putative GQs within the CDS of a variety of species. Our results suggest that the most stable GQs appear to be significantly underrepresented within the CDS, through the use of specific synonymous codon combinations. Furthermore, we identified many peptide sequence motifs in which silent mutations can potentially alter translation via stable GQ formation. This work not only provides a comprehensive analysis on how stable secondary structures appear to be avoided within the CDS of mRNA, but also broadens the current understanding of synonymous codon usage as they relate to the structure-function relationship of RNA.
Collapse
Affiliation(s)
| | | | - Aparna Venkataraman
- Department of Chemistry and Biochemistry, Kent State University, Kent, OH 44242, United States of America
| | - Helen Piontkivska
- Department of Biological Sciences, Kent State University, Kent, OH 44242, United States of America
| | - Soumitra Basu
- Department of Chemistry and Biochemistry, Kent State University, Kent, OH 44242, United States of America.
| |
Collapse
|
16
|
Abrahams L, Hurst LD. Adenine Enrichment at the Fourth CDS Residue in Bacterial Genes Is Consistent with Error Proofing for +1 Frameshifts. Mol Biol Evol 2018; 34:3064-3080. [PMID: 28961919 PMCID: PMC5850271 DOI: 10.1093/molbev/msx223] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022] Open
Abstract
Beyond selection for optimal protein functioning, coding sequences (CDSs) are under selection at the RNA and DNA levels. Here, we identify a possible signature of “dual-coding,” namely extensive adenine (A) enrichment at bacterial CDS fourth sites. In 99.07% of studied bacterial genomes, fourth site A use is greater than expected given genomic A-starting codon use. Arguing for nucleotide level selection, A-starting serine and arginine second codons are heavily utilized when compared with their non-A starting synonyms. Several models have the ability to explain some of this trend. In part, A-enrichment likely reduces 5′ mRNA stability, promoting translation initiation. However T/U, which may also reduce stability, is avoided. Further, +1 frameshifts on the initiating ATG encode a stop codon (TGA) provided A is the fourth residue, acting either as a frameshift “catch and destroy” or a frameshift stop and adjust mechanism and hence implicated in translation initiation. Consistent with both, genomes lacking TGA stop codons exhibit weaker fourth site A-enrichment. Sequences lacking a Shine–Dalgarno sequence and those without upstream leader genes, that may be more error prone during initiation, have greater utilization of A, again suggesting a role in initiation. The frameshift correction model is consistent with the notion that many genomic features are error-mitigation factors and provides the first evidence for site-specific out of frame stop codon selection. We conjecture that the NTG universal start codon may have evolved as a consequence of TGA being a stop codon and the ability of NTGA to rapidly terminate or adjust a ribosome.
Collapse
Affiliation(s)
- Liam Abrahams
- Department of Biology and Biochemistry, The Milner Centre for Evolution, University of Bath, Bath, United Kingdom
| | - Laurence D Hurst
- Department of Biology and Biochemistry, The Milner Centre for Evolution, University of Bath, Bath, United Kingdom
| |
Collapse
|
17
|
Abstract
Codon usage depends on mutation bias, tRNA-mediated selection, and the need for high efficiency and accuracy in translation. One codon in a synonymous codon family is often strongly over-used, especially in highly expressed genes, which often leads to a high dN/dS ratio because dS is very small. Many different codon usage indices have been proposed to measure codon usage and codon adaptation. Sense codon could be misread by release factors and stop codons misread by tRNAs, which also contribute to codon usage in rare cases. This chapter outlines the conceptual framework on codon evolution, illustrates codon-specific and gene-specific codon usage indices, and presents their applications. A new index for codon adaptation that accounts for background mutation bias (Index of Translation Elongation) is presented and contrasted with codon adaptation index (CAI) which does not consider background mutation bias. They are used to re-analyze data from a recent paper claiming that translation elongation efficiency matters little in protein production. The reanalysis disproves the claim.
Collapse
|
18
|
Bhattacharyya S, Jacobs WM, Adkar BV, Yan J, Zhang W, Shakhnovich EI. Accessibility of the Shine-Dalgarno Sequence Dictates N-Terminal Codon Bias in E. coli. Mol Cell 2018; 70:894-905.e5. [PMID: 29883608 PMCID: PMC6311106 DOI: 10.1016/j.molcel.2018.05.008] [Citation(s) in RCA: 41] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2017] [Revised: 02/14/2018] [Accepted: 05/03/2018] [Indexed: 10/14/2022]
Abstract
Despite considerable efforts, no physical mechanism has been shown to explain N-terminal codon bias in prokaryotic genomes. Using a systematic study of synonymous substitutions in two endogenous E. coli genes, we show that interactions between the coding region and the upstream Shine-Dalgarno (SD) sequence modulate the efficiency of translation initiation, affecting both intracellular mRNA and protein levels due to the inherent coupling of transcription and translation in E. coli. We further demonstrate that far-downstream mutations can also modulate mRNA levels by occluding the SD sequence through the formation of non-equilibrium secondary structures. By contrast, a non-endogenous RNA polymerase that decouples transcription and translation largely alleviates the effects of synonymous substitutions on mRNA levels. Finally, a complementary statistical analysis of the E. coli genome specifically implicates avoidance of intra-molecular base pairing with the SD sequence. Our results provide general physical insights into the coding-level features that optimize protein expression in prokaryotes.
Collapse
Affiliation(s)
- Sanchari Bhattacharyya
- Department of Chemistry and Chemical Biology, Harvard University, 12 Oxford Street, Cambridge, MA, USA
| | - William M Jacobs
- Department of Chemistry and Chemical Biology, Harvard University, 12 Oxford Street, Cambridge, MA, USA
| | - Bharat V Adkar
- Department of Chemistry and Chemical Biology, Harvard University, 12 Oxford Street, Cambridge, MA, USA
| | - Jin Yan
- Department of Chemistry and Chemical Biology, Harvard University, 12 Oxford Street, Cambridge, MA, USA; College of Chemical Engineering, Sichuan University, Chengdu 610065, Sichuan, China
| | - Wenli Zhang
- Department of Chemistry and Chemical Biology, Harvard University, 12 Oxford Street, Cambridge, MA, USA; State Key Laboratory of Food Science and Technology, Jiangnan University, Wuxi 214122, China
| | - Eugene I Shakhnovich
- Department of Chemistry and Chemical Biology, Harvard University, 12 Oxford Street, Cambridge, MA, USA.
| |
Collapse
|
19
|
Paul P, Malakar AK, Chakraborty S. Compositional bias coupled with selection and mutation pressure drives codon usage in Brassica campestris genes. Food Sci Biotechnol 2017; 27:725-733. [PMID: 30263798 DOI: 10.1007/s10068-017-0285-x] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2017] [Revised: 11/28/2017] [Accepted: 12/03/2017] [Indexed: 11/25/2022] Open
Abstract
The plant Brassica campestris includes the vegetables turnip and Chinese cabbage, important plants of economic importance. Here, we have analysed the codon usage bias of B. campestris for 116 protein coding genes. Neutrality analysis showed that B. campestris had a wide range of GC3s, and a significant correlation was observed between GC12 and GC3. Nc versus GC3s plot showed a few genes on or proximate to the expected curve, but the majority of points were found to be scattered distantly from the expected curve. Correspondence analysis on codon usage revealed that the position preference of codons on multidimensional space totally depends on the presence of A and T at synonymous third codon position. These results altogether suggest that composition bias along with selection (major) and mutation pressure (minor) affects the codon usage pattern of the protein coding genes in Brassica campestris.
Collapse
Affiliation(s)
- Prosenjit Paul
- Department of Biotechnology, Assam University, Silchar, Assam 788011 India
| | - Arup Kumar Malakar
- Department of Biotechnology, Assam University, Silchar, Assam 788011 India
| | | |
Collapse
|
20
|
Burkhardt DH, Rouskin S, Zhang Y, Li GW, Weissman JS, Gross CA. Operon mRNAs are organized into ORF-centric structures that predict translation efficiency. eLife 2017; 6. [PMID: 28139975 PMCID: PMC5318159 DOI: 10.7554/elife.22037] [Citation(s) in RCA: 107] [Impact Index Per Article: 15.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2016] [Accepted: 01/27/2017] [Indexed: 02/02/2023] Open
Abstract
Bacterial mRNAs are organized into operons consisting of discrete open reading frames (ORFs) in a single polycistronic mRNA. Individual ORFs on the mRNA are differentially translated, with rates varying as much as 100-fold. The signals controlling differential translation are poorly understood. Our genome-wide mRNA secondary structure analysis indicated that operonic mRNAs are comprised of ORF-wide units of secondary structure that vary across ORF boundaries such that adjacent ORFs on the same mRNA molecule are structurally distinct. ORF translation rate is strongly correlated with its mRNA structure in vivo, and correlation persists, albeit in a reduced form, with its structure when translation is inhibited and with that of in vitro refolded mRNA. These data suggest that intrinsic ORF mRNA structure encodes a rough blueprint for translation efficiency. This structure is then amplified by translation, in a self-reinforcing loop, to provide the structure that ultimately specifies the translation of each ORF. DOI:http://dx.doi.org/10.7554/eLife.22037.001 Proteins make up much of the biological machinery inside cells and perform the essential tasks needed to keep each cell alive. Cells contain thousands of different proteins and the instructions needed to build each protein are encoded in genes. However, these instructions cannot be used directly to manufacture the proteins. Instead, a messenger molecule called mRNA is needed to carry the information stored within genes to the parts of the cell where proteins are made. In bacteria, one mRNA molecule can include information from several genes. This group of genes is called an operon and produces a set of proteins that perform a shared task. Although these proteins work together, some of them are needed in greater numbers than others. Because they are all made using information from the same mRNA, some instructions on the mRNA must be read more times than others. It is unclear how bacterial cells control how many proteins are produced from each part of one mRNA but it is thought to relate to the three-dimensional shape of the molecule itself. Burkhardt, Rouskin, Zhang et al. have now examined the production of proteins from mRNAs in the commonly studied bacterium, Escherichia coli. The results showed that each set of instructions on the mRNA formed a three-dimensional structure that corresponds to the amount of protein produced from that portion of the mRNA. When this three-dimensional structure is more stable or rigid, the corresponding instructions tended to produce fewer proteins than if the structure was relatively simple and unstable. Further investigation showed that these three-dimensional mRNA structures could form spontaneously outside of cells, suggesting that molecules other than the mRNA itself have a relatively small role in controlling the number of proteins produced. This also suggests that the entire structure of each mRNA is important and is likely to be essential for cell survival. The next step is to understand why bacteria organise their genes in this way and how the different mRNA structures control how proteins are produced. Moreover, because many bacteria are used like biological factories to produce a variety of commercially useful molecules, these new insights have the potential to enhance a number of manufacturing processes. DOI:http://dx.doi.org/10.7554/eLife.22037.002
Collapse
Affiliation(s)
- David H Burkhardt
- Graduate Group in Biophysics, University of California, San Francisco, San Francisco, United States.,Department of Microbiology and Immunology, University of California, San Francisco, San Francisco, United States.,California Institute of Quantitative Biology, University of California, San Francisco, San Francisco, United States
| | - Silvi Rouskin
- California Institute of Quantitative Biology, University of California, San Francisco, San Francisco, United States.,Department of Cellular and Molecular Pharmacology, Howard Hughes Medical Institute, University of California, San Francisco, San Francisco, United States.,Center for RNA Systems Biology, University of California, San Francisco, San Francisco, United States
| | - Yan Zhang
- Department of Microbiology and Immunology, University of California, San Francisco, San Francisco, United States.,Department of Cell and Tissue Biology, University of California, San Francisco, San Francisco, United States
| | - Gene-Wei Li
- California Institute of Quantitative Biology, University of California, San Francisco, San Francisco, United States.,Department of Cellular and Molecular Pharmacology, Howard Hughes Medical Institute, University of California, San Francisco, San Francisco, United States.,Center for RNA Systems Biology, University of California, San Francisco, San Francisco, United States
| | - Jonathan S Weissman
- California Institute of Quantitative Biology, University of California, San Francisco, San Francisco, United States.,Department of Cellular and Molecular Pharmacology, Howard Hughes Medical Institute, University of California, San Francisco, San Francisco, United States.,Center for RNA Systems Biology, University of California, San Francisco, San Francisco, United States
| | - Carol A Gross
- Department of Microbiology and Immunology, University of California, San Francisco, San Francisco, United States.,California Institute of Quantitative Biology, University of California, San Francisco, San Francisco, United States.,Department of Cell and Tissue Biology, University of California, San Francisco, San Francisco, United States
| |
Collapse
|
21
|
Agashe D, Sane M, Phalnikar K, Diwan GD, Habibullah A, Martinez-Gomez NC, Sahasrabuddhe V, Polachek W, Wang J, Chubiz LM, Marx CJ. Large-Effect Beneficial Synonymous Mutations Mediate Rapid and Parallel Adaptation in a Bacterium. Mol Biol Evol 2016; 33:1542-53. [PMID: 26908584 PMCID: PMC4868122 DOI: 10.1093/molbev/msw035] [Citation(s) in RCA: 46] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022] Open
Abstract
Contrary to previous understanding, recent evidence indicates that synonymous codon changes may sometimes face strong selection. However, it remains difficult to generalize the nature, strength, and mechanism(s) of such selection. Previously, we showed that synonymous variants of a key enzyme-coding gene (fae) of Methylobacterium extorquens AM1 decreased enzyme production and reduced fitness dramatically. We now show that during laboratory evolution, these variants rapidly regained fitness via parallel yet variant-specific, highly beneficial point mutations in the N-terminal region of fae. These mutations (including four synonymous mutations) had weak but consistently positive impacts on transcript levels, enzyme production, or enzyme activity. However, none of the proposed mechanisms (including internal ribosome pause sites or mRNA structure) predicted the fitness impact of evolved or additional, engineered point mutations. This study shows that synonymous mutations can be fixed through strong positive selection, but the mechanism for their benefit varies depending on the local sequence context.
Collapse
Affiliation(s)
- Deepa Agashe
- National Center for Biological Sciences, Tata Institute of Fundamental Research, GKVK, Bangalore, India Department of Organismic and Evolutionary Biology, Harvard University
| | - Mrudula Sane
- National Center for Biological Sciences, Tata Institute of Fundamental Research, GKVK, Bangalore, India
| | - Kruttika Phalnikar
- National Center for Biological Sciences, Tata Institute of Fundamental Research, GKVK, Bangalore, India
| | - Gaurav D Diwan
- National Center for Biological Sciences, Tata Institute of Fundamental Research, GKVK, Bangalore, India SASTRA University, Thanjavur, India
| | - Alefiyah Habibullah
- National Center for Biological Sciences, Tata Institute of Fundamental Research, GKVK, Bangalore, India
| | | | - Vinaya Sahasrabuddhe
- National Center for Biological Sciences, Tata Institute of Fundamental Research, GKVK, Bangalore, India
| | - William Polachek
- Department of Organismic and Evolutionary Biology, Harvard University
| | - Jue Wang
- Department of Organismic and Evolutionary Biology, Harvard University Systems Biology Graduate Program, Harvard University
| | - Lon M Chubiz
- Department of Organismic and Evolutionary Biology, Harvard University
| | - Christopher J Marx
- Department of Organismic and Evolutionary Biology, Harvard University Faculty of Arts and Sciences Center for Systems Biology, Harvard University Department of Biological Sciences, University of Idaho Institute for Bioinformatics and Evolutionary Studies, University of Idaho
| |
Collapse
|
22
|
Abstract
Synonymous mutations do not change the sequence of the polypeptide but they may still influence fitness. We investigated in Salmonella enterica how four synonymous mutations in the rpsT gene (encoding ribosomal protein S20) reduce fitness (i.e., growth rate) and the mechanisms by which this cost can be genetically compensated. The reduced growth rates of the synonymous mutants were correlated with reduced levels of the rpsT transcript and S20 protein. In an adaptive evolution experiment, these fitness impairments could be compensated by mutations that either caused up-regulation of S20 through increased gene dosage (due to duplications), increased transcription of the rpsT gene (due to an rpoD mutation or mutations in rpsT), or increased translation from the rpsT transcript (due to rpsT mutations). We suggest that the reduced levels of S20 in the synonymous mutants result in production of a defective subpopulation of 30S subunits lacking S20 that reduce protein synthesis and bacterial growth and that the compensatory mutations restore S20 levels and the number of functional ribosomes. Our results demonstrate how specific synonymous mutations can cause substantial fitness reductions and that many different types of intra- and extragenic compensatory mutations can efficiently restore fitness. Furthermore, this study highlights that also synonymous sites can be under strong selection, which may have implications for the use of dN/dS ratios as signature for selection.
Collapse
Affiliation(s)
- Anna Knöppel
- Department of Medical Biochemistry and Microbiology, Uppsala University, Uppsala, Sweden
| | - Joakim Näsvall
- Department of Medical Biochemistry and Microbiology, Uppsala University, Uppsala, Sweden
| | - Dan I Andersson
- Department of Medical Biochemistry and Microbiology, Uppsala University, Uppsala, Sweden
| |
Collapse
|
23
|
Zhang D, Chen D, Cao L, Li G, Cheng H. The Effect of Codon Mismatch on the Protein Translation System. PLoS One 2016; 11:e0148302. [PMID: 26840415 PMCID: PMC4739699 DOI: 10.1371/journal.pone.0148302] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2015] [Accepted: 01/15/2016] [Indexed: 11/18/2022] Open
Abstract
Incorrect protein translation, caused by codon mismatch, is an important problem of living cells. In this work, a computational model was introduced to quantify the effects of codon mismatch and the model was used to study the protein translation of Saccharomyces cerevisiae. According to simulation results, the probability of codon mismatch will increase when the supply of amino acids is unbalanced, and the longer is the codon sequence, the larger is the probability for incorrect translation to occur, making the synthesis of long peptide chain difficult. By comparing to simulation results without codon mismatch effects taken into account, the fraction of mRNAs with bound ribosome decrease faster along the mRNAs, making the 5’ ramp phenomenon more obvious. It was also found in our work that the premature mechanism resulted from codon mismatch can reduce the proportion of incorrect translation when the amino acid supply is extremely unbalanced, which is one possible source of high fidelity protein synthesis after peptidyl transfer.
Collapse
Affiliation(s)
- Dinglin Zhang
- Laboratory of Molecular Modeling and Design, State Key Laboratory of Molecular Reaction Dynamics, Dalian Institute of Chemical Physics, Chinese Academy of Sciences, Dalian, Liaoning, 116023, China
| | - Danfeng Chen
- Dalian City Fisherles Technical Extension Station, Dalian, Liaoning, 116025, China
| | - Liaoran Cao
- Laboratory of Molecular Modeling and Design, State Key Laboratory of Molecular Reaction Dynamics, Dalian Institute of Chemical Physics, Chinese Academy of Sciences, Dalian, Liaoning, 116023, China
| | - Guohui Li
- Laboratory of Molecular Modeling and Design, State Key Laboratory of Molecular Reaction Dynamics, Dalian Institute of Chemical Physics, Chinese Academy of Sciences, Dalian, Liaoning, 116023, China
- * E-mail: (GHL); (HC)
| | - Hong Cheng
- Shanghai Key Laboratory of Molecular Andrology, State Key Laboratory of Molecular Biology, Institute of Biochemistry and Cell Biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai, 200031, China
- * E-mail: (GHL); (HC)
| |
Collapse
|
24
|
Uddin A, Chakraborty S. Synonymous codon usage pattern in mitochondrial CYB gene in pisces, aves, and mammals. Mitochondrial DNA A DNA Mapp Seq Anal 2015; 28:187-196. [DOI: 10.3109/19401736.2015.1115842] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]
Affiliation(s)
- Arif Uddin
- Department of Biotechnology, Assam University, Silchar, Assam, India
| | | |
Collapse
|
25
|
Camiolo S, Melito S, Porceddu A. New insights into the interplay between codon bias determinants in plants. DNA Res 2015; 22:461-70. [PMID: 26546225 PMCID: PMC4675714 DOI: 10.1093/dnares/dsv027] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2015] [Accepted: 10/01/2015] [Indexed: 12/28/2022] Open
Abstract
Codon bias is the non-random use of synonymous codons, a phenomenon that has been observed in species as diverse as bacteria, plants and mammals. The preferential use of particular synonymous codons may reflect neutral mechanisms (e.g. mutational bias, G|C-biased gene conversion, genetic drift) and/or selection for mRNA stability, translational efficiency and accuracy. The extent to which these different factors influence codon usage is unknown, so we dissected the contribution of mutational bias and selection towards codon bias in genes from 15 eudicots, 4 monocots and 2 mosses. We analysed the frequency of mononucleotides, dinucleotides and trinucleotides and investigated whether the compositional genomic background could account for the observed codon usage profiles. Neutral forces such as mutational pressure and G|C-biased gene conversion appeared to underlie most of the observed codon bias, although there was also evidence for the selection of optimal translational efficiency and mRNA folding. Our data confirmed the compositional differences between monocots and dicots, with the former featuring in general a lower background compositional bias but a higher overall codon bias.
Collapse
Affiliation(s)
- S Camiolo
- Dipartimento di Agraria, SACEG, Università degli Studi di Sassari, Sassari, Italy
| | - S Melito
- Dipartimento di Agraria, SACEG, Università degli Studi di Sassari, Sassari, Italy
| | - A Porceddu
- Dipartimento di Agraria, SACEG, Università degli Studi di Sassari, Sassari, Italy
| |
Collapse
|
26
|
Zhang Z, Presgraves DC. DrosophilaX-Linked Genes Have Lower Translation Rates than Autosomal Genes. Mol Biol Evol 2015; 33:413-28. [DOI: 10.1093/molbev/msv227] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2015] [Accepted: 10/12/2015] [Indexed: 12/13/2022] Open
|
27
|
How do bacteria tune translation efficiency? Curr Opin Microbiol 2015; 24:66-71. [PMID: 25636133 DOI: 10.1016/j.mib.2015.01.001] [Citation(s) in RCA: 61] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2014] [Revised: 01/02/2015] [Accepted: 01/08/2015] [Indexed: 11/20/2022]
Abstract
Bacterial proteins are translated with precisely determined rates to meet cellular demand. In contrast, efforts to express recombinant proteins in bacteria are often met with large unpredictability in their levels of translation. The disconnect between translation of natural and synthetic mRNA stems from the lack of understanding of the strategy used by bacteria to tune translation efficiency (TE). The development of array-based oligonucleotide synthesis and ribosome profiling provides new approaches to address this issue. Although the major determinant for TE is still unknown, these high-throughput studies point out a statistically significant but mild contribution from the mRNA secondary structure around the start codon. Here I summarize those findings and provide a theoretical framework for measuring TE.
Collapse
|
28
|
Ben-Yehezkel T, Atar S, Zur H, Diament A, Goz E, Marx T, Cohen R, Dana A, Feldman A, Shapiro E, Tuller T. Rationally designed, heterologous S. cerevisiae transcripts expose novel expression determinants. RNA Biol 2015; 12:972-84. [PMID: 26176266 PMCID: PMC4615757 DOI: 10.1080/15476286.2015.1071762] [Citation(s) in RCA: 31] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2015] [Revised: 06/30/2015] [Accepted: 07/07/2015] [Indexed: 01/23/2023] Open
Abstract
Deducing generic causal relations between RNA transcript features and protein expression profiles from endogenous gene expression data remains a major unsolved problem in biology. The analysis of gene expression from heterologous genes contributes significantly to solving this problem, but has been heavily biased toward the study of the effect of 5' transcript regions and to prokaryotes. Here, we employ a synthetic biology driven approach that systematically differentiates the effect of different regions of the transcript on gene expression up to 240 nucleotides into the ORF. This enabled us to discover new causal effects between features in previously unexplored regions of transcripts, and gene expression in natural regimes. We rationally designed, constructed, and analyzed 383 gene variants of the viral HRSVgp04 gene ORF, with multiple synonymous mutations at key positions along the transcript in the eukaryote S. cerevisiae. Our results show that a few silent mutations at the 5'UTR can have a dramatic effect of up to 15 fold change on protein levels, and that even synonymous mutations in positions more than 120 nucleotides downstream from the ORF 5'end can modulate protein levels up to 160%-300%. We demonstrate that the correlation between protein levels and folding energy increases with the significance of the level of selection of the latter in endogenous genes, reinforcing the notion that selection for folding strength in different parts of the ORF is related to translation regulation. Our measured protein abundance correlates notably(correlation up to r = 0.62 (p=0.0013)) with mean relative codon decoding times, based on ribosomal densities (Ribo-Seq) in endogenous genes, supporting the conjecture that translation elongation and adaptation to the tRNA pool can modify protein levels in a causal/direct manner. This report provides an improved understanding of transcript evolution, design principles of gene expression regulation, and suggests simple rules for engineering synthetic gene expression in eukaryotes.
Collapse
Affiliation(s)
- Tuval Ben-Yehezkel
- Department of Biomedical Engineering; Tel-Aviv University; Tel-Aviv, Israel
- Department of Biological Chemistry; Weizmann Institute of Science; Rehovot, Israel
- Department of Applied Mathematics and Computer Science; Weizmann Institute of Science; Rehovot, Israel
- These authors equally contributed to this work.
| | - Shimshi Atar
- Department of Biomedical Engineering; Tel-Aviv University; Tel-Aviv, Israel
- These authors equally contributed to this work.
| | - Hadas Zur
- Department of Biomedical Engineering; Tel-Aviv University; Tel-Aviv, Israel
| | - Alon Diament
- Department of Biomedical Engineering; Tel-Aviv University; Tel-Aviv, Israel
| | - Eli Goz
- Department of Biomedical Engineering; Tel-Aviv University; Tel-Aviv, Israel
| | - Tzipy Marx
- Department of Biological Chemistry; Weizmann Institute of Science; Rehovot, Israel
| | - Rafael Cohen
- Department of Biological Chemistry; Weizmann Institute of Science; Rehovot, Israel
| | - Alexandra Dana
- Department of Biomedical Engineering; Tel-Aviv University; Tel-Aviv, Israel
| | - Anna Feldman
- Department of Biomedical Engineering; Tel-Aviv University; Tel-Aviv, Israel
| | - Ehud Shapiro
- Department of Biological Chemistry; Weizmann Institute of Science; Rehovot, Israel
- Department of Applied Mathematics and Computer Science; Weizmann Institute of Science; Rehovot, Israel
| | - Tamir Tuller
- Department of Biomedical Engineering; Tel-Aviv University; Tel-Aviv, Israel
- Sagol School of Neuroscience; Tel-Aviv University; Tel-Aviv, Israel
| |
Collapse
|
29
|
Tuller T, Zur H. Multiple roles of the coding sequence 5' end in gene expression regulation. Nucleic Acids Res 2014; 43:13-28. [PMID: 25505165 PMCID: PMC4288200 DOI: 10.1093/nar/gku1313] [Citation(s) in RCA: 137] [Impact Index Per Article: 13.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022] Open
Abstract
The codon composition of the coding sequence's (ORF) 5′ end first few dozen codons is known to be distinct to that of the rest of the ORF. Various explanations for the unusual codon distribution in this region have been proposed in recent years, and include, among others, novel regulatory mechanisms of translation initiation and elongation. However, due to the fact that many overlapping regulatory signals are suggested to be associated with this relatively short region, its research is challenging. Here, we review the currently known signals that appear in this region, the theories related to the way they regulate translation and affect the organismal fitness, and the debates they provoke.
Collapse
Affiliation(s)
- Tamir Tuller
- Department of Biomedical Engineering, the Engineering Faculty, Tel Aviv University, Tel Aviv, Israel The Sagol School of Neuroscience, Tel Aviv University, Tel Aviv 69978, Israel
| | - Hadas Zur
- Department of Biomedical Engineering, the Engineering Faculty, Tel Aviv University, Tel Aviv, Israel
| |
Collapse
|
30
|
Faiq MA, Ali M, Dada T, Dada R, Saluja D. A novel methodology for enhanced and consistent heterologous expression of unmodified human cytochrome P450 1B1 (CYP1B1). PLoS One 2014; 9:e110473. [PMID: 25329831 PMCID: PMC4199734 DOI: 10.1371/journal.pone.0110473] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2014] [Accepted: 09/08/2014] [Indexed: 12/29/2022] Open
Abstract
Cytochrome P450 1B1 (CYP1B1) is a universal cancer marker and is implicated in many other disorders. Mutations in CYP1B1 are also associated with childhood blindness due to primary congenital glaucoma (PCG). To understand the CYP1B1 mediated etiopathology of PCG and pathomechanism of various cancers, it is important to carry out its functional studies. Heterologous expression of CYP1B1 in prokaryotes is imperative because bacteria yield a higher amount of heterologous proteins in lesser time and so the expressed protein is ideal for functional studies. In such expression system there is no interference by other eukaryotic proteins. But the story is not that simple as expression of heterologous CYP1B1 poses many technical difficulties. Investigators have employed various modifications/deletions of CYP N-terminus to improve CYP1B1 expression. However, the drawback of these studies is that it changes the original protein and, as a result, invalidates functional studies. The present study examines the role of various conditions and reagents in successful and consistent expression of sufficient quantities of unmodified/native human CYP1B1 in E. coli. We aimed at expressing CYP1B1 in various strains of E. coli and in the course developed a protocol that results in high expression of unmodified protein sufficient for functional/biophysical studies. We examined CYP1B1 expression with respect to different expression vectors, bacterial strains, types of culture media, time, Isopropyl β-D-1-thiogalactopyranoside concentrations, temperatures, rotations per minute, conditioning reagents and the efficacy of a newly described technique called double colony selection. We report a protocol that is simple, easy and can be carried out in any laboratory without the requirement of a fermentor. Though employed for CYP1B1 expression, this protocol can ideally be used to express any eukaryotic membrane protein.
Collapse
Affiliation(s)
- Muneeb A. Faiq
- Dr. Rajendra Prasad Centre for Ophthalmic Sciences, All India Institute of Medical Sciences, Ansari Nagar, New Delhi, India
- Medical Biotechnology Laboratory, Dr. B. R. Ambedkar Centre for Biomedical Research, University of Delhi, North Campus, Delhi, India
- Laboratory for Molecular Reproduction and Genetics, Department of Anatomy, All India Institute of Medical Sciences, Ansari Nagar, India
| | - Mashook Ali
- Medical Biotechnology Laboratory, Dr. B. R. Ambedkar Centre for Biomedical Research, University of Delhi, North Campus, Delhi, India
| | - Tanuj Dada
- Dr. Rajendra Prasad Centre for Ophthalmic Sciences, All India Institute of Medical Sciences, Ansari Nagar, New Delhi, India
| | - Rima Dada
- Laboratory for Molecular Reproduction and Genetics, Department of Anatomy, All India Institute of Medical Sciences, Ansari Nagar, India
| | - Daman Saluja
- Medical Biotechnology Laboratory, Dr. B. R. Ambedkar Centre for Biomedical Research, University of Delhi, North Campus, Delhi, India
| |
Collapse
|
31
|
Kessler MD, Dean MD. Effective population size does not predict codon usage bias in mammals. Ecol Evol 2014; 4:3887-900. [PMID: 25505518 PMCID: PMC4242573 DOI: 10.1002/ece3.1249] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2014] [Revised: 08/04/2014] [Accepted: 08/07/2014] [Indexed: 12/20/2022] Open
Abstract
Synonymous codons are not used at equal frequency throughout the genome, a phenomenon termed codon usage bias (CUB). It is often assumed that interspecific variation in the intensity of CUB is related to species differences in effective population sizes (Ne), with selection on CUB operating less efficiently in species with small Ne. Here, we specifically ask whether variation in Ne predicts differences in CUB in mammals and report two main findings. First, across 41 mammalian genomes, CUB was not correlated with two indirect proxies of Ne (body mass and generation time), even though there was statistically significant evidence of selection shaping CUB across all species. Interestingly, autosomal genes showed higher codon usage bias compared to X-linked genes, and high-recombination genes showed higher codon usage bias compared to low recombination genes, suggesting intraspecific variation in Ne predicts variation in CUB. Second, across six mammalian species with genetic estimates of Ne (human, chimpanzee, rabbit, and three mouse species: Mus musculus, M. domesticus, and M. castaneus), Ne and CUB were weakly and inconsistently correlated. At least in mammals, interspecific divergence in Ne does not strongly predict variation in CUB. One hypothesis is that each species responds to a unique distribution of selection coefficients, confounding any straightforward link between Ne and CUB.
Collapse
Affiliation(s)
- Michael D Kessler
- Molecular and Computational Biology, University of Southern California 1050 Childs Way, Los Angeles, California, 90089
| | - Matthew D Dean
- Molecular and Computational Biology, University of Southern California 1050 Childs Way, Los Angeles, California, 90089
| |
Collapse
|
32
|
Hussmann JA, Press WH. Local correlations in codon preferences do not support a model of tRNA recycling. Cell Rep 2014; 8:1624-1629. [PMID: 25199837 DOI: 10.1016/j.celrep.2014.08.012] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2014] [Revised: 07/28/2014] [Accepted: 08/06/2014] [Indexed: 10/24/2022] Open
Abstract
It has been proposed that patterns in the usage of synonymous codons provide evidence that individual tRNA molecules are recycled through the ribosome, translating several occurrences of the same amino acid before diffusing away. The claimed evidence is based on counting the frequency with which pairs of synonymous codons are used at nearby occurrences of the same amino acid, as compared to the frequency expected if each codon were chosen independently from a single genome-wide distribution. We show that such statistics simply measure variation in codon preferences across a genome. As a negative control on the potential contribution of pressure to exploit tRNA recycling on these signals, we examine correlations in the usage of codons that encode different amino acids. We find that these controls are statistically as strong as the claimed evidence and conclude that there is no informatic evidence that tRNA recycling is a force shaping codon usage.
Collapse
Affiliation(s)
- Jeffrey A Hussmann
- Institute for Computational Engineering and Sciences, University of Texas, Austin, TX 78712, USA; Institute for Cellular and Molecular Biology, University of Texas, Austin, TX 78712, USA.
| | - William H Press
- Institute for Computational Engineering and Sciences, University of Texas, Austin, TX 78712, USA; Institute for Cellular and Molecular Biology, University of Texas, Austin, TX 78712, USA
| |
Collapse
|
33
|
Abstract
Although the mapping of codon to amino acid is conserved across nearly all species, the frequency at which synonymous codons are used varies both between organisms and between genes from the same organism. This variation affects diverse cellular processes including protein expression, regulation, and folding. Here, we mathematically model an additional layer of complexity and show that individual codon usage biases follow a position-dependent exponential decay model with unique parameter fits for each codon. We use this methodology to perform an in-depth analysis on codon usage bias in the model organism Escherichia coli. Our methodology shows that lowly and highly expressed genes are more similar in their codon usage patterns in the 5′-gene regions, but that these preferences diverge at distal sites resulting in greater positional dependency (pD, which we mathematically define later) for highly expressed genes. We show that position-dependent codon usage bias is partially explained by the structural requirements of mRNAs that results in increased usage of A/T rich codons shortly after the gene start. However, we also show that the pD of 4- and 6-fold degenerate codons is partially related to the gene copy number of cognate-tRNAs supporting existing hypotheses that posit benefits to a region of slow translation in the beginning of coding sequences. Lastly, we demonstrate that viewing codon usage bias through a position-dependent framework has practical utility by improving accuracy of gene expression prediction when incorporating positional dependencies into the Codon Adaptation Index model.
Collapse
Affiliation(s)
- Adam J Hockenberry
- Department of Chemical and Biological Engineering, Northwestern UniversityInterdepartmental Program in Biological Sciences, Northwestern University
| | - M Irmak Sirer
- Department of Chemical and Biological Engineering, Northwestern University
| | - Luís A Nunes Amaral
- Department of Chemical and Biological Engineering, Northwestern UniversityNorthwestern Institute on Complex Systems, Northwestern UniversityHoward Hughes Medical Institute, Northwestern University
| | - Michael C Jewett
- Department of Chemical and Biological Engineering, Northwestern UniversityInterdepartmental Program in Biological Sciences, Northwestern UniversityNorthwestern Institute on Complex Systems, Northwestern UniversityChemistry of Life Processes Institute, Northwestern UniversityInstitute for BioNanotechnology and Medicine, Northwestern University
| |
Collapse
|
34
|
Relative specificity: all substrates are not created equal. GENOMICS PROTEOMICS & BIOINFORMATICS 2014; 12:1-7. [PMID: 24491634 PMCID: PMC4411342 DOI: 10.1016/j.gpb.2014.01.001] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/26/2013] [Revised: 12/21/2013] [Accepted: 01/07/2014] [Indexed: 11/24/2022]
Abstract
A biological molecule, e.g., an enzyme, tends to interact with its many cognate substrates, targets, or partners differentially. Such a property is termed relative specificity and has been proposed to regulate important physiological functions, even though it has not been examined explicitly in most complex biochemical systems. This essay reviews several recent large-scale studies that investigate protein folding, signal transduction, RNA binding, translation and transcription in the context of relative specificity. These results and others support a pervasive role of relative specificity in diverse biological processes. It is becoming clear that relative specificity contributes fundamentally to the diversity and complexity of biological systems, which has significant implications in disease processes as well.
Collapse
|
35
|
Stergachis AB, Haugen E, Shafer A, Fu W, Vernot B, Reynolds A, Raubitschek A, Ziegler S, LeProust EM, Akey JM, Stamatoyannopoulos JA. Exonic transcription factor binding directs codon choice and affects protein evolution. Science 2013; 342:1367-72. [PMID: 24337295 DOI: 10.1126/science.1243490] [Citation(s) in RCA: 201] [Impact Index Per Article: 18.3] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023]
Abstract
Genomes contain both a genetic code specifying amino acids and a regulatory code specifying transcription factor (TF) recognition sequences. We used genomic deoxyribonuclease I footprinting to map nucleotide resolution TF occupancy across the human exome in 81 diverse cell types. We found that ~15% of human codons are dual-use codons ("duons") that simultaneously specify both amino acids and TF recognition sites. Duons are highly conserved and have shaped protein evolution, and TF-imposed constraint appears to be a major driver of codon usage bias. Conversely, the regulatory code has been selectively depleted of TFs that recognize stop codons. More than 17% of single-nucleotide variants within duons directly alter TF binding. Pervasive dual encoding of amino acid and regulatory information appears to be a fundamental feature of genome evolution.
Collapse
Affiliation(s)
- Andrew B Stergachis
- Department of Genome Sciences, University of Washington, Seattle, WA 98195, USA
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|
36
|
Charneski CA, Hurst LD. Positive Charge Loading at Protein Termini Is Due to Membrane Protein Topology, Not a Translational Ramp. Mol Biol Evol 2013; 31:70-84. [DOI: 10.1093/molbev/mst169] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
|
37
|
Shah P, Ding Y, Niemczyk M, Kudla G, Plotkin J. Rate-limiting steps in yeast protein translation. Cell 2013; 153:1589-601. [PMID: 23791185 PMCID: PMC3694300 DOI: 10.1016/j.cell.2013.05.049] [Citation(s) in RCA: 334] [Impact Index Per Article: 30.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2012] [Revised: 02/11/2013] [Accepted: 05/29/2013] [Indexed: 12/04/2022]
Abstract
Deep sequencing now provides detailed snapshots of ribosome occupancy on mRNAs. We leverage these data to parameterize a computational model of translation, keeping track of every ribosome, tRNA, and mRNA molecule in a yeast cell. We determine the parameter regimes in which fast initiation or high codon bias in a transgene increases protein yield and infer the initiation rates of endogenous Saccharomyces cerevisiae genes, which vary by several orders of magnitude and correlate with 5′ mRNA folding energies. Our model recapitulates the previously reported 5′-to-3′ ramp of decreasing ribosome densities, although our analysis shows that this ramp is caused by rapid initiation of short genes rather than slow codons at the start of transcripts. We conclude that protein production in healthy yeast cells is typically limited by the availability of free ribosomes, whereas protein production under periods of stress can sometimes be rescued by reducing initiation or elongation rates. Computational model of translation tracks all ribosomes, tRNAs, and mRNAs in a cell Translation is generally limited by initiation, not elongation Model allows inference of initiation rates for all yeast genes Ramp of 5′ ribosomes is caused by rapid initiation of short genes
Collapse
Affiliation(s)
- Premal Shah
- Department of Biology, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Yang Ding
- Department of Biology, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Malwina Niemczyk
- Wellcome Trust Centre for Cell Biology, University of Edinburgh, Edinburgh EH3 9LP, UK
| | - Grzegorz Kudla
- Wellcome Trust Centre for Cell Biology, University of Edinburgh, Edinburgh EH3 9LP, UK
| | - Joshua B. Plotkin
- Department of Biology, University of Pennsylvania, Philadelphia, PA 19104, USA
- Corresponding author
| |
Collapse
|
38
|
Zhou JH, Su JH, Chen HT, Zhang J, Ma LN, Ding YZ, Stipkovits L, Szathmary S, Pejsak Z, Liu YS. Clustering of low usage codons in the translation initiation region of hepatitis C virus. INFECTION GENETICS AND EVOLUTION 2013; 18:8-12. [DOI: 10.1016/j.meegid.2013.03.043] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/17/2012] [Revised: 02/26/2013] [Accepted: 03/24/2013] [Indexed: 01/02/2023]
|
39
|
Bentele K, Saffert P, Rauscher R, Ignatova Z, Blüthgen N. Efficient translation initiation dictates codon usage at gene start. Mol Syst Biol 2013; 9:675. [PMID: 23774758 PMCID: PMC3964316 DOI: 10.1038/msb.2013.32] [Citation(s) in RCA: 192] [Impact Index Per Article: 17.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2013] [Accepted: 05/14/2013] [Indexed: 12/18/2022] Open
Abstract
Rare codons are enriched at gene start in many genomes. Genome analysis and experimental testing show that this enrichment evolved to keep the ribosome binding site free from stable mRNA structures, in order to facilitate efficient translation initiation. ![]()
The use of rare codons coincides with suppression of mRNA structures at the ribosome binding site across genomes. There is preferential selection for synonymous codons that reduce GC-content at the beginning of genes and a stronger pressure for rare codon usage in GC-rich organisms. Amino acids encoded by AU-rich codons are preferred at gene start. Experimental results show that mRNA structure at translation start strongly influences protein yield.
The genetic code is degenerate; thus, protein evolution does not uniquely determine the coding sequence. One of the puzzles in evolutionary genetics is therefore to uncover evolutionary driving forces that result in specific codon choice. In many bacteria, the first 5–10 codons of protein-coding genes are often codons that are less frequently used in the rest of the genome, an effect that has been argued to arise from selection for slowed early elongation to reduce ribosome traffic jams. However, genome analysis across many species has demonstrated that the region shows reduced mRNA folding consistent with pressure for efficient translation initiation. This raises the possibility that unusual codon usage is a side effect of selection for reduced mRNA structure. Here we discriminate between these two competing hypotheses, and show that in bacteria selection favours codons that reduce mRNA folding around the translation start, regardless of whether these codons are frequent or rare. Experiments confirm that primarily mRNA structure, and not codon usage, at the beginning of genes determines the translation rate.
Collapse
Affiliation(s)
- Kajetan Bentele
- Institute for Theoretical Biology, Humboldt Universität zu Berlin, Berlin, Germany
| | | | | | | | | |
Collapse
|
40
|
Lind PA, Andersson DI. Fitness costs of synonymous mutations in the rpsT gene can be compensated by restoring mRNA base pairing. PLoS One 2013; 8:e63373. [PMID: 23691039 PMCID: PMC3655191 DOI: 10.1371/journal.pone.0063373] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2013] [Accepted: 03/27/2013] [Indexed: 01/12/2023] Open
Abstract
We previously reported that the distribution of fitness effects for non-synonymous and synonymous mutations in Salmonella typhimurium ribosomal proteins S20 and L1 are similar, suggesting that fitness constraints are present at the level of mRNA. Here we explore the hypothesis that synonymous mutations confer their fitness-reducing effect by alterating the secondary structure of the mRNA. To this end, we constructed a set of synonymous substitutions in the rpsT gene, encoding ribosomal protein S20, that are located in predicted paired regions in the mRNA and measured their effect on bacterial fitness. Our results show that for 3/9 cases tested, the reduced fitness conferred by a synonymous mutation could be fully or partly restored by introducing a second synonymous substitution that restore base pairing in a mRNA stem. In addition, random mutations in predicted paired regions had larger fitness effects than those in unpaired regions. Finally, we did not observe any correlation between fitness effects of the synonymous mutations and their rarity. These results suggest that for ribosomal protein S20, the deleterious effects of synonymous mutations are not generally due to codon usage effects, but that mRNA secondary structure is a major fitness constraint.
Collapse
Affiliation(s)
- Peter A. Lind
- Department of Medical Biochemistry and Microbiology, Uppsala University, Uppsala, Sweden
| | - Dan I. Andersson
- Department of Medical Biochemistry and Microbiology, Uppsala University, Uppsala, Sweden
- * E-mail:
| |
Collapse
|
41
|
Ding Y, Shah P, Plotkin JB. Weak 5'-mRNA secondary structures in short eukaryotic genes. Genome Biol Evol 2013; 4:1046-53. [PMID: 23034215 PMCID: PMC3490412 DOI: 10.1093/gbe/evs082] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/20/2023] Open
Abstract
Experimental studies of translation have found that short genes tend to exhibit greater densities of ribosomes than long genes in eukaryotic species. It remains an open question whether the elevated ribosome density on short genes is due to faster initiation or slower elongation dynamics. Here, we address this question computationally using 5′-mRNA folding energy as a proxy for translation initiation rates and codon bias as a proxy for elongation rates. We report a significant trend toward reduced 5′-secondary structure in shorter coding sequences, suggesting that short genes initiate faster during translation. We also find a trend toward higher 5′-codon bias in short genes, suggesting that short genes elongate faster than long genes. Both of these trends hold across a diverse set of eukaryotic taxa. Thus, the elevated ribosome density on short eukaryotic genes is likely caused by differential rates of initiation, rather than differential rates of elongation.
Collapse
Affiliation(s)
- Yang Ding
- Department of Biology, University of Pennsylvania, PA, USA
| | | | | |
Collapse
|
42
|
Non-optimal codon usage is a mechanism to achieve circadian clock conditionality. Nature 2013; 495:116-20. [PMID: 23417065 PMCID: PMC3593822 DOI: 10.1038/nature11942] [Citation(s) in RCA: 136] [Impact Index Per Article: 12.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2012] [Accepted: 01/30/2013] [Indexed: 11/29/2022]
Abstract
Circadian rhythms are oscillations in biological processes that function as a key adaptation to the daily rhythms of most environments. In the model cyanobacterial circadian clock system, the core oscillator proteins are encoded by the gene cluster kaiABC1. Genes with high expression and functional importance like the kai genes are usually encoded by optimal codons, yet the codon usage bias of the kaiBC genes is not optimized for translational efficiency. We discovered a relationship between codon usage and a general property of circadian rhythms called conditionality; namely, that endogenous rhythmicity is robustly expressed under some environmental conditions but not under others2. Despite the generality of circadian conditionality, however, its molecular basis is unknown for any system. Here we show that non-optimal codon usage was selected as a post-transcriptional mechanism to switch between circadian and non-circadian regulation of gene expression as an adaptive response to environmental conditions. When the kaiBC sequence was experimentally optimized to enhance expression of the KaiB and KaiC proteins, intrinsic rhythmicity was enhanced at cool temperatures that are experienced by this organism in its natural habitat. However, fitness at those temperatures was highest in cells whose endogenous rhythms were suppressed at cool temperatures as compared with cells exhibiting high-amplitude rhythmicity. These results indicate natural selection against circadian systems in cyanobacteria that are intrinsically robust at cool temperatures. Modulation of circadian amplitude is therefore critical to its adaptive significance3. Moreover, these results show the direct effects of codon usage on a complex phenotype and organismal fitness. Our work also challenges the long-standing view of directional selection towards optimal codons4–7, and provides a key example of natural selection against optimal codon to achieve adaptive responses to environmental changes.
Collapse
|
43
|
A comparative analysis on the synonymous codon usage pattern in viral functional genes and their translational initiation region of ASFV. Virus Genes 2012; 46:271-9. [PMID: 23161403 DOI: 10.1007/s11262-012-0847-1] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2012] [Accepted: 11/01/2012] [Indexed: 01/21/2023]
Abstract
The synonymous codon usage pattern of African swine fever virus (ASFV), the similarity degree of the synonymous codon usage between this virus and some organisms and the synonymous codon usage bias for the translation initiation region of viral functional genes in the whole genome of ASFV have been investigated by some simply statistical analyses. Although both GC12% (the GC content at the first and second codon positions) and GC3% (the GC content at the third codon position) of viral functional genes have a large fluctuation, the significant correlations between GC12 and GC3% and between GC3% and the first principal axis of principle component analysis on the relative synonymous codon usage of the viral functional genes imply that mutation pressure of ASFV plays an important role in the synonymous codon usage pattern. Turning to the synonymous codon usage of this virus, the codons with U/A end predominate in the synonymous codon family for the same amino acid and a weak codon usage bias in both leading and lagging strands suggests that strand compositional asymmetry does not take part in the formation of codon usage in ASFV. The interaction between the absolute codon usage bias and GC3% suggests that other selections take part in the formation of codon usage, except for the mutation pressure. It is noted that the similarity degree of codon usage between ASFV and soft tick is higher than that between the virus and the pig, suggesting that the soft tick plays a more important role than the pig in the codon usage pattern of ASFV. The translational initiation region of the viral functional genes generally have a strong tendency to select some synonymous codons with low GC content, suggesting that the synonymous codon usage bias caused by translation selection from the host takes part in modulating the translation initiation efficiency of ASFV functional genes.
Collapse
|
44
|
A selective force favoring increased G+C content in bacterial genes. Proc Natl Acad Sci U S A 2012; 109:14504-7. [PMID: 22908296 DOI: 10.1073/pnas.1205683109] [Citation(s) in RCA: 97] [Impact Index Per Article: 8.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Bacteria display considerable variation in their overall base compositions, which range from 13% to over 75% G+C. This variation in genomic base compositions has long been considered to be a strictly neutral character, due solely to differences in the mutational process; however, recent sequence comparisons indicate that mutational input alone cannot produce the observed base compositions, implying a role for natural selection. Because bacterial genomes have high gene content, forces that operate on the base composition of individual genes could help shape the overall genomic base composition. To explore this possibility, we tested whether genes that encode the same protein but vary only in their base compositions at synonymous sites have effects on bacterial fitness. Escherichia coli strains harboring G+C-rich versions of genes display higher growth rates, indicating that despite a pervasive mutational bias toward A+T, a selective force, independent of adaptive codon use, is driving genes toward higher G+C contents.
Collapse
|
45
|
Martincorena I, Seshasayee ASN, Luscombe NM. Evidence of non-random mutation rates suggests an evolutionary risk management strategy. Nature 2012; 485:95-8. [PMID: 22522932 DOI: 10.1038/nature10995] [Citation(s) in RCA: 119] [Impact Index Per Article: 9.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2011] [Accepted: 02/29/2012] [Indexed: 01/12/2023]
Abstract
A central tenet in evolutionary theory is that mutations occur randomly with respect to their value to an organism; selection then governs whether they are fixed in a population. This principle has been challenged by long-standing theoretical models predicting that selection could modulate the rate of mutation itself. However, our understanding of how the mutation rate varies between different sites within a genome has been hindered by technical difficulties in measuring it. Here we present a study that overcomes previous limitations by combining phylogenetic and population genetic techniques. Upon comparing 34 Escherichia coli genomes, we observe that the neutral mutation rate varies by more than an order of magnitude across 2,659 genes, with mutational hot and cold spots spanning several kilobases. Importantly, the variation is not random: we detect a lower rate in highly expressed genes and in those undergoing stronger purifying selection. Our observations suggest that the mutation rate has been evolutionarily optimized to reduce the risk of deleterious mutations. Current knowledge of factors influencing the mutation rate—including transcription-coupled repair and context-dependent mutagenesis—do not explain these observations, indicating that additional mechanisms must be involved. The findings have important implications for our understanding of evolution and the control of mutations.
Collapse
Affiliation(s)
- Iñigo Martincorena
- EMBL-European Bioinformatics Institute, Wellcome Trust Genome Campus, Cambridge CB10 1SD, UK.
| | | | | |
Collapse
|
46
|
Radomski JP, Slonimski PP. Alignment free characterization of the influenza-A hemagglutinin genes by the ISSCOR method. C R Biol 2012; 335:180-93. [PMID: 22464426 DOI: 10.1016/j.crvi.2012.01.001] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2010] [Revised: 10/26/2011] [Accepted: 01/11/2012] [Indexed: 12/23/2022]
Abstract
Analyses and visualizations by the ISSCOR method of the influenza virus hemagglutinin genes of three different A-subtypes revealed some rather striking temporal (for A/H3N3), and spatial relationships (for A/H5N1) between groups of individual gene subsets. The application to the A/H1N1 set revealed also relationships between the seasonal H1, and the swine-like novel 2009 H1v variants in a quick and unambiguous manner. Based on these examples we consider the application of the ISSCOR method for analysis of large sets of homologous genes as a worthwhile addition to a toolbox of genomics-it allows a rapid diagnostics of trends, and possibly can even aid an early warning of newly emerging epidemiological threats.
Collapse
Affiliation(s)
- Jan P Radomski
- Interdisciplinary Center for Mathematical and Computational Modeling, Warsaw University, Warsaw, Poland.
| | | |
Collapse
|
47
|
Lenz G, Doron-Faigenboim A, Ron EZ, Tuller T, Gophna U. Sequence features of E. coli mRNAs affect their degradation. PLoS One 2011; 6:e28544. [PMID: 22163312 PMCID: PMC3233582 DOI: 10.1371/journal.pone.0028544] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2011] [Accepted: 11/10/2011] [Indexed: 11/19/2022] Open
Abstract
Degradation of mRNA in bacteria is a regulatory mechanism, providing an efficient way to fine-tune protein abundance in response to environmental changes. While the mechanisms responsible for initiation and subsequent propagation of mRNA degradation are well studied, the mRNA features that affect its stability are yet to be elucidated. We calculated three properties for each mRNA in the E. coli transcriptome: G+C content, tRNA adaptation index (tAI) and folding energy. Each of these properties were then correlated with the experimental transcript half life measured for each transcript and detected significant correlations. A sliding window analysis identified the regions that displayed the maximal signal. The correlation between transcript half life and both G+C content and folding energy was strongest at the 5' termini of the mRNAs. Partial correlations showed that each of the parameters contributes separately to mRNA half life. Notably, mRNAs of recently-acquired genes in the E. coli genome, which have a distinct nucleotide composition, tend to be highly stable. This high stability may aid the evolutionary fixation of horizontally acquired genes.
Collapse
Affiliation(s)
- Gal Lenz
- Department of Molecular Microbiology and Biotechnology, George S. Wise Faculty of Life Sciences, Tel Aviv University, Ramat Aviv, Israel
| | - Adi Doron-Faigenboim
- Department of Molecular Microbiology and Biotechnology, George S. Wise Faculty of Life Sciences, Tel Aviv University, Ramat Aviv, Israel
| | - Eliora Z. Ron
- Department of Molecular Microbiology and Biotechnology, George S. Wise Faculty of Life Sciences, Tel Aviv University, Ramat Aviv, Israel
| | - Tamir Tuller
- Iby and Aladar Fleischman Faculty of Engineering, Tel Aviv University, Ramat Aviv, Israel
| | - Uri Gophna
- Department of Molecular Microbiology and Biotechnology, George S. Wise Faculty of Life Sciences, Tel Aviv University, Ramat Aviv, Israel
| |
Collapse
|
48
|
Lin CH, Lian CY, Hsiung CA, Chen FC. Changes in transcriptional orientation are associated with increases in evolutionary rates of enterobacterial genes. BMC Bioinformatics 2011; 12 Suppl 9:S19. [PMID: 22152004 PMCID: PMC3283321 DOI: 10.1186/1471-2105-12-s9-s19] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/02/2022] Open
Abstract
Background Changes in transcriptional orientation (“CTOs”) occur frequently in prokaryotic genomes. Such changes usually result from genomic inversions, which may cause a conflict between the directions of replication and transcription and an increase in mutation rate. However, CTOs do not always lead to the replication-transcription confrontation. Furthermore, CTOs may cause deleterious disruptions of operon structure and/or gene regulations. The currently existing CTOs may indicate relaxation of selection pressure. Therefore, it is of interest to investigate whether CTOs have an independent effect on the evolutionary rates of the affected genes, and whether these genes are subject to any type of selection pressure in prokaryotes. Methods Three closely related enterbacteria, Escherichia coli, Klebsiella pneumoniae and Salmonella enterica serovar Typhimurium, were selected for comparisons of synonymous (dS) and nonsynonymous (dN) substitution rate between the genes that have experienced changes in transcriptional orientation (changed-orientation genes, “COGs”) and those that do not (same-orientation genes, “SOGs”). The dN/dS ratio was also derived to evaluate the selection pressure on the analyzed genes. Confounding factors in the estimation of evolutionary rates, such as gene essentiality, gene expression level, replication-transcription confrontation, and decreased dS at gene terminals were controlled in the COG-SOG comparisons. Results We demonstrate that COGs have significantly higher dN and dS than SOGs when a series of confounding factors are controlled. However, the dN/dS ratios are similar between the two gene groups, suggesting that the increase in dS can sufficiently explain the increase in dN in COGs. Therefore, the increases in evolutionary rates in COGs may be mainly mutation-driven. Conclusions Here we show that CTOs can increase the evolutionary rates of the affected genes. This effect is independent of the replication-transcription confrontation, which is suggested to be the major cause of inversion-associated evolutionary rate increases. The real cause of such evolutionary rate increases remains unclear but is worth further explorations.
Collapse
Affiliation(s)
- Chieh-Hua Lin
- Division of Biostatistics and Bioinformatics, Institute of Population Health Sciences, National Health Research Institutes, 35 Keyen Road, Zhunan Town, Miaoli County, Taiwan, Republic of China
| | | | | | | |
Collapse
|
49
|
Nørholm MHH, Light S, Virkki MTI, Elofsson A, von Heijne G, Daley DO. Manipulating the genetic code for membrane protein production: what have we learnt so far? BIOCHIMICA ET BIOPHYSICA ACTA-BIOMEMBRANES 2011; 1818:1091-6. [PMID: 21884679 DOI: 10.1016/j.bbamem.2011.08.018] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/20/2011] [Revised: 08/04/2011] [Accepted: 08/15/2011] [Indexed: 12/19/2022]
Abstract
With synthetic gene services, molecular cloning is as easy as ordering a pizza. However choosing the right RNA code for efficient protein production is less straightforward, more akin to deciding on the pizza toppings. The possibility to choose synonymous codons in the gene sequence has ignited a discussion that dates back 50 years: Does synonymous codon use matter? Recent studies indicate that replacement of particular codons for synonymous codons can improve expression in homologous or heterologous hosts, however it is not always successful. Furthermore it is increasingly apparent that membrane protein biogenesis can be codon-sensitive. Single synonymous codon substitutions can influence mRNA stability, mRNA structure, translational initiation, translational elongation and even protein folding. Synonymous codon substitutions therefore need to be carefully evaluated when membrane proteins are engineered for higher production levels and further studies are needed to fully understand how to select the codons that are optimal for higher production. This article is part of a Special Issue entitled: Protein Folding in Membranes.
Collapse
Affiliation(s)
- Morten H H Nørholm
- Center for Biomembrane Research, Department of Biochemistry and Biophysics, Stockholm University, SE-106 91, Sweden.
| | | | | | | | | | | |
Collapse
|
50
|
Stoletzki N. The surprising negative correlation of gene length and optimal codon use--disentangling translational selection from GC-biased gene conversion in yeast. BMC Evol Biol 2011; 11:93. [PMID: 21481245 PMCID: PMC3096941 DOI: 10.1186/1471-2148-11-93] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2010] [Accepted: 04/11/2011] [Indexed: 02/06/2023] Open
Abstract
Background Surprisingly, in several multi-cellular eukaryotes optimal codon use correlates negatively with gene length. This contrasts with the expectation under selection for translational accuracy. While suggested explanations focus on variation in strength and efficiency of translational selection, it has rarely been noticed that the negative correlation is reported only in organisms whose optimal codons are biased towards codons that end with G or C (-GC). This raises the question whether forces that affect base composition - such as GC-biased gene conversion - contribute to the negative correlation between optimal codon use and gene length. Results Yeast is a good organism to study this as equal numbers of optimal codons end in -GC and -AT and one may hence compare frequencies of optimal GC- with optimal AT-ending codons to disentangle the forces. Results of this study demonstrate in yeast frequencies of GC-ending (optimal AND non-optimal) codons decrease with gene length and increase with recombination. A decrease of GC-ending codons along genes contributes to the negative correlation with gene length. Correlations with recombination and gene expression differentiate between GC-ending and optimal codons, and also substitution patterns support effects of GC-biased gene conversion. Conclusion While the general effect of GC-biased gene conversion is well known, the negative correlation of optimal codon use with gene length has not been considered in this context before. Initiation of gene conversion events in promoter regions and the presence of a gene conversion gradient most likely explain the observed decrease of GC-ending codons with gene length and gene position.
Collapse
Affiliation(s)
- Nina Stoletzki
- Ludwig-Maximilan Universität, Biocenter, Grosshadernerstr, 2, D-82152 Planegg-Martinsried, Germany.
| |
Collapse
|