1
|
Chakarborty S, Irshad IU, Mahima, Sharma AK. TIR predictor and optimizer: Web-tools for accurate prediction of translation initiation rate and precision gene design in Saccharomyces cerevisiae. Biotechnol J 2024; 19:e2400081. [PMID: 38719586 DOI: 10.1002/biot.202400081] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2024] [Revised: 04/15/2024] [Accepted: 04/16/2024] [Indexed: 05/14/2024]
Abstract
Translation initiation is the primary determinant of the rate of protein production. The variation in the rate with which this step occurs can cause up to three orders of magnitude differences in cellular protein levels. Several mRNA features, including mRNA stability in proximity to the start codon, coding sequence length, and presence of specific motifs in the mRNA molecule, have been shown to influence the translation initiation rate. These molecular factors acting at different strengths allow precise control of in vivo translation initiation rate and thus the rate of protein synthesis. However, despite the paramount importance of translation initiation rate in protein synthesis, accurate prediction of the absolute values of initiation rate remains a challenge. In fact, as of now, there is no available model for predicting the initiation rate in Saccharomyces cerevisiae. To address this, we train a machine learning model for predicting the in vivo initiation rate in S. cerevisiae transcripts. The model is trained using a diverse set of mRNA transcripts, enabling the comparison of initiation rates across different transcripts. Our model exhibited excellent accuracy in predicting the translation initiation rate and demonstrated its effectiveness with both endogenous and exogenous transcripts. Then, by combining the machine learning model with the Monte-Carlo search algorithm, we have also devised a method to optimize the nucleotide sequence of any gene to achieve a specific target initiation rate. The machine learning model we've developed for predicting translation initiation rates, along with the gene optimization method, are deployed as a web server. Both web servers are accessible for free at the following link: ajeetsharmalab.com/TIRPredictor. Thus, this research advances our fundamental understanding of translation initiation processes, with direct applications in biotechnology.
Collapse
Affiliation(s)
| | | | - Mahima
- Department of Physics, Indian Institute of Technology Jammu, Jammu, India
| | - Ajeet K Sharma
- Department of Physics, Indian Institute of Technology Jammu, Jammu, India
- Department of Biosciences and Bioengineering, Indian Institute of Technology Jammu, Jammu, India
| |
Collapse
|
2
|
Farookhi H, Xia X. Differential Selection for Translation Efficiency Shapes Translation Machineries in Bacterial Species. Microorganisms 2024; 12:768. [PMID: 38674712 PMCID: PMC11052298 DOI: 10.3390/microorganisms12040768] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2024] [Revised: 04/01/2024] [Accepted: 04/09/2024] [Indexed: 04/28/2024] Open
Abstract
Different bacterial species have dramatically different generation times, from 20-30 min in Escherichia coli to about two weeks in Mycobacterium leprae. The translation machinery in a cell needs to synthesize all proteins for a new cell in each generation. The three subprocesses of translation, i.e., initiation, elongation, and termination, are expected to be under stronger selection pressure to optimize in short-generation bacteria (SGB) such as Vibrio natriegens than in the long-generation Mycobacterium leprae. The initiation efficiency depends on the start codon decoded by the initiation tRNA, the optimal Shine-Dalgarno (SD) decoded by the anti-SD (aSD) sequence on small subunit rRNA, and the secondary structure that may embed the initiation signals and prevent them from being decoded. The elongation efficiency depends on the tRNA pool and codon usage. The termination efficiency in bacteria depends mainly on the nature of the stop codon and the nucleotide immediately downstream of the stop codon. By contrasting SGB with long-generation bacteria (LGB), we predict (1) SGB to have more ribosome RNA operons to produce ribosomes, and more tRNA genes for carrying amino acids to ribosomes, (2) SGB to have a higher percentage of genes using AUG as the start codon and UAA as the stop codon than LGB, (3) SGB to exhibit better codon and anticodon adaptation than LGB, and (4) SGB to have a weaker secondary structure near the translation initiation signals than LGB. These differences between SGB and LGB should be more pronounced in highly expressed genes than the rest of the genes. We present empirical evidence in support of these predictions.
Collapse
Affiliation(s)
- Heba Farookhi
- Department of Biology, University of Ottawa, Ottawa, ON K1N 6N5, Canada;
| | - Xuhua Xia
- Department of Biology, University of Ottawa, Ottawa, ON K1N 6N5, Canada;
- Ottawa Institute of Systems Biology, University of Ottawa, Ottawa, ON K1H 8M5, Canada
| |
Collapse
|
3
|
Zeng J, Song K, Wang J, Wen H, Zhou J, Ni T, Lu H, Yu Y. Characterization and optimization of 5´ untranslated region containing poly-adenine tracts in Kluyveromyces marxianus using machine-learning model. Microb Cell Fact 2024; 23:7. [PMID: 38172836 PMCID: PMC10763412 DOI: 10.1186/s12934-023-02271-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2023] [Accepted: 12/12/2023] [Indexed: 01/05/2024] Open
Abstract
BACKGROUND The 5´ untranslated region (5´ UTR) plays a key role in regulating translation efficiency and mRNA stability, making it a favored target in genetic engineering and synthetic biology. A common feature found in the 5´ UTR is the poly-adenine (poly(A)) tract. However, the effect of 5´ UTR poly(A) on protein production remains controversial. Machine-learning models are powerful tools for explaining the complex contributions of features, but models incorporating features of 5´ UTR poly(A) are currently lacking. Thus, our goal is to construct such a model, using natural 5´ UTRs from Kluyveromyces marxianus, a promising cell factory for producing heterologous proteins. RESULTS We constructed a mini-library consisting of 207 5´ UTRs harboring poly(A) and 34 5´ UTRs without poly(A) from K. marxianus. The effects of each 5´ UTR on the production of a GFP reporter were evaluated individually in vivo, and the resulting protein abundance spanned an approximately 450-fold range throughout. The data were used to train a multi-layer perceptron neural network (MLP-NN) model that incorporated the length and position of poly(A) as features. The model exhibited good performance in predicting protein abundance (average R2 = 0.7290). The model suggests that the length of poly(A) is negatively correlated with protein production, whereas poly(A) located between 10 and 30 nt upstream of the start codon (AUG) exhibits a weak positive effect on protein abundance. Using the model as guidance, the deletion or reduction of poly(A) upstream of 30 nt preceding AUG tended to improve the production of GFP and a feruloyl esterase. Deletions of poly(A) showed inconsistent effects on mRNA levels, suggesting that poly(A) represses protein production either with or without reducing mRNA levels. CONCLUSION The effects of poly(A) on protein production depend on its length and position. Integrating poly(A) features into machine-learning models improves simulation accuracy. Deleting or reducing poly(A) upstream of 30 nt preceding AUG tends to enhance protein production. This optimization strategy can be applied to enhance the yield of K. marxianus and other microbial cell factories.
Collapse
Affiliation(s)
- Junyuan Zeng
- State Key Laboratory of Genetic Engineering, School of Life Sciences, Fudan University, Shanghai, China
- Shanghai Engineering Research Center of Industrial Microorganisms, Shanghai, 200438, China
| | - Kunfeng Song
- State Key Laboratory of Genetic Engineering, School of Life Sciences, Fudan University, Shanghai, China
- Shanghai Engineering Research Center of Industrial Microorganisms, Shanghai, 200438, China
| | - Jingqi Wang
- State Key Laboratory of Genetic Engineering, School of Life Sciences, Fudan University, Shanghai, China
- Shanghai Engineering Research Center of Industrial Microorganisms, Shanghai, 200438, China
| | - Haimei Wen
- State Key Laboratory of Genetic Engineering, School of Life Sciences, Fudan University, Shanghai, China
- Shanghai Engineering Research Center of Industrial Microorganisms, Shanghai, 200438, China
| | - Jungang Zhou
- State Key Laboratory of Genetic Engineering, School of Life Sciences, Fudan University, Shanghai, China
- Shanghai Engineering Research Center of Industrial Microorganisms, Shanghai, 200438, China
| | - Ting Ni
- State Key Laboratory of Genetic Engineering, School of Life Sciences, Fudan University, Shanghai, China
- Shanghai Engineering Research Center of Industrial Microorganisms, Shanghai, 200438, China
| | - Hong Lu
- State Key Laboratory of Genetic Engineering, School of Life Sciences, Fudan University, Shanghai, China
- Shanghai Engineering Research Center of Industrial Microorganisms, Shanghai, 200438, China
| | - Yao Yu
- State Key Laboratory of Genetic Engineering, School of Life Sciences, Fudan University, Shanghai, China.
- Shanghai Engineering Research Center of Industrial Microorganisms, Shanghai, 200438, China.
| |
Collapse
|
4
|
de Rozières CM, Pequeno A, Shahabi S, Lucas TM, Godula K, Ghosh G, Joseph S. PABP1 Drives the Selective Translation of Influenza A Virus mRNA. J Mol Biol 2022; 434:167460. [PMID: 35074482 PMCID: PMC8897273 DOI: 10.1016/j.jmb.2022.167460] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2021] [Revised: 12/22/2021] [Accepted: 01/13/2022] [Indexed: 11/26/2022]
Abstract
Influenza A virus (IAV) is a human-infecting pathogen with a history of causing seasonal epidemics and on several occasions worldwide pandemics. Infection by IAV causes a dramatic decrease in host mRNA translation, whereas viral mRNAs are efficiently translated. The IAV mRNAs have a highly conserved 5'-untranslated region (5'UTR) that is rich in adenosine residues. We show that the human polyadenylate binding protein 1 (PABP1) binds to the 5'UTR of the viral mRNAs. The interaction of PABP1 with the viral 5'UTR makes the translation of viral mRNAs more resistant to canonical cap-dependent translation inhibition than model mRNAs. Additionally, PABP1 bound to the viral 5'UTR can recruit eIF4G in an eIF4E-independent manner. These results indicate that PABP1 bound to the viral 5'UTR may promote eIF4E-independent translation initiation.
Collapse
Affiliation(s)
- Cyrus M de Rozières
- Department of Chemistry and Biochemistry, University of California, San Diego, La Jolla, CA 92093-0314, USA
| | - Alberto Pequeno
- Department of Chemistry and Biochemistry, University of California, San Diego, La Jolla, CA 92093-0314, USA
| | - Shandy Shahabi
- Department of Chemistry and Biochemistry, University of California, San Diego, La Jolla, CA 92093-0314, USA
| | - Taryn M Lucas
- Department of Chemistry and Biochemistry, University of California, San Diego, La Jolla, CA 92093-0314, USA
| | - Kamil Godula
- Department of Chemistry and Biochemistry, University of California, San Diego, La Jolla, CA 92093-0314, USA
| | - Gourisankar Ghosh
- Department of Chemistry and Biochemistry, University of California, San Diego, La Jolla, CA 92093-0314, USA
| | - Simpson Joseph
- Department of Chemistry and Biochemistry, University of California, San Diego, La Jolla, CA 92093-0314, USA.
| |
Collapse
|
5
|
Rudzińska I, Płonka M, Armatowska A, Turowski TW, Boguta M. Rbs1 protein, involved in RNA polymerase III complex assembly in the yeast Saccharomyces cerevisiae, induces a Gcn4 response and forms aggregates when overproduced. Gene 2022; 809:146034. [PMID: 34688816 DOI: 10.1016/j.gene.2021.146034] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2021] [Revised: 08/26/2021] [Accepted: 10/19/2021] [Indexed: 11/24/2022]
Abstract
We previously reported the function of Rbs1 protein in RNA polymerase III complex assembly via interactions with both, proteins and mRNAs. Rbs1 is a poly(A)-binding protein. The R3H domain in Rbs1 is required for mRNA interactions. The present study utilized the results of a genome-wide analysis of RNA binding by Rbs1 to show a direct interaction between Rbs1 with the 5'-untranslated region (5'-UTR) in PCL5 mRNA. By examining Pcl5 protein levels, we found that Rbs1 overproduction inhibited the translation of PCL5 mRNA. Pcl5 is a cyclin that is associated with Pho85 kinase, which is involved in the degradation of Gcn4 transcription factor. Consequently, lower levels of Pcl5 that resulted from Rbs1 overproduction increased the Gcn4 response. The functional R3H domain in Rbs1 was required for the downregulation of Pcl5 translation and increase in the Gcn4 response, thus validating a regulatory mechanism that relies on the interaction between Rbs1 and the 5'-UTR in PCL5 mRNA. Rbs1 protein was further characterized by microscopy, which identified single Rbs1 assemblies in part of the cell population. The presence of Rbs1 aggregates was confirmed by the fractionation of cellular extracts. Altogether, our results suggest a more general role of Rbs1 in regulating cellular metabolism beyond the assembly of RNA polymerase III.
Collapse
Affiliation(s)
- Izabela Rudzińska
- Laboratory of tRNA Transcription, Institute of Biochemistry and Biophysics, Polish Academy of Sciences, Pawińskiego 5A, 02-106 Warsaw, Poland
| | - Marta Płonka
- Laboratory of tRNA Transcription, Institute of Biochemistry and Biophysics, Polish Academy of Sciences, Pawińskiego 5A, 02-106 Warsaw, Poland
| | - Alicja Armatowska
- Laboratory of tRNA Transcription, Institute of Biochemistry and Biophysics, Polish Academy of Sciences, Pawińskiego 5A, 02-106 Warsaw, Poland
| | - Tomasz W Turowski
- Laboratory of Transcription Mechanisms, Institute of Biochemistry and Biophysics, Polish Academy of Sciences, Pawińskiego 5A, 02-106 Warsaw, Poland
| | - Magdalena Boguta
- Laboratory of tRNA Transcription, Institute of Biochemistry and Biophysics, Polish Academy of Sciences, Pawińskiego 5A, 02-106 Warsaw, Poland.
| |
Collapse
|
6
|
Detailed Dissection and Critical Evaluation of the Pfizer/BioNTech and Moderna mRNA Vaccines. Vaccines (Basel) 2021; 9:vaccines9070734. [PMID: 34358150 PMCID: PMC8310186 DOI: 10.3390/vaccines9070734] [Citation(s) in RCA: 71] [Impact Index Per Article: 23.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2021] [Revised: 06/25/2021] [Accepted: 06/30/2021] [Indexed: 01/19/2023] Open
Abstract
The design of Pfizer/BioNTech and Moderna mRNA vaccines involves many different types of optimizations. Proper optimization of vaccine mRNA can reduce dosage required for each injection leading to more efficient immunization programs. The mRNA components of the vaccine need to have a 5′-UTR to load ribosomes efficiently onto the mRNA for translation initiation, optimized codon usage for efficient translation elongation, and optimal stop codon for efficient translation termination. Both 5′-UTR and the downstream 3′-UTR should be optimized for mRNA stability. The replacement of uridine by N1-methylpseudourinine (Ψ) complicates some of these optimization processes because Ψ is more versatile in wobbling than U. Different optimizations can conflict with each other, and compromises would need to be made. I highlight the similarities and differences between Pfizer/BioNTech and Moderna mRNA vaccines and discuss the advantage and disadvantage of each to facilitate future vaccine improvement. In particular, I point out a few optimizations in the design of the two mRNA vaccines that have not been performed properly.
Collapse
|
7
|
De Nijs Y, De Maeseneire SL, Soetaert WK. 5' untranslated regions: the next regulatory sequence in yeast synthetic biology. Biol Rev Camb Philos Soc 2019; 95:517-529. [PMID: 31863552 DOI: 10.1111/brv.12575] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2019] [Revised: 11/08/2019] [Accepted: 11/28/2019] [Indexed: 01/10/2023]
Abstract
When developing industrial biotechnology processes, Saccharomyces cerevisiae (baker's yeast or brewer's yeast) is a popular choice as a microbial host. Many tools have been developed in the fields of synthetic biology and metabolic engineering to introduce heterologous pathways and tune their expression in yeast. Such tools mainly focus on controlling transcription, whereas post-transcriptional regulation is often overlooked. Herein we discuss regulatory elements found in the 5' untranslated region (UTR) and their influence on protein synthesis. We provide not only an overall picture, but also a set of design rules on how to engineer a 5' UTR. The reader is also referred to currently available models that allow gene expression to be tuned predictably using different 5' UTRs.
Collapse
Affiliation(s)
- Yatti De Nijs
- Faculty of Bioscience Engineering, Centre for Industrial Biotechnology and Biocatalysis (InBio.be), Department Biotechnology, Ghent University, Coupure Links 653, 9000, Ghent, Belgium
| | - Sofie L De Maeseneire
- Faculty of Bioscience Engineering, Centre for Industrial Biotechnology and Biocatalysis (InBio.be), Department Biotechnology, Ghent University, Coupure Links 653, 9000, Ghent, Belgium
| | - Wim K Soetaert
- Faculty of Bioscience Engineering, Centre for Industrial Biotechnology and Biocatalysis (InBio.be), Department Biotechnology, Ghent University, Coupure Links 653, 9000, Ghent, Belgium
| |
Collapse
|
8
|
Vopálenský V, Sýkora M, Mašek T, Pospíšek M. Messenger RNAs of Yeast Virus-Like Elements Contain Non-templated 5' Poly(A) Leaders, and Their Expression Is Independent of eIF4E and Pab1. Front Microbiol 2019; 10:2366. [PMID: 31736885 PMCID: PMC6831550 DOI: 10.3389/fmicb.2019.02366] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2019] [Accepted: 09/30/2019] [Indexed: 02/01/2023] Open
Abstract
We employed virus-like elements (VLEs) pGKL1,2 from Kluyveromyces lactis as a model to investigate the previously neglected transcriptome of the broader group of yeast cytoplasmic linear dsDNA VLEs. We performed 5′ and 3′ RACE analyses of all pGKL1,2 mRNAs and found them not 3′ polyadenylated and containing frequently uncapped 5′ poly(A) leaders that are not complementary to VLE genomic DNA. The degree of 5′ capping and/or 5′ mRNA polyadenylation is specific to each gene and is controlled by the corresponding promoter region. The expression of pGKL1,2 transcripts is independent of eIF4E and Pab1 and is enhanced in lsm1Δ and pab1Δ strains. We suggest a model of primitive pGKL1,2 gene expression regulation in which the degree of 5′ mRNA capping and 5′ non-template polyadenylation, together with the presence of negative regulators such as Pab1 and Lsm1, play important roles. Our data also support a hypothesis of a close relationship between yeast linear VLEs and poxviruses.
Collapse
Affiliation(s)
- Václav Vopálenský
- Laboratory of RNA Biochemistry, Department of Genetics and Microbiology, Faculty of Science, Charles University, Prague, Czechia
| | - Michal Sýkora
- Laboratory of RNA Biochemistry, Department of Genetics and Microbiology, Faculty of Science, Charles University, Prague, Czechia
| | - Tomáš Mašek
- Laboratory of RNA Biochemistry, Department of Genetics and Microbiology, Faculty of Science, Charles University, Prague, Czechia
| | - Martin Pospíšek
- Laboratory of RNA Biochemistry, Department of Genetics and Microbiology, Faculty of Science, Charles University, Prague, Czechia
| |
Collapse
|
9
|
Rollins MG, Jha S, Bartom ET, Walsh D. RACK1 evolved species-specific multifunctionality in translational control through sequence plasticity within a loop domain. J Cell Sci 2019; 132:jcs.228908. [PMID: 31118235 DOI: 10.1242/jcs.228908] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2018] [Accepted: 05/14/2019] [Indexed: 01/23/2023] Open
Abstract
Receptor of activated protein C kinase 1 (RACK1) is a highly conserved eukaryotic protein that regulates several aspects of mRNA translation; yet, how it does so, remains poorly understood. Here we show that, although RACK1 consists largely of conserved β-propeller domains that mediate binding to several other proteins, a short interconnecting loop between two of these blades varies across species to control distinct RACK1 functions during translation. Mutants and chimeras revealed that the amino acid composition of the loop is optimized to regulate interactions with eIF6, a eukaryotic initiation factor that controls 60S biogenesis and 80S ribosome assembly. Separately, phylogenetics revealed that, despite broad sequence divergence of the loop, there is striking conservation of negatively charged residues amongst protists and dicot plants, which is reintroduced to mammalian RACK1 by poxviruses through phosphorylation. Although both charged and uncharged loop mutants affect eIF6 interactions, only a negatively charged plant - but not uncharged yeast or human loop - enhances translation of mRNAs with adenosine-rich 5' untranslated regions (UTRs). Our findings reveal how sequence plasticity within the RACK1 loop confers multifunctionality in translational control across species.
Collapse
Affiliation(s)
- Madeline G Rollins
- Department of Microbiology-Immunology, Feinberg School of Medicine, Northwestern University, Chicago, IL 60611, USA
| | - Sujata Jha
- Department of Microbiology-Immunology, Feinberg School of Medicine, Northwestern University, Chicago, IL 60611, USA
| | - Elizabeth T Bartom
- Department of Biochemistry and Molecular Genetics, Feinberg School of Medicine, Northwestern University, Chicago, IL 60611, USA
| | - Derek Walsh
- Department of Microbiology-Immunology, Feinberg School of Medicine, Northwestern University, Chicago, IL 60611, USA
| |
Collapse
|
10
|
Xia X. Translation Control of HAC1 by Regulation of Splicing in Saccharomyces cerevisiae. Int J Mol Sci 2019; 20:ijms20122860. [PMID: 31212749 PMCID: PMC6627864 DOI: 10.3390/ijms20122860] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2019] [Revised: 05/30/2019] [Accepted: 06/10/2019] [Indexed: 12/19/2022] Open
Abstract
Hac1p is a key transcription factor regulating the unfolded protein response (UPR) induced by abnormal accumulation of unfolded/misfolded proteins in the endoplasmic reticulum (ER) in Saccharomyces cerevisiae. The accumulation of unfolded/misfolded proteins is sensed by protein Ire1p, which then undergoes trans-autophosphorylation and oligomerization into discrete foci on the ER membrane. HAC1 pre-mRNA, which is exported to the cytoplasm but is blocked from translation by its intron sequence looping back to its 5’UTR to form base-pair interaction, is transported to the Ire1p foci to be spliced, guided by a cis-acting bipartite element at its 3’UTR (3’BE). Spliced HAC1 mRNA can be efficiently translated. The resulting Hac1p enters the nucleus and activates, together with coactivators, a large number of genes encoding proteins such as protein chaperones to restore and maintain ER homeostasis and secretary protein quality control. This review details the translation regulation of Hac1p production, mediated by the nonconventional splicing, in the broad context of translation control and summarizes the evolution and diversification of the UPR signaling pathway among fungal, metazoan and plant lineages.
Collapse
Affiliation(s)
- Xuhua Xia
- Department of Biology, University of Ottawa, Marie-Curie Private, Ottawa, ON K1N 9A7, Canada.
| |
Collapse
|
11
|
Sýkora M, Pospíšek M, Novák J, Mrvová S, Krásný L, Vopálenský V. Transcription apparatus of the yeast virus-like elements: Architecture, function, and evolutionary origin. PLoS Pathog 2018; 14:e1007377. [PMID: 30346988 PMCID: PMC6211774 DOI: 10.1371/journal.ppat.1007377] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/21/2018] [Revised: 11/01/2018] [Accepted: 10/03/2018] [Indexed: 11/19/2022] Open
Abstract
Extrachromosomal hereditary elements such as organelles, viruses, and plasmids are important for the cell fitness and survival. Their transcription is dependent on host cellular RNA polymerase (RNAP) or intrinsic RNAP encoded by these elements. The yeast Kluyveromyces lactis contains linear cytoplasmic DNA virus-like elements (VLEs, also known as linear plasmids) that bear genes encoding putative non-canonical two-subunit RNAP. Here, we describe the architecture and identify the evolutionary origin of this transcription machinery. We show that the two RNAP subunits interact in vivo, and this complex interacts with another two VLE-encoded proteins, namely the mRNA capping enzyme and a putative helicase. RNAP, mRNA capping enzyme and the helicase also interact with VLE-specific DNA in vivo. Further, we identify a promoter sequence element that causes 5' mRNA polyadenylation of VLE-specific transcripts via RNAP slippage at the transcription initiation site, and structural elements that precede the termination sites. As a result, we present a first model of the yeast virus-like element transcription initiation and intrinsic termination. Finally, we demonstrate that VLE RNAP and its promoters display high similarity to poxviral RNAP and promoters of early poxviral genes, respectively, thereby pointing to their evolutionary origin.
Collapse
Affiliation(s)
- Michal Sýkora
- Department of Genetics and Microbiology, Faculty of Science, Charles University, Prague, Czech Republic
| | - Martin Pospíšek
- Department of Genetics and Microbiology, Faculty of Science, Charles University, Prague, Czech Republic
- * E-mail: (MP); (VV)
| | - Josef Novák
- Department of Genetics and Microbiology, Faculty of Science, Charles University, Prague, Czech Republic
| | - Silvia Mrvová
- Department of Genetics and Microbiology, Faculty of Science, Charles University, Prague, Czech Republic
| | - Libor Krásný
- Institute of Microbiology, Academy of Sciences of the Czech Republic, Prague, Czech Republic
| | - Václav Vopálenský
- Department of Genetics and Microbiology, Faculty of Science, Charles University, Prague, Czech Republic
- * E-mail: (MP); (VV)
| |
Collapse
|
12
|
Brambilla M, Martani F, Bertacchi S, Vitangeli I, Branduardi P. The Saccharomyces cerevisiae
poly (A) binding protein (Pab1): Master regulator of mRNA metabolism and cell physiology. Yeast 2018; 36:23-34. [DOI: 10.1002/yea.3347] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2017] [Revised: 06/26/2018] [Accepted: 07/06/2018] [Indexed: 12/15/2022] Open
Affiliation(s)
- Marco Brambilla
- Department of Biotechnology and Biosciences; University of Milano-Bicocca; Piazza della Scienza 2 20126 Milan Italy
| | - Francesca Martani
- Department of Biotechnology and Biosciences; University of Milano-Bicocca; Piazza della Scienza 2 20126 Milan Italy
| | - Stefano Bertacchi
- Department of Biotechnology and Biosciences; University of Milano-Bicocca; Piazza della Scienza 2 20126 Milan Italy
| | - Ilaria Vitangeli
- Department of Biotechnology and Biosciences; University of Milano-Bicocca; Piazza della Scienza 2 20126 Milan Italy
| | - Paola Branduardi
- Department of Biotechnology and Biosciences; University of Milano-Bicocca; Piazza della Scienza 2 20126 Milan Italy
| |
Collapse
|
13
|
Abstract
Codon usage depends on mutation bias, tRNA-mediated selection, and the need for high efficiency and accuracy in translation. One codon in a synonymous codon family is often strongly over-used, especially in highly expressed genes, which often leads to a high dN/dS ratio because dS is very small. Many different codon usage indices have been proposed to measure codon usage and codon adaptation. Sense codon could be misread by release factors and stop codons misread by tRNAs, which also contribute to codon usage in rare cases. This chapter outlines the conceptual framework on codon evolution, illustrates codon-specific and gene-specific codon usage indices, and presents their applications. A new index for codon adaptation that accounts for background mutation bias (Index of Translation Elongation) is presented and contrasted with codon adaptation index (CAI) which does not consider background mutation bias. They are used to re-analyze data from a recent paper claiming that translation elongation efficiency matters little in protein production. The reanalysis disproves the claim.
Collapse
|
14
|
ARSDA: A New Approach for Storing, Transmitting and Analyzing Transcriptomic Data. G3-GENES GENOMES GENETICS 2017; 7:3839-3848. [PMID: 29079682 PMCID: PMC5714481 DOI: 10.1534/g3.117.300271] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
Two major stumbling blocks exist in high-throughput sequencing (HTS) data analysis. The first is the sheer file size, typically in gigabytes when uncompressed, causing problems in storage, transmission, and analysis. However, these files do not need to be so large, and can be reduced without loss of information. Each HTS file, either in compressed .SRA or plain text .fastq format, contains numerous identical reads stored as separate entries. For example, among 44,603,541 forward reads in the SRR4011234.sra file (from a Bacillus subtilis transcriptomic study) deposited at NCBI’s SRA database, one read has 497,027 identical copies. Instead of storing them as separate entries, one can and should store them as a single entry with the SeqID_NumCopy format (which I dub as FASTA+ format). The second is the proper allocation of reads that map equally well to paralogous genes. I illustrate in detail a new method for such allocation. I have developed ARSDA software that implement these new approaches. A number of HTS files for model species are in the process of being processed and deposited at http://coevol.rdc.uottawa.ca to demonstrate that this approach not only saves a huge amount of storage space and transmission bandwidth, but also dramatically reduces time in downstream data analysis. Instead of matching the 497,027 identical reads separately against the B. subtilis genome, one only needs to match it once. ARSDA includes functions to take advantage of HTS data in the new sequence format for downstream data analysis such as gene expression characterization. I contrasted gene expression results between ARSDA and Cufflinks so readers can better appreciate the strength of ARSDA. ARSDA is freely available for Windows, Linux. and Macintosh computers at http://dambe.bio.uottawa.ca/ARSDA/ARSDA.aspx.
Collapse
|
15
|
The 5'-poly(A) leader of poxvirus mRNA confers a translational advantage that can be achieved in cells with impaired cap-dependent translation. PLoS Pathog 2017; 13:e1006602. [PMID: 28854224 PMCID: PMC5595341 DOI: 10.1371/journal.ppat.1006602] [Citation(s) in RCA: 34] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2017] [Revised: 09/12/2017] [Accepted: 08/23/2017] [Indexed: 01/11/2023] Open
Abstract
The poly(A) leader at the 5'-untranslated region (5'-UTR) is an unusually striking feature of all poxvirus mRNAs transcribed after viral DNA replication (post-replicative mRNAs). These poly(A) leaders are non-templated and of heterogeneous lengths; and their function during poxvirus infection remains a long-standing question. Here, we discovered that a 5'-poly(A) leader conferred a selective translational advantage to mRNA in poxvirus-infected cells. A constitutive and uninterrupted 5'-poly(A) leader with 12 residues was optimal. Because the most frequent lengths of the 5'-poly(A) leaders are 8-12 residues, the result suggests that the poly(A) leader has been evolutionarily optimized to boost poxvirus protein production. A 5'-poly(A) leader also could increase protein production in the bacteriophage T7 promoter-based expression system of vaccinia virus, the prototypic member of poxviruses. Interestingly, although vaccinia virus post-replicative mRNAs do have 5'- methylated guanosine caps and can use cap-dependent translation, in vaccinia virus-infected cells, mRNA with a 5'-poly(A) leader could also be efficiently translated in cells with impaired cap-dependent translation. However, the translation was not mediated through an internal ribosome entry site (IRES). These results point to a fundamental mechanism poxvirus uses to efficiently translate its post-replicative mRNAs.
Collapse
|
16
|
Bai B, Peviani A, van der Horst S, Gamm M, Snel B, Bentsink L, Hanson J. Extensive translational regulation during seed germination revealed by polysomal profiling. THE NEW PHYTOLOGIST 2017; 214:233-244. [PMID: 27935038 PMCID: PMC5347915 DOI: 10.1111/nph.14355] [Citation(s) in RCA: 54] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/20/2016] [Accepted: 10/25/2016] [Indexed: 05/18/2023]
Abstract
This work investigates the extent of translational regulation during seed germination. The polysome occupancy of each gene is determined by genome-wide profiling of total mRNA and polysome-associated mRNA. This reveals extensive translational regulation during Arabidopsis thaliana seed germination. The polysome occupancy of thousands of individual mRNAs changes to a large extent during the germination process. Intriguingly, these changes are restricted to two temporal phases (shifts) during germination, seed hydration and germination. Sequence features, such as upstream open reading frame number, transcript length, mRNA stability, secondary structures, and the presence and location of specific motifs correlated with this translational regulation. These features differed significantly between the two shifts, indicating that independent mechanisms regulate translation during seed germination. This study reveals substantial translational dynamics during seed germination and identifies development-dependent sequence features and cis elements that correlate with the translation control, uncovering a novel and important layer of gene regulation during seed germination.
Collapse
Affiliation(s)
- Bing Bai
- Department of Molecular Plant PhysiologyUtrecht University3584 CHUtrechtthe Netherlands
- Wageningen Seed LaboratoryLaboratory of Plant PhysiologyWageningen University6708 PBWageningenthe Netherlands
| | - Alessia Peviani
- Theoretical Biology and BioinformaticsUtrecht University3584 CHUtrechtthe Netherlands
| | - Sjors van der Horst
- Wageningen Seed LaboratoryLaboratory of Plant PhysiologyWageningen University6708 PBWageningenthe Netherlands
- Theoretical Biology and BioinformaticsUtrecht University3584 CHUtrechtthe Netherlands
| | - Magdalena Gamm
- Department of Molecular Plant PhysiologyUtrecht University3584 CHUtrechtthe Netherlands
| | - Berend Snel
- Theoretical Biology and BioinformaticsUtrecht University3584 CHUtrechtthe Netherlands
| | - Leónie Bentsink
- Department of Molecular Plant PhysiologyUtrecht University3584 CHUtrechtthe Netherlands
- Wageningen Seed LaboratoryLaboratory of Plant PhysiologyWageningen University6708 PBWageningenthe Netherlands
| | - Johannes Hanson
- Department of Molecular Plant PhysiologyUtrecht University3584 CHUtrechtthe Netherlands
- Umeå Plant Science CentreDepartment of Plant PhysiologyUniversity of UmeåUmeåSE‐901 87Sweden
| |
Collapse
|
17
|
Abstract
Bioinformatic analysis can not only accelerate drug target identification and drug candidate screening and refinement, but also facilitate characterization of side effects and predict drug resistance. High-throughput data such as genomic, epigenetic, genome architecture, cistromic, transcriptomic, proteomic, and ribosome profiling data have all made significant contribution to mechanismbased drug discovery and drug repurposing. Accumulation of protein and RNA structures, as well as development of homology modeling and protein structure simulation, coupled with large structure databases of small molecules and metabolites, paved the way for more realistic protein-ligand docking experiments and more informative virtual screening. I present the conceptual framework that drives the collection of these high-throughput data, summarize the utility and potential of mining these data in drug discovery, outline a few inherent limitations in data and software mining these data, point out news ways to refine analysis of these diverse types of data, and highlight commonly used software and databases relevant to drug discovery.
Collapse
Affiliation(s)
- Xuhua Xia
- Department of Biology, Faculty of Science, University of Ottawa, Ottawa, Ontario K1N 6N5, Canada
- Ottawa Institute of Systems Biology, Ottawa K1H 8M5, Canada
| |
Collapse
|
18
|
Opposing Roles of Double-Stranded RNA Effector Pathways and Viral Defense Proteins Revealed with CRISPR-Cas9 Knockout Cell Lines and Vaccinia Virus Mutants. J Virol 2016; 90:7864-79. [PMID: 27334583 DOI: 10.1128/jvi.00869-16] [Citation(s) in RCA: 44] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2016] [Accepted: 06/16/2016] [Indexed: 12/15/2022] Open
Abstract
UNLABELLED Vaccinia virus (VACV) decapping enzymes and cellular exoribonuclease Xrn1 catalyze successive steps in mRNA degradation and prevent double-stranded RNA (dsRNA) accumulation, whereas the viral E3 protein can bind dsRNA. We showed that dsRNA and E3 colocalized within cytoplasmic viral factories in cells infected with a decapping enzyme mutant as well as with wild-type VACV and that they coprecipitated with antibody. An E3 deletion mutant induced protein kinase R (PKR) and eukaryotic translation initiation factor alpha (eIF2α) phosphorylation earlier and more strongly than a decapping enzyme mutant even though less dsRNA was made, leading to more profound effects on viral gene expression. Human HAP1 and A549 cells were genetically modified by clustered regularly interspaced short palindromic repeat-Cas9 (CRISPR-Cas9) to determine whether the same pathways restrict E3 and decapping mutants. The E3 mutant replicated in PKR knockout (KO) HAP1 cells in which RNase L is intrinsically inactive but only with a double knockout (DKO) of PKR and RNase L in A549 cells, indicating that both pathways decreased replication equivalently and that no additional dsRNA pathway was crucial. In contrast, replication of the decapping enzyme mutant increased significantly (though less than that of wild-type virus) in DKO A549 cells but not in DKO HAP1 cells where a smaller increase in viral protein synthesis occurred. Xrn1 KO A549 cells were viable but nonpermissive for VACV; however, wild-type and mutant viruses replicated in triple-KO cells in which RNase L and PKR were also inactivated. Since KO of PKR and RNase L was sufficient to enable VACV replication in the absence of E3 or Xrn1, the poor replication of the decapping mutant, particularly in HAP1 DKO, cells indicated additional translational defects. IMPORTANCE Viruses have evolved ways of preventing or counteracting the cascade of antiviral responses that double-stranded RNA (dsRNA) triggers in host cells. We showed that the dsRNA produced in excess in cells infected with a vaccinia virus (VACV) decapping enzyme mutant and by wild-type virus colocalized with the viral E3 protein in cytoplasmic viral factories. Novel human cell lines defective in either or both protein kinase R and RNase L dsRNA effector pathways and/or the cellular 5' exonuclease Xrn1 were prepared by CRISPR-Cas9 gene editing. Inactivation of both pathways was necessary and sufficient to allow full replication of the E3 mutant and reverse the defect cause by inactivation of Xrn1, whereas the decapping enzyme mutant still exhibited defects in gene expression. The study provided new insights into functions of the VACV proteins, and the well-characterized panel of CRISPR-Cas9-modified human cell lines should have broad applicability for studying innate dsRNA pathways.
Collapse
|
19
|
Yang F, Ji QQ, Ruan LL, Ye Q, Wang ED. The mRNA of human cytoplasmic arginyl-tRNA synthetase recruits prokaryotic ribosomes independently. J Biol Chem 2015; 289:20953-9. [PMID: 24898251 DOI: 10.1074/jbc.m114.562454] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022] Open
Abstract
There are two isoforms of cytoplasmic arginyl-tRNA synthetase (hcArgRS) in human cells. The long form is a component of the multiple aminoacyl-tRNA synthetase complex, and the other is an N-terminal truncated form (NhcArgRS), free in the cytoplasm. It has been shown that the two forms of ArgRS arise from alternative translational initiation in a single mRNA. The short form is produced from the initiation at a downstream, in-frame AUG start codon. Interestingly, our data suggest that the alternative translational initiation of hcArgRS mRNA also takes place in Escherichia coli transformants. When the gene encoding full-length hcArgRS was overexpressed in E. coli, two forms of hcArgRS were observed. The N-terminal sequencing experiment identified that the short form was identical to the NhcArgRS in human cytoplasm. By constructing a bicistronic system, our data support that the mRNA encoding the N-terminal extension of hcArgRS has the capacity of independently recruiting E. coli ribosomes. Furthermore, two critical elements for recruiting prokaryotic ribosomes were identified, the “AGGA” core of the Shine-Dalgarno sequence and the “A-rich” sequence located just proximal to the alternative in-frame initiation site. Although the mechanisms of prokaryotic and eukaryotic translational initiation are distinct, they share some common features. The ability of the hcArgRS mRNA to recruit the prokaryotic ribosome may provide clues for shedding light on the mechanism of alternative translational initiation of hcArgRS mRNA in eukaryotic cells.
Collapse
|
20
|
Prabhakaran R, Chithambaram S, Xia X. Escherichia coli and Staphylococcus phages: effect of translation initiation efficiency on differential codon adaptation mediated by virulent and temperate lifestyles. J Gen Virol 2015; 96:1169-1179. [PMID: 25614589 PMCID: PMC4631060 DOI: 10.1099/vir.0.000050] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2014] [Accepted: 01/11/2015] [Indexed: 12/19/2022] Open
Abstract
Rapid biosynthesis is key to the success of bacteria and viruses. Highly expressed genes in bacteria exhibit a strong codon bias corresponding to the differential availability of tRNAs. However, a large clade of lambdoid coliphages exhibits relatively poor codon adaptation to the host translation machinery, in contrast to other coliphages that exhibit strong codon adaptation to the host. Three possible explanations were previously proposed but dismissed: (1) the phage-borne tRNA genes that reduce the dependence of phage translation on host tRNAs, (2) lack of time needed for evolving codon adaptation due to recent host switching, and (3) strong strand asymmetry with biased mutation disrupting codon adaptation. Here, we examined the possibility that phages with relatively poor codon adaptation have poor translation initiation which would weaken the selection on codon adaptation. We measured translation initiation by: (1) the strength and position of the Shine-Dalgarno (SD) sequence, and (2) the stability of the secondary structure of sequences flanking the SD and start codon known to affect accessibility of the SD sequence and start codon. Phage genes with strong codon adaptation had significantly stronger SD sequences than those with poor codon adaptation. The former also had significantly weaker secondary structure in sequences flanking the SD sequence and start codon than the latter. Thus, lambdoid phages do not exhibit strong codon adaptation because they have relatively inefficient translation initiation and would benefit little from increased elongation efficiency. We also provided evidence suggesting that phage lifestyle (virulent versus temperate) affected selection intensity on the efficiency of translation initiation and elongation.
Collapse
Affiliation(s)
- Ramanandan Prabhakaran
- Department of Biology and Center for Advanced Research in Environmental Genomics, University of Ottawa, 30 Marie Curie, PO Box 450, Station A, Ottawa, Ontario K1N 6N5, Canada
| | - Shivapriya Chithambaram
- Department of Biology and Center for Advanced Research in Environmental Genomics, University of Ottawa, 30 Marie Curie, PO Box 450, Station A, Ottawa, Ontario K1N 6N5, Canada
| | - Xuhua Xia
- Department of Biology and Center for Advanced Research in Environmental Genomics, University of Ottawa, 30 Marie Curie, PO Box 450, Station A, Ottawa, Ontario K1N 6N5, Canada
- Correspondence Xuhua Xia
| |
Collapse
|
21
|
Bioinformatics analysis of alternative polyadenylation in green alga Chlamydomonas reinhardtii using transcriptome sequences from three different sequencing platforms. G3-GENES GENOMES GENETICS 2014; 4:871-83. [PMID: 24626288 PMCID: PMC4025486 DOI: 10.1534/g3.114.010249] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/05/2022]
Abstract
Messenger RNA 3′-end formation is an essential posttranscriptional processing step for most eukaryotic genes. Different from plants and animals where AAUAAA and its variants routinely are found as the main poly(A) signal, Chlamydomonas reinhardtii uses UGUAA as the major poly(A) signal. The advance of sequencing technology provides an enormous amount of sequencing data for us to explore the variations of poly(A) signals, alternative polyadenylation (APA), and its relationship with splicing in this algal species. Through genome-wide analysis of poly(A) sites in C. reinhardtii, we identified a large number of poly(A) sites: 21,041 from Sanger expressed sequence tags, 88,184 from 454, and 195,266 from Illumina sequence reads. In comparison with previous collections, more new poly(A) sites are found in coding sequences and intron and intergenic regions by deep-sequencing. Interestingly, G-rich signals are particularly abundant in intron and intergenic regions. The prevalence of different poly(A) signals between coding sequences and a 3′-untranslated region implies potentially different polyadenylation mechanisms. Our data suggest that the APA occurs in about 68% of C. reinhardtii genes. Using Gene Ontolgy analysis, we found most of the APA genes are involved in RNA regulation and metabolic process, protein synthesis, hydrolase, and ligase activities. Moreover, intronic poly(A) sites are more abundant in constitutively spliced introns than retained introns, suggesting an interplay between polyadenylation and splicing. Our results support that APA, as in higher eukaryotes, may play significant roles in increasing transcriptome diversity and gene expression regulation in this algal species. Our datasets also provide useful information for accurate annotation of transcript ends in C. reinhardtii.
Collapse
|
22
|
Abstract
Studying phage codon adaptation is important not only for understanding the process of translation elongation, but also for reengineering phages for medical and industrial purposes. To evaluate the effect of mutation and selection on phage codon usage, we developed an index to measure selection imposed by host translation machinery, based on the difference in codon usage between all host genes and highly expressed host genes. We developed linear and nonlinear models to estimate the C→T mutation bias in different phage lineages and to evaluate the relative effect of mutation and host selection on phage codon usage. C→T-biased mutations occur more frequently in single-stranded DNA (ssDNA) phages than in double-stranded DNA (dsDNA) phages and affect not only synonymous codon usage, but also nonsynonymous substitutions at second codon positions, especially in ssDNA phages. The host translation machinery affects codon adaptation in both dsDNA and ssDNA phages, with a stronger effect on dsDNA phages than on ssDNA phages. Strand asymmetry with the associated local variation in mutation bias can significantly interfere with codon adaptation in both dsDNA and ssDNA phages.
Collapse
|
23
|
Xia X. DAMBE5: a comprehensive software package for data analysis in molecular biology and evolution. Mol Biol Evol 2013; 30:1720-8. [PMID: 23564938 PMCID: PMC3684854 DOI: 10.1093/molbev/mst064] [Citation(s) in RCA: 739] [Impact Index Per Article: 67.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022] Open
Abstract
Since its first release in 2001 as mainly a software package for phylogenetic analysis, data analysis for molecular biology and evolution (DAMBE) has gained many new functions that may be classified into six categories: 1) sequence retrieval, editing, manipulation, and conversion among more than 20 standard sequence formats including MEGA, NEXUS, PHYLIP, GenBank, and the new NeXML format for interoperability, 2) motif characterization and discovery functions such as position weight matrix and Gibbs sampler, 3) descriptive genomic analysis tools with improved versions of codon adaptation index, effective number of codons, protein isoelectric point profiling, RNA and protein secondary structure prediction and calculation of minimum folding energy, and genomic skew plots with optimized window size, 4) molecular phylogenetics including sequence alignment, testing substitution saturation, distance-based, maximum parsimony, and maximum-likelihood methods for tree reconstructions, testing the molecular clock hypothesis with either a phylogeny or with relative-rate tests, dating gene duplication and speciation events, choosing the best-fit substitution models, and estimating rate heterogeneity over sites, 5) phylogeny-based comparative methods for continuous and discrete variables, and 6) graphic functions including secondary structure display, optimized skew plot, hydrophobicity plot, and many other plots of amino acid properties along a protein sequence, tree display and drawing by dragging nodes to each other, and visual searching of the maximum parsimony tree. DAMBE features a graphic, user-friendly, and intuitive interface and is freely available from http://dambe.bio.uottawa.ca (last accessed April 16, 2013).
Collapse
Affiliation(s)
- Xuhua Xia
- Department of Biology and Center for Advanced Research in Environmental Genomics, University of Ottawa, Ottawa, Ontario, Canada
| |
Collapse
|
24
|
Kramer S, Bannerman-Chukualim B, Ellis L, Boulden EA, Kelly S, Field MC, Carrington M. Differential localization of the two T. brucei poly(A) binding proteins to the nucleus and RNP granules suggests binding to distinct mRNA pools. PLoS One 2013; 8:e54004. [PMID: 23382864 PMCID: PMC3559699 DOI: 10.1371/journal.pone.0054004] [Citation(s) in RCA: 38] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2012] [Accepted: 12/06/2012] [Indexed: 12/30/2022] Open
Abstract
The number of paralogs of proteins involved in translation initiation is larger in trypanosomes than in yeasts or many metazoan and includes two poly(A) binding proteins, PABP1 and PABP2, and four eIF4E variants. In many cases, the paralogs are individually essential and are thus unlikely to have redundant functions although, as yet, distinct functions of different isoforms have not been determined. Here, trypanosome PABP1 and PABP2 have been further characterised. PABP1 and PABP2 diverged subsequent to the differentiation of the Kinetoplastae lineage, supporting the existence of specific aspects of translation initiation regulation. PABP1 and PABP2 exhibit major differences in intracellular localization and distribution on polysome fractionation under various conditions that interfere with mRNA metabolism. Most striking are differences in localization to the four known types of inducible RNP granules. Moreover, only PABP2 but not PABP1 can accumulate in the nucleus. Taken together, these observations indicate that PABP1 and PABP2 likely associate with distinct populations of mRNAs. The differences in localization to inducible RNP granules also apply to paralogs of components of the eIF4F complex: eIF4E1 showed similar localization pattern to PABP2, whereas the localisation of eIF4E4 and eIF4G3 resembled that of PABP1. The grouping of translation initiation as either colocalizing with PABP1 or with PABP2 can be used to complement interaction studies to further define the translation initiation complexes in kinetoplastids.
Collapse
Affiliation(s)
- Susanne Kramer
- Department of Biochemistry, University of Cambridge, Cambridge, United Kingdom
| | | | - Louise Ellis
- Department of Biochemistry, University of Cambridge, Cambridge, United Kingdom
| | | | - Steve Kelly
- Department of Plant Sciences, University of Oxford, and Oxford Centre for Integrative Systems Biology, Department of Biochemistry, University of Oxford, Oxford, United Kingdom
| | - Mark C. Field
- Department of Pathology, University of Cambridge, Cambridge, United Kingdom
| | - Mark Carrington
- Department of Biochemistry, University of Cambridge, Cambridge, United Kingdom
- * E-mail:
| |
Collapse
|
25
|
Motalleb G. Functional motifs in Escherichia coli NC101. INTERNATIONAL JOURNAL OF MOLECULAR AND CELLULAR MEDICINE 2013; 2:177-84. [PMID: 24551810 PMCID: PMC3927380] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/14/2013] [Accepted: 09/30/2013] [Indexed: 12/01/2022]
Abstract
Escherichia coli (E. coli) bacteria can damage DNA of the gut lining cells and may encourage the development of colon cancer according to recent reports. Genetic switches are specific sequence motifs and many of them are drug targets. It is interesting to know motifs and their location in sequences. At the present study, Gibbs sampler algorithm was used in order to predict and find functional motifs in E. coli NC101 contig 1. The whole genomic sequence of Escherichia coli NC101 contig 1 were retrieved from http://www.ncbi.nlm.nih.gov (NCBI Reference sequence: NZ_AEFA01000001.1) in order to be analyzed with DAMBE software and BLAST. The results showed that the 6-mer motif is CUGGAA in most sequences (genes1-3, 8, 9, 12, 14-18, 20-23, 25, 27, 29, 31-34), CUUGUA for gene 4 , CUGUAA for gene 5, CUGAUG for gene 6, CUGAUA for gene7, CUGAAA for genes 10, 11, 13, 26, 28, and CUGGAG for gene 19, and CUGGUA for gene30 in E. coli NC101 contig 1. It is concluded that the 6-mer motif is CUGGAA in most sequences in E. coli NC101 contig1. The present study may help experimental studies on elucidating the pharmacological and phylogenic functions of the motifs in E. coli.
Collapse
Affiliation(s)
- Gholamreza Motalleb
- Corresponding author: Department of Biology, University of Zabol, Zabol, Iran. reza.motaleb @uoz.ac.ir ;
| |
Collapse
|
26
|
Xia X. Position weight matrix, gibbs sampler, and the associated significance tests in motif characterization and prediction. SCIENTIFICA 2012; 2012:917540. [PMID: 24278755 PMCID: PMC3820676 DOI: 10.6064/2012/917540] [Citation(s) in RCA: 34] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/22/2012] [Accepted: 10/11/2012] [Indexed: 05/31/2023]
Abstract
Position weight matrix (PWM) is not only one of the most widely used bioinformatic methods, but also a key component in more advanced computational algorithms (e.g., Gibbs sampler) for characterizing and discovering motifs in nucleotide or amino acid sequences. However, few generally applicable statistical tests are available for evaluating the significance of site patterns, PWM, and PWM scores (PWMS) of putative motifs. Statistical significance tests of the PWM output, that is, site-specific frequencies, PWM itself, and PWMS, are in disparate sources and have never been collected in a single paper, with the consequence that many implementations of PWM do not include any significance test. Here I review PWM-based methods used in motif characterization and prediction (including a detailed illustration of the Gibbs sampler for de novo motif discovery), present statistical and probabilistic rationales behind statistical significance tests relevant to PWM, and illustrate their application with real data. The multiple comparison problem associated with the test of site-specific frequencies is best handled by false discovery rate methods. The test of PWM, due to the use of pseudocounts, is best done by resampling methods. The test of individual PWMS for each sequence segment should be based on the extreme value distribution.
Collapse
Affiliation(s)
- Xuhua Xia
- Department of Biology, University of Ottawa, 30 Marie Curie, Ottawa, ON, Canada K1N 6N5
| |
Collapse
|
27
|
Sun X, Yang Q, Xia X. An improved implementation of effective number of codons (nc). Mol Biol Evol 2012; 30:191-6. [PMID: 22915832 DOI: 10.1093/molbev/mss201] [Citation(s) in RCA: 50] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
The effective number of codons (N(c)) is a widely used index for characterizing codon usage bias because it does not require a set of reference genes as does codon adaptation index (CAI) and because of the freely available computational tools such as CodonW. However, N(c), as originally formulated has many problems. For example, it can have values far greater than the number of sense codons; it treats a 6-fold compound codon family as a single-codon family although it is made of a 2-fold and a 4-fold codon family that can be under dramatically different selection for codon usage bias; the existing implementations do not handle all different genetic codes; it is often biased by codon families with a small number of codons. We developed a new N(c) that has a number of advantages over the original N(c). Its maximum value equals the number of sense codons when all synonymous codons are used equally, and its minimum value equals the number of codon families when exactly one codon is used in each synonymous codon family. It handles all known genetic codes. It breaks the compound codon families (e.g., those involving amino acids coded by six synonymous codons) into 2-fold and 4-fold codon families. It reduces the effect of codon families with few codons by introducing pseudocount and weighted averages. The new N(c) has significantly improved correlation with CAI than the original N(c) from CodonW based on protein-coding genes from Saccharomyces cerevisiae, Caenorhabditis elegans, Drosophila melanogaster, Escherichia coli, Bacillus subtilis, Micrococcus luteus, and Mycoplasma genitalium. It also correlates better with protein abundance data from the yeast than the original N(c).
Collapse
Affiliation(s)
- Xiaoyan Sun
- State Key Laboratory of Paleobiology and Stratigraphy, Nanjing Institute of Geology and Palaeontology, Chinese Academy of Science, Nanjing, China
| | | | | |
Collapse
|
28
|
Yang Z, Martens CA, Bruno DP, Porcella SF, Moss B. Pervasive initiation and 3'-end formation of poxvirus postreplicative RNAs. J Biol Chem 2012; 287:31050-60. [PMID: 22829601 DOI: 10.1074/jbc.m112.390054] [Citation(s) in RCA: 42] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022] Open
Abstract
Poxviruses are large DNA viruses that replicate within the cytoplasm and encode a complete transcription system, including a multisubunit RNA polymerase, stage-specific transcription factors, capping and methylating enzymes, and a poly(A) polymerase. Expression of the more than 200 open reading frames by vaccinia virus, the prototype poxvirus, is temporally regulated: early mRNAs are synthesized immediately after infection, whereas intermediate and late mRNAs are synthesized following genome replication. The postreplicative transcripts are heterogeneous in length and overlap the entire genome, which pose obstacles for high resolution mapping. We used tag-based methods in conjunction with high throughput cDNA sequencing to determine the precise 5'-capped and 3'-polyadenylated ends of postreplicative RNAs. Polymerase slippage during initiation of intermediate and late RNA synthesis results in a 5'-poly(A) leader that allowed the unambiguous identification of true transcription start sites. Ninety RNA start sites were located just upstream of intermediate and late open reading frames, but many more appeared anomalous, occurring within coding and non-coding regions, indicating pervasive transcription initiation. We confirmed the presence of functional promoter sequences upstream of representative anomalous start sites and demonstrated that alternative start sites within open reading frames could generate truncated isoforms of proteins. In an analogous manner, poly(A) sequences allowed accurate mapping of the numerous 3'-ends of postreplicative RNAs, which were preceded by a pyrimidine-rich sequence in the DNA coding strand. The distribution of postreplicative promoter sequences throughout the genome provides enormous transcriptional complexity, and the large number of previously unmapped RNAs may have novel functions.
Collapse
Affiliation(s)
- Zhilong Yang
- Laboratory of Viral Diseases, NIAID, National Institutes of Health, Bethesda, Maryland 20892-3210, USA
| | | | | | | | | |
Collapse
|
29
|
A new framework for understanding IRES-mediated translation. Gene 2012; 502:75-86. [PMID: 22555019 DOI: 10.1016/j.gene.2012.04.039] [Citation(s) in RCA: 70] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2011] [Revised: 03/23/2012] [Accepted: 04/17/2012] [Indexed: 01/08/2023]
Abstract
Studies over the past 5 or so years have indicated that the traditional clustering of mechanisms for translation initiation in eukaryotes into cap-dependent and cap-independent (or IRES-mediated) is far too narrow. From individual studies of a number of mRNAs encoding proteins that are regulatory in nature (i.e. likely to be needed in small amounts such as transcription factors, protein kinases, etc.), it is now evident that mRNAs exist that blur these boundaries. This review seeks to set the basic ground rules for the analysis of different initiation pathways that are associated with these new mRNAs as well as related to the more traditional mechanisms, especially the cap-dependent translational process that is the major route of initiation of mRNAs for housekeeping proteins and thus, the bulk of protein synthesis in most cells. It will become apparent that a mixture of descriptions is likely to become the norm in the near future (i.e. m(7)G-assisted internal initiation).
Collapse
|
30
|
Factors affecting splicing strength of yeast genes. Comp Funct Genomics 2011; 2011:212146. [PMID: 22162666 PMCID: PMC3226532 DOI: 10.1155/2011/212146] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2011] [Accepted: 09/06/2011] [Indexed: 01/30/2023] Open
Abstract
Accurate and efficient splicing is of crucial importance for highly-transcribed intron-containing genes (ICGs) in rapidly replicating unicellular eukaryotes such as the budding yeast Saccharomyces cerevisiae. We characterize the 5' and 3' splice sites (ss) by position weight matrix scores (PWMSs), which is the highest for the consensus sequence and the lowest for splice sites differing most from the consensus sequence and used PWMS as a proxy for splicing strength. HAC1, which is known to be spliced by a nonspliceosomal mechanism, has the most negative PWMS for both its 5' ss and 3' ss. Several genes under strong splicing regulation and requiring additional splicing factors for their splicing also have small or negative PWMS values. Splicing strength is higher for highly transcribed ICGs than for lowly transcribed ICGs and higher for transcripts that bind strongly to spliceosomes than those that bind weakly. The 3' splice site features a prominent poly-U tract before the 3'AG. Our results suggest the potential of using PWMS as a screening tool for ICGs that are either spliced by a nonspliceosome mechanism or under strong splicing regulation in yeast and other fungal species.
Collapse
|