1
|
Kumar P, Johnson JE, McGowan T, Chambers MC, Heydarian M, Mehta S, Easterly C, Griffin TJ, Jagtap PD. Discovering Novel Proteoforms Using Proteogenomic Workflows Within the Galaxy Bioinformatics Platform. Methods Mol Biol 2025; 2859:109-128. [PMID: 39436599 DOI: 10.1007/978-1-0716-4152-1_7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2024]
Abstract
Proteogenomics is a growing "multi-omics" research area that combines mass spectrometry-based proteomics and high-throughput nucleotide sequencing technologies. Proteogenomics has helped in genomic annotation for organisms whose complete genome sequences became available by using high-throughput DNA sequencing technologies. Apart from genome annotation, this multi-omics approach has also helped researchers confirm expression of variant proteins belonging to unique proteoforms that could have resulted from single-nucleotide polymorphism (SNP), insertion and deletions (Indels), splice isoforms, or other genome or transcriptome variations.A proteogenomic study depends on a multistep informatics workflow, requiring different software at each step. These integrated steps include creating an appropriate protein sequence database, matching spectral data against these sequences, and finally identifying peptide sequences corresponding to novel proteoforms followed by variant classification and functional analysis. The disparate software required for a proteogenomic study is difficult for most researchers to access and use, especially those lacking computational expertise. Furthermore, using them disjointedly can be error-prone as it requires setting up individual parameters for each software. Consequently, reproducibility suffers. Managing output files from each software is an additional challenge. One solution for these challenges in proteogenomics is the open-source Web-based computational platform Galaxy. Its capability to create and manage workflows comprised of disparate software while recording and saving all important parameters promotes both usability and reproducibility. Here, we describe a workflow that can perform proteogenomic analysis on a Galaxy-based platform. This Galaxy workflow facilitates matching of spectral data with a customized protein sequence database, identifying novel protein variants, assessing quality of results, and classifying variants along with visualization against the genome.
Collapse
Affiliation(s)
- Praveen Kumar
- Data Sciences & Quantitative Biology, Discovery Sciences, AstraZeneca, Waltham, MA, USA
- Department of Biochemistry, Molecular Biology and Biophysics, University of Minnesota, Minneapolis, MN, USA
- Bioinformatics and Computational Biology, University of Minnesota, Minneapolis, MN, USA
| | - James E Johnson
- Minnesota Supercomputing Institute, University of Minnesota, Minneapolis, MN, USA
| | - Thomas McGowan
- Minnesota Supercomputing Institute, University of Minnesota, Minneapolis, MN, USA
| | | | | | - Subina Mehta
- Department of Biochemistry, Molecular Biology and Biophysics, University of Minnesota, Minneapolis, MN, USA
| | - Caleb Easterly
- Carolina Population Center, University of North Carolina, Chapel Hill, NC, USA
| | - Timothy J Griffin
- Department of Biochemistry, Molecular Biology and Biophysics, University of Minnesota, Minneapolis, MN, USA
| | - Pratik D Jagtap
- Department of Biochemistry, Molecular Biology and Biophysics, University of Minnesota, Minneapolis, MN, USA.
| |
Collapse
|
2
|
Ward C, Beharry A, Tennakoon R, Rozik P, Wilhelm SDP, Heinemann IU, O’Donoghue P. Mechanisms and Delivery of tRNA Therapeutics. Chem Rev 2024; 124:7976-8008. [PMID: 38801719 PMCID: PMC11212642 DOI: 10.1021/acs.chemrev.4c00142] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2024] [Revised: 04/11/2024] [Accepted: 04/26/2024] [Indexed: 05/29/2024]
Abstract
Transfer ribonucleic acid (tRNA) therapeutics will provide personalized and mutation specific medicines to treat human genetic diseases for which no cures currently exist. The tRNAs are a family of adaptor molecules that interpret the nucleic acid sequences in our genes into the amino acid sequences of proteins that dictate cell function. Humans encode more than 600 tRNA genes. Interestingly, even healthy individuals contain some mutant tRNAs that make mistakes. Missense suppressor tRNAs insert the wrong amino acid in proteins, and nonsense suppressor tRNAs read through premature stop signals to generate full length proteins. Mutations that underlie many human diseases, including neurodegenerative diseases, cancers, and diverse rare genetic disorders, result from missense or nonsense mutations. Thus, specific tRNA variants can be strategically deployed as therapeutic agents to correct genetic defects. We review the mechanisms of tRNA therapeutic activity, the nature of the therapeutic window for nonsense and missense suppression as well as wild-type tRNA supplementation. We discuss the challenges and promises of delivering tRNAs as synthetic RNAs or as gene therapies. Together, tRNA medicines will provide novel treatments for common and rare genetic diseases in humans.
Collapse
Affiliation(s)
- Cian Ward
- Department of Biochemistry, Department of Chemistry, The University of Western Ontario, London, Ontario N6A 5C1, Canada
| | - Aruun Beharry
- Department of Biochemistry, Department of Chemistry, The University of Western Ontario, London, Ontario N6A 5C1, Canada
| | - Rasangi Tennakoon
- Department of Biochemistry, Department of Chemistry, The University of Western Ontario, London, Ontario N6A 5C1, Canada
| | - Peter Rozik
- Department of Biochemistry, Department of Chemistry, The University of Western Ontario, London, Ontario N6A 5C1, Canada
| | - Sarah D. P. Wilhelm
- Department of Biochemistry, Department of Chemistry, The University of Western Ontario, London, Ontario N6A 5C1, Canada
| | - Ilka U. Heinemann
- Department of Biochemistry, Department of Chemistry, The University of Western Ontario, London, Ontario N6A 5C1, Canada
| | - Patrick O’Donoghue
- Department of Biochemistry, Department of Chemistry, The University of Western Ontario, London, Ontario N6A 5C1, Canada
| |
Collapse
|
3
|
Wright DE, O’Donoghue P. Biosynthesis, Engineering, and Delivery of Selenoproteins. Int J Mol Sci 2023; 25:223. [PMID: 38203392 PMCID: PMC10778597 DOI: 10.3390/ijms25010223] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2023] [Revised: 12/14/2023] [Accepted: 12/18/2023] [Indexed: 01/12/2024] Open
Abstract
Selenocysteine (Sec) was discovered as the 21st genetically encoded amino acid. In nature, site-directed incorporation of Sec into proteins requires specialized biosynthesis and recoding machinery that evolved distinctly in bacteria compared to archaea and eukaryotes. Many organisms, including higher plants and most fungi, lack the Sec-decoding trait. We review the discovery of Sec and its role in redox enzymes that are essential to human health and important targets in disease. We highlight recent genetic code expansion efforts to engineer site-directed incorporation of Sec in bacteria and yeast. We also review methods to produce selenoproteins with 21 or more amino acids and approaches to delivering recombinant selenoproteins to mammalian cells as new applications for selenoproteins in synthetic biology.
Collapse
Affiliation(s)
- David E. Wright
- Department of Biochemistry, The University of Western Ontario, London, ON N6A 5C1, Canada;
| | - Patrick O’Donoghue
- Department of Biochemistry, The University of Western Ontario, London, ON N6A 5C1, Canada;
- Department of Chemistry, The University of Western Ontario, London, ON N6A 5C1, Canada
| |
Collapse
|
4
|
Shirokikh NE, Jensen KB, Thakor N. Editorial: RNA machines. Front Genet 2023; 14:1290420. [PMID: 37829284 PMCID: PMC10565666 DOI: 10.3389/fgene.2023.1290420] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2023] [Accepted: 09/18/2023] [Indexed: 10/14/2023] Open
Affiliation(s)
- Nikolay E. Shirokikh
- The John Curtin School of Medical Research, The Australian National University, Canberra, ACT, Australia
| | - Kirk Blomquist Jensen
- School of Biological Sciences, Faculty of Sciences, University of Adelaide, Adelaide, SA, Australia
| | - Nehal Thakor
- Department of Chemistry and Biochemistry, University of Lethbridge, Lethbridge, AB, Canada
| |
Collapse
|
5
|
Hasan F, Lant JT, O'Donoghue P. Perseverance of protein homeostasis despite mistranslation of glycine codons with alanine. Philos Trans R Soc Lond B Biol Sci 2023; 378:20220029. [PMID: 36633285 PMCID: PMC9835607 DOI: 10.1098/rstb.2022.0029] [Citation(s) in RCA: 10] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2022] [Accepted: 10/05/2022] [Indexed: 01/13/2023] Open
Abstract
By linking amino acids to their codon assignments, transfer RNAs (tRNAs) are essential for protein synthesis and translation fidelity. Some human tRNA variants cause amino acid mis-incorporation at a codon or set of codons. We recently found that a naturally occurring tRNASer variant decodes phenylalanine codons with serine and inhibits protein synthesis. Here, we hypothesized that human tRNA variants that misread glycine (Gly) codons with alanine (Ala) will also disrupt protein homeostasis. The A3G mutation occurs naturally in tRNAGly variants (tRNAGlyCCC, tRNAGlyGCC) and creates an alanyl-tRNA synthetase (AlaRS) identity element (G3 : U70). Because AlaRS does not recognize the anticodon, the human tRNAAlaAGC G35C (tRNAAlaACC) variant may function similarly to mis-incorporate Ala at Gly codons. The tRNAGly and tRNAAla variants had no effect on protein synthesis in mammalian cells under normal growth conditions; however, tRNAGlyGCC A3G depressed protein synthesis in the context of proteasome inhibition. Mass spectrometry confirmed Ala mistranslation at multiple Gly codons caused by the tRNAGlyGCC A3G and tRNAAlaAGC G35C mutants, and in some cases, we observed multiple mistranslation events in the same peptide. The data reveal mistranslation of Ala at Gly codons and defects in protein homeostasis generated by natural human tRNA variants that are tolerated under normal conditions. This article is part of the theme issue 'Reactivity and mechanism in chemical and synthetic biology'.
Collapse
MESH Headings
- Humans
- Alanine/genetics
- Alanine/chemistry
- Alanine/metabolism
- Alanine-tRNA Ligase/chemistry
- Alanine-tRNA Ligase/genetics
- Alanine-tRNA Ligase/metabolism
- Codon/genetics
- Glycine/genetics
- Glycine/metabolism
- Protein Biosynthesis
- Proteostasis
- RNA, Transfer/genetics
- RNA, Transfer/metabolism
- RNA, Transfer, Ala/chemistry
- RNA, Transfer, Ala/genetics
- RNA, Transfer, Ala/metabolism
- RNA, Transfer, Gly/metabolism
Collapse
Affiliation(s)
- Farah Hasan
- Department of Biochemistry, The University of Western Ontario, London, Ontario, Canada N6A 5C1
| | - Jeremy T. Lant
- Department of Biochemistry, The University of Western Ontario, London, Ontario, Canada N6A 5C1
| | - Patrick O'Donoghue
- Department of Biochemistry, The University of Western Ontario, London, Ontario, Canada N6A 5C1
- Department of Chemistry, The University of Western Ontario, London, Ontario, Canada N6A 5C1
| |
Collapse
|
6
|
Guo X, Su M. The Origin of Translation: Bridging the Nucleotides and Peptides. Int J Mol Sci 2022; 24:ijms24010197. [PMID: 36613641 PMCID: PMC9820756 DOI: 10.3390/ijms24010197] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2022] [Revised: 12/09/2022] [Accepted: 12/15/2022] [Indexed: 12/24/2022] Open
Abstract
Extant biology uses RNA to record genetic information and proteins to execute biochemical functions. Nucleotides are translated into amino acids via transfer RNA in the central dogma. tRNA is essential in translation as it connects the codon and the cognate amino acid. To reveal how the translation emerged in the prebiotic context, we start with the structure and dissection of tRNA, followed by the theory and hypothesis of tRNA and amino acid recognition. Last, we review how amino acids assemble on the tRNA and further form peptides. Understanding the origin of life will also promote our knowledge of artificial living systems.
Collapse
Affiliation(s)
- Xuyuan Guo
- School of Genetics and Microbiology, Trinity College Dublin, The University of Dublin, College Green, Dublin 2, D02 PN40 Dublin, Ireland
| | - Meng Su
- MRC Laboratory of Molecular Biology, Cambridge CB2 0QH, UK
- Correspondence:
| |
Collapse
|
7
|
Progress in and Opportunities for Applying Information Theory to Computational Biology and Bioinformatics. ENTROPY 2022; 24:e24070925. [PMID: 35885148 PMCID: PMC9323281 DOI: 10.3390/e24070925] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Received: 05/26/2022] [Revised: 06/27/2022] [Accepted: 06/30/2022] [Indexed: 11/25/2022]
|
8
|
Siddika T, Heinemann IU, O'Donoghue P. Expanding codon size. eLife 2022; 11:78869. [PMID: 35543705 PMCID: PMC9094744 DOI: 10.7554/elife.78869] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
Engineering transfer RNAs to read codons consisting of four bases requires changes in tRNA that go beyond the anticodon sequence.
Collapse
Affiliation(s)
- Tarana Siddika
- Department of Biochemistry, The University of Western Ontario, London, Canada
| | - Ilka U Heinemann
- Department of Biochemistry, The University of Western Ontario, London, Canada
| | - Patrick O'Donoghue
- Department of Biochemistry, Department of Chemistry, The University of Western Ontario, London, Canada
| |
Collapse
|
9
|
Rozik P, Szabla R, Lant JT, Kiri R, Wright DE, Junop M, O'Donoghue P. A novel fluorescent reporter sensitive to serine mis-incorporation. RNA Biol 2022; 19:221-233. [PMID: 35167412 PMCID: PMC8855846 DOI: 10.1080/15476286.2021.2015173] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022] Open
Abstract
High-fidelity translation was considered a requirement for living cells. The frozen accident theory suggested that any deviation from the standard genetic code should result in the production of so much mis-made and non-functional proteins that cells cannot remain viable. Studies in bacterial, yeast, and mammalian cells show that significant levels of mistranslation (1–10% per codon) can be tolerated or even beneficial under conditions of oxidative stress. Single tRNA mutants, which occur naturally in the human population, can lead to amino acid mis-incorporation at a codon or set of codons. The rate or level of mistranslation can be difficult or impossible to measure in live cells. We developed a novel red fluorescent protein reporter that is sensitive to serine (Ser) mis-incorporation at proline (Pro) codons. The mCherry Ser151Pro mutant is efficiently produced in Escherichia coli but non-fluorescent. We demonstrated in cells and with purified mCherry protein that the fluorescence of mCherry Ser151Pro is rescued by two different tRNASer gene variants that were mutated to contain the Pro (UGG) anticodon. Ser mis-incorporation was confirmed by mass spectrometry. Remarkably, E. coli tolerated mistranslation rates of ~10% per codon with negligible reduction in growth rate. Conformational sampling simulations revealed that the Ser151Pro mutant leads to significant changes in the conformational freedom of the chromophore precursor, which is indicative of a defect in chromophore maturation. Together our data suggest that the mCherry Ser151 mutants may be used to report Ser mis-incorporation at multiple other codons, further expanding the ability to measure mistranslation in living cells.
Collapse
Affiliation(s)
- Peter Rozik
- Department of Biochemistry, The University of Western Ontario, London, Ontario, Canada
| | - Robert Szabla
- Department of Biochemistry, The University of Western Ontario, London, Ontario, Canada
| | - Jeremy T Lant
- Department of Biochemistry, The University of Western Ontario, London, Ontario, Canada
| | - Rashmi Kiri
- Department of Biochemistry, The University of Western Ontario, London, Ontario, Canada
| | - David E Wright
- Department of Biochemistry, The University of Western Ontario, London, Ontario, Canada
| | - Murray Junop
- Department of Biochemistry, The University of Western Ontario, London, Ontario, Canada
| | - Patrick O'Donoghue
- Department of Biochemistry, The University of Western Ontario, London, Ontario, Canada.,Department of Chemistry, The University of Western Ontario, London, Ontario, Canada
| |
Collapse
|
10
|
Gurzeler LA, Ziegelmüller J, Mühlemann O, Karousis ED. Production of human translation-competent lysates using dual centrifugation. RNA Biol 2022; 19:78-88. [PMID: 34965175 PMCID: PMC8815625 DOI: 10.1080/15476286.2021.2014695] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2021] [Accepted: 12/02/2021] [Indexed: 11/17/2022] Open
Abstract
Protein synthesis is a central process in gene expression and the development of efficient in vitro translation systems has been the focus of scientific efforts for decades. The production of translation-competent lysates originating from human cells or tissues remains challenging, mainly due to the variability of cell lysis conditions. Here we present a robust and fast method based on dual centrifugation that allows for detergent-free cell lysis under controlled mechanical forces. We optimized the lysate preparation to yield cytoplasm-enriched extracts from human cells that efficiently translate mRNAs in a cap-dependent as well as in an IRES-mediated way. Reduction of the phosphorylation state of eIF2α using recombinant GADD34 and 2-aminopurine considerably boosts the protein output, reinforcing the potential of this method to produce recombinant proteins from human lysates.
Collapse
Affiliation(s)
- Lukas-Adrian Gurzeler
- Department of Chemistry, Biochemistry and Pharmaceutical Sciences, University of Bern, Bern, Switzerland
- Graduate School for Cellular and Biomedical Sciences, University of Bern, Bern, Switzerland
| | - Jana Ziegelmüller
- Department of Chemistry, Biochemistry and Pharmaceutical Sciences, University of Bern, Bern, Switzerland
- Graduate School for Cellular and Biomedical Sciences, University of Bern, Bern, Switzerland
| | - Oliver Mühlemann
- Department of Chemistry, Biochemistry and Pharmaceutical Sciences, University of Bern, Bern, Switzerland
| | - Evangelos D. Karousis
- Department of Chemistry, Biochemistry and Pharmaceutical Sciences, University of Bern, Bern, Switzerland
| |
Collapse
|
11
|
Berg MD, Isaacson JR, Cozma E, Genereaux J, Lajoie P, Villén J, Brandl CJ. Regulating Expression of Mistranslating tRNAs by Readthrough RNA Polymerase II Transcription. ACS Synth Biol 2021; 10:3177-3189. [PMID: 34726901 PMCID: PMC8765249 DOI: 10.1021/acssynbio.1c00461] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/02/2022]
Abstract
![]()
Transfer RNA (tRNA)
variants that alter the genetic code increase
protein diversity and have many applications in synthetic biology.
Since the tRNA variants can cause a loss of proteostasis, regulating
their expression is necessary to achieve high levels of novel protein.
Mechanisms to positively regulate transcription with exogenous activator
proteins like those often used to regulate RNA polymerase II (RNAP
II)-transcribed genes are not applicable to tRNAs as their expression
by RNA polymerase III requires elements internal to the tRNA. Here,
we show that tRNA expression is repressed by overlapping transcription
from an adjacent RNAP II promoter. Regulating the expression of the
RNAP II promoter allows inverse regulation of the tRNA. Placing either
Gal4- or TetR–VP16-activated promoters downstream of a mistranslating
tRNASer variant that misincorporates serine at proline
codons in Saccharomyces cerevisiae allows
mistranslation at a level not otherwise possible because of the toxicity
of the unregulated tRNA. Using this inducible tRNA system, we explore
the proteotoxic effects of mistranslation on yeast cells. High levels
of mistranslation cause cells to arrest in the G1 phase. These cells
are impermeable to propidium iodide, yet growth is not restored upon
repressing tRNA expression. High levels of mistranslation increase
cell size and alter cell morphology. This regulatable tRNA expression
system can be applied to study how native tRNAs and tRNA variants
affect the proteome and other biological processes. Variations of
this inducible tRNA system should be applicable to other eukaryotic
cell types.
Collapse
Affiliation(s)
- Matthew D. Berg
- Department of Biochemistry, The University of Western Ontario, London, Ontario N6A 5C1, Canada
- Department of Genome Sciences, University of Washington, Seattle, Washington 98195, United States
| | - Joshua R. Isaacson
- Department of Biology, The University of Western Ontario, London, Ontario N6A 5C1, Canada
| | - Ecaterina Cozma
- Department of Biochemistry, The University of Western Ontario, London, Ontario N6A 5C1, Canada
| | - Julie Genereaux
- Department of Biochemistry, The University of Western Ontario, London, Ontario N6A 5C1, Canada
| | - Patrick Lajoie
- Department of Anatomy and Cell Biology, The University of Western Ontario, London, Ontario N6A 5C1, Canada
| | - Judit Villén
- Department of Genome Sciences, University of Washington, Seattle, Washington 98195, United States
| | - Christopher J. Brandl
- Department of Biochemistry, The University of Western Ontario, London, Ontario N6A 5C1, Canada
| |
Collapse
|
12
|
Komar AA. A Code Within a Code: How Codons Fine-Tune Protein Folding in the Cell. BIOCHEMISTRY (MOSCOW) 2021; 86:976-991. [PMID: 34488574 DOI: 10.1134/s0006297921080083] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
The genetic code sets the correspondence between the sequence of a given nucleotide triplet in an mRNA molecule, called a codon, and the amino acid that is added to the growing polypeptide chain during protein synthesis. With four bases (A, G, U, and C), there are 64 possible triplet codons: 61 sense codons (encoding amino acids) and 3 nonsense codons (so-called, stop codons that define termination of translation). In most organisms, there are 20 common/standard amino acids used in protein synthesis; thus, the genetic code is redundant with most amino acids (with the exception of Met and Trp) are being encoded by more than one (synonymous) codon. Synonymous codons were initially presumed to have entirely equivalent functions, however, the finding that synonymous codons are not present at equal frequencies in mRNA suggested that the specific codon choice might have functional implications beyond coding for amino acid. Observation of nonequivalent use of codons in mRNAs implied a possibility of the existence of auxiliary information in the genetic code. Indeed, it has been found that genetic code contains several layers of such additional information and that synonymous codons are strategically placed within mRNAs to ensure a particular translation kinetics facilitating and fine-tuning co-translational protein folding in the cell via step-wise/sequential structuring of distinct regions of the polypeptide chain emerging from the ribosome at different points in time. This review summarizes key findings in the field that have identified the role of synonymous codons and their usage in protein folding in the cell.
Collapse
Affiliation(s)
- Anton A Komar
- Center for Gene Regulation in Health and Disease and Department of Biological, Geological and Environmental Sciences, Cleveland State University, Cleveland, OH 44115, USA. .,Department of Biochemistry and Center for RNA Science and Therapeutics, Case Western Reserve University, Cleveland, OH 44106, USA.,Genomic Medicine Institute, Lerner Research Institute, Cleveland Clinic, Cleveland, OH 44195, USA.,DAPCEL, Inc., Cleveland, OH 44106, USA
| |
Collapse
|
13
|
Ji Y, Zhou Z, Liu H, Davuluri RV. DNABERT: pre-trained Bidirectional Encoder Representations from Transformers model for DNA-language in genome. Bioinformatics 2021; 37:2112-2120. [PMID: 33538820 PMCID: PMC11025658 DOI: 10.1093/bioinformatics/btab083] [Citation(s) in RCA: 227] [Impact Index Per Article: 75.7] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/10/2020] [Revised: 12/31/2020] [Accepted: 02/01/2021] [Indexed: 12/19/2022] Open
Abstract
MOTIVATION Deciphering the language of non-coding DNA is one of the fundamental problems in genome research. Gene regulatory code is highly complex due to the existence of polysemy and distant semantic relationship, which previous informatics methods often fail to capture especially in data-scarce scenarios. RESULTS To address this challenge, we developed a novel pre-trained bidirectional encoder representation, named DNABERT, to capture global and transferrable understanding of genomic DNA sequences based on up and downstream nucleotide contexts. We compared DNABERT to the most widely used programs for genome-wide regulatory elements prediction and demonstrate its ease of use, accuracy and efficiency. We show that the single pre-trained transformers model can simultaneously achieve state-of-the-art performance on prediction of promoters, splice sites and transcription factor binding sites, after easy fine-tuning using small task-specific labeled data. Further, DNABERT enables direct visualization of nucleotide-level importance and semantic relationship within input sequences for better interpretability and accurate identification of conserved sequence motifs and functional genetic variant candidates. Finally, we demonstrate that pre-trained DNABERT with human genome can even be readily applied to other organisms with exceptional performance. We anticipate that the pre-trained DNABERT model can be fined tuned to many other sequence analyses tasks. AVAILABILITY AND IMPLEMENTATION The source code, pretrained and finetuned model for DNABERT are available at GitHub (https://github.com/jerryji1993/DNABERT). SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Yanrong Ji
- Division of Health and Biomedical Informatics, Department of Preventive Medicine, Northwestern University Feinberg School of Medicine, Chicago, IL 60611, USA
| | - Zhihan Zhou
- Department of Computer Science, Northwestern University, Evanston, IL 60208, USA
| | - Han Liu
- Department of Computer Science, Northwestern University, Evanston, IL 60208, USA
| | - Ramana V Davuluri
- Department of Biomedical Informatics, Stony Brook University, Stony Brook, NY 11794, USA
| |
Collapse
|
14
|
Role of Synonymous Mutations in the Evolution of TEM β-Lactamase Genes. Antimicrob Agents Chemother 2021; 65:AAC.00018-21. [PMID: 33820762 DOI: 10.1128/aac.00018-21] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2021] [Accepted: 03/23/2021] [Indexed: 01/25/2023] Open
Abstract
Nonsynonymous mutations are well documented in TEM β-lactamases. The resulting amino acid changes often alter the conferred phenotype from broad spectrum (2b) conferred by TEM-1 to extended spectrum (2be), inhibitor resistant (2br), or both extended spectrum and inhibitor resistant (2ber). The encoding bla TEM genes also deviate in numerous synonymous mutations, which are not well understood. bla TEM-3 (2be), bla TEM-33 (2br), and bla TEM-109 (2ber) were studied in comparison to bla TEM-1 bla TEM-33 was chosen for more detailed studies because it deviates from bla TEM-1 by a single nonsynonymous mutation and three additional synonymous mutations. Genes encoding the enzymes with only nonsynonymous or all (including synonymous) mutations plus all permutations between bla TEM-1 and bla TEM-33 were expressed in Escherichia coli cells. In disc diffusion assays, genes encoding TEM-3, TEM-33, and TEM-109 with all synonymous mutations resulted in higher resistance levels than genes without synonymous mutations. Disc diffusion assays with the 16 genes carrying all possible nucleotide change combinations between bla TEM-1 and bla TEM-33 indicated different susceptibilities for different variants. Nucleotide BLAST searches did not identify genes without synonymous mutations but did identify some without nonsynonymous mutations. Energies of possible secondary mRNA structures calculated with mfold are generally higher with synonymous mutations, suggesting that their role could be to destabilize the mRNA and facilitate its unfolding for efficient translation. In summary, our data indicate that transition from bla TEM-1 to other variant genes by simply acquiring the nonsynonymous mutations is not favored. Instead, synonymous mutations seem to support the transition to other variant genes with nonsynonymous mutations leading to different phenotypes.
Collapse
|
15
|
Abstract
Microbiology began as a unified science using the principles of chemistry to understand living systems. The unified view quickly split into the subdisciplines of medical microbiology, molecular biology, and environmental microbiology. The advent of a universal phylogeny and culture-independent approaches have helped tear down the boundaries separating the subdisciplines. The vision for the future is that the study of the fundamental roles of microbes in ecology and evolution will lead to an integrated biology with no boundary between microbiology and macrobiology. Expected final online publication date for the Annual Review of Microbiology, Volume 75 is October 2021. Please see http://www.annualreviews.org/page/journal/pubdates for revised estimates.
Collapse
Affiliation(s)
- Roberto Kolter
- Department of Microbiology, Harvard Medical School, Boston, Massachusetts 02115, USA;
| |
Collapse
|
16
|
Goaillard JM, Marder E. Ion Channel Degeneracy, Variability, and Covariation in Neuron and Circuit Resilience. Annu Rev Neurosci 2021; 44:335-357. [PMID: 33770451 DOI: 10.1146/annurev-neuro-092920-121538] [Citation(s) in RCA: 85] [Impact Index Per Article: 28.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
The large number of ion channels found in all nervous systems poses fundamental questions concerning how the characteristic intrinsic properties of single neurons are determined by the specific subsets of channels they express. All neurons display many different ion channels with overlapping voltage- and time-dependent properties. We speculate that these overlapping properties promote resilience in neuronal function. Individual neurons of the same cell type show variability in ion channel conductance densities even though they can generate reliable and similar behavior. This complicates a simple assignment of function to any conductance and is associated with variable responses of neurons of the same cell type to perturbations, deletions, and pharmacological manipulation. Ion channel genes often show strong positively correlated expression, which may result from the molecular and developmental rules that determine which ion channels are expressed in a given cell type.
Collapse
Affiliation(s)
| | - Eve Marder
- Volen Center and Department of Biology, Brandeis University, Waltham, Massachusetts 02454, USA;
| |
Collapse
|
17
|
Berg MD, Brandl CJ. Transfer RNAs: diversity in form and function. RNA Biol 2021; 18:316-339. [PMID: 32900285 PMCID: PMC7954030 DOI: 10.1080/15476286.2020.1809197] [Citation(s) in RCA: 45] [Impact Index Per Article: 15.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2020] [Revised: 07/31/2020] [Accepted: 08/08/2020] [Indexed: 12/11/2022] Open
Abstract
As the adaptor that decodes mRNA sequence into protein, the basic aspects of tRNA structure and function are central to all studies of biology. Yet the complexities of their properties and cellular roles go beyond the view of tRNAs as static participants in protein synthesis. Detailed analyses through more than 60 years of study have revealed tRNAs to be a fascinatingly diverse group of molecules in form and function, impacting cell biology, physiology, disease and synthetic biology. This review analyzes tRNA structure, biosynthesis and function, and includes topics that demonstrate their diversity and growing importance.
Collapse
Affiliation(s)
- Matthew D. Berg
- Department of Biochemistry, The University of Western Ontario, London, Canada
| | | |
Collapse
|
18
|
Muthugobal BKN, Ramesh G, Parthasarathy S, Suvaithenamudhan S, Muthuvel Prasath K. Gray code representation of the universal genetic code: Generation of never born protein sequences using Toeplitz matrix approach. Biosystems 2020; 198:104280. [PMID: 33161051 DOI: 10.1016/j.biosystems.2020.104280] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2020] [Revised: 10/23/2020] [Accepted: 10/23/2020] [Indexed: 01/21/2023]
Abstract
In this paper, we identify all possible Gray Code and Partitioned Gray Code representations of the Universal Genetic Code for n = 2-bit and 3-bit binary numbers. We analyse the Hamming Distance matrices of all these Gray code and Partitioned Gray Code possibilities for which we obtain the Toeplitz and Partitioned Toeplitz Matrices, respectively. We use this Gray Code and Partitioned Gray Code representations of the Universal Genetic Code combined with the novel Toeplitz matrix approach to generate many Never Born Protein (NBP) Sequences, which exhibit intrinsic structural stability. In general, Never Born Protein sequences may have many potential applications in synthetic biology and opens a new vista in understanding this new subset of proteins for better applications in drug discovery, synthesis of fine chemicals, etc.
Collapse
Affiliation(s)
| | - Ganapathy Ramesh
- Ramanujan Research Centre, Department of Mathematics, Government Arts College (Autonomous), Kumbakonam, 612 001, Tamil Nadu, India
| | - Subbiah Parthasarathy
- Department of Bioinformatics, School of Life Sciences, Bharathidasan University, Tiruchirappalli, 620 024, Tamil Nadu, India.
| | - Suvaiyarasan Suvaithenamudhan
- Department of Bioinformatics, School of Life Sciences, Bharathidasan University, Tiruchirappalli, 620 024, Tamil Nadu, India
| | - Karuppasamy Muthuvel Prasath
- Department of Bioinformatics, School of Life Sciences, Bharathidasan University, Tiruchirappalli, 620 024, Tamil Nadu, India
| |
Collapse
|
19
|
Abstract
Our understanding of the human genome has continuously expanded since its draft publication in 2001. Over the years, novel assays have allowed us to progressively overlay layers of knowledge above the raw sequence of A's, T's, G's, and C's. The reference human genome sequence is now a complex knowledge base maintained under the shared stewardship of multiple specialist communities. Its complexity stems from the fact that it is simultaneously a template for transcription, a record of evolution, a vehicle for genetics, and a functional molecule. In short, the human genome serves as a frame of reference at the intersection of a diversity of scientific fields. In recent years, the progressive fall in sequencing costs has given increasing importance to the quality of the human reference genome, as hundreds of thousands of individuals are being sequenced yearly, often for clinical applications. Also, novel sequencing-based assays shed light on novel functions of the genome, especially with respect to gene expression regulation. Keeping the human genome annotation up to date and accurate is therefore an ongoing partnership between reference annotation projects and the greater community worldwide.
Collapse
Affiliation(s)
- Daniel R Zerbino
- European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton CB10 1SD, United Kingdom; , ,
| | - Adam Frankish
- European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton CB10 1SD, United Kingdom; , ,
| | - Paul Flicek
- European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton CB10 1SD, United Kingdom; , ,
| |
Collapse
|
20
|
Patel U, Gautam S, Chatterji D. Unraveling the Role of Silent Mutation in the ω-Subunit of Escherichia coli RNA Polymerase: Structure Transition Inhibits Transcription. ACS OMEGA 2019; 4:17714-17725. [PMID: 31681877 PMCID: PMC6822122 DOI: 10.1021/acsomega.9b02103] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/09/2019] [Accepted: 09/05/2019] [Indexed: 05/07/2023]
Abstract
The bacterial RNA polymerase is a multi-subunit enzyme complex composed of six subunits, α2ββ'σω. The function of this enzyme is to transcribe the DNA base sequence to the RNA intermediate, which is ultimately translated to protein. Though the contribution of each subunit in RNA synthesis has been clearly elucidated, the role of the smallest ω-subunit is still unclear despite several studies. Recently, a study on a dominant negative mutant of rpoZ has been reported in which the mutant was shown to render the RNA polymerase defective in transcription initiation (ω6, N60D) and gave an insight on the function of ω in RNA polymerase. Serendipitously, we also obtained a silent mutant, and the mutant was found to be lethal during the isolation of toxic mutants. The primary focus of this study is to understand the mechanistic details of this lethality. Isolated ω shows a predominantly unstructured circular dichroism profile and becomes α-helical in the enzyme complex. This structural transition is perhaps the reason for this lack of function. Subsequently, we generated several silent mutants of ω to investigate the role of codon bias and the effect of rare codons with respect to their position in rpoZ. Not all silent mutations affect the structure. RNA polymerase when reconstituted with structurally altered silent mutants of ω is transcriptionally inactive. The CodonPlus strain, which has surplus tRNA, was used to assess for the rescue of the phenotype in lethal silent mutants.
Collapse
Affiliation(s)
| | - Sudhanshu Gautam
- Molecular Biophysics Unit, Indian Institute of Science, Bangalore 560012, India
| | - Dipankar Chatterji
- Molecular Biophysics Unit, Indian Institute of Science, Bangalore 560012, India
| |
Collapse
|
21
|
Kinney JB, McCandlish DM. Massively Parallel Assays and Quantitative Sequence-Function Relationships. Annu Rev Genomics Hum Genet 2019; 20:99-127. [PMID: 31091417 DOI: 10.1146/annurev-genom-083118-014845] [Citation(s) in RCA: 81] [Impact Index Per Article: 16.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Over the last decade, a rich variety of massively parallel assays have revolutionized our understanding of how biological sequences encode quantitative molecular phenotypes. These assays include deep mutational scanning, high-throughput SELEX, and massively parallel reporter assays. Here, we review these experimental methods and how the data they produce can be used to quantitatively model sequence-function relationships. In doing so, we touch on a diverse range of topics, including the identification of clinically relevant genomic variants, the modeling of transcription factor binding to DNA, the functional and evolutionary landscapes of proteins, and cis-regulatory mechanisms in both transcription and mRNA splicing. We further describe a unified conceptual framework and a core set of mathematical modeling strategies that studies in these diverse areas can make use of. Finally, we highlight key aspects of experimental design and mathematical modeling that are important for the results of such studies to be interpretable and reproducible.
Collapse
Affiliation(s)
- Justin B Kinney
- Simons Center for Quantitative Biology, Cold Spring Harbor Laboratory, Cold Spring Harbor, New York 11724, USA; ,
| | - David M McCandlish
- Simons Center for Quantitative Biology, Cold Spring Harbor Laboratory, Cold Spring Harbor, New York 11724, USA; ,
| |
Collapse
|
22
|
Compositional dynamics and codon usage pattern of BRCA1 gene across nine mammalian species. Genomics 2019; 111:167-176. [DOI: 10.1016/j.ygeno.2018.01.013] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2017] [Revised: 12/22/2017] [Accepted: 01/22/2018] [Indexed: 11/19/2022]
|
23
|
Lant JT, Berg MD, Heinemann IU, Brandl CJ, O'Donoghue P. Pathways to disease from natural variations in human cytoplasmic tRNAs. J Biol Chem 2019; 294:5294-5308. [PMID: 30643023 DOI: 10.1074/jbc.rev118.002982] [Citation(s) in RCA: 52] [Impact Index Per Article: 10.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022] Open
Abstract
Perfectly accurate translation of mRNA into protein is not a prerequisite for life. Resulting from errors in protein synthesis, mistranslation occurs in all cells, including human cells. The human genome encodes >600 tRNA genes, providing both the raw material for genetic variation and a buffer to ensure that resulting translation errors occur at tolerable levels. On the basis of data from the 1000 Genomes Project, we highlight the unanticipated prevalence of mistranslating tRNA variants in the human population and review studies on synthetic and natural tRNA mutations that cause mistranslation or de-regulate protein synthesis. Although mitochondrial tRNA variants are well known to drive human diseases, including developmental disorders, few studies have revealed a role for human cytoplasmic tRNA mutants in disease. In the context of the unexpectedly large number of tRNA variants in the human population, the emerging literature suggests that human diseases may be affected by natural tRNA variants that cause mistranslation or de-regulate tRNA expression and nucleotide modification. This review highlights examples relevant to genetic disorders, cancer, and neurodegeneration in which cytoplasmic tRNA variants directly cause or exacerbate disease and disease-linked phenotypes in cells, animal models, and humans. In the near future, tRNAs may be recognized as useful genetic markers to predict the onset or severity of human disease.
Collapse
Affiliation(s)
| | | | | | | | - Patrick O'Donoghue
- From the Departments of Biochemistry and .,Chemistry, The University of Western Ontario, London, Ontario N6A 5C1, Canada
| |
Collapse
|
24
|
Dimitrakopoulos L, Prassas I, Diamandis EP, Charames GS. Onco-proteogenomics: Multi-omics level data integration for accurate phenotype prediction. Crit Rev Clin Lab Sci 2017; 54:414-432. [DOI: 10.1080/10408363.2017.1384446] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/10/2023]
Affiliation(s)
- Lampros Dimitrakopoulos
- Department of Laboratory Medicine and Pathobiology, University of Toronto, Toronto, ON, Canada
- Department of Pathology and Laboratory Medicine, Mount Sinai Hospital, Joseph and Wolf Lebovic Health Complex, Toronto, ON, Canada
- Lunenfeld-Tanenbaum Research Institute, Sinai Health System, Toronto, ON, Canada
| | - Ioannis Prassas
- Department of Pathology and Laboratory Medicine, Mount Sinai Hospital, Joseph and Wolf Lebovic Health Complex, Toronto, ON, Canada
| | - Eleftherios P. Diamandis
- Department of Laboratory Medicine and Pathobiology, University of Toronto, Toronto, ON, Canada
- Department of Pathology and Laboratory Medicine, Mount Sinai Hospital, Joseph and Wolf Lebovic Health Complex, Toronto, ON, Canada
- Lunenfeld-Tanenbaum Research Institute, Sinai Health System, Toronto, ON, Canada
- Department of Clinical Biochemistry, University Health Network, Toronto, ON, Canada
| | - George S. Charames
- Department of Laboratory Medicine and Pathobiology, University of Toronto, Toronto, ON, Canada
- Department of Pathology and Laboratory Medicine, Mount Sinai Hospital, Joseph and Wolf Lebovic Health Complex, Toronto, ON, Canada
- Lunenfeld-Tanenbaum Research Institute, Sinai Health System, Toronto, ON, Canada
| |
Collapse
|
25
|
The Genetic Codes: Mathematical Formulae and an Inverse Symmetry-Information Relationship. INFORMATION 2016. [DOI: 10.3390/info8010006] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open
|
26
|
Grosjean H, Westhof E. An integrated, structure- and energy-based view of the genetic code. Nucleic Acids Res 2016; 44:8020-40. [PMID: 27448410 PMCID: PMC5041475 DOI: 10.1093/nar/gkw608] [Citation(s) in RCA: 185] [Impact Index Per Article: 23.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2016] [Revised: 06/11/2016] [Accepted: 06/17/2016] [Indexed: 12/25/2022] Open
Abstract
The principles of mRNA decoding are conserved among all extant life forms. We present an integrative view of all the interaction networks between mRNA, tRNA and rRNA: the intrinsic stability of codon-anticodon duplex, the conformation of the anticodon hairpin, the presence of modified nucleotides, the occurrence of non-Watson-Crick pairs in the codon-anticodon helix and the interactions with bases of rRNA at the A-site decoding site. We derive a more information-rich, alternative representation of the genetic code, that is circular with an unsymmetrical distribution of codons leading to a clear segregation between GC-rich 4-codon boxes and AU-rich 2:2-codon and 3:1-codon boxes. All tRNA sequence variations can be visualized, within an internal structural and energy framework, for each organism, and each anticodon of the sense codons. The multiplicity and complexity of nucleotide modifications at positions 34 and 37 of the anticodon loop segregate meaningfully, and correlate well with the necessity to stabilize AU-rich codon-anticodon pairs and to avoid miscoding in split codon boxes. The evolution and expansion of the genetic code is viewed as being originally based on GC content with progressive introduction of A/U together with tRNA modifications. The representation we present should help the engineering of the genetic code to include non-natural amino acids.
Collapse
Affiliation(s)
- Henri Grosjean
- Institute for Integrative Biology of the Cell (I2BC), CEA, CNRS, Univ Paris-Sud, Université Paris-Saclay, 91198 Gif-sur-Yvette, France
| | - Eric Westhof
- Architecture et Réactivité de l'ARN, Université de Strasbourg, Institut de biologie moléculaire et cellulaire du CNRS, 15 rue René Descartes, 67084 Strasbourg, France
| |
Collapse
|
27
|
Komar AA. The "periodic table" of the genetic code: A new way to look at the code and the decoding process. ACTA ACUST UNITED AC 2016; 4:e1234431. [PMID: 28090420 DOI: 10.1080/21690731.2016.1234431] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2016] [Accepted: 09/05/2016] [Indexed: 01/18/2023]
Abstract
Henri Grosjean and Eric Westhof recently presented an information-rich, alternative view of the genetic code, which takes into account current knowledge of the decoding process, including the complex nature of interactions between mRNA, tRNA and rRNA that take place during protein synthesis on the ribosome, and it also better reflects the evolution of the code. The new asymmetrical circular genetic code has a number of advantages over the traditional codon table and the previous circular diagrams (with a symmetrical/clockwise arrangement of the U, C, A, G bases). Most importantly, all sequence co-variances can be visualized and explained based on the internal logic of the thermodynamics of codon-anticodon interactions.
Collapse
Affiliation(s)
- Anton A Komar
- Center for Gene Regulation in Health and Disease and Department of Biological, Geological and Environmental Sciences, Cleveland State University, Cleveland, OH, USA; Department of Biochemistry and Center for RNA Molecular Biology, Case Western Reserve University, Cleveland, OH, USA; Genomic Medicine Institute, Lerner Research Institute, Cleveland Clinic, Cleveland, OH, USA
| |
Collapse
|
28
|
Görnerup O, Gillblad D, Vasiloudis T. Domain-agnostic discovery of similarities and concepts at scale. Knowl Inf Syst 2016. [DOI: 10.1007/s10115-016-0984-2] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
|
29
|
Komar AA. The Yin and Yang of codon usage. Hum Mol Genet 2016; 25:R77-R85. [PMID: 27354349 DOI: 10.1093/hmg/ddw207] [Citation(s) in RCA: 98] [Impact Index Per Article: 12.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2016] [Accepted: 06/24/2016] [Indexed: 01/07/2023] Open
Abstract
The genetic code is degenerate. With the exception of two amino acids (Met and Trp), all other amino acid residues are each encoded by multiple, so-called synonymous codons. Synonymous codons were initially presumed to have entirely equivalent functions, however, the finding that synonymous codons are not present at equal frequencies in genes/genomes suggested that codon choice might have functional implications beyond amino acid coding. The pattern of non-uniform codon use (known as codon usage bias) varies between organisms and represents a unique feature of an organism. Organism-specific codon choice is related to organism-specific differences in populations of cognate tRNAs. This implies that, in a given organism, frequently used codons will be translated more rapidly than infrequently used ones and vice versa A theory of codon-tRNA co-evolution (necessary to balance accurate and efficient protein production) was put forward to explain the existence of codon usage bias. This model suggests that selection favours preferred (frequent) over un-preferred (rare) codons in order to sustain efficient protein production in cells and that a given un-preferred codon will have the same effect on an organism's fitness regardless of its position within an mRNA's open reading frame. However, many recent studies refute this prediction. Un-preferred codons have been found to have important functional roles and their effects appeared to be position-dependent. Synonymous codon usage affects the efficiency/stringency of mRNA decoding, mRNA biogenesis/stability, and protein secretion and folding. This review summarizes recent developments in the field that have identified novel functions of synonymous codons and their usage.
Collapse
Affiliation(s)
- Anton A Komar
- Center for Gene Regulation in Health and Disease and Department of Biological, Geological and Environmental Sciences, Cleveland State University, Cleveland, Ohio, OH, USA Department of Biochemistry and Center for RNA Molecular Biology, Case Western Reserve University, Cleveland, Ohio, USA Genomic Medicine Institute, Lerner Research Institute, Cleveland Clinic, Cleveland, Ohio, OH, USA
| |
Collapse
|
30
|
|
31
|
Mazumder TH, Uddin A, Chakraborty S. Transcription factor gene GATA2: Association of leukemia and nonsynonymous to the synonymous substitution rate across five mammals. Genomics 2016; 107:155-61. [PMID: 26850985 DOI: 10.1016/j.ygeno.2016.02.001] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2015] [Revised: 01/15/2016] [Accepted: 02/01/2016] [Indexed: 11/29/2022]
Abstract
GATA2 gene encodes a member of the GATA family of zinc-finger transcription factors that play a pivotal role during the transition of primitive blood forming cells into white blood cells. Mutation in GATA2 results in the loss of function or even gain of function, including abnormal proliferation of white blood cells that may predispose to acute myeloid leukemia. Our results showed that the codon usage in GATA2 has been influenced by GC mutation bias where nature has highly favored fourteen most over represented codons but disfavored the ATA codon across five mammals. Purifying natural selection has affected GATA2 gene in human and other mammals to maintain its protein function during the period of evolution. Our findings report an insight into the codon usage patterns in gaining the clues for codon optimization to alter the translational efficiency as well as for the functional conservation of gene expression and the significance of nucleotide composition in GATA2 gene within mammals.
Collapse
Affiliation(s)
| | - Arif Uddin
- Department of Biotechnology, Assam University, Silchar 788011, Assam, India
| | - Supriyo Chakraborty
- Department of Biotechnology, Assam University, Silchar 788011, Assam, India.
| |
Collapse
|
32
|
Ling J, O'Donoghue P, Söll D. Genetic code flexibility in microorganisms: novel mechanisms and impact on physiology. Nat Rev Microbiol 2015; 13:707-721. [PMID: 26411296 DOI: 10.1038/nrmicro3568] [Citation(s) in RCA: 89] [Impact Index Per Article: 9.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]
Abstract
The genetic code, initially thought to be universal and immutable, is now known to contain many variations, including biased codon usage, codon reassignment, ambiguous decoding and recoding. As a result of recent advances in the areas of genome sequencing, biochemistry, bioinformatics and structural biology, our understanding of genetic code flexibility has advanced substantially in the past decade. In this Review, we highlight the prevalence, evolution and mechanistic basis of genetic code variations in microorganisms, and we discuss how this flexibility of the genetic code affects microbial physiology.
Collapse
Affiliation(s)
- Jiqiang Ling
- Department of Microbiology and Molecular Genetics, University of Texas Health Science Center, Houston, Texas 77030, USA
| | - Patrick O'Donoghue
- Department of Biochemistry, The University of Western Ontario, London, Ontario N6A 5C1, Canada.,Department of Chemistry, The University of Western Ontario, London, Ontario N6A 5C1, Canada
| | - Dieter Söll
- Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, Connecticut 06520-8114, USA.,Department of Chemistry, Yale University, New Haven, Connecticut 06520-8114, USA
| |
Collapse
|
33
|
Abstract
E. coli's hardiness, versatility, broad palate and ease of handling have made it the most intensively studied and best understood organism on the planet. However, research on E.coli has primarily examined it as a model organism, one that is abstracted from any natural history. But E. coli is far more than just a microbial lab rat. Rather, it is a highly diverse organism with a complex, multi-faceted niche in the wild. Recent studies of 'wild' E. coli have, for example, revealed a great deal about its presence in the environment, its diversity and genomic evolution, as well as its role in the human microbiome and disease. These findings have shed light on aspects of its biology and ecology that pose far-reaching questions and illustrate how an appreciation of E. coli's natural history can expand its value as a model organism.
Collapse
Affiliation(s)
- Zachary D Blount
- Department of Microbiology and Molecular Genetics, Michigan State University, East Lansing, United States; BEACON Center for the Study of Evolution in Action, East Lansing, United States
| |
Collapse
|
34
|
Bouchelion A, Zhang Z, Li Y, Qian H, Mukherjee AB. Mice homozygous for c.451C>T mutation in Cln1 gene recapitulate INCL phenotype. Ann Clin Transl Neurol 2014; 1:1006-23. [PMID: 25574475 PMCID: PMC4284126 DOI: 10.1002/acn3.144] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2014] [Accepted: 10/16/2014] [Indexed: 11/21/2022] Open
Abstract
Objective Nonsense mutations account for 5–70% of all genetic disorders. In the United States, nonsense mutations in the CLN1/PPT1 gene underlie >40% of the patients with infantile neuronal ceroid lipofuscinosis (INCL), a devastating neurodegenerative lysosomal storage disease. We sought to generate a reliable mouse model of INCL carrying the most common Ppt1 nonsense mutation (c.451C>T) found in the United States patient population to provide a platform for evaluating nonsense suppressors in vivo. Methods We knocked-in c.451C>T nonsense mutation in the Ppt1 gene in C57 embryonic stem (ES) cells using a targeting vector in which LoxP flanked the Neo cassette, which was removed from targeted ES cells by electroporating Cre. Two independently targeted ES clones were injected into blastocysts to generate syngenic C57 knock-in mice, obviating the necessity for extensive backcrossing. Results Generation of Ppt1-KI mice was confirmed by DNA sequencing, which showed the presence of c.451C>T mutation in the Ppt1 gene. These mice are viable and fertile, although they developed spasticity (a “clasping” phenotype) at a median age of 6 months. Autofluorescent storage materials accumulated throughout the brain regions and in visceral organs. Electron microscopic analysis of the brain and the spleen showed granular osmiophilic deposits. Increased neuronal apoptosis was particularly evident in cerebral cortex and abnormal histopathological and electroretinographic (ERG) analyses attested striking retinal degeneration. Progressive deterioration of motor coordination and behavioral parameters continued until eventual death. Interpretation Our findings show that Ppt1-KI mice reliably recapitulate INCL phenotype providing a platform for testing the efficacy of existing and novel nonsense suppressors in vivo.
Collapse
Affiliation(s)
- Ashleigh Bouchelion
- Program on Developmental Endocrinology and Genetics, Section on Developmental Genetics, Eunice Kennedy-Shriver National Institute of Child Health and Human Development Bethesda, Maryland
| | - Zhongjian Zhang
- Program on Developmental Endocrinology and Genetics, Section on Developmental Genetics, Eunice Kennedy-Shriver National Institute of Child Health and Human Development Bethesda, Maryland
| | - Yichao Li
- Visual Function Core (HNW2-L), National Eye Institute, National Institutes of Health Bethesda, Maryland, 20892-1830
| | - Haohua Qian
- Visual Function Core (HNW2-L), National Eye Institute, National Institutes of Health Bethesda, Maryland, 20892-1830
| | - Anil B Mukherjee
- Program on Developmental Endocrinology and Genetics, Section on Developmental Genetics, Eunice Kennedy-Shriver National Institute of Child Health and Human Development Bethesda, Maryland
| |
Collapse
|
35
|
Aerni HR, Shifman MA, Rogulina S, O'Donoghue P, Rinehart J. Revealing the amino acid composition of proteins within an expanded genetic code. Nucleic Acids Res 2014; 43:e8. [PMID: 25378305 PMCID: PMC4333366 DOI: 10.1093/nar/gku1087] [Citation(s) in RCA: 66] [Impact Index Per Article: 6.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022] Open
Abstract
The genetic code can be manipulated to reassign codons for the incorporation of non-standard amino acids (NSAA). Deletion of release factor 1 in Escherichia coli enhances translation of UAG (Stop) codons, yet may also extended protein synthesis at natural UAG terminated messenger RNAs. The fidelity of protein synthesis at reassigned UAG codons and the purity of the NSAA containing proteins produced require careful examination. Proteomics would be an ideal tool for these tasks, but conventional proteomic analyses cannot readily identify the extended proteins and accurately discover multiple amino acid (AA) insertions at a single UAG. To address these challenges, we created a new proteomic workflow that enabled the detection of UAG readthrough in native proteins in E. coli strains in which UAG was reassigned to encode phosphoserine. The method also enabled quantitation of NSAA and natural AA incorporation at UAG in a recombinant reporter protein. As a proof-of-principle, we measured the fidelity and purity of the phosphoserine orthogonal translation system (OTS) and used this information to improve its performance. Our results show a surprising diversity of natural AAs at reassigned stop codons. Our method can be used to improve OTSs and to quantify amino acid purity at reassigned codons in organisms with expanded genetic codes.
Collapse
Affiliation(s)
- Hans R Aerni
- Department of Cellular & Molecular Physiology, Yale University, New Haven, CT 06520, USA Systems Biology Institute, Yale University, West Haven, CT 06516, USA
| | - Mark A Shifman
- Keck Biotechnology Resource Laboratory, Yale University, New Haven, CT 06511, USA
| | - Svetlana Rogulina
- Department of Cellular & Molecular Physiology, Yale University, New Haven, CT 06520, USA Systems Biology Institute, Yale University, West Haven, CT 06516, USA
| | - Patrick O'Donoghue
- Departments of Biochemistry and Chemistry, The University of Western Ontario, London, ON N6A 5C1, Canada
| | - Jesse Rinehart
- Department of Cellular & Molecular Physiology, Yale University, New Haven, CT 06520, USA Systems Biology Institute, Yale University, West Haven, CT 06516, USA
| |
Collapse
|
36
|
Sanbonmatsu KY. Flipping through the Genetic Code: New Developments in Discrimination between Cognate and Near-Cognate tRNAs and the Effect of Antibiotics. J Mol Biol 2014; 426:3197-3200. [DOI: 10.1016/j.jmb.2014.07.005] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/19/2023]
|
37
|
Brosius J. The persistent contributions of RNA to eukaryotic gen(om)e architecture and cellular function. Cold Spring Harb Perspect Biol 2014; 6:a016089. [PMID: 25081515 DOI: 10.1101/cshperspect.a016089] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/01/2023]
Abstract
Currently, the best scenario for earliest forms of life is based on RNA molecules as they have the proven ability to catalyze enzymatic reactions and harbor genetic information. Evolutionary principles valid today become apparent in such models already. Furthermore, many features of eukaryotic genome architecture might have their origins in an RNA or RNA/protein (RNP) world, including the onset of a further transition, when DNA replaced RNA as the genetic bookkeeper of the cell. Chromosome maintenance, splicing, and regulatory function via RNA may be deeply rooted in the RNA/RNP worlds. Mostly in eukaryotes, conversion from RNA to DNA is still ongoing, which greatly impacts the plasticity of extant genomes. Raw material for novel genes encoding protein or RNA, or parts of genes including regulatory elements that selection can act on, continues to enter the evolutionary lottery.
Collapse
Affiliation(s)
- Jürgen Brosius
- Institute of Experimental Pathology (ZMBE), University of Münster, D-48149 Münster, Germany
| |
Collapse
|
38
|
Ivanova NN, Schwientek P, Tripp HJ, Rinke C, Pati A, Huntemann M, Visel A, Woyke T, Kyrpides NC, Rubin EM. Stop codon reassignments in the wild. Science 2014; 344:909-13. [PMID: 24855270 DOI: 10.1126/science.1250691] [Citation(s) in RCA: 92] [Impact Index Per Article: 9.2] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]
Abstract
The canonical genetic code is assumed to be deeply conserved across all domains of life with very few exceptions. By scanning 5.6 trillion base pairs of metagenomic data for stop codon reassignment events, we detected recoding in a substantial fraction of the >1700 environmental samples examined. We observed extensive opal and amber stop codon reassignments in bacteriophages and of opal in bacteria. Our data indicate that bacteriophages can infect hosts with a different genetic code and demonstrate phage-host antagonism based on code differences. The abundance and diversity of genetic codes present in environmental organisms should be considered in the design of engineered organisms with altered genetic codes in order to preclude the exchange of genetic information with naturally occurring species.
Collapse
Affiliation(s)
- Natalia N Ivanova
- Department of Energy Joint Genome Institute (DOE JGI), Walnut Creek, CA 94598, USA
| | - Patrick Schwientek
- Department of Energy Joint Genome Institute (DOE JGI), Walnut Creek, CA 94598, USA
| | - H James Tripp
- Department of Energy Joint Genome Institute (DOE JGI), Walnut Creek, CA 94598, USA
| | - Christian Rinke
- Department of Energy Joint Genome Institute (DOE JGI), Walnut Creek, CA 94598, USA
| | - Amrita Pati
- Department of Energy Joint Genome Institute (DOE JGI), Walnut Creek, CA 94598, USA
| | - Marcel Huntemann
- Department of Energy Joint Genome Institute (DOE JGI), Walnut Creek, CA 94598, USA
| | - Axel Visel
- Department of Energy Joint Genome Institute (DOE JGI), Walnut Creek, CA 94598, USA. Genomics Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA. School of Natural Sciences, University of California, Merced, CA 95343, USA
| | - Tanja Woyke
- Department of Energy Joint Genome Institute (DOE JGI), Walnut Creek, CA 94598, USA
| | - Nikos C Kyrpides
- Department of Energy Joint Genome Institute (DOE JGI), Walnut Creek, CA 94598, USA
| | - Edward M Rubin
- Department of Energy Joint Genome Institute (DOE JGI), Walnut Creek, CA 94598, USA. Genomics Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA.
| |
Collapse
|
39
|
Rosandić M, Paar V. Codon sextets with leading role of serine create "ideal" symmetry classification scheme of the genetic code. Gene 2014; 543:45-52. [PMID: 24709107 DOI: 10.1016/j.gene.2014.04.009] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2014] [Accepted: 04/03/2014] [Indexed: 11/17/2022]
Abstract
The standard classification scheme of the genetic code is organized for alphabetic ordering of nucleotides. Here we introduce the new, "ideal" classification scheme in compact form, for the first time generated by codon sextets encoding Ser, Arg and Leu amino acids. The new scheme creates the known purine/pyrimidine, codon-anticodon, and amino/keto type symmetries and a novel A+U rich/C+G rich symmetry. This scheme is built from "leading" and "nonleading" groups of 32 codons each. In the ensuing 4 × 16 scheme, based on trinucleotide quadruplets, Ser has a central role as initial generator. Six codons encoding Ser and six encoding Arg extend continuously along a linear array in the "leading" group, and together with four of six Leu codons uniquely define construction of the "leading" group. The remaining two Leu codons enable construction of the "nonleading" group. The "ideal" genetic code suggests the evolution of genetic code with serine as an initiator.
Collapse
Affiliation(s)
- Marija Rosandić
- Croatian Academy of Sciences and Arts, Zrinski trg 11, 10000 Zagreb, Croatia
| | - Vladimir Paar
- Croatian Academy of Sciences and Arts, Zrinski trg 11, 10000 Zagreb, Croatia; Faculty of Science, University of Zagreb, Bijenička 32, 10000 Zagreb, Croatia.
| |
Collapse
|
40
|
McCarthy DJ, Humburg P, Kanapin A, Rivas MA, Gaulton K, Cazier JB, Donnelly P. Choice of transcripts and software has a large effect on variant annotation. Genome Med 2014; 6:26. [PMID: 24944579 PMCID: PMC4062061 DOI: 10.1186/gm543] [Citation(s) in RCA: 131] [Impact Index Per Article: 13.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2013] [Accepted: 03/20/2014] [Indexed: 12/19/2022] Open
Abstract
Background Variant annotation is a crucial step in the analysis of genome sequencing data. Functional annotation results can have a strong influence on the ultimate conclusions of disease studies. Incorrect or incomplete annotations can cause researchers both to overlook potentially disease-relevant DNA variants and to dilute interesting variants in a pool of false positives. Researchers are aware of these issues in general, but the extent of the dependency of final results on the choice of transcripts and software used for annotation has not been quantified in detail. Methods This paper quantifies the extent of differences in annotation of 80 million variants from a whole-genome sequencing study. We compare results using the RefSeq and Ensembl transcript sets as the basis for variant annotation with the software Annovar, and also compare the results from two annotation software packages, Annovar and VEP (Ensembl’s Variant Effect Predictor), when using Ensembl transcripts. Results We found only 44% agreement in annotations for putative loss-of-function variants when using the RefSeq and Ensembl transcript sets as the basis for annotation with Annovar. The rate of matching annotations for loss-of-function and nonsynonymous variants combined was 79% and for all exonic variants it was 83%. When comparing results from Annovar and VEP using Ensembl transcripts, matching annotations were seen for only 65% of loss-of-function variants and 87% of all exonic variants, with splicing variants revealed as the category with the greatest discrepancy. Using these comparisons, we characterised the types of apparent errors made by Annovar and VEP and discuss their impact on the analysis of DNA variants in genome sequencing studies. Conclusions Variant annotation is not yet a solved problem. Choice of transcript set can have a large effect on the ultimate variant annotations obtained in a whole-genome sequencing study. Choice of annotation software can also have a substantial effect. The annotation step in the analysis of a genome sequencing study must therefore be considered carefully, and a conscious choice made as to which transcript set and software are used for annotation.
Collapse
Affiliation(s)
- Davis J McCarthy
- Department of Statistics, University of Oxford, South Parks Road, Oxford, UK ; Wellcome Trust Centre for Human Genetics, University of Oxford, Roosevelt Drive, Oxford, UK
| | - Peter Humburg
- Wellcome Trust Centre for Human Genetics, University of Oxford, Roosevelt Drive, Oxford, UK
| | - Alexander Kanapin
- Wellcome Trust Centre for Human Genetics, University of Oxford, Roosevelt Drive, Oxford, UK
| | - Manuel A Rivas
- Wellcome Trust Centre for Human Genetics, University of Oxford, Roosevelt Drive, Oxford, UK
| | - Kyle Gaulton
- Wellcome Trust Centre for Human Genetics, University of Oxford, Roosevelt Drive, Oxford, UK
| | | | - Peter Donnelly
- Department of Statistics, University of Oxford, South Parks Road, Oxford, UK ; Wellcome Trust Centre for Human Genetics, University of Oxford, Roosevelt Drive, Oxford, UK
| |
Collapse
|
41
|
Ullah AMMS, D'Addona D, Arai N. DNA based computing for understanding complex shapes. Biosystems 2014; 117:40-53. [PMID: 24447435 DOI: 10.1016/j.biosystems.2014.01.003] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2013] [Revised: 11/14/2013] [Accepted: 01/07/2014] [Indexed: 11/30/2022]
Abstract
This study deals with a computing method called DNA based computing (DBC) that takes inspiration from the Central Dogma of Molecular Biology. The proposed DBC uses a set of user-defined rules to create a DNA-like sequence from a given piece of problem-relevant information (e.g., image data) in a dry-media (i.e., in an ordinary computer). It then uses another set of user-defined rules to create an mRNA-like sequence from the DNA. Finally, it uses the genetic code to translate the mRNA (or directly the DNA) to a protein-like sequence (a sequence of amino acids). The informational characteristics of the protein (entropy, absence, presence, abundance of some selected amino acids, and relationships among their likelihoods) can be used to solve problems (e.g., to understand complex shapes from their image data). Two case studies ((1) fractal geometry generated shape of a fern-leaf and (2) machining experiment generated shape of the worn-zones of a cutting tool) are presented elucidating the shape understanding ability of the proposed DBC in the presence of a great deal of variability in the image data of the respective shapes. The implication of the proposed DBC from the context of Internet-aided manufacturing system is also described. Further study can be carried out in solving other complex computational problems by using the proposed DBC and its derivatives.
Collapse
Affiliation(s)
- A M M Sharif Ullah
- Department of Mechanical Engineering, Kitami Institute of Technology, 165 Koen-cho, Kitami, Hokkaido 090-8507, Japan.
| | - Doriana D'Addona
- Department of Materials and Production Engineering, University of Naples Federico II, Piazzale Tecchio 80, I - 80125 Naples, Italy
| | - Nobuyuki Arai
- Graduate School of Engineering, Kitami Institute of Technology, 165 Koen-cho, Kitami, Hokkaido 090-8507, Japan
| |
Collapse
|
42
|
Foroughmand-Araabi MH, Goliaei B, Alishahi K, Sadeghi M. Dependency of codon usage on protein sequence patterns: a statistical study. Theor Biol Med Model 2014; 11:2. [PMID: 24410898 PMCID: PMC3896713 DOI: 10.1186/1742-4682-11-2] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2013] [Accepted: 01/03/2014] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Codon degeneracy and codon usage by organisms is an interesting and challenging problem. Researchers demonstrated the relation between codon usage and various functions or properties of genes and proteins, such as gene regulation, translation rate, translation efficiency, mRNA stability, splicing, and protein domains. Researchers usually represent segments of proteins responsible for specific functions or structures in a family of proteins as sequence patterns or motifs. We asked the question if organisms use the same codons in pattern segments as compared to the rest of the sequence. METHODS We used the likelihood ratio test, Pearson's chi-squared test, and mutual information to compare these two codon usages. RESULTS We showed that codon usage, in segments of genes that code for a given pattern or motif in a group of proteins, varied from the rest of the gene. The codon usage in these segments was not random. Amino acids with larger number of codons used more specific codon ratios in these segments. We studied the number of amino acids in the pattern (pattern length). As patterns got longer, there was a slight decrease in the fraction of patterns with significant different codon usage in the pattern region as compared to codon usage in the gene region. We defined a measure of specificity of protein patterns, and studied its relation to the codon usage. The difference in the codon usage between pattern region and gene region, was less for the patterns with higher specificity. CONCLUSIONS We provided a hypothesis that there are segments on genes that affect the codon usage and thus influence protein translation speed, and these regions are the regions that code protein pattern regions.
Collapse
Affiliation(s)
| | - Bahram Goliaei
- Institute of Biochemistry and Biophysics, University of Tehran, Tehran, Iran.
| | | | | |
Collapse
|
43
|
Krishnakumar R, Prat L, Aerni HR, Ling J, Merryman C, Glass JI, Rinehart J, Söll D. Transfer RNA misidentification scrambles sense codon recoding. Chembiochem 2013; 14:1967-72. [PMID: 24000185 DOI: 10.1002/cbic.201300444] [Citation(s) in RCA: 37] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2013] [Indexed: 12/22/2022]
Abstract
Sense codon recoding is the basis for genetic code expansion with more than two different noncanonical amino acids. It requires an unused (or rarely used) codon, and an orthogonal tRNA synthetase:tRNA pair with the complementary anticodon. The Mycoplasma capricolum genome contains just six CGG arginine codons, without a dedicated tRNA(Arg). We wanted to reassign this codon to pyrrolysine by providing M. capricolum with pyrrolysyl-tRNA synthetase, a synthetic tRNA with a CCG anticodon (tRNA(Pyl)(CCG)), and the genes for pyrrolysine biosynthesis. Here we show that tRNA(Pyl)(CCG) is efficiently recognized by the endogenous arginyl-tRNA synthetase, presumably at the anticodon. Mass spectrometry revealed that in the presence of tRNA(Pyl)(CCG), CGG codons are translated as arginine. This result is not unexpected as most tRNA synthetases use the anticodon as a recognition element. The data suggest that tRNA misidentification by endogenous aminoacyl-tRNA synthetases needs to be overcome for sense codon recoding.
Collapse
Affiliation(s)
- Radha Krishnakumar
- Synthetic Biology and Bioenergy, J. Craig Venter Institute, 9704 Medical Center Drive, Rockville, MD 20850 (USA)
| | | | | | | | | | | | | | | |
Collapse
|
44
|
Hor CY, Yang CB, Chang CH, Tseng CT, Chen HH. A Tool Preference Choice Method for RNA Secondary Structure Prediction by SVM with Statistical Tests. Evol Bioinform Online 2013; 9:163-84. [PMID: 23641141 PMCID: PMC3629938 DOI: 10.4137/ebo.s10580] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022] Open
Abstract
The Prediction of RNA secondary structures has drawn much attention from both biologists and computer scientists. Many useful tools have been developed for this purpose. These tools have their individual strengths and weaknesses. As a result, based on support vector machines (SVM), we propose a tool choice method which integrates three prediction tools: pknotsRG, RNAStructure, and NUPACK. Our method first extracts features from the target RNA sequence, and adopts two information-theoretic feature selection methods for feature ranking. We propose a method to combine feature selection and classifier fusion in an incremental manner. Our test data set contains 720 RNA sequences, where 225 pseudoknotted RNA sequences are obtained from PseudoBase, and 495 nested RNA sequences are obtained from RNA SSTRAND. The method serves as a preprocessing way in analyzing RNA sequences before the RNA secondary structure prediction tools are employed. In addition, the performance of various configurations is subject to statistical tests to examine their significance. The best base-pair accuracy achieved is 75.5%, which is obtained by the proposed incremental method, and is significantly higher than 68.8%, which is associated with the best predictor, pknotsRG.
Collapse
Affiliation(s)
- Chiou-Yi Hor
- Department of Computer Science and Engineering, National Sun Yat-sen University, Kaohsiung, Taiwan
| | | | | | | | | |
Collapse
|
45
|
Williams RW, Xue B, Uversky VN, Dunker AK. Distribution and cluster analysis of predicted intrinsically disordered protein Pfam domains. INTRINSICALLY DISORDERED PROTEINS 2013; 1:e25724. [PMID: 28516017 PMCID: PMC5424788 DOI: 10.4161/idp.25724] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/05/2013] [Revised: 07/02/2013] [Accepted: 07/11/2013] [Indexed: 11/19/2022]
Abstract
The Pfam database groups regions of proteins by how well hidden Markov models (HMMs) can be trained to recognize similarities among them. Conservation pressure is probably in play here. The Pfam seed training set includes sequence and structure information, being drawn largely from the PDB. A long standing hypothesis among intrinsically disordered protein (IDP) investigators has held that conservation pressures are also at play in the evolution of different kinds of intrinsic disorder, but we find that predicted intrinsic disorder (PID) is not always conserved across Pfam domains. Here we analyze distributions and clusters of PID regions in 193024 members of the version 23.0 Pfam seed database. To include the maximum information available for proteins that remain unfolded in solution, we employ the 10 linearly independent Kidera factors1–3 for the amino acids, combined with PONDR4 predictions of disorder tendency, to transform the sequences of these Pfam members into an 11 column matrix where the number of rows is the length of each Pfam region. Cluster analyses of the set of all regions, including those that are folded, show 6 groupings of domains. Cluster analyses of domains with mean VSL2b scores greater than 0.5 (half predicted disorder or more) show at least 3 separated groups. It is hypothesized that grouping sets into shorter sequences with more uniform length will reveal more information about intrinsic disorder and lead to more finely structured and perhaps more accurate predictions. HMMs could be trained to include this information.
Collapse
Affiliation(s)
- Robert W Williams
- Department of Biomedical Informatics; Uniformed Services University; Bethesda, MD USA
| | - Bin Xue
- Center for Computational Biology and Bioinformatics; Indiana School of Medicine; Indianapolis, IN USA.,Department of Molecular Medicine; College of Medicine; University of South Florida; Tampa, FL USA
| | - Vladimir N Uversky
- Center for Computational Biology and Bioinformatics; Indiana School of Medicine; Indianapolis, IN USA.,Department of Molecular Medicine; College of Medicine; University of South Florida; Tampa, FL USA.,Byrd Alzheimer's Research Institute; College of Medicine; University of South Florida; Tampa, FL USA.,Institute for Biological Instrumentation; Russian Academy of Sciences; Moscow Region, Russia
| | - A Keith Dunker
- Center for Computational Biology and Bioinformatics; Indiana School of Medicine; Indianapolis, IN USA
| |
Collapse
|
46
|
Abstract
The interplay of translation and mRNA turnover has helped unveil how the regulation of gene expression is a continuum in which events that occur during the birth of a transcript in the nucleus can have profound effects on subsequent steps in the cytoplasm. Exemplifying this continuum is nonsense-mediated mRNA decay (NMD), the process wherein a premature stop codon affects both translation and mRNA decay. Studies of NMD helped lead us to the therapeutic concept of treating a subset of patients suffering from multiple genetic disorders due to nonsense mutations with a single small-molecule drug that modulates the translation termination process at a premature nonsense codon. Here we review both translation termination and NMD, and our subsequent efforts over the past 15 years that led to the identification, characterization, and clinical testing of ataluren, a new therapeutic with the potential to treat a broad range of genetic disorders due to nonsense mutations.
Collapse
Affiliation(s)
- Stuart W Peltz
- PTC Therapeutics, Inc., South Plainfield, New Jersey 07080, USA.
| | | | | | | |
Collapse
|
47
|
|
48
|
Gagarinova A, Emili A. Genome-scale genetic manipulation methods for exploring bacterial molecular biology. MOLECULAR BIOSYSTEMS 2012; 8:1626-38. [PMID: 22517266 DOI: 10.1039/c2mb25040c] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/08/2023]
Abstract
Bacteria are diverse and abundant, playing key roles in human health and disease, the environment, and biotechnology. Despite progress in genome sequencing and bioengineering, much remains unknown about the functional organization of prokaryotes. For instance, roughly a third of the protein-coding genes of the best-studied model bacterium, Escherichia coli, currently lack experimental annotations. Systems-level experimental approaches for investigating the functional associations of bacterial genes and genetic structures are essential for defining the fundamental molecular biology of microbes, preventing the spread of antibacterial resistance in the clinic, and driving the development of future biotechnological applications. This review highlights recently introduced large-scale genetic manipulation and screening procedures for the systematic exploration of bacterial gene functions, molecular relationships, and the global organization of bacteria at the gene, pathway, and genome levels.
Collapse
Affiliation(s)
- Alla Gagarinova
- Department of Molecular Genetics, University of Toronto, Toronto, Canada
| | | |
Collapse
|
49
|
Okada Y, Terzaghi E, Streisinger G, Emrich J, Inouye M, Tsugita A. A frame-shift mutation involving the addition of two base pairs in the lysozyme gene of phage t4. Proc Natl Acad Sci U S A 2010; 56:1692-8. [PMID: 16591406 PMCID: PMC220157 DOI: 10.1073/pnas.56.6.1692] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Affiliation(s)
- Y Okada
- INSTITUTE OF MOLECULAR BIOLOGY, UNIVERSITY OF OREGON, EUGENE
| | | | | | | | | | | |
Collapse
|
50
|
Zamir A, Leder P, Elson D. A ribosome-catalyzed reaction between N-formylmethionyl-trna and puromycin. Proc Natl Acad Sci U S A 2010; 56:1794-801. [PMID: 16591422 PMCID: PMC220182 DOI: 10.1073/pnas.56.6.1794] [Citation(s) in RCA: 40] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Affiliation(s)
- A Zamir
- BIOCHEMISTRY SECTION, THE WEIZMANN INSTITUTE OF SCIENCE, REHOVOTH, ISRAEL
| | | | | |
Collapse
|