1
|
Nawaz MA, Pamirsky IE, Golokhvast KS. Bioinformatics in Russia: history and present-day landscape. Brief Bioinform 2024; 25:bbae513. [PMID: 39402695 PMCID: PMC11473191 DOI: 10.1093/bib/bbae513] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2024] [Revised: 08/12/2024] [Accepted: 10/01/2024] [Indexed: 10/19/2024] Open
Abstract
Bioinformatics has become an interdisciplinary subject due to its universal role in molecular biology research. The current status of Russia's bioinformatics research in Russia is not known. Here, we review the history of bioinformatics in Russia, present the current landscape, and highlight future directions and challenges. Bioinformatics research in Russia is driven by four major industries: information technology, pharmaceuticals, biotechnology, and agriculture. Over the past three decades, despite a delayed start, the field has gained momentum, especially in protein and nucleic acid research. Dedicated and shared centers for genomics, proteomics, and bioinformatics are active in different regions of Russia. Present-day bioinformatics in Russia is characterized by research issues related to genetics, metagenomics, OMICs, medical informatics, computational biology, environmental informatics, and structural bioinformatics. Notable developments are in the fields of software (tools, algorithms, and pipelines), use of high computation power (e.g. by the Siberian Supercomputer Center), and large-scale sequencing projects (the sequencing of 100 000 human genomes). Government funding is increasing, policies are being changed, and a National Genomic Information Database is being established. An increased focus on eukaryotic genome sequencing, the development of a common place for developers and researchers to share tools and data, and the use of biological modeling, machine learning, and biostatistics are key areas for future focus. Universities and research institutes have started to implement bioinformatics modules. A critical mass of bioinformaticians is essential to catch up with the global pace in the discipline.
Collapse
Affiliation(s)
- Muhammad A Nawaz
- Advanced Engineering School (Agrobiotek), National Research Tomsk State University, Lenin Ave, 36, Tomsk Oblast, Tomsk 634050, Russia
- Centre for Research in the Field of Materials and Technologies, National Research Tomsk State University, Lenin Ave, 36, Tomsk Oblast, Tomsk 634050, Russia
| | - Igor E Pamirsky
- Advanced Engineering School (Agrobiotek), National Research Tomsk State University, Lenin Ave, 36, Tomsk Oblast, Tomsk 634050, Russia
- Siberian Federal Scientific Centre of Agrobiotechnology, Centralnaya st., 2b, Presidium, Krasnoobsk, 633501, Novosibirsk Oblast, Russia
| | - Kirill S Golokhvast
- Advanced Engineering School (Agrobiotek), National Research Tomsk State University, Lenin Ave, 36, Tomsk Oblast, Tomsk 634050, Russia
- Siberian Federal Scientific Centre of Agrobiotechnology, Centralnaya st., 2b, Presidium, Krasnoobsk, 633501, Novosibirsk Oblast, Russia
| |
Collapse
|
2
|
Salman A, Biziaev N, Shuvalova E, Alkalaeva E. mRNA context and translation factors determine decoding in alternative nuclear genetic codes. Bioessays 2024; 46:e2400058. [PMID: 38724251 DOI: 10.1002/bies.202400058] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2024] [Revised: 04/19/2024] [Accepted: 04/23/2024] [Indexed: 06/27/2024]
Abstract
The genetic code is a set of instructions that determine how the information in our genetic material is translated into amino acids. In general, it is universal for all organisms, from viruses and bacteria to humans. However, in the last few decades, exceptions to this rule have been identified both in pro- and eukaryotes. In this review, we discuss the 16 described alternative eukaryotic nuclear genetic codes and observe theories of their appearance in evolution. We consider possible molecular mechanisms that allow codon reassignment. Most reassignments in nuclear genetic codes are observed for stop codons. Moreover, in several organisms, stop codons can simultaneously encode amino acids and serve as termination signals. In this case, the meaning of the codon is determined by the additional factors besides the triplets. A comprehensive review of various non-standard coding events in the nuclear genomes provides a new insight into the translation mechanism in eukaryotes.
Collapse
Affiliation(s)
- Ali Salman
- Engelhardt Institute of Molecular Biology, the Russian Academy of Sciences, Moscow, Russia
| | - Nikita Biziaev
- Engelhardt Institute of Molecular Biology, the Russian Academy of Sciences, Moscow, Russia
| | - Ekaterina Shuvalova
- Engelhardt Institute of Molecular Biology, the Russian Academy of Sciences, Moscow, Russia
| | - Elena Alkalaeva
- Engelhardt Institute of Molecular Biology, the Russian Academy of Sciences, Moscow, Russia
| |
Collapse
|
3
|
Stefanov BA, Ajuh E, Allen S, Nowacki M. Eukaryotic release factor 1 from Euplotes promotes frameshifting at premature stop codons in human cells. iScience 2024; 27:109413. [PMID: 38510117 PMCID: PMC10952039 DOI: 10.1016/j.isci.2024.109413] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2023] [Revised: 01/23/2024] [Accepted: 02/29/2024] [Indexed: 03/22/2024] Open
Abstract
Human physiology is highly susceptible to frameshift mutations within coding regions, and many hereditary diseases and cancers are caused by such indels. Presently, therapeutic options to counteract them are limited and, in the case of direct genome editing, risky. Here, we show that release factor 1 (eRF1) from Euplotes, an aquatic protist known for frequent +1 frameshifts in its coding regions, can enhance +1 ribosomal frameshifting at slippery heptameric sequences in human cells without an apparent requirement for an mRNA secondary structure. We further show an increase in frameshifting rate at the premature termination sequence found in the HEXA gene of Tay-Sachs disease patients, or a breast cancer cell line that harbors a tumor-driving frameshift mutation in GATA3. Although the overall increase in frameshifting would need further improvement for clinical applications, our results underscore the potential of exogenous factors, such as Eu eRF1, to increase frameshifting in human cells.
Collapse
Affiliation(s)
| | - Elvis Ajuh
- Institute of Cell Biology, University of Bern, Baltzerstrasse 4, 3012 Bern, Switzerland
| | - Sarah Allen
- Institute of Cell Biology, University of Bern, Baltzerstrasse 4, 3012 Bern, Switzerland
| | - Mariusz Nowacki
- Institute of Cell Biology, University of Bern, Baltzerstrasse 4, 3012 Bern, Switzerland
| |
Collapse
|
4
|
Rotterová J, Pánek T, Salomaki ED, Kotyk M, Táborský P, Kolísko M, Čepička I. Single cell transcriptomics reveals UAR codon reassignment in Palmarella salina (Metopida, Armophorea) and confirms Armophorida belongs to APM clade. Mol Phylogenet Evol 2024; 191:107991. [PMID: 38092322 DOI: 10.1016/j.ympev.2023.107991] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2023] [Revised: 12/04/2023] [Accepted: 12/09/2023] [Indexed: 12/17/2023]
Abstract
Anaerobes have emerged in several major lineages of ciliates, but the number of independent transitions to anaerobiosis among ciliates is unknown. The APM clade (Armophorea, Muranotrichea, Parablepharismea) represents the largest clade of obligate anaerobes among ciliates and contains free-living marine and freshwater representatives as well as gut endobionts of animals. The evolution of APM group has only recently started getting attention, and our knowledge on its phylogeny and genetics is still limited to a fraction of taxa. While ciliates portray a wide array of alternatives to the standard genetic code across numerous classes, the APM ciliates were considered to be the largest group using exclusively standard nuclear genetic code. In this study, we present a pan-ciliate phylogenomic analysis with emphasis on the APM clade, bringing the first phylogenomic analysis of the family Tropidoatractidae (Armophorea) and confirming the position of Armophorida within Armophorea. We include five newly sequenced single cell transcriptomes from marine, freshwater, and endobiotic APM ciliates - Palmarella salina, Anteclevelandella constricta, Nyctotherus sp., Caenomorpha medusula, and Thigmothrix strigosa. We report the first discovery of an alternative nuclear genetic code among APM ciliates, used by Palmarella salina (Tropidoatractidae, Armophorea), but not by its close relative, Tropidoatractus sp., and provide a comparative analysis of stop codon identity and frequency indicating the precedency to the UAG codon loss/reassignment over the UAA codon reassignment in the specific ancestor of Palmarella. Comparative genomic and proteomic studies of this group may help explain the constraints that underlie UAR stop-to-sense reassignment, the most frequent type of alternative nuclear genetic code, not only in ciliates, but eukaryotes in general.
Collapse
Affiliation(s)
- Johana Rotterová
- Department of Zoology, Faculty of Science, Charles University, Prague 128 00, Czech Republic; Department of Marine Sciences, University of Puerto Rico Mayagüez, Mayagüez, PR, USA.
| | - Tomáš Pánek
- Department of Zoology, Faculty of Science, Charles University, Prague 128 00, Czech Republic
| | - Eric D Salomaki
- Institute of Parasitology, Biology Centre Czech Academy of Sciences, České Budějovice 370 05, Czech Republic; Center for Computational Biology of Human Disease and Center for Computation and Visualization, Brown University, Providence, Rhode Island, USA
| | - Michael Kotyk
- Department of Zoology, Faculty of Science, Charles University, Prague 128 00, Czech Republic
| | - Petr Táborský
- Department of Zoology, Faculty of Science, Charles University, Prague 128 00, Czech Republic
| | - Martin Kolísko
- Institute of Parasitology, Biology Centre Czech Academy of Sciences, České Budějovice 370 05, Czech Republic
| | - Ivan Čepička
- Department of Zoology, Faculty of Science, Charles University, Prague 128 00, Czech Republic.
| |
Collapse
|
5
|
Gao X, Chen K, Xiong J, Zou D, Yang F, Ma Y, Jiang C, Gao X, Wang G, Gu S, Zhang P, Luo S, Huang K, Bao Y, Zhang Z, Ma L, Miao W. The P10K database: a data portal for the protist 10 000 genomes project. Nucleic Acids Res 2024; 52:D747-D755. [PMID: 37930867 PMCID: PMC10767852 DOI: 10.1093/nar/gkad992] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2023] [Revised: 10/03/2023] [Accepted: 10/17/2023] [Indexed: 11/08/2023] Open
Abstract
Protists, a highly diverse group of microscopic eukaryotic organisms distinct from fungi, animals and plants, exert crucial roles within the earth's biosphere. However, the genomes of only a small fraction of known protist species have been published and made publicly accessible. To address this constraint, the Protist 10 000 Genomes Project (P10K) was initiated, implementing a specialized pipeline for single-cell genome/transcriptome assembly, decontamination and annotation of protists. The resultant P10K database (https://ngdc.cncb.ac.cn/p10k/) serves as a comprehensive platform, collating and disseminating genome sequences and annotations from diverse protist groups. Currently, the P10K database has incorporated 2959 genomes and transcriptomes, including 1101 newly sequenced datasets by P10K and 1858 publicly available datasets. Notably, it covers 45% of the protist orders, with a significant representation (53% coverage) of ciliates, featuring nearly a thousand genomes/transcriptomes. Intriguingly, analysis of the unique codon table usage among ciliates has revealed differences compared to the NCBI taxonomy system, suggesting a need to revise the codon tables used for these species. Collectively, the P10K database serves as a valuable repository of genetic resources for protist research and aims to expand its collection by incorporating more sequenced data and advanced analysis tools to benefit protist studies worldwide.
Collapse
Affiliation(s)
- Xinxin Gao
- Institute of Hydrobiology, Chinese Academy of Sciences, Wuhan 430072, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Kai Chen
- Institute of Hydrobiology, Chinese Academy of Sciences, Wuhan 430072, China
| | - Jie Xiong
- Institute of Hydrobiology, Chinese Academy of Sciences, Wuhan 430072, China
- Key Laboratory of Breeding Biotechnology and Sustainable Aquaculture, Chinese Academy of Sciences, Wuhan 430072, China
| | - Dong Zou
- China National Center for Bioinformation, Beijing 100101, China
- National Genomics Data Center & CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China
| | - Fangdian Yang
- Institute of Hydrobiology, Chinese Academy of Sciences, Wuhan 430072, China
| | - Yingke Ma
- China National Center for Bioinformation, Beijing 100101, China
- National Genomics Data Center & CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China
| | - Chuanqi Jiang
- Institute of Hydrobiology, Chinese Academy of Sciences, Wuhan 430072, China
| | - Xiaoxuan Gao
- Shandong University of Technology, Zibo 255000, China
| | - Guangying Wang
- Institute of Hydrobiology, Chinese Academy of Sciences, Wuhan 430072, China
| | - Siyu Gu
- Institute of Hydrobiology, Chinese Academy of Sciences, Wuhan 430072, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Peng Zhang
- Institute of Hydrobiology, Chinese Academy of Sciences, Wuhan 430072, China
| | - Shuai Luo
- Institute of Hydrobiology, Chinese Academy of Sciences, Wuhan 430072, China
| | - Kaiyao Huang
- Institute of Hydrobiology, Chinese Academy of Sciences, Wuhan 430072, China
- Key laboratory of Lake and Watershed Science for Water Security, Chinese Academy of Sciences, Nanjing 210008, China
| | - Yiming Bao
- University of Chinese Academy of Sciences, Beijing 100049, China
- China National Center for Bioinformation, Beijing 100101, China
- National Genomics Data Center & CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China
| | - Zhang Zhang
- University of Chinese Academy of Sciences, Beijing 100049, China
- China National Center for Bioinformation, Beijing 100101, China
- National Genomics Data Center & CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China
| | - Lina Ma
- University of Chinese Academy of Sciences, Beijing 100049, China
- China National Center for Bioinformation, Beijing 100101, China
- National Genomics Data Center & CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China
| | - Wei Miao
- Institute of Hydrobiology, Chinese Academy of Sciences, Wuhan 430072, China
- Key laboratory of Lake and Watershed Science for Water Security, Chinese Academy of Sciences, Nanjing 210008, China
- Hubei Hongshan Laboratory, Wuhan 430070, China
| |
Collapse
|
6
|
Xiao Y, Li J, Wang R, Fan Y, Han X, Fu Y, Alepuz P, Wang W, Liang A. eIF5A promotes +1 programmed ribosomal frameshifting in Euplotes octocarinatus. Int J Biol Macromol 2024; 254:127743. [PMID: 38287569 DOI: 10.1016/j.ijbiomac.2023.127743] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2023] [Revised: 10/25/2023] [Accepted: 10/26/2023] [Indexed: 01/31/2024]
Abstract
Programmed ribosomal frameshifting (PRF) exists in all branches of life that regulate gene expression at the translational level. The single-celled eukaryote Euplotes exhibit high frequency of PRF. However, the molecular mechanism of modulating Euplotes PRF remains largely unknown. Here, we identified two novel eIF5A genes, eIF5A1 and eIF5A2, in Euplotes octocarinatus and found that the Eo-eIF5A2 gene requires a -1 PRF to produce complete protein product. Although both Eo-eIF5As showed significant structural similarity with yeast eIF5A, neither of them could functionally replace yeast eIF5A. Eo-eIF5A knockdown inhibited +1 PRF of the η-tubulin gene. Using an in vitro reconstituted translation system, we found that hypusinated Eo-eIF5A (Eo-eIF5AH) can promote +1 PRF at the canonical AAA_UAA frameshifting site of Euplotes. The results showed eIF5A is a novel trans-regulator of PRF in Euplotes and has an evolutionary conserved role in regulating +1 PRF in eukaryotes.
Collapse
Affiliation(s)
- Yu Xiao
- Key Laboratory of Chemical Biology and Molecular Engineering of Ministry of Education, Institute of Biotechnology, Shanxi University, Taiyuan 030006, China
| | - Jia Li
- Key Laboratory of Chemical Biology and Molecular Engineering of Ministry of Education, Institute of Biotechnology, Shanxi University, Taiyuan 030006, China
| | - Ruanlin Wang
- Key Laboratory of Chemical Biology and Molecular Engineering of Ministry of Education, Institute of Biotechnology, Shanxi University, Taiyuan 030006, China.
| | - Yajiao Fan
- Key Laboratory of Chemical Biology and Molecular Engineering of Ministry of Education, Institute of Biotechnology, Shanxi University, Taiyuan 030006, China
| | - Xiaxia Han
- Key Laboratory of Chemical Biology and Molecular Engineering of Ministry of Education, Institute of Biotechnology, Shanxi University, Taiyuan 030006, China
| | - Yuejun Fu
- Key Laboratory of Chemical Biology and Molecular Engineering of Ministry of Education, Institute of Biotechnology, Shanxi University, Taiyuan 030006, China
| | - Paula Alepuz
- Instituto de Biotecnología y Biomedicina (Biotecmed) and Departamento de Bioquímica y Biología Molecular, Universitat de València, Spain
| | - Wei Wang
- Key Laboratory of Chemical Biology and Molecular Engineering of Ministry of Education, Institute of Biotechnology, Shanxi University, Taiyuan 030006, China.
| | - Aihua Liang
- Key Laboratory of Chemical Biology and Molecular Engineering of Ministry of Education, Institute of Biotechnology, Shanxi University, Taiyuan 030006, China.
| |
Collapse
|
7
|
Antonov IV, O’Loughlin S, Gorohovski AN, O’Connor PB, Baranov PV, Atkins JF. Streptomyces rare codon UUA: from features associated with 2 adpA related locations to candidate phage regulatory translational bypassing. RNA Biol 2023; 20:926-942. [PMID: 37968863 PMCID: PMC10732093 DOI: 10.1080/15476286.2023.2270812] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2022] [Accepted: 10/02/2023] [Indexed: 11/17/2023] Open
Abstract
In Streptomyces species, the cell cycle involves a switch from an early and vegetative state to a later phase where secondary products including antibiotics are synthesized, aerial hyphae form and sporulation occurs. AdpA, which has two domains, activates the expression of numerous genes involved in the switch from the vegetative growth phase. The adpA mRNA of many Streptomyces species has a UUA codon in a linker region between 5' sequence encoding one domain and 3' sequence encoding its other and C-terminal domain. UUA codons are exceptionally rare in Streptomyces, and its functional cognate tRNA is not present in a fully modified and acylated form, in the early and vegetative phase of the cell cycle though it is aminoacylated later. Here, we report candidate recoding signals that may influence decoding of the linker region UUA. Additionally, a short ORF 5' of the main ORF has been identified with a GUG at, or near, its 5' end and an in-frame UUA near its 3' end. The latter is commonly 5 nucleotides 5' of the main ORF start. Ribosome profiling data show translation of that 5' region. Ten years ago, UUA-mediated translational bypassing was proposed as a sensor by a Streptomyces phage of its host's cell cycle stage and an effector of its lytic/lysogeny switch. We provide the first experimental evidence supportive of this proposal.
Collapse
Affiliation(s)
- Ivan V. Antonov
- Russian Academy of Science, Institute of Bioengineering, Research Center of Biotechnology, Moscow, Russia
- Laboratory of Bioinformatics, Faculty of Computer Science, National Research University Higher School of Economics, Moscow, Russia
- School of Biochemistry and Cell Biology, University College Cork, Cork, Ireland
| | - Sinéad O’Loughlin
- School of Biochemistry and Cell Biology, University College Cork, Cork, Ireland
| | - Alessandro N. Gorohovski
- Russian Academy of Science, Institute of Bioengineering, Research Center of Biotechnology, Moscow, Russia
- Structural Biology and BioComputing Program, Spanish National Cancer Research Centre (CNIO), Madrid, Spain
| | | | - Pavel V. Baranov
- School of Biochemistry and Cell Biology, University College Cork, Cork, Ireland
| | - John F. Atkins
- School of Biochemistry and Cell Biology, University College Cork, Cork, Ireland
| |
Collapse
|