1
|
Kang YJ, Li JY, Ke L, Jiang S, Yang DC, Hou M, Gao G. Quantitative model suggests both intrinsic and contextual features contribute to the transcript coding ability determination in cells. Brief Bioinform 2021; 23:6445106. [PMID: 34849565 DOI: 10.1093/bib/bbab483] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2021] [Revised: 10/18/2021] [Accepted: 10/23/2021] [Indexed: 11/13/2022] Open
Abstract
Gene transcription and protein translation are two key steps of the 'central dogma.' It is still a major challenge to quantitatively deconvolute factors contributing to the coding ability of transcripts in mammals. Here, we propose ribosome calculator (RiboCalc) for quantitatively modeling the coding ability of RNAs in human genome. In addition to effectively predicting the experimentally confirmed coding abundance via sequence and transcription features with high accuracy, RiboCalc provides interpretable parameters with biological information. Large-scale analysis further revealed a number of transcripts with a variety of coding ability for distinct types of cells (i.e. context-dependent coding transcripts), suggesting that, contrary to conventional wisdom, a transcript's coding ability should be modeled as a continuous spectrum with a context-dependent nature.
Collapse
Affiliation(s)
- Yu-Jian Kang
- Biomedical Pioneering Innovation Center (BIOPIC), Beijing Advanced Innovation Center for Genomics (ICG), Center for Bioinformatics (CBI), and State Key Laboratory of Protein and Plant Gene Research at School of Life Sciences, Peking University, Beijing, 100871, China
| | - Jing-Yi Li
- Biomedical Pioneering Innovation Center (BIOPIC), Beijing Advanced Innovation Center for Genomics (ICG), Center for Bioinformatics (CBI), and State Key Laboratory of Protein and Plant Gene Research at School of Life Sciences, Peking University, Beijing, 100871, China
| | - Lan Ke
- Biomedical Pioneering Innovation Center (BIOPIC), Beijing Advanced Innovation Center for Genomics (ICG), Center for Bioinformatics (CBI), and State Key Laboratory of Protein and Plant Gene Research at School of Life Sciences, Peking University, Beijing, 100871, China
| | - Shuai Jiang
- Biomedical Pioneering Innovation Center (BIOPIC), Beijing Advanced Innovation Center for Genomics (ICG), Center for Bioinformatics (CBI), and State Key Laboratory of Protein and Plant Gene Research at School of Life Sciences, Peking University, Beijing, 100871, China
| | - De-Chang Yang
- Biomedical Pioneering Innovation Center (BIOPIC), Beijing Advanced Innovation Center for Genomics (ICG), Center for Bioinformatics (CBI), and State Key Laboratory of Protein and Plant Gene Research at School of Life Sciences, Peking University, Beijing, 100871, China
| | - Mei Hou
- Biomedical Pioneering Innovation Center (BIOPIC), Beijing Advanced Innovation Center for Genomics (ICG), Center for Bioinformatics (CBI), and State Key Laboratory of Protein and Plant Gene Research at School of Life Sciences, Peking University, Beijing, 100871, China
| | - Ge Gao
- Biomedical Pioneering Innovation Center (BIOPIC), Beijing Advanced Innovation Center for Genomics (ICG), Center for Bioinformatics (CBI), and State Key Laboratory of Protein and Plant Gene Research at School of Life Sciences, Peking University, Beijing, 100871, China
| |
Collapse
|
2
|
Volkova OA, Kondrakhin YV, Kashapov TA, Sharipov RN. Comparative analysis of protein-coding and long non-coding transcripts based on RNA sequence features. J Bioinform Comput Biol 2019; 16:1840013. [PMID: 29739305 DOI: 10.1142/s0219720018400139] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
RNA plays an important role in the intracellular cell life and in the organism in general. Besides the well-established protein coding RNAs (messenger RNAs, mRNAs), long non-coding RNAs (lncRNAs) have gained the attention of recent researchers. Although lncRNAs have been classified as non-coding, some authors reported the presence of corresponding sequences in ribosome profiling data (Ribo-seq). Ribo-seq technology is a powerful experimental tool utilized to characterize RNA translation in cell with focus on initiation (harringtonine, lactimidomycin) and elongation (cycloheximide). By exploiting translation starts obtained from the Ribo-seq experiment, we developed a novel position weight matrix model for the prediction of translation starts. This model allowed us to achieve 96% accuracy of discrimination between human mRNAs and lncRNAs. When the same model was used for the prediction of putative ORFs in RNAs, we discovered that the majority of lncRNAs contained only small ORFs ([Formula: see text][Formula: see text]nt) in contrast to mRNAs.
Collapse
Affiliation(s)
- Oxana A Volkova
- * Laboratory of Gene Engineering, The Federal Research Center Institute of Cytology and Genetics, The Siberian Branch of the Russian Academy of Sciences, Prosp. Acad. Lavrentyeva, 10, Novosibirsk 630090, Russia
| | - Yury V Kondrakhin
- † Laboratory of Bioinformatics, Institute of Computational Technologies, The Siberian Branch of the Russian Academy of Sciences, Ul. Acad. Rzhanova, 6, Novosibirsk 630090, Russia.,‡ BIOSOFT.RU, Ltd, Ul. Russkaya, 41/1 Novosibirsk 630058, Russia
| | - Timur A Kashapov
- ‡ BIOSOFT.RU, Ltd, Ul. Russkaya, 41/1 Novosibirsk 630058, Russia
| | - Ruslan N Sharipov
- ‡ BIOSOFT.RU, Ltd, Ul. Russkaya, 41/1 Novosibirsk 630058, Russia.,§ Novosibirsk State University, Ul. Pirogova, 2, Novosibirsk 630090, Russia
| |
Collapse
|
3
|
Gelfand M. Introduction to selected papers from MCCMB 2015. J Bioinform Comput Biol 2016. [DOI: 10.1142/s0219720016020030] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
|