1
|
Sutanto K, Turcotte M. Assessing Global-Local Secondary Structure Fingerprints to Classify RNA Sequences With Deep Learning. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2023; 20:2736-2747. [PMID: 34633933 DOI: 10.1109/tcbb.2021.3118358] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
RNA elements that are transcribed but not translated into proteins are called non-coding RNAs (ncRNAs). They play wide-ranging roles in biological processes and disorders. Just like proteins, their structure is often intimately linked to their function. Many examples have been documented where structure is conserved across taxa despite sequence divergence. Thus, structure is often used to identify function. Specifically, the secondary structure is predicted and ncRNAs with similar structures are assumed to have same or similar functions. However, a strand of RNA can fold into multiple possible structures, and some strands even fold differently in vivo and in vitro. Furthermore, ncRNAs often function as RNA-protein complexes, which can affect structure. Because of these, we hypothesized using one structure per sequence may discard information, possibly resulting in poorer classification accuracy. Therefore, we propose using secondary structure fingerprints, comprising two categories: a higher-level representation derived from RNA-As-Graphs (RAG), and free energy fingerprints based on a curated repertoire of small structural motifs. The fingerprints take into account the difference between global and local structural matches. We also evaluated our deep learning architecture with k-mers. By combining our global-local fingerprints with 6-mer, we achieved an accuracy, precision, and recall of 91.04%, 91.10%, and 91.00%.
Collapse
|
2
|
Zhou S, Gu Y, Yu H, Yang X, Gao S. RUE: A Robust Personalized Cost Assignment Strategy for Class Imbalance Cost-sensitive Learning. JOURNAL OF KING SAUD UNIVERSITY - COMPUTER AND INFORMATION SCIENCES 2023. [DOI: 10.1016/j.jksuci.2023.03.001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/08/2023]
|
3
|
|
4
|
Zhang Y, Lin M, Yang Y, Ding C. A Hybrid Ensemble and Evolutionary Algorithm for Imbalanced Classification and its Application on Bioinformatics. Comput Biol Chem 2022; 98:107646. [DOI: 10.1016/j.compbiolchem.2022.107646] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2021] [Revised: 02/15/2022] [Accepted: 02/21/2022] [Indexed: 11/03/2022]
|
5
|
Breuer R, Gomes-Filho JV, Randau L. Conservation of Archaeal C/D Box sRNA-Guided RNA Modifications. Front Microbiol 2021; 12:654029. [PMID: 33776983 PMCID: PMC7994747 DOI: 10.3389/fmicb.2021.654029] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2021] [Accepted: 02/19/2021] [Indexed: 12/18/2022] Open
Abstract
Post-transcriptional modifications fulfill many important roles during ribosomal RNA maturation in all three domains of life. Ribose 2'-O-methylations constitute the most abundant chemical rRNA modification and are, for example, involved in RNA folding and stabilization. In archaea, these modification sites are determined by variable sets of C/D box sRNAs that guide the activity of the rRNA 2'-O-methyltransferase fibrillarin. Each C/D box sRNA contains two guide sequences that can act in coordination to bridge rRNA sequences. Here, we will review the landscape of archaeal C/D box sRNA genes and their target sites. One focus is placed on the apparent accelerated evolution of guide sequences and the varied pairing of the two individual guides, which results in different rRNA modification patterns and RNA chaperone activities.
Collapse
Affiliation(s)
| | | | - Lennart Randau
- Prokaryotic RNA Biology, Philipps-Universität Marburg, Marburg, Germany
| |
Collapse
|
6
|
Georgakilas GK, Grioni A, Liakos KG, Chalupova E, Plessas FC, Alexiou P. Multi-branch Convolutional Neural Network for Identification of Small Non-coding RNA genomic loci. Sci Rep 2020; 10:9486. [PMID: 32528107 PMCID: PMC7289789 DOI: 10.1038/s41598-020-66454-3] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2019] [Accepted: 05/21/2020] [Indexed: 02/03/2023] Open
Abstract
Genomic regions that encode small RNA genes exhibit characteristic patterns in their sequence, secondary structure, and evolutionary conservation. Convolutional Neural Networks are a family of algorithms that can classify data based on learned patterns. Here we present MuStARD an application of Convolutional Neural Networks that can learn patterns associated with user-defined sets of genomic regions, and scan large genomic areas for novel regions exhibiting similar characteristics. We demonstrate that MuStARD is a generic method that can be trained on different classes of human small RNA genomic loci, without need for domain specific knowledge, due to the automated feature and background selection processes built into the model. We also demonstrate the ability of MuStARD for inter-species identification of functional elements by predicting mouse small RNAs (pre-miRNAs and snoRNAs) using models trained on the human genome. MuStARD can be used to filter small RNA-Seq datasets for identification of novel small RNA loci, intra- and inter- species, as demonstrated in three use cases of human, mouse, and fly pre-miRNA prediction. MuStARD is easy to deploy and extend to a variety of genomic classification questions. Code and trained models are freely available at gitlab.com/RBP_Bioinformatics/mustard.
Collapse
Affiliation(s)
| | - Andrea Grioni
- Central European Institute of Technology, Brno, Czech Republic
| | - Konstantinos G Liakos
- Department of Electrical and Computer Engineering, School of Engineering, University of Thessaly, Volos, Greece
| | - Eliska Chalupova
- Faculty of Science, National Centre for Biomolecular Research, Masaryk University, Brno, Czech Republic
| | - Fotis C Plessas
- Department of Electrical and Computer Engineering, School of Engineering, University of Thessaly, Volos, Greece
| | | |
Collapse
|
7
|
Bratkovič T, Božič J, Rogelj B. Functional diversity of small nucleolar RNAs. Nucleic Acids Res 2020; 48:1627-1651. [PMID: 31828325 PMCID: PMC7038934 DOI: 10.1093/nar/gkz1140] [Citation(s) in RCA: 132] [Impact Index Per Article: 33.0] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2019] [Revised: 11/17/2019] [Accepted: 12/05/2019] [Indexed: 12/22/2022] Open
Abstract
Small nucleolar RNAs (snoRNAs) are short non-protein-coding RNAs with a long-recognized role in tuning ribosomal and spliceosomal function by guiding ribose methylation and pseudouridylation at targeted nucleotide residues of ribosomal and small nuclear RNAs, respectively. SnoRNAs are increasingly being implicated in regulation of new types of post-transcriptional processes, for example rRNA acetylation, modulation of splicing patterns, control of mRNA abundance and translational efficiency, or they themselves are processed to shorter stable RNA species that seem to be the principal or alternative bioactive isoform. Intriguingly, some display unusual cellular localization under exogenous stimuli, or tissue-specific distribution. Here, we discuss the new and unforeseen roles attributed to snoRNAs, focusing on the presumed mechanisms of action. Furthermore, we review the experimental approaches to study snoRNA function, including high resolution RNA:protein and RNA:RNA interaction mapping, techniques for analyzing modifications on targeted RNAs, and cellular and animal models used in snoRNA biology research.
Collapse
Affiliation(s)
- Tomaž Bratkovič
- University of Ljubljana, Faculty of Pharmacy, Aškerčeva cesta 7, SI1000 Ljubljana, Slovenia
| | - Janja Božič
- Jozef Stefan Institute, Department of Biotechnology, Jamova cesta 39, SI1000 Ljubljana, Slovenia.,Biomedical Research Institute BRIS, Puhova ulica 10, SI1000 Ljubljana, Slovenia
| | - Boris Rogelj
- University of Ljubljana, Faculty of Pharmacy, Aškerčeva cesta 7, SI1000 Ljubljana, Slovenia.,Jozef Stefan Institute, Department of Biotechnology, Jamova cesta 39, SI1000 Ljubljana, Slovenia.,Biomedical Research Institute BRIS, Puhova ulica 10, SI1000 Ljubljana, Slovenia.,University of Ljubljana, Faculty of Chemistry and Chemical Technology, Večna pot 113, SI1000 Ljubljana, Slovenia
| |
Collapse
|
8
|
Wang HT, Xiao FH, Li GH, Kong QP. Identification of DNA N 6-methyladenine sites by integration of sequence features. Epigenetics Chromatin 2020; 13:8. [PMID: 32093759 PMCID: PMC7038560 DOI: 10.1186/s13072-020-00330-2] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2019] [Accepted: 02/03/2020] [Indexed: 02/21/2023] Open
Abstract
Background An increasing number of nucleic acid modifications have been profiled with the development of sequencing technologies. DNA N6-methyladenine (6mA), which is a prevalent epigenetic modification, plays important roles in a series of biological processes. So far, identification of DNA 6mA relies primarily on time-consuming and expensive experimental approaches. However, in silico methods can be implemented to conduct preliminary screening to save experimental resources and time, especially given the rapid accumulation of sequencing data. Results In this study, we constructed a 6mA predictor, p6mA, from a series of sequence-based features, including physicochemical properties, position-specific triple-nucleotide propensity (PSTNP), and electron–ion interaction pseudopotential (EIIP). We performed maximum relevance maximum distance (MRMD) analysis to select key features and used the Extreme Gradient Boosting (XGBoost) algorithm to build our predictor. Results demonstrated that p6mA outperformed other existing predictors using different datasets. Conclusions p6mA can predict the methylation status of DNA adenines, using only sequence files. It may be used as a tool to help the study of 6mA distribution pattern. Users can download it from https://github.com/Konglab404/p6mA.
Collapse
Affiliation(s)
- Hao-Tian Wang
- State Key Laboratory of Genetic Resources and Evolution/Key Laboratory of Healthy Aging Research of Yunnan Province, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, 650223, China.,Center for Excellence in Animal Evolution and Genetics, Chinese Academy of Sciences, Kunming, 650223, China.,Kunming Key Laboratory of Healthy Aging Study, Kunming, 650223, China.,Kunming College of Life Science, University of Chinese Academy of Sciences, Beijing, 100049, China
| | - Fu-Hui Xiao
- State Key Laboratory of Genetic Resources and Evolution/Key Laboratory of Healthy Aging Research of Yunnan Province, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, 650223, China.,Center for Excellence in Animal Evolution and Genetics, Chinese Academy of Sciences, Kunming, 650223, China.,Kunming Key Laboratory of Healthy Aging Study, Kunming, 650223, China
| | - Gong-Hua Li
- State Key Laboratory of Genetic Resources and Evolution/Key Laboratory of Healthy Aging Research of Yunnan Province, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, 650223, China.,Center for Excellence in Animal Evolution and Genetics, Chinese Academy of Sciences, Kunming, 650223, China.,Kunming Key Laboratory of Healthy Aging Study, Kunming, 650223, China
| | - Qing-Peng Kong
- State Key Laboratory of Genetic Resources and Evolution/Key Laboratory of Healthy Aging Research of Yunnan Province, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, 650223, China. .,Center for Excellence in Animal Evolution and Genetics, Chinese Academy of Sciences, Kunming, 650223, China. .,Kunming Key Laboratory of Healthy Aging Study, Kunming, 650223, China. .,KIZ/CUHK Joint Laboratory of Bioresources and Molecular Research in Common Diseases, Kunming, 650223, China.
| |
Collapse
|
9
|
Dong YM, Bi JH, He QE, Song K. ESDA: An Improved Approach to Accurately Identify Human snoRNAs for Precision Cancer Therapy. Curr Bioinform 2020. [DOI: 10.2174/1574893614666190424162230] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/14/2023]
Abstract
Background:
SnoRNAs (Small nucleolar RNAs) are small RNA molecules with approximately
60-300 nucleotides in sequence length. They have been proved to play important roles
in cancer occurrence and progression. It is of great clinical importance to identify new snoRNAs as
fast and accurately as possible.
Objective:
A novel algorithm, ESDA (Elastically Sparse Partial Least Squares Discriminant Analysis),
was proposed to improve the speed and the performance of recognizing snoRNAs from other
RNAs in human genomes.
Methods:
In ESDA algorithm, to optimize the extracted information, kernel features were selected
from the variables extracted from both primary sequences and secondary structures. Then they
were used by SPLSDA (sparse partial least squares discriminant analysis) algorithm as input variables
for the final classification model training to distinguish snoRNA sequences from other Human
RNAs. Due to the fact that no prior biological knowledge is request to optimize the classification
model, ESDA is a very practical method especially for completely new sequences.
Results:
89 H/ACA snoRNAs and 269 C/D snoRNAs of human were used as positive samples and
3403 non-snoRNAs as negative samples to test the identification performance of the proposed
ESDA. For the H/ACA snoRNAs identification, the sensitivity and specificity were respectively as
high as 99.6% and 98.8%. For C/D snoRNAs, they were respectively 96.1% and 98.3%. Furthermore,
we compared ESDA with other widely used algorithms and classifiers: SnoReport, RF
(Random Forest), DWD (Distance Weighted Discrimination) and SVM (Support Vector Machine).
The highest improvement of accuracy obtained by ESDA was 25.1%.
Conclusion:
Strongly proved the superiority performance of ESDA and make it promising for
identifying SnoRNAs for further development of the precision medicine for cancers.
Collapse
Affiliation(s)
- Yan-mei Dong
- School of Chemical Engineering & Technology, Tianjin University, 300072 Tianjin, China
| | - Jia-hao Bi
- School of Chemical Engineering & Technology, Tianjin University, 300072 Tianjin, China
| | - Qi-en He
- School of Chemical Engineering & Technology, Tianjin University, 300072 Tianjin, China
| | - Kai Song
- School of Chemical Engineering & Technology, Tianjin University, 300072 Tianjin, China
| |
Collapse
|
10
|
Deryusheva S, Talhouarne GJS, Gall JG. "Lost and Found": snoRNA Annotation in the Xenopus Genome and Implications for Evolutionary Studies. Mol Biol Evol 2020; 37:149-166. [PMID: 31553476 PMCID: PMC6984369 DOI: 10.1093/molbev/msz209] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022] Open
Abstract
Small nucleolar RNAs (snoRNAs) function primarily as guide RNAs for posttranscriptional modification of rRNAs and spliceosomal snRNAs, both of which are functionally important and evolutionarily conserved molecules. It is commonly believed that snoRNAs and the modifications they mediate are highly conserved across species. However, most relevant data on snoRNA annotation and RNA modification are limited to studies on human and yeast. Here, we used RNA-sequencing data from the giant oocyte nucleus of the frog Xenopus tropicalis to annotate a nearly complete set of snoRNAs. We compared the frog data with snoRNA sets from human and other vertebrate genomes, including mammals, birds, reptiles, and fish. We identified many Xenopus-specific (or nonhuman) snoRNAs and Xenopus-specific domains in snoRNAs from conserved RNA families. We predicted that some of these nonhuman snoRNAs and domains mediate modifications at unexpected positions in rRNAs and snRNAs. These modifications were mapped as predicted when RNA modification assays were applied to RNA from nine vertebrate species: frogs X. tropicalis and X. laevis, newt Notophthalmus viridescens, axolotl Ambystoma mexicanum, whiptail lizard Aspidoscelis neomexicana, zebrafish Danio rerio, chicken, mouse, and human. This analysis revealed that only a subset of RNA modifications is evolutionarily conserved and that modification patterns may vary even between closely related species. We speculate that each functional domain in snoRNAs (half of an snoRNA) may evolve independently and shuffle between different snoRNAs.
Collapse
Affiliation(s)
| | | | - Joseph G Gall
- Department of Embryology, Carnegie Institution for Science, Baltimore, MD
| |
Collapse
|
11
|
Dual-initiation promoters with intertwined canonical and TCT/TOP transcription start sites diversify transcript processing. Nat Commun 2020; 11:168. [PMID: 31924754 PMCID: PMC6954239 DOI: 10.1038/s41467-019-13687-0] [Citation(s) in RCA: 21] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2018] [Accepted: 11/19/2019] [Indexed: 12/26/2022] Open
Abstract
Variations in transcription start site (TSS) selection reflect diversity of preinitiation complexes and can impact on post-transcriptional RNA fates. Most metazoan polymerase II-transcribed genes carry canonical initiation with pyrimidine/purine (YR) dinucleotide, while translation machinery-associated genes carry polypyrimidine initiator (5’-TOP or TCT). By addressing the developmental regulation of TSS selection in zebrafish we uncovered a class of dual-initiation promoters in thousands of genes, including snoRNA host genes. 5’-TOP/TCT initiation is intertwined with canonical initiation and used divergently in hundreds of dual-initiation promoters during maternal to zygotic transition. Dual-initiation in snoRNA host genes selectively generates host and snoRNA with often different spatio-temporal expression. Dual-initiation promoters are pervasive in human and fruit fly, reflecting evolutionary conservation. We propose that dual-initiation on shared promoters represents a composite promoter architecture, which can function both coordinately and divergently to diversify RNAs. The functional significance of start site choice in promoter architectures is little understood. Here the authors identify in zebrafish development and mammalian cells a class of dual-initiation promoters, in which non-canonical YC dinucleotides reflecting 5’-TOP/TCT initiation are intertwined with canonical YR-initiation.
Collapse
|
12
|
Bahrami AA, Payandeh Z, Khalili S, Zakeri A, Bandehpour M. Immunoinformatics: In Silico Approaches and Computational Design of a Multi-epitope, Immunogenic Protein. Int Rev Immunol 2019; 38:307-322. [PMID: 31478759 DOI: 10.1080/08830185.2019.1657426] [Citation(s) in RCA: 60] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/26/2022]
Abstract
Immunoinformatics is a new critical field with several tools and databases that conduct the eyesight of experimental selection and facilitate analysis of the great amount of immunologic data obtained from experimental researches and helps to design and introducing new hypothesis. Given these visages, immunoinformatics seems to be the way that develop and progress the immunological research. Bioinformatics methods and applications are successfully employed in vaccine informatics to assist different sites of the preclinical, clinical, and post-licensure vaccine enterprises. On the other hand, the progression of molecular biology and immunology caused epitope vaccines have become the focus of research on molecular vaccines. Moreover, reverse vaccinology could improve vaccine production and vaccination protocols by in silico prediction of protein-vaccine candidates from genome sequences. B- and T-cell immune epitopes could be predicted by immunoinformatics algorithms and computational methods to improve the vaccine design, protective immunity analysis, assessment of vaccine safety and efficacy, and immunization modeling. This review aims to discuss the power of computational approaches in vaccine design and their relevance to the development of effective vaccines. Furthermore, the various divisions of this field and available tools in each item are introduced and reviewed.
Collapse
Affiliation(s)
- Armina Alagheband Bahrami
- Department of Biotechnology, School of Advanced Technologies in Medicine, Shahid Beheshti University of Medical Sciences, Tehran, Iran
| | - Zahra Payandeh
- Immunology Research Center, Tabriz University of Medical Sciences, Tabriz, Iran
| | - Saeed Khalili
- Department of Biology Sciences, Shahid Rajaee Teacher Training University, Tehran, Iran
| | - Alireza Zakeri
- Department of Biology Sciences, Shahid Rajaee Teacher Training University, Tehran, Iran
| | - Mojgan Bandehpour
- Department of Biotechnology, School of Advanced Technologies in Medicine, Shahid Beheshti University of Medical Sciences, Tehran, Iran
| |
Collapse
|
13
|
Yu H, Yang X, Zheng S, Sun C. Active Learning From Imbalanced Data: A Solution of Online Weighted Extreme Learning Machine. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2019; 30:1088-1103. [PMID: 30137013 DOI: 10.1109/tnnls.2018.2855446] [Citation(s) in RCA: 35] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
It is well known that active learning can simultaneously improve the quality of the classification model and decrease the complexity of training instances. However, several previous studies have indicated that the performance of active learning is easily disrupted by an imbalanced data distribution. Some existing imbalanced active learning approaches also suffer from either low performance or high time consumption. To address these problems, this paper describes an efficient solution based on the extreme learning machine (ELM) classification model, called active online-weighted ELM (AOW-ELM). The main contributions of this paper include: 1) the reasons why active learning can be disrupted by an imbalanced instance distribution and its influencing factors are discussed in detail; 2) the hierarchical clustering technique is adopted to select initially labeled instances in order to avoid the missed cluster effect and cold start phenomenon as much as possible; 3) the weighted ELM (WELM) is selected as the base classifier to guarantee the impartiality of instance selection in the procedure of active learning, and an efficient online updated mode of WELM is deduced in theory; and 4) an early stopping criterion that is similar to but more flexible than the margin exhaustion criterion is presented. The experimental results on 32 binary-class data sets with different imbalance ratios demonstrate that the proposed AOW-ELM algorithm is more effective and efficient than several state-of-the-art active learning algorithms that are specifically designed for the class imbalance scenario.
Collapse
|
14
|
Expression profiling of snoRNAs in normal hematopoiesis and AML. Blood Adv 2019; 2:151-163. [PMID: 29365324 DOI: 10.1182/bloodadvances.2017006668] [Citation(s) in RCA: 33] [Impact Index Per Article: 6.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2017] [Accepted: 12/21/2017] [Indexed: 12/13/2022] Open
Abstract
Small nucleolar RNAs (snoRNAs) are noncoding RNAs that contribute to ribosome biogenesis and RNA splicing by modifying ribosomal RNA and spliceosome RNAs, respectively. We optimized a next-generation sequencing approach and a custom analysis pipeline to identify and quantify expression of snoRNAs in acute myeloid leukemia (AML) and normal hematopoietic cell populations. We show that snoRNAs are expressed in a lineage- and development-specific fashion during hematopoiesis. The most striking examples involve snoRNAs located in 2 imprinted loci, which are highly expressed in hematopoietic progenitors and downregulated during myeloid differentiation. Although most snoRNAs are expressed at similar levels in AML cells compared with CD34+, a subset of snoRNAs showed consistent differential expression, with the great majority of these being decreased in the AML samples. Analysis of host gene expression, splicing patterns, and whole-genome sequence data for mutational events did not identify transcriptional patterns or genetic alterations that account for these expression differences. These data provide a comprehensive analysis of the snoRNA transcriptome in normal and leukemic cells and should be helpful in the design of studies to define the contribution of snoRNAs to normal and malignant hematopoiesis.
Collapse
|
15
|
Noncoding RNA Transcripts during Differentiation of Induced Pluripotent Stem Cells into Hepatocytes. Stem Cells Int 2018; 2018:5692840. [PMID: 30210551 PMCID: PMC6120260 DOI: 10.1155/2018/5692840] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2018] [Revised: 05/20/2018] [Accepted: 06/12/2018] [Indexed: 02/01/2023] Open
Abstract
Recent advances in the stem cell field allow to obtain many human tissues in vitro. However, hepatic differentiation of induced pluripotent stem cells (iPSCs) still remains challenging. Hepatocyte-like cells (HLCs) obtained after differentiation resemble more fetal liver hepatocytes. MicroRNAs (miRNA) play an important role in the differentiation process. Here, we analysed noncoding RNA profiles from the last stages of differentiation and compare them to hepatocytes. Our results show that HLCs maintain an epithelial character and express miRNA which can block hepatocyte maturation by inhibiting the epithelial-mesenchymal transition (EMT). Additionally, we identified differentially expressed small nucleolar RNAs (snoRNAs) and discovered novel noncoding RNA (ncRNA) genes.
Collapse
|
16
|
Arias-Carrasco R, Vásquez-Morán Y, Nakaya HI, Maracaja-Coutinho V. StructRNAfinder: an automated pipeline and web server for RNA families prediction. BMC Bioinformatics 2018; 19:55. [PMID: 29454313 PMCID: PMC5816368 DOI: 10.1186/s12859-018-2052-2] [Citation(s) in RCA: 29] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2017] [Accepted: 02/02/2018] [Indexed: 01/11/2023] Open
Abstract
Background The function of many noncoding RNAs (ncRNAs) depend upon their secondary structures. Over the last decades, several methodologies have been developed to predict such structures or to use them to functionally annotate RNAs into RNA families. However, to fully perform this analysis, researchers should utilize multiple tools, which require the constant parsing and processing of several intermediate files. This makes the large-scale prediction and annotation of RNAs a daunting task even to researchers with good computational or bioinformatics skills. Results We present an automated pipeline named StructRNAfinder that predicts and annotates RNA families in transcript or genome sequences. This single tool not only displays the sequence/structural consensus alignments for each RNA family, according to Rfam database but also provides a taxonomic overview for each assigned functional RNA. Moreover, we implemented a user-friendly web service that allows researchers to upload their own nucleotide sequences in order to perform the whole analysis. Finally, we provided a stand-alone version of StructRNAfinder to be used in large-scale projects. The tool was developed under GNU General Public License (GPLv3) and is freely available at http://structrnafinder.integrativebioinformatics.me. Conclusions The main advantage of StructRNAfinder relies on the large-scale processing and integrating the data obtained by each tool and database employed along the workflow, of which several files are generated and displayed in user-friendly reports, useful for downstream analyses and data exploration. Electronic supplementary material The online version of this article (10.1186/s12859-018-2052-2) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Raúl Arias-Carrasco
- Centro de Genómica y Bioinformática, Facultad de Ciencias, Universidad Mayor, 8580745, Santiago, Chile.,Programa de Doctorado en Genómica Integrativa, Vicerrectoría de Investigación, Universidad Mayor, 8580745, Santiago, Chile
| | - Yessenia Vásquez-Morán
- Centro de Genómica y Bioinformática, Facultad de Ciencias, Universidad Mayor, 8580745, Santiago, Chile
| | - Helder I Nakaya
- Faculdade de Ciências Farmacêuticas, Universidade de São Paulo, São Paulo, 05508-900, Brazil.
| | - Vinicius Maracaja-Coutinho
- Centro de Genómica y Bioinformática, Facultad de Ciencias, Universidad Mayor, 8580745, Santiago, Chile. .,Instituto Vandique, João Pessoa, 58000-000, Brazil. .,Beagle Bioinformatics, 8320000, Santiago, Chile. .,Advanced Center for Chronic Diseases (ACCDiS), Facultad de Ciencias Químicas y Farmacéuticas, Universidad de Chile, 8380492, Santiago, Chile.
| |
Collapse
|
17
|
Yadav S, Shekhawat M, Jahagirdar D, Kumar Sharma N. Natural and artificial small RNAs: a promising avenue of nucleic acid therapeutics for cancer. Cancer Biol Med 2017; 14:242-253. [PMID: 28884041 PMCID: PMC5570601 DOI: 10.20892/j.issn.2095-3941.2017.0038] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2017] [Accepted: 05/22/2017] [Indexed: 01/02/2023] Open
Abstract
Since the failure of traditional therapy, gene therapy using functional DNA sequence and small RNA/DNA molecules (oligonucleotide) has become a promising avenue for cancer treatment. The discovery of RNA molecules has impelled researchers to investigate small regulatory RNA from various natural and artificial sources and determine a cogent target for controlling tumor progression. Small regulatory RNAs are used for therapeutic silencing of oncogenes and aberrant DNA repair response genes. Despite their advantages, therapies based on small RNAs exhibit limitations in terms of stability of therapeutic drugs, precision-based delivery in tissues, precision-based intercellular and intracellular targeting, and tumor heterogeneity-based responses. In this study, we summarize the potential and drawbacks of small RNAs in nucleic acid therapeutics for cancer.
Collapse
Affiliation(s)
- Sunny Yadav
- Cancer and Translational Research Lab, Dr. D.Y Patil Biotechnology & Bioinformatics Institute, Dr. D. Y. Patil Vidyapeeth, Pune 411033, Maharashtra, India
| | - Mamta Shekhawat
- Cancer and Translational Research Lab, Dr. D.Y Patil Biotechnology & Bioinformatics Institute, Dr. D. Y. Patil Vidyapeeth, Pune 411033, Maharashtra, India
| | - Devashree Jahagirdar
- Cancer and Translational Research Lab, Dr. D.Y Patil Biotechnology & Bioinformatics Institute, Dr. D. Y. Patil Vidyapeeth, Pune 411033, Maharashtra, India
| | - Nilesh Kumar Sharma
- Cancer and Translational Research Lab, Dr. D.Y Patil Biotechnology & Bioinformatics Institute, Dr. D. Y. Patil Vidyapeeth, Pune 411033, Maharashtra, India
| |
Collapse
|
18
|
A Review on Recent Computational Methods for Predicting Noncoding RNAs. BIOMED RESEARCH INTERNATIONAL 2017; 2017:9139504. [PMID: 28553651 PMCID: PMC5434267 DOI: 10.1155/2017/9139504] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/29/2016] [Revised: 02/06/2017] [Accepted: 02/15/2017] [Indexed: 12/20/2022]
Abstract
Noncoding RNAs (ncRNAs) play important roles in various cellular activities and diseases. In this paper, we presented a comprehensive review on computational methods for ncRNA prediction, which are generally grouped into four categories: (1) homology-based methods, that is, comparative methods involving evolutionarily conserved RNA sequences and structures, (2) de novo methods using RNA sequence and structure features, (3) transcriptional sequencing and assembling based methods, that is, methods designed for single and pair-ended reads generated from next-generation RNA sequencing, and (4) RNA family specific methods, for example, methods specific for microRNAs and long noncoding RNAs. In the end, we summarized the advantages and limitations of these methods and pointed out a few possible future directions for ncRNA prediction. In conclusion, many computational methods have been demonstrated to be effective in predicting ncRNAs for further experimental validation. They are critical in reducing the huge number of potential ncRNAs and pointing the community to high confidence candidates. In the future, high efficient mapping technology and more intrinsic sequence features (e.g., motif and k-mer frequencies) and structure features (e.g., minimum free energy, conserved stem-loop, or graph structures) are suggested to be combined with the next- and third-generation sequencing platforms to improve ncRNA prediction.
Collapse
|
19
|
Vieira LM, Grativol C, Thiebaut F, Carvalho TG, Hardoim PR, Hemerly A, Lifschitz S, Ferreira PCG, Walter MEMT. PlantRNA_Sniffer: A SVM-Based Workflow to Predict Long Intergenic Non-Coding RNAs in Plants. Noncoding RNA 2017; 3:ncrna3010011. [PMID: 29657283 PMCID: PMC5831995 DOI: 10.3390/ncrna3010011] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/29/2016] [Revised: 02/19/2017] [Accepted: 02/24/2017] [Indexed: 12/17/2022] Open
Abstract
Non-coding RNAs (ncRNAs) constitute an important set of transcripts produced in the cells of organisms. Among them, there is a large amount of a particular class of long ncRNAs that are difficult to predict, the so-called long intergenic ncRNAs (lincRNAs), which might play essential roles in gene regulation and other cellular processes. Despite the importance of these lincRNAs, there is still a lack of biological knowledge and, currently, the few computational methods considered are so specific that they cannot be successfully applied to other species different from those that they have been originally designed to. Prediction of lncRNAs have been performed with machine learning techniques. Particularly, for lincRNA prediction, supervised learning methods have been explored in recent literature. As far as we know, there are no methods nor workflows specially designed to predict lincRNAs in plants. In this context, this work proposes a workflow to predict lincRNAs on plants, considering a workflow that includes known bioinformatics tools together with machine learning techniques, here a support vector machine (SVM). We discuss two case studies that allowed to identify novel lincRNAs, in sugarcane (Saccharum spp.) and in maize (Zea mays). From the results, we also could identify differentially-expressed lincRNAs in sugarcane and maize plants submitted to pathogenic and beneficial microorganisms.
Collapse
Affiliation(s)
- Lucas Maciel Vieira
- Departamento de Ciência da Computação, Universidade de Brasília, Brasília-DF 70910-900, Brasil.
| | - Clicia Grativol
- Laboratório de Química e Função de Proteínas e Peptídeos, Universidade Estadual do Norte Fluminense, Campos dos Goytacazes-RJ 28013-602, Brazil.
| | - Flavia Thiebaut
- Instituto de Bioquímica Médica Leopoldo de Meis, Universidade Federal do Rio de Janeiro, Rio de Janeiro-RJ 21941-901, Brazil.
| | - Thais G Carvalho
- Instituto de Bioquímica Médica Leopoldo de Meis, Universidade Federal do Rio de Janeiro, Rio de Janeiro-RJ 21941-901, Brazil.
| | - Pablo R Hardoim
- Instituto de Bioquímica Médica Leopoldo de Meis, Universidade Federal do Rio de Janeiro, Rio de Janeiro-RJ 21941-901, Brazil.
| | - Adriana Hemerly
- Instituto de Bioquímica Médica Leopoldo de Meis, Universidade Federal do Rio de Janeiro, Rio de Janeiro-RJ 21941-901, Brazil.
| | - Sergio Lifschitz
- Departamento de Informática, Pontifícia Universidade Católica do Rio de Janeiro, Rio de Janeiro-RJ 22451-900, Brazil.
| | - Paulo Cavalcanti Gomes Ferreira
- Instituto de Bioquímica Médica Leopoldo de Meis, Universidade Federal do Rio de Janeiro, Rio de Janeiro-RJ 21941-901, Brazil.
| | - Maria Emilia M T Walter
- Departamento de Ciência da Computação, Universidade de Brasília, Brasília-DF 70910-900, Brasil.
| |
Collapse
|
20
|
de Araujo Oliveira JV, Costa F, Backofen R, Stadler PF, Machado Telles Walter ME, Hertel J. SnoReport 2.0: new features and a refined Support Vector Machine to improve snoRNA identification. BMC Bioinformatics 2016; 17:464. [PMID: 28105919 PMCID: PMC5249026 DOI: 10.1186/s12859-016-1345-6] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/12/2024] Open
Abstract
Background
snoReport uses RNA secondary structure prediction combined with machine learning as the basis to identify the two main classes of small nucleolar RNAs, the box H/ACA snoRNAs and the box C/D snoRNAs. Here, we present snoReport 2.0, which substantially improves and extends in the original method by: extracting new features for both box C/D and H/ACA box snoRNAs; developing a more sophisticated technique in the SVM training phase with recent data from vertebrate organisms and a careful choice of the SVM parameters C and γ; and using updated versions of tools and databases used for the construction of the original version of snoReport. To validate the new version and to demonstrate its improved performance, we tested snoReport 2.0 in different organisms. Results Results of the training and test phases of boxes H/ACA and C/D snoRNAs, in both versions of snoReport, are discussed. Validation on real data was performed to evaluate the predictions of snoReport 2.0. Our program was applied to a set of previously annotated sequences, some of them experimentally confirmed, of humans, nematodes, drosophilids, platypus, chickens and leishmania. We significantly improved the predictions for vertebrates, since the training phase used information of these organisms, but H/ACA box snoRNAs identification was improved for the other ones. Conclusion We presented snoReport 2.0, to predict H/ACA box and C/D box snoRNAs, an efficient method to find true positives and avoid false positives in vertebrate organisms. H/ACA box snoRNA classifier showed an F-score of 93 % (an improvement of 10 % regarding the previous version), while C/D box snoRNA classifier, an F-Score of 94 % (improvement of 14 %). Besides, both classifiers exhibited performance measures above 90 %. These results show that snoReport 2.0 avoid false positives and false negatives, allowing to predict snoRNAs with high quality. In the validation phase, snoReport 2.0 predicted 67.43 % of vertebrate organisms for both classes. For Nematodes and Drosophilids, 69 % and 76.67 %, for H/ACA box snoRNAs were predicted, respectively, showing that snoReport 2.0 is good to identify snoRNAs in vertebrates and also H/ACA box snoRNAs in invertebrates organisms. Electronic supplementary material The online version of this article (doi:10.1186/s12859-016-1345-6) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
| | - Fabrizio Costa
- Bioinformatics Group, Department of Computer Science, Albert-Ludwigs-University Freiburg, Georges-Köhler-Allee 106, Freiburg, 79110, Germany
| | - Rolf Backofen
- Bioinformatics Group, Department of Computer Science, Albert-Ludwigs-University Freiburg, Georges-Köhler-Allee 106, Freiburg, 79110, Germany
| | - Peter Florian Stadler
- Bioinformatics Group, Department of Computer Science, and Interdisciplinary Center for Bioinformatics, University of Leipzig, Haertelstraße 16-18, Leipzig, D-04107, Germany.,German Centre for Integrative Biodiversity Research (iDiv), Halle-Jena-Leipzig, Germany.,Institute for Theoretical Chemistry, University of Vienna, Währingerstraße 17, Vienna, A-1090, Austria.,Center for non-coding RNA in Technology and Health, University of Copenhagen, Grønnegårdsvej 3, Frederiksberg, DK-1870, Denmark.,Max Planck Institute for Mathematics in the Sciences, Inselstraße 22, Leipzig, D-04103, Germany.,RNomics Group, Fraunhofer Institut for Cell Therapy and Immunology, Perlickstraße 1, Leipzig, D-04103, Germany.,Santa Fe Institute, 1399 Hyde Park Rd., Santa Fe, NM87501, USA.,Young Investigators Group Bioinformatics & Transcriptomics, Helmholtz Centre for Environmental Research - UFZ, Permoserstraße 15, Leipzig, D-04318, Germany
| | | | - Jana Hertel
- Bioinformatics Group, Department of Computer Science, and Interdisciplinary Center for Bioinformatics, University of Leipzig, Haertelstraße 16-18, Leipzig, D-04107, Germany
| |
Collapse
|
21
|
Jorjani H, Kehr S, Jedlinski DJ, Gumienny R, Hertel J, Stadler PF, Zavolan M, Gruber AR. An updated human snoRNAome. Nucleic Acids Res 2016; 44:5068-82. [PMID: 27174936 PMCID: PMC4914119 DOI: 10.1093/nar/gkw386] [Citation(s) in RCA: 178] [Impact Index Per Article: 22.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2015] [Accepted: 04/23/2016] [Indexed: 12/18/2022] Open
Abstract
Small nucleolar RNAs (snoRNAs) are a class of non-coding RNAs that guide the post-transcriptional processing of other non-coding RNAs (mostly ribosomal RNAs), but have also been implicated in processes ranging from microRNA-dependent gene silencing to alternative splicing. In order to construct an up-to-date catalog of human snoRNAs we have combined data from various databases, de novo prediction and extensive literature review. In total, we list more than 750 curated genomic loci that give rise to snoRNA and snoRNA-like genes. Utilizing small RNA-seq data from the ENCODE project, our study characterizes the plasticity of snoRNA expression identifying both constitutively as well as cell type specific expressed snoRNAs. Especially, the comparison of malignant to non-malignant tissues and cell types shows a dramatic perturbation of the snoRNA expression profile. Finally, we developed a high-throughput variant of the reverse-transcriptase-based method for identifying 2'-O-methyl modifications in RNAs termed RimSeq. Using the data from this and other high-throughput protocols together with previously reported modification sites and state-of-the-art target prediction methods we re-estimate the snoRNA target RNA interaction network. Our current results assign a reliable modification site to 83% of the canonical snoRNAs, leaving only 76 snoRNA sequences as orphan.
Collapse
Affiliation(s)
- Hadi Jorjani
- Computational and Systems Biology, Biozentrum, University of Basel and Swiss Institute of Bioinformatics, Basel CH-4056, Switzerland
| | - Stephanie Kehr
- Bioinformatics Group, Department of Computer Science, and Interdisciplinary Center for Bioinformatics, University of Leipzig, D-04107 Leipzig, Germany
| | - Dominik J Jedlinski
- Computational and Systems Biology, Biozentrum, University of Basel and Swiss Institute of Bioinformatics, Basel CH-4056, Switzerland
| | - Rafal Gumienny
- Computational and Systems Biology, Biozentrum, University of Basel and Swiss Institute of Bioinformatics, Basel CH-4056, Switzerland
| | - Jana Hertel
- Bioinformatics Group, Department of Computer Science, and Interdisciplinary Center for Bioinformatics, University of Leipzig, D-04107 Leipzig, Germany
| | - Peter F Stadler
- Bioinformatics Group, Department of Computer Science, and Interdisciplinary Center for Bioinformatics, University of Leipzig, D-04107 Leipzig, Germany Max Planck Institute for Mathematics in the Sciences, D-04103 Leipzig, Germany RNomics Group, Fraunhofer Institute for Cell Therapy and Immunology, D-04103 Leipzig, Germany Department of Theoretical Chemistry, University of Vienna, A-1090 Vienna, Austria Santa Fe Institute, NM-87501Santa Fe, USA
| | - Mihaela Zavolan
- Computational and Systems Biology, Biozentrum, University of Basel and Swiss Institute of Bioinformatics, Basel CH-4056, Switzerland
| | - Andreas R Gruber
- Computational and Systems Biology, Biozentrum, University of Basel and Swiss Institute of Bioinformatics, Basel CH-4056, Switzerland
| |
Collapse
|
22
|
Meng Y, Yi X, Li X, Hu C, Wang J, Bai L, Czajkowsky DM, Shao Z. The non-coding RNA composition of the mitotic chromosome by 5'-tag sequencing. Nucleic Acids Res 2016; 44:4934-46. [PMID: 27016738 PMCID: PMC4889943 DOI: 10.1093/nar/gkw195] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2015] [Accepted: 03/15/2016] [Indexed: 12/16/2022] Open
Abstract
Mitotic chromosomes are one of the most commonly recognized sub-cellular structures in eukaryotic cells. Yet basic information necessary to understand their structure and assembly, such as their composition, is still lacking. Recent proteomic studies have begun to fill this void, identifying hundreds of RNA-binding proteins bound to mitotic chromosomes. However, by contrast, there are only two RNA species (U3 snRNA and rRNA) that are known to be associated with the mitotic chromosome, suggesting that there are many mitotic chromosome-associated RNAs (mCARs) not yet identified. Here, using a targeted protocol based on 5'-tag sequencing to profile the mammalian mCAR population, we report the identification of 1279 mCARs, the majority of which are ncRNAs, including lncRNAs that exhibit greater conservation across 60 vertebrate species than the entire population of lncRNAs. There is also a significant enrichment of snoRNAs and specific SINE RNAs. Finally, ∼40% of the mCARs are presently unannotated, many of which are as abundant as the annotated mCARs, suggesting that there are also many novel ncRNAs in the mCARs. Overall, the mCARs identified here, together with the previous proteomic and genomic data, constitute the first comprehensive catalogue of the molecular composition of the eukaryotic mitotic chromosomes.
Collapse
Affiliation(s)
- Yicong Meng
- Shanghai Center for Systems Biomedicine, Shanghai Jiao Tong University, Shanghai 200240, China
| | - Xianfu Yi
- School of Biomedical Engineering, Tianjin Medical University, Tianjin 300070, China
| | - Xinhui Li
- Bio-ID Center, School of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai 200240, China
| | - Chuansheng Hu
- Bio-ID Center, School of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai 200240, China
| | - Ju Wang
- School of Biomedical Engineering, Tianjin Medical University, Tianjin 300070, China
| | - Ling Bai
- Bio-ID Center, School of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai 200240, China
| | - Daniel M Czajkowsky
- Bio-ID Center, School of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai 200240, China
| | - Zhifeng Shao
- Bio-ID Center, School of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai 200240, China State Key Laboratory of Oncogenes & Related Genes, Shanghai Jiao Tong University, Shanghai 200240, China
| |
Collapse
|
23
|
Qu G, Kruszka K, Plewka P, Yang SY, Chiou TJ, Jarmolowski A, Szweykowska-Kulinska Z, Echeverria M, Karlowski WM. Promoter-based identification of novel non-coding RNAs reveals the presence of dicistronic snoRNA-miRNA genes in Arabidopsis thaliana. BMC Genomics 2015; 16:1009. [PMID: 26607788 PMCID: PMC4660826 DOI: 10.1186/s12864-015-2221-x] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2015] [Accepted: 11/16/2015] [Indexed: 11/18/2022] Open
Abstract
Background In the past few decades, non-coding RNAs (ncRNAs) have emerged as important regulators of gene expression in eukaryotes. Most studies of ncRNAs in plants have focused on the identification of silencing microRNAs (miRNAs) and small interfering RNAs (siRNAs). Another important family of ncRNAs that has been well characterized in plants is the small nucleolar RNAs (snoRNAs) and the related small Cajal body-specific RNAs (scaRNAs). Both target chemical modifications of ribosomal RNAs (rRNAs) and small nuclear RNAs (snRNAs). In plants, the snoRNA genes are organized in clusters, transcribed by RNA Pol II from a common promoter and subsequently processed into mature molecules. The promoter regions of snoRNA polycistronic genes in plants are highly enriched in two conserved cis-regulatory elements (CREs), Telo-box and Site II, which coordinate the expression of snoRNAs and ribosomal protein coding genes throughout the cell cycle. Results In order to identify novel ncRNA genes, we have used the snoRNA Telo-box/Site II motifs combination as a functional promoter indicator to screen the Arabidopsis genome. The predictions generated by this process were tested by detailed exploration of available RNA-Seq and expression data sets and experimental validation. As a result, we have identified several snoRNAs, scaRNAs and 'orphan' snoRNAs. We also show evidence for 16 novel ncRNAs that lack similarity to any reported RNA family. Finally, we have identified two dicistronic genes encoding precursors that are processed to mature snoRNA and miRNA molecules. We discuss the evolutionary consequences of this result in the context of a tight link between snoRNAs and miRNAs in eukaryotes. Conclusions We present an alternative computational approach for non-coding RNA detection. Instead of depending on sequence or structure similarity in the whole genome screenings, we have explored the properties of promoter regions of well-characterized ncRNAs. Interestingly, besides expected ncRNAs predictions we were also able to recover single precursor arrangement for snoRNA-miRNA. Accompanied by analyses performed on rice sequences, we conclude that such arrangement might have interesting functional and evolutionary consequences and discuss this result in the context of a tight link between snoRNAs and miRNAs in eukaryotes. Electronic supplementary material The online version of this article (doi:10.1186/s12864-015-2221-x) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Ge Qu
- Department of Computational Biology, Institute of Molecular Biology and Biotechnology, Faculty of Biology, Adam Mickiewicz University in Poznan, Umultowska 89, 61-614, Poznan, Poland.
| | - Katarzyna Kruszka
- Department of Gene Expression, Institute of Molecular Biology and Biotechnology, Faculty of Biology, Adam Mickiewicz University in Poznan, Umultowska 89, Poznan, 61-614, Poland.
| | - Patrycja Plewka
- Department of Gene Expression, Institute of Molecular Biology and Biotechnology, Faculty of Biology, Adam Mickiewicz University in Poznan, Umultowska 89, Poznan, 61-614, Poland.
| | - Shu-Yi Yang
- Agricultural Biotechnology Research Center, Academia Sinica, No. 128 Academia Rd. Sec. 2, Taipei, 115, Taiwan.
| | - Tzyy-Jen Chiou
- Agricultural Biotechnology Research Center, Academia Sinica, No. 128 Academia Rd. Sec. 2, Taipei, 115, Taiwan.
| | - Artur Jarmolowski
- Department of Gene Expression, Institute of Molecular Biology and Biotechnology, Faculty of Biology, Adam Mickiewicz University in Poznan, Umultowska 89, Poznan, 61-614, Poland.
| | - Zofia Szweykowska-Kulinska
- Department of Gene Expression, Institute of Molecular Biology and Biotechnology, Faculty of Biology, Adam Mickiewicz University in Poznan, Umultowska 89, Poznan, 61-614, Poland.
| | - Manuel Echeverria
- Faculté des Sciences, Université de Perpignan via Domitia, 52, Av Paul Alduy, Perpignan, 66860, France.
| | - Wojciech M Karlowski
- Department of Computational Biology, Institute of Molecular Biology and Biotechnology, Faculty of Biology, Adam Mickiewicz University in Poznan, Umultowska 89, 61-614, Poznan, Poland.
| |
Collapse
|
24
|
Arruda WC, Souza DS, Ralha CG, Walter MEMT, Raiol T, Brigido MM, Stadler PF. Knowledge-based reasoning to annotate noncoding RNA using multi-agent system. J Bioinform Comput Biol 2015. [PMID: 26223200 DOI: 10.1142/s0219720015500213] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
Noncoding RNAs (ncRNAs) have been focus of intense research over the last few years. Since characteristics and signals of ncRNAs are not entirely known, researchers use different computational tools together with their biological knowledge to predict putative ncRNAs. In this context, this work presents ncRNA-Agents, a multi-agent system to annotate ncRNAs based on the output of different tools, using inference rules to simulate biologists' reasoning. Experiments with data from the fungus Saccharomyces cerevisiae allowed to measure the performance of ncRNA-Agents, with better sensibility, when compared to Infernal, a widely used tool for annotating ncRNA. Besides, data of the Schizosaccharomyces pombe and Paracoccidioides brasiliensis fungi identified novel putative ncRNAs, which demonstrated the usefulness of our approach. NcRNA-Agents can be be found at: http://www.biomol.unb.br/ncrna-agents.
Collapse
Affiliation(s)
- Wosley C Arruda
- * Department of Computer Science, University of Brasília, Campus Universitário Darcy Ribeiro Prédio CIC/EST, ASA Norte, Brasília-DF,CEP: 70910-900, Brazil
| | - Daniel S Souza
- * Department of Computer Science, University of Brasília, Campus Universitário Darcy Ribeiro Prédio CIC/EST, ASA Norte, Brasília-DF,CEP: 70910-900, Brazil
| | - Célia G Ralha
- * Department of Computer Science, University of Brasília, Campus Universitário Darcy Ribeiro Prédio CIC/EST, ASA Norte, Brasília-DF,CEP: 70910-900, Brazil
| | - Maria Emilia M T Walter
- * Department of Computer Science, University of Brasília, Campus Universitário Darcy Ribeiro Prédio CIC/EST, ASA Norte, Brasília-DF,CEP: 70910-900, Brazil
| | - Tainá Raiol
- † Leônidas and Maria Deane Research Center (Fiocruz Amazônia), Rua Teresina, 476 Adrianópolis, Manaus-AM, CEP: 69027-070, Brazil
| | - Marcelo M Brigido
- ‡ Department of Cellular Biology, Institute of Biology, University of Brasília, Campus Universitário Darcy Ribeiro, Prédio do Institute de Biologia, ASA Norte, Brasília-DF,CEP: 70910-900, Brazil
| | - Peter F Stadler
- § Department of Computer Science and the Interdisciplinary Center for Bioinformatics, University of Leipzig, Härtelstrasse 16-18, D-04107, Leipzig, Germany
| |
Collapse
|
25
|
Agrisani A, Tafer H, Stadler PF, Furia M. Unusual Novel SnoRNA-Like RNAs in Drosophila melanogaster. Noncoding RNA 2015; 1:139-150. [PMID: 29861420 PMCID: PMC5932544 DOI: 10.3390/ncrna1020139] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2015] [Revised: 07/06/2015] [Accepted: 07/09/2015] [Indexed: 12/12/2022] Open
Abstract
A computational screen for novel small nucleolar RNAs in Drosophila melanogaster uncovered 15 novel snoRNAs and snoRNA-like long non-coding RNAs. In contrast to earlier surverys, the novel sequences are mostly poorly conserved and originate from unusual genomic locations. The majority derive from precurors antisense to well-known protein-coding genes, and four of the candidates are produced from exon-coding regions. Only a minority of the new sequences appears to have canonical target sites in ribosomal or small nuclear RNAs. Taken together, these evolutionary young, poorly conserved, and genomically atypical sequences point at a class of snoRNA-like transcripts with predominantly regulatory functions in the fruit fly genome.
Collapse
Affiliation(s)
- Alberto Agrisani
- Department of Biology, University of Naples "Federico II", Complesso Universitario Monte Santangelo, via Cinthia, I-80126 Napoli, Italy.
| | - Hakim Tafer
- Institut für Biotechnologie, Universität für Bodenkultur, Muthgasse 18, A-1190 Wien, Austria.
| | - Peter F Stadler
- Bioinformatics Group, Department Computer Science, German Centre for Integrative Biodiversity Research (iDiv), Halle-Jena-Leipzig; University Leipzig, Härtelstrasse 16-18, D-04107 Leipzig, Germany.
- Max Planck Institute for Mathematics in the Sciences, Inselstraße 22, D-04103 Leipzig, Germany.
- Fraunhofer Institute for Cell Therapy and Immunology, Perlickstrasse 1, D-04103 Leipzig, Germany.
- Department of Theoretical Chemistry, University of Vienna, Währingerstrasse 17, A-1090 Vienna, Austria.
- Center for RNA in Technology and Health, University of Copenhagen, Grønnegårdsvej 3, DK-1870 Frederiksberg C, Denmark.
- Santa Fe Institute, 1399 Hyde Park Road, Santa Fe, NM 87501, USA.
| | - Maria Furia
- Department of Biology, University of Naples "Federico II", Complesso Universitario Monte Santangelo, via Cinthia, I-80126 Napoli, Italy.
| |
Collapse
|
26
|
Severino P, Oliveira LS, Andreghetto FM, Torres N, Curioni O, Cury PM, Toporcov TN, Paschoal AR, Durham AM. Small RNAs in metastatic and non-metastatic oral squamous cell carcinoma. BMC Med Genomics 2015; 8:31. [PMID: 26104160 PMCID: PMC4479233 DOI: 10.1186/s12920-015-0102-4] [Citation(s) in RCA: 30] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2014] [Accepted: 05/29/2015] [Indexed: 12/26/2022] Open
Abstract
BACKGROUND Small non-coding regulatory RNAs control cellular functions at the transcriptional and post-transcriptional levels. Oral squamous cell carcinoma is among the leading cancers in the world and the presence of cervical lymph node metastases is currently its strongest prognostic factor. In this work we aimed at finding small RNAs expressed in oral squamous cell carcinoma that could be associated with the presence of lymph node metastasis. METHODS Small RNA libraries from metastatic and non-metastatic oral squamous cell carcinomas were sequenced for the identification and quantification of known small RNAs. Selected markers were validated in plasma samples. Additionally, we used in silico analysis to investigate possible new molecules, not previously described, involved in the metastatic process. RESULTS Global expression patterns were not associated with cervical metastases. MiR-21, miR-203 and miR-205 were highly expressed throughout samples, in agreement with their role in epithelial cell biology, but disagreeing with studies correlating these molecules with cancer invasion. Eighteen microRNAs, but no other small RNA class, varied consistently between metastatic and non-metastatic samples. Nine of these microRNAs had been previously detected in human plasma, eight of which presented consistent results between tissue and plasma samples. MiR-31 and miR-130b, known to inhibit several steps in the metastatic process, were over-expressed in non-metastatic samples and the expression of miR-130b was confirmed in plasma of patients showing no metastasis. MiR-181 and miR-296 were detected in metastatic tumors and the expression of miR-296 was confirmed in plasma of patients presenting metastasis. A novel microRNA-like molecule was also associated with non-metastatic samples, potentially targeting cell-signaling mechanisms. CONCLUSIONS We corroborate literature data on the role of small RNAs in cancer metastasis and suggest the detection of microRNAs as a tool that may assist in the evaluation of oral squamous cell carcinoma metastatic potential.
Collapse
Affiliation(s)
- Patricia Severino
- Albert Einstein Research and Education Institute, Hospital Israelita Albert Einstein, Sao Paulo, SP, Brazil.
| | - Liliane Santana Oliveira
- Albert Einstein Research and Education Institute, Hospital Israelita Albert Einstein, Sao Paulo, SP, Brazil.
| | - Flávia Maziero Andreghetto
- Albert Einstein Research and Education Institute, Hospital Israelita Albert Einstein, Sao Paulo, SP, Brazil.
| | - Natalia Torres
- Albert Einstein Research and Education Institute, Hospital Israelita Albert Einstein, Sao Paulo, SP, Brazil.
| | - Otávio Curioni
- Hospital Heliopolis, Departamento de Cirurgia e Otorrinolaringologia, Sao Paulo, SP, Brazil.
| | | | - Tatiana Natasha Toporcov
- Departamento de Epidemiologia, Faculdade de Saúde Pública, University of Sao Paulo, Sao Paulo, SP, Brazil.
| | | | - Alan Mitchell Durham
- Instituto de Matemática e Estatística, University of Sao Paulo, Sao Paulo, SP, Brazil.
| |
Collapse
|
27
|
García-López J, Alonso L, Cárdenas DB, Artaza-Alvarez H, Hourcade JDD, Martínez S, Brieño-Enríquez MA, Del Mazo J. Diversity and functional convergence of small noncoding RNAs in male germ cell differentiation and fertilization. RNA (NEW YORK, N.Y.) 2015; 21:946-962. [PMID: 25805854 PMCID: PMC4408801 DOI: 10.1261/rna.048215.114] [Citation(s) in RCA: 50] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/22/2014] [Accepted: 01/15/2015] [Indexed: 06/04/2023]
Abstract
The small noncoding RNAs (sncRNAs) are considered as post-transcriptional key regulators of male germ cell development. In addition to microRNAs (miRNAs) and PIWI-interacting RNAs (piRNAs), other sncRNAs generated from small nucleolar RNAs (snoRNAs), tRNAs, or rRNAs processing may also play important regulatory roles in spermatogenesis. By next-generation sequencing (NGS), we characterized the sncRNA populations detected at three milestone stages in male germ differentiation: primordial germ cells (PGCs), pubertal spermatogonia cells, and mature spermatozoa. To assess their potential transmission through the spermatozoa during fertilization, the sncRNAs of mouse oocytes and zygotes were also analyzed. Both, microRNAs and snoRNA-derived small RNAs are abundantly expressed in PGCs but transiently replaced by piRNAs in spermatozoa and endo-siRNAs in oocytes and zygotes. Exhaustive analysis of miRNA sequence variants also shows an increment of noncanonical microRNA forms along male germ cell differentiation. RNAs-derived from tRNAs and rRNAs interacting with PIWI proteins are not generated by the ping-pong pathway and could be a source of primary piRNAs. Moreover, our results strongly suggest that the small RNAs-derived from tRNAs and rRNAs are interacting with PIWI proteins, and specifically with MILI. Finally, computational analysis revealed their potential involvement in post-transcriptional regulation of mRNA transcripts suggesting functional convergence among different small RNA classes in germ cells and zygotes.
Collapse
Affiliation(s)
- Jesús García-López
- Department of Cellular and Molecular Biology, Centro de Investigaciones Biológicas (CSIC), 28040 Madrid, Spain
| | - Lola Alonso
- Department of Bioinformatics Service, Centro de Investigaciones Biológicas (CSIC), 28040 Madrid, Spain
| | - David B Cárdenas
- Department of Cellular and Molecular Biology, Centro de Investigaciones Biológicas (CSIC), 28040 Madrid, Spain
| | - Haydeé Artaza-Alvarez
- Department of Bioinformatics Service, Centro de Investigaciones Biológicas (CSIC), 28040 Madrid, Spain
| | - Juan de Dios Hourcade
- Department of Cellular and Molecular Biology, Centro de Investigaciones Biológicas (CSIC), 28040 Madrid, Spain
| | - Sergio Martínez
- Department of Cellular and Molecular Biology, Centro de Investigaciones Biológicas (CSIC), 28040 Madrid, Spain
| | - Miguel A Brieño-Enríquez
- Department of Cellular and Molecular Biology, Centro de Investigaciones Biológicas (CSIC), 28040 Madrid, Spain
| | - Jesús Del Mazo
- Department of Cellular and Molecular Biology, Centro de Investigaciones Biológicas (CSIC), 28040 Madrid, Spain
| |
Collapse
|
28
|
Dupuis-Sandoval F, Poirier M, Scott MS. The emerging landscape of small nucleolar RNAs in cell biology. WILEY INTERDISCIPLINARY REVIEWS-RNA 2015; 6:381-97. [PMID: 25879954 PMCID: PMC4696412 DOI: 10.1002/wrna.1284] [Citation(s) in RCA: 155] [Impact Index Per Article: 17.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/16/2015] [Revised: 03/18/2015] [Accepted: 03/20/2015] [Indexed: 01/07/2023]
Abstract
Small nucleolar RNAs (snoRNAs) are a large class of small noncoding RNAs present in all eukaryotes sequenced thus far. As a family, they have been well characterized as playing a central role in ribosome biogenesis, guiding either the sequence-specific chemical modification of pre-rRNA (ribosomal RNA) or its processing. However, in higher eukaryotes, numerous orphan snoRNAs were described over a decade ago, with no known target or ascribed function, suggesting the possibility of alternative cellular functionality. In recent years, thanks in great part to advances in sequencing methodologies, we have seen many examples of the diversity that exists in the snoRNA family on multiple levels. In this review, we discuss the identification of novel snoRNA members, of unexpected binding partners, as well as the clarification and extension of the snoRNA target space and the characterization of diverse new noncanonical functions, painting a new and extended picture of the snoRNA landscape. Under the deluge of novel features and functions that have recently come to light, snoRNAs emerge as a central, dynamic, and highly versatile group of small regulatory RNAs. WIREs RNA 2015, 6:381–397. doi: 10.1002/wrna.1284
Collapse
Affiliation(s)
- Fabien Dupuis-Sandoval
- Biochemistry Department, Faculty of Medicine and Health Sciences, University of Sherbrooke, Sherbrooke, Canada
| | - Mikaël Poirier
- Biochemistry Department, Faculty of Medicine and Health Sciences, University of Sherbrooke, Sherbrooke, Canada
| | - Michelle S Scott
- Biochemistry Department and RNA Group, Faculty of Medicine and Health Sciences, University of Sherbrooke, Sherbrooke, Canada
| |
Collapse
|
29
|
Abstract
Riboswitches present a ubiquitous genetic regulatory mechanism for prokaryotes and have been found in HIV1, fungi, plants, and even H. sapiens. We present an overview of approaches to predict riboswitch aptamers and, more generally, RNA conformational switches.
Collapse
Affiliation(s)
- P Clote
- Biology Department, Boston College, Boston, Massachusetts, USA.
| |
Collapse
|
30
|
Khisamutdinov EF, Bui MNH, Jasinski D, Zhao Z, Cui Z, Guo P. Simple Method for Constructing RNA Triangle, Square, Pentagon by Tuning Interior RNA 3WJ Angle from 60° to 90° or 108°. Methods Mol Biol 2015; 1316:181-93. [PMID: 25967062 DOI: 10.1007/978-1-4939-2730-2_15] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/27/2023]
Abstract
Precise shape control of architectures at the nanometer scale is an intriguing but extremely challenging facet. RNA has recently emerged as a unique material and thermostable building block for use in nanoparticle construction. Here, we describe a simple method from design to synthesis of RNA triangle, square, and pentagon by stretching RNA 3WJ native angle from 60° to 90° and 108°, using the three-way junction (3WJ) of the pRNA from bacteriophage phi29 dsDNA packaging motor. These methods for the construction of elegant polygons can be applied to other RNA building blocks including the utilization and application of RNA 4-way, 5-way, and other multi-way junctions.
Collapse
Affiliation(s)
- Emil F Khisamutdinov
- Department of Pharmaceutical Sciences, College of Pharmacy, University of Kentucky, Lexington, KY, 40536, USA,
| | | | | | | | | | | |
Collapse
|
31
|
Dritsou V, Deligianni E, Dialynas E, Allen J, Poulakakis N, Louis C, Lawson D, Topalis P. Non-coding RNA gene families in the genomes of anopheline mosquitoes. BMC Genomics 2014; 15:1038. [PMID: 25432596 PMCID: PMC4300560 DOI: 10.1186/1471-2164-15-1038] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2014] [Accepted: 11/19/2014] [Indexed: 12/12/2022] Open
Abstract
Background Only a small fraction of the mosquito species of the genus Anopheles are able to transmit malaria, one of the biggest killer diseases of poverty, which is mostly prevalent in the tropics. This diversity has genetic, yet unknown, causes. In a further attempt to contribute to the elucidation of these variances, the international “Anopheles Genomes Cluster Consortium” project (a.k.a. “16 Anopheles genomes project”) was established, aiming at a comprehensive genomic analysis of several anopheline species, most of which are malaria vectors. In the frame of the international consortium carrying out this project our team studied the genes encoding families of non-coding RNAs (ncRNAs), concentrating on four classes: microRNA (miRNA), ribosomal RNA (rRNA), small nuclear RNA (snRNA), and in particular small nucleolar RNA (snoRNA) and, finally, transfer RNA (tRNA). Results Our analysis was carried out using, exclusively, computational approaches, and evaluating both the primary NGS reads as well as the respective genome assemblies produced by the consortium and stored in VectorBase; moreover, the results of RNAseq surveys in cases in which these were available and meaningful were also accessed in order to obtain supplementary data, as were “pre-genomic era” sequence data stored in nucleic acid databases. The investigation included the identification and analysis, in most species studied, of ncRNA genes belonging to several families, as well as the analysis of the evolutionary relations of some of those genes in cross-comparisons to other members of the genus Anopheles. Conclusions Our study led to the identification of members of these gene families in the majority of twenty different anopheline taxa. A set of tools for the study of the evolution and molecular biology of important disease vectors has, thus, been obtained. Electronic supplementary material The online version of this article (doi:10.1186/1471-2164-15-1038) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
| | | | | | | | | | | | | | - Pantelis Topalis
- Institute of Molecular Biology and Biotechnology, FORTH, Heraklion, Greece.
| |
Collapse
|
32
|
Neafsey DE, Waterhouse RM, Abai MR, Aganezov SS, Alekseyev MA, Allen JE, Amon J, Arcà B, Arensburger P, Artemov G, Assour LA, Basseri H, Berlin A, Birren BW, Blandin SA, Brockman AI, Burkot TR, Burt A, Chan CS, Chauve C, Chiu JC, Christensen M, Costantini C, Davidson VLM, Deligianni E, Dottorini T, Dritsou V, Gabriel SB, Guelbeogo WM, Hall AB, Han MV, Hlaing T, Hughes DST, Jenkins AM, Jiang X, Jungreis I, Kakani EG, Kamali M, Kemppainen P, Kennedy RC, Kirmitzoglou IK, Koekemoer LL, Laban N, Langridge N, Lawniczak MKN, Lirakis M, Lobo NF, Lowy E, MacCallum RM, Mao C, Maslen G, Mbogo C, McCarthy J, Michel K, Mitchell SN, Moore W, Murphy KA, Naumenko AN, Nolan T, Novoa EM, O'Loughlin S, Oringanje C, Oshaghi MA, Pakpour N, Papathanos PA, Peery AN, Povelones M, Prakash A, Price DP, Rajaraman A, Reimer LJ, Rinker DC, Rokas A, Russell TL, Sagnon N, Sharakhova MV, Shea T, Simão FA, Simard F, Slotman MA, Somboon P, Stegniy V, Struchiner CJ, Thomas GWC, Tojo M, Topalis P, Tubio JMC, Unger MF, Vontas J, Walton C, Wilding CS, Willis JH, Wu YC, Yan G, Zdobnov EM, Zhou X, Catteruccia F, Christophides GK, Collins FH, Cornman RS, Crisanti A, Donnelly MJ, Emrich SJ, Fontaine MC, Gelbart W, Hahn MW, Hansen IA, Howell PI, Kafatos FC, Kellis M, Lawson D, Louis C, Luckhart S, Muskavitch MAT, Ribeiro JM, Riehle MA, Sharakhov IV, Tu Z, Zwiebel LJ, Besansky NJ. Mosquito genomics. Highly evolvable malaria vectors: the genomes of 16 Anopheles mosquitoes. Science 2014; 347:1258522. [PMID: 25554792 DOI: 10.1126/science.1258522] [Citation(s) in RCA: 369] [Impact Index Per Article: 36.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/02/2022]
Abstract
Variation in vectorial capacity for human malaria among Anopheles mosquito species is determined by many factors, including behavior, immunity, and life history. To investigate the genomic basis of vectorial capacity and explore new avenues for vector control, we sequenced the genomes of 16 anopheline mosquito species from diverse locations spanning ~100 million years of evolution. Comparative analyses show faster rates of gene gain and loss, elevated gene shuffling on the X chromosome, and more intron losses, relative to Drosophila. Some determinants of vectorial capacity, such as chemosensory genes, do not show elevated turnover but instead diversify through protein-sequence changes. This dynamism of anopheline genes and genomes may contribute to their flexible capacity to take advantage of new ecological niches, including adapting to humans as primary hosts.
Collapse
Affiliation(s)
- Daniel E Neafsey
- Genome Sequencing and Analysis Program, Broad Institute, 415 Main Street, Cambridge, MA 02142, USA.
| | - Robert M Waterhouse
- Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, 32 Vassar Street, Cambridge, MA 02139, USA. The Broad Institute of Massachusetts Institute of Technology and Harvard, 415 Main Street, Cambridge, MA 02142, USA. Department of Genetic Medicine and Development, University of Geneva Medical School, Rue Michel-Servet 1, 1211 Geneva, Switzerland. Swiss Institute of Bioinformatics, Rue Michel-Servet 1, 1211 Geneva, Switzerland
| | - Mohammad R Abai
- Department of Medical Entomology and Vector Control, School of Public Health and Institute of Health Researches, Tehran University of Medical Sciences, Tehran, Iran
| | - Sergey S Aganezov
- George Washington University, Department of Mathematics and Computational Biology Institute, 45085 University Drive, Ashburn, VA 20147, USA
| | - Max A Alekseyev
- George Washington University, Department of Mathematics and Computational Biology Institute, 45085 University Drive, Ashburn, VA 20147, USA
| | - James E Allen
- European Molecular Biology Laboratory, European Bioinformatics Institute, EMBL-EBI, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - James Amon
- National Vector Borne Disease Control Programme, Ministry of Health, Tafea Province, Vanuatu
| | - Bruno Arcà
- Department of Public Health and Infectious Diseases, Division of Parasitology, Sapienza University of Rome, Piazzale Aldo Moro 5, 00185 Rome, Italy
| | - Peter Arensburger
- Department of Biological Sciences, California State Polytechnic-Pomona, 3801 West Temple Avenue, Pomona, CA 91768, USA
| | - Gleb Artemov
- Tomsk State University, 36 Lenina Avenue, Tomsk, Russia
| | - Lauren A Assour
- Department of Computer Science and Engineering, Eck Institute for Global Health, 211B Cushing Hall, University of Notre Dame, Notre Dame, IN 46556, USA
| | - Hamidreza Basseri
- Department of Medical Entomology and Vector Control, School of Public Health and Institute of Health Researches, Tehran University of Medical Sciences, Tehran, Iran
| | - Aaron Berlin
- Genome Sequencing and Analysis Program, Broad Institute, 415 Main Street, Cambridge, MA 02142, USA
| | - Bruce W Birren
- Genome Sequencing and Analysis Program, Broad Institute, 415 Main Street, Cambridge, MA 02142, USA
| | - Stephanie A Blandin
- Inserm, U963, F-67084 Strasbourg, France. CNRS, UPR9022, IBMC, F-67084 Strasbourg, France
| | - Andrew I Brockman
- Department of Life Sciences, Imperial College London, South Kensington Campus, London SW7 2AZ, UK
| | - Thomas R Burkot
- Faculty of Medicine, Health and Molecular Science, Australian Institute of Tropical Health Medicine, James Cook University, Cairns 4870, Australia
| | - Austin Burt
- Department of Life Sciences, Imperial College London, Silwood Park Campus, Ascot SL5 7PY, UK
| | - Clara S Chan
- Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, 32 Vassar Street, Cambridge, MA 02139, USA. The Broad Institute of Massachusetts Institute of Technology and Harvard, 415 Main Street, Cambridge, MA 02142, USA
| | - Cedric Chauve
- Department of Mathematics, Simon Fraser University, 8888 University Drive, Burnaby, BC V5A 1S6, Canada
| | - Joanna C Chiu
- Department of Entomology and Nematology, One Shields Avenue, University of California-Davis, Davis, CA 95616, USA
| | - Mikkel Christensen
- European Molecular Biology Laboratory, European Bioinformatics Institute, EMBL-EBI, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Carlo Costantini
- Institut de Recherche pour le Développement, Unités Mixtes de Recherche Maladies Infectieuses et Vecteurs Écologie, Génétique, Évolution et Contrôle, 911, Avenue Agropolis, BP 64501 Montpellier, France
| | - Victoria L M Davidson
- Division of Biology, Kansas State University, 271 Chalmers Hall, Manhattan, KS 66506, USA
| | - Elena Deligianni
- Institute of Molecular Biology and Biotechnology, Foundation for Research and Technology, Hellas, Nikolaou Plastira 100 GR-70013, Heraklion, Crete, Greece
| | - Tania Dottorini
- Department of Life Sciences, Imperial College London, South Kensington Campus, London SW7 2AZ, UK
| | - Vicky Dritsou
- Centre of Functional Genomics, University of Perugia, Perugia, Italy
| | - Stacey B Gabriel
- Genomics Platform, Broad Institute, 415 Main Street, Cambridge, MA 02142, USA
| | - Wamdaogo M Guelbeogo
- Centre National de Recherche et de Formation sur le Paludisme, Ouagadougou 01 BP 2208, Burkina Faso
| | - Andrew B Hall
- Program of Genetics, Bioinformatics, and Computational Biology, Virginia Polytechnic Institute and State University, Blacksburg, VA 24061, USA
| | - Mira V Han
- School of Life Sciences, University of Nevada, Las Vegas, NV 89154, USA
| | - Thaung Hlaing
- Department of Medical Research, No. 5 Ziwaka Road, Dagon Township, Yangon 11191, Myanmar
| | - Daniel S T Hughes
- European Molecular Biology Laboratory, European Bioinformatics Institute, EMBL-EBI, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK. Baylor College of Medicine, 1 Baylor Plaza, Houston, TX 77030, USA
| | - Adam M Jenkins
- Boston College, 140 Commonwealth Avenue, Chestnut Hill, MA 02467, USA
| | - Xiaofang Jiang
- Department of Biochemistry, Virginia Polytechnic Institute and State University, Blacksburg, VA 24061, USA. Program of Genetics, Bioinformatics, and Computational Biology, Virginia Polytechnic Institute and State University, Blacksburg, VA 24061, USA
| | - Irwin Jungreis
- Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, 32 Vassar Street, Cambridge, MA 02139, USA. The Broad Institute of Massachusetts Institute of Technology and Harvard, 415 Main Street, Cambridge, MA 02142, USA
| | - Evdoxia G Kakani
- Harvard School of Public Health, Department of Immunology and Infectious Diseases, Boston, MA 02115, USA. Dipartimento di Medicina Sperimentale e Scienze Biochimiche, Università degli Studi di Perugia, Perugia, Italy
| | - Maryam Kamali
- Department of Entomology, Virginia Polytechnic Institute and State University, Blacksburg, VA 24061, USA
| | - Petri Kemppainen
- Computational Evolutionary Biology Group, Faculty of Life Sciences, University of Manchester, Oxford Road, Manchester M13 9PT, UK
| | - Ryan C Kennedy
- Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, CA 94143, USA
| | - Ioannis K Kirmitzoglou
- Department of Life Sciences, Imperial College London, South Kensington Campus, London SW7 2AZ, UK. Bioinformatics Research Laboratory, Department of Biological Sciences, New Campus, University of Cyprus, CY 1678 Nicosia, Cyprus
| | - Lizette L Koekemoer
- Wits Research Institute for Malaria, Faculty of Health Sciences, and Vector Control Reference Unit, National Institute for Communicable Diseases of the National Health Laboratory Service, Sandringham 2131, Johannesburg, South Africa
| | - Njoroge Laban
- National Museums of Kenya, P.O. Box 40658-00100, Nairobi, Kenya
| | - Nicholas Langridge
- European Molecular Biology Laboratory, European Bioinformatics Institute, EMBL-EBI, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Mara K N Lawniczak
- Department of Life Sciences, Imperial College London, South Kensington Campus, London SW7 2AZ, UK
| | - Manolis Lirakis
- Department of Biology, University of Crete, 700 13 Heraklion, Greece
| | - Neil F Lobo
- Eck Institute for Global Health and Department of Biological Sciences, University of Notre Dame, 317 Galvin Life Sciences Building, Notre Dame, IN 46556, USA
| | - Ernesto Lowy
- European Molecular Biology Laboratory, European Bioinformatics Institute, EMBL-EBI, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Robert M MacCallum
- Department of Life Sciences, Imperial College London, South Kensington Campus, London SW7 2AZ, UK
| | - Chunhong Mao
- Virginia Bioinformatics Institute, 1015 Life Science Circle, Virginia Polytechnic Institute and State University, Blacksburg, VA 24061, USA
| | - Gareth Maslen
- European Molecular Biology Laboratory, European Bioinformatics Institute, EMBL-EBI, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Charles Mbogo
- Kenya Medical Research Institute-Wellcome Trust Research Programme, Centre for Geographic Medicine Research - Coast, P.O. Box 230-80108, Kilifi, Kenya
| | - Jenny McCarthy
- Department of Biological Sciences, California State Polytechnic-Pomona, 3801 West Temple Avenue, Pomona, CA 91768, USA
| | - Kristin Michel
- Division of Biology, Kansas State University, 271 Chalmers Hall, Manhattan, KS 66506, USA
| | - Sara N Mitchell
- Harvard School of Public Health, Department of Immunology and Infectious Diseases, Boston, MA 02115, USA
| | - Wendy Moore
- Department of Entomology, 1140 East South Campus Drive, Forbes 410, University of Arizona, Tucson, AZ 85721, USA
| | - Katherine A Murphy
- Department of Entomology and Nematology, One Shields Avenue, University of California-Davis, Davis, CA 95616, USA
| | - Anastasia N Naumenko
- Department of Entomology, Virginia Polytechnic Institute and State University, Blacksburg, VA 24061, USA
| | - Tony Nolan
- Department of Life Sciences, Imperial College London, South Kensington Campus, London SW7 2AZ, UK
| | - Eva M Novoa
- Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, 32 Vassar Street, Cambridge, MA 02139, USA. The Broad Institute of Massachusetts Institute of Technology and Harvard, 415 Main Street, Cambridge, MA 02142, USA
| | - Samantha O'Loughlin
- Department of Life Sciences, Imperial College London, Silwood Park Campus, Ascot SL5 7PY, UK
| | - Chioma Oringanje
- Department of Entomology, 1140 East South Campus Drive, Forbes 410, University of Arizona, Tucson, AZ 85721, USA
| | - Mohammad A Oshaghi
- Department of Medical Entomology and Vector Control, School of Public Health and Institute of Health Researches, Tehran University of Medical Sciences, Tehran, Iran
| | - Nazzy Pakpour
- Department of Medical Microbiology and Immunology, School of Medicine, University of California Davis, One Shields Avenue, Davis, CA 95616, USA
| | - Philippos A Papathanos
- Department of Life Sciences, Imperial College London, South Kensington Campus, London SW7 2AZ, UK. Centre of Functional Genomics, University of Perugia, Perugia, Italy
| | - Ashley N Peery
- Department of Entomology, Virginia Polytechnic Institute and State University, Blacksburg, VA 24061, USA
| | - Michael Povelones
- Department of Pathobiology, University of Pennsylvania School of Veterinary Medicine, 3800 Spruce Street, Philadelphia, PA 19104, USA
| | - Anil Prakash
- Regional Medical Research Centre NE, Indian Council of Medical Research, P.O. Box 105, Dibrugarh-786 001, Assam, India
| | - David P Price
- Department of Biology, New Mexico State University, Las Cruces, NM 88003, USA. Molecular Biology Program, New Mexico State University, Las Cruces, NM 88003, USA
| | - Ashok Rajaraman
- Department of Mathematics, Simon Fraser University, 8888 University Drive, Burnaby, BC V5A 1S6, Canada
| | - Lisa J Reimer
- Department of Vector Biology, Liverpool School of Tropical Medicine, Pembroke Place, Liverpool, L3 5QA, UK
| | - David C Rinker
- Center for Human Genetics Research, Vanderbilt University Medical Center, Nashville, TN 37235, USA
| | - Antonis Rokas
- Center for Human Genetics Research, Vanderbilt University Medical Center, Nashville, TN 37235, USA. Department of Biological Sciences, Vanderbilt University, Nashville, TN 37235, USA
| | - Tanya L Russell
- Faculty of Medicine, Health and Molecular Science, Australian Institute of Tropical Health Medicine, James Cook University, Cairns 4870, Australia
| | - N'Fale Sagnon
- Centre National de Recherche et de Formation sur le Paludisme, Ouagadougou 01 BP 2208, Burkina Faso
| | - Maria V Sharakhova
- Department of Entomology, Virginia Polytechnic Institute and State University, Blacksburg, VA 24061, USA
| | - Terrance Shea
- Genome Sequencing and Analysis Program, Broad Institute, 415 Main Street, Cambridge, MA 02142, USA
| | - Felipe A Simão
- Department of Genetic Medicine and Development, University of Geneva Medical School, Rue Michel-Servet 1, 1211 Geneva, Switzerland. Swiss Institute of Bioinformatics, Rue Michel-Servet 1, 1211 Geneva, Switzerland
| | - Frederic Simard
- Institut de Recherche pour le Développement, Unités Mixtes de Recherche Maladies Infectieuses et Vecteurs Écologie, Génétique, Évolution et Contrôle, 911, Avenue Agropolis, BP 64501 Montpellier, France
| | - Michel A Slotman
- Department of Entomology, Texas A&M University, College Station, TX 77807, USA
| | - Pradya Somboon
- Department of Parasitology, Faculty of Medicine, Chiang Mai University, Chiang Mai 50200, Thailand
| | | | - Claudio J Struchiner
- Fundação Oswaldo Cruz, Avenida Brasil 4365, RJ Brazil. Instituto de Medicina Social, Universidade do Estado do Rio de Janeiro, Rio de Janeiro, Brazil
| | - Gregg W C Thomas
- School of Informatics and Computing, Indiana University, Bloomington, IN 47405, USA
| | - Marta Tojo
- Department of Physiology, School of Medicine, Center for Research in Molecular Medicine and Chronic Diseases, Instituto de Investigaciones Sanitarias, University of Santiago de Compostela, Santiago de Compostela, A Coruña, Spain
| | - Pantelis Topalis
- Institute of Molecular Biology and Biotechnology, Foundation for Research and Technology, Hellas, Nikolaou Plastira 100 GR-70013, Heraklion, Crete, Greece
| | - José M C Tubio
- Wellcome Trust Sanger Institute, Hinxton, Cambridgeshire, CB10 1SA, UK
| | - Maria F Unger
- Eck Institute for Global Health and Department of Biological Sciences, University of Notre Dame, 317 Galvin Life Sciences Building, Notre Dame, IN 46556, USA
| | - John Vontas
- Department of Biology, University of Crete, 700 13 Heraklion, Greece
| | - Catherine Walton
- Computational Evolutionary Biology Group, Faculty of Life Sciences, University of Manchester, Oxford Road, Manchester M13 9PT, UK
| | - Craig S Wilding
- School of Natural Sciences and Psychology, Liverpool John Moores University, Liverpool L3 3AF, UK
| | - Judith H Willis
- Department of Cellular Biology, University of Georgia, Athens, GA 30602, USA
| | - Yi-Chieh Wu
- Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, 32 Vassar Street, Cambridge, MA 02139, USA. The Broad Institute of Massachusetts Institute of Technology and Harvard, 415 Main Street, Cambridge, MA 02142, USA. Department of Computer Science, Harvey Mudd College, Claremont, CA 91711, USA
| | - Guiyun Yan
- Program in Public Health, College of Health Sciences, University of California, Irvine, Hewitt Hall, Irvine, CA 92697, USA
| | - Evgeny M Zdobnov
- Department of Genetic Medicine and Development, University of Geneva Medical School, Rue Michel-Servet 1, 1211 Geneva, Switzerland. Swiss Institute of Bioinformatics, Rue Michel-Servet 1, 1211 Geneva, Switzerland
| | - Xiaofan Zhou
- Department of Biological Sciences, Vanderbilt University, Nashville, TN 37235, USA
| | - Flaminia Catteruccia
- Harvard School of Public Health, Department of Immunology and Infectious Diseases, Boston, MA 02115, USA. Dipartimento di Medicina Sperimentale e Scienze Biochimiche, Università degli Studi di Perugia, Perugia, Italy
| | - George K Christophides
- Department of Life Sciences, Imperial College London, South Kensington Campus, London SW7 2AZ, UK
| | - Frank H Collins
- Eck Institute for Global Health and Department of Biological Sciences, University of Notre Dame, 317 Galvin Life Sciences Building, Notre Dame, IN 46556, USA
| | - Robert S Cornman
- Department of Cellular Biology, University of Georgia, Athens, GA 30602, USA
| | - Andrea Crisanti
- Department of Life Sciences, Imperial College London, South Kensington Campus, London SW7 2AZ, UK. Centre of Functional Genomics, University of Perugia, Perugia, Italy
| | - Martin J Donnelly
- Department of Vector Biology, Liverpool School of Tropical Medicine, Pembroke Place, Liverpool, L3 5QA, UK. Malaria Programme, Wellcome Trust Sanger Institute, Cambridge CB10 1SJ, UK
| | - Scott J Emrich
- Department of Computer Science and Engineering, Eck Institute for Global Health, 211B Cushing Hall, University of Notre Dame, Notre Dame, IN 46556, USA
| | - Michael C Fontaine
- Eck Institute for Global Health and Department of Biological Sciences, University of Notre Dame, 317 Galvin Life Sciences Building, Notre Dame, IN 46556, USA. Centre of Evolutionary and Ecological Studies (Marine Evolution and Conservation group), University of Groningen, Nijenborgh 7, NL-9747 AG Groningen, Netherlands
| | - William Gelbart
- Department of Molecular and Cellular Biology, Harvard University, 16 Divinity Avenue, Cambridge, MA 02138, USA
| | - Matthew W Hahn
- Department of Biology, Indiana University, Bloomington, IN 47405, USA. School of Informatics and Computing, Indiana University, Bloomington, IN 47405, USA
| | - Immo A Hansen
- Department of Biology, New Mexico State University, Las Cruces, NM 88003, USA. Molecular Biology Program, New Mexico State University, Las Cruces, NM 88003, USA
| | - Paul I Howell
- Centers for Disease Control and Prevention, 1600 Clifton Road NE MSG49, Atlanta, GA 30329, USA
| | - Fotis C Kafatos
- Department of Life Sciences, Imperial College London, South Kensington Campus, London SW7 2AZ, UK
| | - Manolis Kellis
- Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, 32 Vassar Street, Cambridge, MA 02139, USA. The Broad Institute of Massachusetts Institute of Technology and Harvard, 415 Main Street, Cambridge, MA 02142, USA
| | - Daniel Lawson
- European Molecular Biology Laboratory, European Bioinformatics Institute, EMBL-EBI, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Christos Louis
- Department of Biology, University of Crete, 700 13 Heraklion, Greece. Institute of Molecular Biology and Biotechnology, Foundation for Research and Technology, Hellas, Nikolaou Plastira 100 GR-70013, Heraklion, Crete, Greece. Centre of Functional Genomics, University of Perugia, Perugia, Italy
| | - Shirley Luckhart
- Department of Medical Microbiology and Immunology, School of Medicine, University of California Davis, One Shields Avenue, Davis, CA 95616, USA
| | - Marc A T Muskavitch
- Boston College, 140 Commonwealth Avenue, Chestnut Hill, MA 02467, USA. Biogen Idec, 14 Cambridge Center, Cambridge, MA 02142, USA
| | - José M Ribeiro
- Laboratory of Malaria and Vector Research, National Institute of Allergy and Infectious Diseases, 12735 Twinbrook Parkway, Rockville, MD 20852, USA
| | - Michael A Riehle
- Department of Entomology, 1140 East South Campus Drive, Forbes 410, University of Arizona, Tucson, AZ 85721, USA
| | - Igor V Sharakhov
- Department of Entomology, Virginia Polytechnic Institute and State University, Blacksburg, VA 24061, USA. Program of Genetics, Bioinformatics, and Computational Biology, Virginia Polytechnic Institute and State University, Blacksburg, VA 24061, USA
| | - Zhijian Tu
- Program of Genetics, Bioinformatics, and Computational Biology, Virginia Polytechnic Institute and State University, Blacksburg, VA 24061, USA. Department of Biochemistry, Virginia Polytechnic Institute and State University, Blacksburg, VA 24061, USA
| | - Laurence J Zwiebel
- Departments of Biological Sciences and Pharmacology, Institutes for Chemical Biology, Genetics and Global Health, Vanderbilt University and Medical Center, Nashville, TN 37235, USA
| | - Nora J Besansky
- Eck Institute for Global Health and Department of Biological Sciences, University of Notre Dame, 317 Galvin Life Sciences Building, Notre Dame, IN 46556, USA.
| |
Collapse
|
33
|
Ortogero N, Schuster AS, Oliver DK, Riordan CR, Hong AS, Hennig GW, Luong D, Bao J, Bhetwal BP, Ro S, McCarrey JR, Yan W. A novel class of somatic small RNAs similar to germ cell pachytene PIWI-interacting small RNAs. J Biol Chem 2014; 289:32824-34. [PMID: 25320077 DOI: 10.1074/jbc.m114.613232] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022] Open
Abstract
PIWI-interacting RNAs (piRNAs) are small noncoding RNAs that bind PIWI family proteins exclusively expressed in the germ cells of mammalian gonads. MIWI2-associated piRNAs are essential for silencing transposons during primordial germ cell development, and MIWI-bound piRNAs are required for normal spermatogenesis during adulthood in mice. Although piRNAs have long been regarded as germ cell-specific, increasing lines of evidence suggest that somatic cells also express piRNA-like RNAs (pilRNAs). Here, we report the detection of abundant pilRNAs in somatic cells, which are similar to MIWI-associated piRNAs mainly expressed in pachytene spermatocytes and round spermatids in the testis. Based on small RNA deep sequencing and quantitative PCR analyses, pilRNA expression is dynamic and displays tissue specificity. Although pilRNAs are similar to pachytene piRNAs in both size and genomic origins, they have a distinct ping-pong signature. Furthermore, pilRNA biogenesis appears to utilize a yet to be identified pathway, which is different from all currently known small RNA biogenetic pathways. In addition, pilRNAs appear to preferentially target the 3'-UTRs of mRNAs in a partially complementary manner. Our data suggest that pilRNAs, as an integral component of the small RNA transcriptome in somatic cell lineages, represent a distinct population of small RNAs that may have functions similar to germ cell piRNAs.
Collapse
Affiliation(s)
- Nicole Ortogero
- From the Department of Physiology and Cell Biology, University of Nevada School of Medicine, Reno, Nevada 89557 and
| | - Andrew S Schuster
- From the Department of Physiology and Cell Biology, University of Nevada School of Medicine, Reno, Nevada 89557 and
| | - Daniel K Oliver
- From the Department of Physiology and Cell Biology, University of Nevada School of Medicine, Reno, Nevada 89557 and
| | - Connor R Riordan
- From the Department of Physiology and Cell Biology, University of Nevada School of Medicine, Reno, Nevada 89557 and
| | - Annie S Hong
- From the Department of Physiology and Cell Biology, University of Nevada School of Medicine, Reno, Nevada 89557 and
| | - Grant W Hennig
- From the Department of Physiology and Cell Biology, University of Nevada School of Medicine, Reno, Nevada 89557 and
| | - Dickson Luong
- From the Department of Physiology and Cell Biology, University of Nevada School of Medicine, Reno, Nevada 89557 and
| | - Jianqiang Bao
- From the Department of Physiology and Cell Biology, University of Nevada School of Medicine, Reno, Nevada 89557 and
| | - Bhupal P Bhetwal
- From the Department of Physiology and Cell Biology, University of Nevada School of Medicine, Reno, Nevada 89557 and
| | - Seungil Ro
- From the Department of Physiology and Cell Biology, University of Nevada School of Medicine, Reno, Nevada 89557 and
| | - John R McCarrey
- the Department of Biology, University of Texas at San Antonio, San Antonio, Texas 78249
| | - Wei Yan
- From the Department of Physiology and Cell Biology, University of Nevada School of Medicine, Reno, Nevada 89557 and
| |
Collapse
|
34
|
Gupta Y, Witte M, Möller S, Ludwig RJ, Restle T, Zillikens D, Ibrahim SM. ptRNApred: computational identification and classification of post-transcriptional RNA. Nucleic Acids Res 2014; 42:e167. [PMID: 25303994 PMCID: PMC4267668 DOI: 10.1093/nar/gku918] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
UNLABELLED Non-coding RNAs (ncRNAs) are known to play important functional roles in the cell. However, their identification and recognition in genomic sequences remains challenging. In silico methods, such as classification tools, offer a fast and reliable way for such screening and multiple classifiers have already been developed to predict well-defined subfamilies of RNA. So far, however, out of all the ncRNAs, only tRNA, miRNA and snoRNA can be predicted with a satisfying sensitivity and specificity. We here present ptRNApred, a tool to detect and classify subclasses of non-coding RNA that are involved in the regulation of post-transcriptional modifications or DNA replication, which we here call post-transcriptional RNA (ptRNA). It (i) detects RNA sequences coding for post-transcriptional RNA from the genomic sequence with an overall sensitivity of 91% and a specificity of 94% and (ii) predicts ptRNA-subclasses that exist in eukaryotes: snRNA, snoRNA, RNase P, RNase MRP, Y RNA or telomerase RNA. AVAILABILITY The ptRNApred software is open for public use on http://www.ptrnapred.org/.
Collapse
Affiliation(s)
- Yask Gupta
- Department of Dermatology, University of Lübeck, 23538 Lübeck, Germany
| | - Mareike Witte
- Department of Dermatology, University of Lübeck, 23538 Lübeck, Germany
| | - Steffen Möller
- Department of Dermatology, University of Lübeck, 23538 Lübeck, Germany
| | - Ralf J Ludwig
- Department of Dermatology, University of Lübeck, 23538 Lübeck, Germany
| | - Tobias Restle
- Institute for Molecular Medicine, University of Lübeck, 23538 Lübeck, Germany
| | - Detlef Zillikens
- Department of Dermatology, University of Lübeck, 23538 Lübeck, Germany
| | - Saleh M Ibrahim
- Department of Dermatology, University of Lübeck, 23538 Lübeck, Germany
| |
Collapse
|
35
|
Primary transcriptome map of the hyperthermophilic archaeon Thermococcus kodakarensis. BMC Genomics 2014; 15:684. [PMID: 25127548 PMCID: PMC4247193 DOI: 10.1186/1471-2164-15-684] [Citation(s) in RCA: 75] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2014] [Accepted: 07/30/2014] [Indexed: 01/02/2023] Open
Abstract
Background Prokaryotes have relatively small genomes, densely-packed with protein-encoding sequences. RNA sequencing has, however, revealed surprisingly complex transcriptomes and here we report the transcripts present in the model hyperthermophilic Archaeon, Thermococcus kodakarensis, under different physiological conditions. Results Sequencing cDNA libraries, generated from RNA isolated from cells under different growth and metabolic conditions has identified >2,700 sites of transcription initiation, established a genome-wide map of transcripts, and consensus sequences for transcription initiation and post-transcription regulatory elements. The primary transcription start sites (TSS) upstream of 1,254 annotated genes, plus 644 primary TSS and their promoters within genes, are identified. Most mRNAs have a 5'-untranslated region (5'-UTR) 10 to 50 nt long (median = 16 nt), but ~20% have 5'-UTRs from 50 to 300 nt long and ~14% are leaderless. Approximately 50% of mRNAs contain a consensus ribosome binding sequence. The results identify TSS for 1,018 antisense transcripts, most with sequences complementary to either the 5'- or 3'-region of a sense mRNA, and confirm the presence of transcripts from all three CRISPR loci, the RNase P and 7S RNAs, all tRNAs and rRNAs and 69 predicted snoRNAs. Two putative riboswitch RNAs were present in growing but not in stationary phase cells. The procedure used is designed to identify TSS but, assuming that the number of cDNA reads correlates with transcript abundance, the results also provide a semi-quantitative documentation of the differences in T. kodakarensis genome expression under different growth conditions and confirm previous observations of substrate-dependent specific gene expression. Many previously unanticipated small RNAs have been identified, some with relative low GC contents (≤50%) and sequences that do not fold readily into base-paired secondary structures, contrary to the classical expectations for non-coding RNAs in a hyperthermophile. Conclusion The results identify >2,700 TSS, including almost all of the primary sites of transcription initiation upstream of annotated genes, plus many secondary sites, sites within genes and sites resulting in antisense transcripts. The T. kodakarensis genome is small (~2.1 Mbp) and tightly packed with protein-encoding genes, but the transcriptomes established also contain many non-coding RNAs and predict extensive RNA-based regulation in this model Archaeon. Electronic supplementary material The online version of this article (doi:10.1186/1471-2164-15-684) contains supplementary material, which is available to authorized users.
Collapse
|
36
|
Abstract
snoRNAs (small nucleolar RNAs) constitute one of the largest and best-studied classes of non-coding RNAs that confer enzymatic specificity. With associated proteins, these snoRNAs form ribonucleoprotein complexes that can direct 2'-O-methylation or pseudouridylation of target non-coding RNAs. Aided by computational methods and high-throughput sequencing, new studies have expanded the diversity of known snoRNA functions. Complexes incorporating snoRNAs have dynamic specificity, and include diverse roles in RNA silencing, telomerase maintenance and regulation of alternative splicing. Evidence that dysregulation of snoRNAs can cause human disease, including cancer, indicates that the full scope of snoRNA roles remains an unfinished story. The diversity in structure, genomic origin and function between snoRNAs found in different complexes and among different phyla illustrates the surprising plasticity of snoRNAs in evolution. The ability of snoRNAs to direct highly specific interactions with other RNAs is a consistent thread in their newly discovered functions. Because they are ubiquitous throughout Eukarya and Archaea, it is likely they were a feature of the last common ancestor of these two domains, placing their origin over two billion years ago. In the present chapter, we focus on recent advances in our understanding of these ancient, but functionally dynamic RNA-processing machines.
Collapse
|
37
|
Deep profiling of the novel intermediate-size noncoding RNAs in intraerythrocytic Plasmodium falciparum. PLoS One 2014; 9:e92946. [PMID: 24713982 PMCID: PMC3979661 DOI: 10.1371/journal.pone.0092946] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2013] [Accepted: 02/27/2014] [Indexed: 11/23/2022] Open
Abstract
Intermediate-size noncoding RNAs (is-ncRNAs) have been shown to play important regulatory roles in the development of several eukaryotic organisms. However, they have not been thoroughly explored in Plasmodium falciparum, which is the most virulent malaria parasite infecting human being. By using Illumina/Solexa paired-end sequencing of an is-ncRNA-specific library, we performed a systematic identification of novel is-ncRNAs in intraerythrocytic P. falciparum, strain 3D7. A total of 1,198 novel is-ncRNA candidates, including antisense, intergenic, and intronic is-ncRNAs, were identified. Bioinformatics analyses showed that the intergenic is-ncRNAs were the least conserved among different Plasmodium species, and antisense is-ncRNAs were more conserved than their sense counterparts. Twenty-two novel snoRNAs were identified, and eight potential novel classes of P. falciparum is-ncRNAs were revealed by clustering analysis. The expression of randomly selected novel is-ncRNAs was confirmed by RT-PCR and northern blotting assays. An obvious different expressional profile of the novel is-ncRNA between the early and late intraerythrocytic developmental stages of the parasite was observed. The expression levels of the antisense RNAs correlated with those of their cis-encoded sense RNA counterparts, suggesting that these is-ncRNAs are involved in the regulation of gene expression of the parasite. In conclusion, we accomplished a deep profiling analysis of novel is-ncRNAs in P. falciparum, analysed the conservation and structural features of these novel is-ncRNAs, and revealed their differential expression patterns during the development of the parasite. These findings provide important information for further functional characterisation of novel is-ncRNAs during the development of P. falciparum.
Collapse
|
38
|
Fukunaga T, Ozaki H, Terai G, Asai K, Iwasaki W, Kiryu H. CapR: revealing structural specificities of RNA-binding protein target recognition using CLIP-seq data. Genome Biol 2014; 15:R16. [PMID: 24447569 PMCID: PMC4053987 DOI: 10.1186/gb-2014-15-1-r16] [Citation(s) in RCA: 73] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2013] [Accepted: 01/21/2014] [Indexed: 12/02/2022] Open
Abstract
RNA-binding proteins (RBPs) bind to their target RNA molecules by recognizing specific RNA sequences and structural contexts. The development of CLIP-seq and related protocols has made it possible to exhaustively identify RNA fragments that bind to RBPs. However, no efficient bioinformatics method exists to reveal the structural specificities of RBP–RNA interactions using these data. We present CapR, an efficient algorithm that calculates the probability that each RNA base position is located within each secondary structural context. Using CapR, we demonstrate that several RBPs bind to their target RNA molecules under specific structural contexts. CapR is available at https://sites.google.com/site/fukunagatsu/software/capr.
Collapse
|
39
|
Abstract
Many different types of functional non-coding RNAs participate in a wide range of important cellular functions but the large majority of these RNAs are not routinely annotated in published genomes. Several programs have been developed for identifying RNAs, including specific tools tailored to a particular RNA family as well as more general ones designed to work for any family. Many of these tools utilize covariance models (CMs), statistical models of the conserved sequence, and structure of an RNA family. In this chapter, as an illustrative example, the Infernal software package and CMs from the Rfam database are used to identify RNAs in the genome of the archaeon Methanobrevibacter ruminantium, uncovering some additional RNAs not present in the genome's initial annotation. Analysis of the results and comparison with family-specific methods demonstrate some important strengths and weaknesses of this general approach.
Collapse
Affiliation(s)
- Eric P Nawrocki
- Howard Hughes Medical Institute, Janelia Farm Research Campus, Ashburn, VA, 20147, USA
| |
Collapse
|
40
|
Abstract
Many RNA families, i.e., groups of homologous RNA genes, belong to RNA classes, such as tRNAs, snoRNAs, or microRNAs, that are characterized by common sequence motifs and/or common secondary structure features. The detection of new members of RNA classes, as well as the comprehensive annotation of genomes with members of RNA classes is a challenging task that goes beyond simple homology search. Computational methods addressing this problem typically use a three-tiered approach: In the first step an efficient and sensitive filter is employed. In the second step the candidate set is narrowed down using computationally expensive methods geared towards specificity. In the final step the hits are annotated with class-specific features and scored. Here we review the tools that are currently available for a diverse set of RNA classes.
Collapse
|
41
|
Patra D, Fasold M, Langenberger D, Steger G, Grosse I, Stadler PF. plantDARIO: web based quantitative and qualitative analysis of small RNA-seq data in plants. FRONTIERS IN PLANT SCIENCE 2014; 5:708. [PMID: 25566282 PMCID: PMC4274896 DOI: 10.3389/fpls.2014.00708] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/30/2014] [Accepted: 11/26/2014] [Indexed: 05/11/2023]
Abstract
High-throughput sequencing techniques have made it possible to assay an organism's entire repertoire of small non-coding RNAs (ncRNAs) in an efficient and cost-effective manner. The moderate size of small RNA-seq datasets makes it feasible to provide free web services to the research community that provide many basic features of a small RNA-seq analysis, including quality control, read normalization, ncRNA quantification, and the prediction of putative novel ncRNAs. DARIO is one such system that so far has been focussed on animals. Here we introduce an extension of this system to plant short non-coding RNAs (sncRNAs). It includes major modifications to cope with plant-specific sncRNA processing. The current version of plantDARIO covers analyses of mapping files, small RNA-seq quality control, expression analyses of annotated sncRNAs, including the prediction of novel miRNAs and snoRNAs from unknown expressed loci and expression analyses of user-defined loci. At present Arabidopsis thaliana, Beta vulgaris, and Solanum lycopersicum are covered. The web tool links to a plant specific visualization browser to display the read distribution of the analyzed sample. The easy-to-use platform of plantDARIO quantifies RNA expression of annotated sncRNAs from different sncRNA databases together with new sncRNAs, annotated by our group. The plantDARIO website can be accessed at http://plantdario.bioinf.uni-leipzig.de/.
Collapse
Affiliation(s)
- Deblina Patra
- Institut für Informatik, Martin-Luther-Universität Halle-WittenbergHalle (Saale), Germany
- Bioinformatics Group, Department of Computer Science, Interdisciplinary Center for Bioinformatics, University LeipzigLeipzig, Germany
| | - Mario Fasold
- Bioinformatics Group, Department of Computer Science, Interdisciplinary Center for Bioinformatics, University LeipzigLeipzig, Germany
- ecSeq BioinformaticsLeipzig, Germany
| | - David Langenberger
- Bioinformatics Group, Department of Computer Science, Interdisciplinary Center for Bioinformatics, University LeipzigLeipzig, Germany
- ecSeq BioinformaticsLeipzig, Germany
| | - Gerhard Steger
- Institut für Pysikalische Biologie, Heinrich-Heine-UniversitätDüsseldorf, Germany
| | - Ivo Grosse
- Institut für Informatik, Martin-Luther-Universität Halle-WittenbergHalle (Saale), Germany
- German Centre for Integrative Biodiversity Research (iDiv) Halle-Jena-LeipzigLeipzig, Germany
| | - Peter F. Stadler
- Bioinformatics Group, Department of Computer Science, Interdisciplinary Center for Bioinformatics, University LeipzigLeipzig, Germany
- German Centre for Integrative Biodiversity Research (iDiv) Halle-Jena-LeipzigLeipzig, Germany
- Max Planck Institute for Mathematics in the SciencesLeipzig, Germany
- Fraunhofer Institute for Cell Therapy and ImmunologyLeipzig, Germany
- Department of Theoretical Chemistry of the University of ViennaVienna, Austria
- Center for RNA in Technology and Health, University of CopenhagenFrederiksberg, Denmark
- Santa Fe InstituteSanta Fe, USA
- *Correspondence: Peter F. Stadler, Bioinformatics Group, Department of Computer Science, University Leipzig, Härtelstrasse 16-18, D-04107 Leipzig, Germany e-mail:
| |
Collapse
|
42
|
Bartschat S, Kehr S, Tafer H, Stadler PF, Hertel J. snoStrip: a snoRNA annotation pipeline. Bioinformatics 2013; 30:115-6. [PMID: 24174566 DOI: 10.1093/bioinformatics/btt604] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
MOTIVATION Although small nucleolar RNAs form an important class of non-coding RNAs, no comprehensive annotation efforts have been undertaken, presumably because the task is complicated by both the large number of distinct small nucleolar RNA families and their relatively rapid pace of sequence evolution. RESULTS With snoStrip we present an automatic annotation pipeline developed specifically for comparative genomics of small nucleolar RNAs. It makes use of sequence conservation, canonical box motifs as well as secondary structure and predicts putative targets. AVAILABILITY AND IMPLEMENTATION The snoStrip web service and the download version is available at http://snostrip.bioinf.uni-leipzig.de/
Collapse
Affiliation(s)
- Sebastian Bartschat
- Bioinformatics Group, Department of Computer Science, and Interdisciplinary Center for Bioinformatics, University of Leipzig, Härtelstrasse 16-18, D-04107 Leipzig, Germany
| | | | | | | | | |
Collapse
|
43
|
Kishore S, Gruber AR, Jedlinski DJ, Syed AP, Jorjani H, Zavolan M. Insights into snoRNA biogenesis and processing from PAR-CLIP of snoRNA core proteins and small RNA sequencing. Genome Biol 2013; 14:R45. [PMID: 23706177 PMCID: PMC4053766 DOI: 10.1186/gb-2013-14-5-r45] [Citation(s) in RCA: 98] [Impact Index Per Article: 8.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2013] [Revised: 05/15/2013] [Accepted: 05/26/2013] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND In recent years, a variety of small RNAs derived from other RNAs with well-known functions such as tRNAs and snoRNAs, have been identified. The functional relevance of these RNAs is largely unknown. To gain insight into the complexity of snoRNA processing and the functional relevance of snoRNA-derived small RNAs, we sequence long and short RNAs, small RNAs that co-precipitate with the Argonaute 2 protein and RNA fragments obtained in photoreactive nucleotide-enhanced crosslinking and immunoprecipitation (PAR-CLIP) of core snoRNA-associated proteins. RESULTS Analysis of these data sets reveals that many loci in the human genome reproducibly give rise to C/D box-like snoRNAs, whose expression and evolutionary conservation are typically less pronounced relative to the snoRNAs that are currently cataloged. We further find that virtually all C/D box snoRNAs are specifically processed inside the regions of terminal complementarity, retaining in the mature form only 4-5 nucleotides upstream of the C box and 2-5 nucleotides downstream of the D box. Sequencing of the total and Argonaute 2-associated populations of small RNAs reveals that despite their cellular abundance, C/D box-derived small RNAs are not efficiently incorporated into the Ago2 protein. CONCLUSIONS We conclude that the human genome encodes a large number of snoRNAs that are processed along the canonical pathway and expressed at relatively low levels. Generation of snoRNA-derived processing products with alternative, particularly miRNA-like, functions appears to be uncommon.
Collapse
Affiliation(s)
- Shivendra Kishore
- Computational and Systems Biology, Biozentrum, University of Basel, Klingelbergstrasse 50-70, 4056 Basel, Switzerland
| | - Andreas R Gruber
- Computational and Systems Biology, Biozentrum, University of Basel, Klingelbergstrasse 50-70, 4056 Basel, Switzerland
| | - Dominik J Jedlinski
- Computational and Systems Biology, Biozentrum, University of Basel, Klingelbergstrasse 50-70, 4056 Basel, Switzerland
| | - Afzal P Syed
- Computational and Systems Biology, Biozentrum, University of Basel, Klingelbergstrasse 50-70, 4056 Basel, Switzerland
| | - Hadi Jorjani
- Computational and Systems Biology, Biozentrum, University of Basel, Klingelbergstrasse 50-70, 4056 Basel, Switzerland
| | - Mihaela Zavolan
- Computational and Systems Biology, Biozentrum, University of Basel, Klingelbergstrasse 50-70, 4056 Basel, Switzerland
| |
Collapse
|
44
|
Ortogero N, Hennig GW, Langille C, Ro S, McCarrey JR, Yan W. Computer-assisted annotation of murine Sertoli cell small RNA transcriptome. Biol Reprod 2013; 88:3. [PMID: 23136297 DOI: 10.1095/biolreprod.112.102269] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/02/2023] Open
Abstract
Mammalian genomes encode a large number of small noncoding RNAs (sncRNAs) that play regulatory roles during development and adulthood by affecting gene expression. Several sncRNA species, including microRNAs (miRNAs), piwi-interacting RNAs (piRNAs), endogenous small interfering RNAs (endo-siRNAs), and small nucleolar RNAs (snoRNAs), are abundantly expressed in the testis and required for normal testicular development and spermatogenesis. To evaluate global changes in sncRNA expression, the next-generation sequencing (NGS)-based sncRNA transcriptomic analysis has become routine, because it allows rapid determination of the small RNA transcriptome of a particular testicular cell type. However, annotation of small RNA NGS reads can be challenging due to the volume of reads obtained, which is usually in the millions. Therefore, we developed a computer-assisted sncRNA annotation protocol that could identify not only known sncRNAs but also previously uncharacterized ones. Using this protocol, we annotated NGS reads of a Sertoli cell sncRNA library, and we report to our knowledge the first comprehensive annotation of the sncRNA transcriptome of immature murine Sertoli cells. Moreover, the computer-assisted sncRNA annotation pipeline that we report is applicable for annotating NGS reads derived from other cell types and/or sequencing platforms.
Collapse
Affiliation(s)
- Nicole Ortogero
- Department of Physiology and Cell Biology, University of Nevada School of Medicine, Reno, NV 89557, USA
| | | | | | | | | | | |
Collapse
|
45
|
Wang D, Xia Y, Li X, Hou L, Yu J. The Rice Genome Knowledgebase (RGKbase): an annotation database for rice comparative genomics and evolutionary biology. Nucleic Acids Res 2012. [PMID: 23193278 PMCID: PMC3531066 DOI: 10.1093/nar/gks1225] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022] Open
Abstract
Over the past 10 years, genomes of cultivated rice cultivars and their wild counterparts have been sequenced although most efforts are focused on genome assembly and annotation of two major cultivated rice (Oryza sativa L.) subspecies, 93-11 (indica) and Nipponbare (japonica). To integrate information from genome assemblies and annotations for better analysis and application, we now introduce a comparative rice genome database, the Rice Genome Knowledgebase (RGKbase, http://rgkbase.big.ac.cn/RGKbase/). RGKbase is built to have three major components: (i) integrated data curation for rice genomics and molecular biology, which includes genome sequence assemblies, transcriptomic and epigenomic data, genetic variations, quantitative trait loci (QTLs) and the relevant literature; (ii) User-friendly viewers, such as Gbrowse, GeneBrowse and Circos, for genome annotations and evolutionary dynamics and (iii) Bioinformatic tools for compositional and synteny analyses, gene family classifications, gene ontology terms and pathways and gene co-expression networks. RGKbase current includes data from five rice cultivars and species: Nipponbare (japonica), 93-11 (indica), PA64s (indica), the African rice (Oryza glaberrima) and a wild rice species (Oryza brachyantha). We are also constantly introducing new datasets from variety of public efforts, such as two recent releases—sequence data from ∼1000 rice varieties, which are mapped into the reference genome, yielding ample high-quality single-nucleotide polymorphisms and insertions–deletions.
Collapse
Affiliation(s)
- Dapeng Wang
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100029, PR China
| | | | | | | | | |
Collapse
|
46
|
Washietl S, Will S, Hendrix DA, Goff LA, Rinn JL, Berger B, Kellis M. Computational analysis of noncoding RNAs. WILEY INTERDISCIPLINARY REVIEWS-RNA 2012; 3:759-78. [PMID: 22991327 DOI: 10.1002/wrna.1134] [Citation(s) in RCA: 43] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
Noncoding RNAs have emerged as important key players in the cell. Understanding their surprisingly diverse range of functions is challenging for experimental and computational biology. Here, we review computational methods to analyze noncoding RNAs. The topics covered include basic and advanced techniques to predict RNA structures, annotation of noncoding RNAs in genomic data, mining RNA-seq data for novel transcripts and prediction of transcript structures, computational aspects of microRNAs, and database resources.
Collapse
Affiliation(s)
- Stefan Washietl
- Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, MA, USA.
| | | | | | | | | | | | | |
Collapse
|
47
|
Xiao T, Wang Y, Luo H, Liu L, Wei G, Chen X, Sun Y, Chen X, Skogerbø G, Chen R. A differential sequencing-based analysis of the C. elegans noncoding transcriptome. RNA (NEW YORK, N.Y.) 2012; 18:626-639. [PMID: 22345127 PMCID: PMC3312551 DOI: 10.1261/rna.030965.111] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/16/2011] [Accepted: 12/22/2011] [Indexed: 05/31/2023]
Abstract
Noncoding RNAs are increasingly being recognized as important players in eukaryote biology. However, despite major efforts in mapping the Caenorhabditis elegans transcriptome over the last couple of years, nonpolyadenylated and intermediate-size noncoding RNAs (is-ncRNAs) are still incompletely explored. We have combined an enzymatic approach with full-length RNA-Seq of is-ncRNAs in C. elegans. A total of 473 novel is-ncRNAs has been identified, of which a substantial fraction was associated with transcription factor binding sites and developmentally regulated expression patterns. Analysis of sequence and secondary structure permitted classification of more than 200 is-ncRNAs into several known RNA classes, while another 33 is-ncRNAs were identified as belonging to two previously uncharacterized groups of is-ncRNAs. Three of the unclassified is-ncRNAs contain the 5' Alu domain common to SRP RNAs and specifically bound with the SRP9/14 heterodimer in vitro. One of these (inc394) showed 65% sequence identity with the human, neuron-specific BC200 RNA. Structure-based clustering analysis and in vitro binding experiments supported the notion that the nematode stem-bulge RNAs (sbRNAs) are homologs (or functional analogs) of the Y RNAs. Moreover, analysis of the differential libraries showed that some mature snoRNAs undergo secondary 5' cap modification after processing of the primary transcript, thus suggesting the existence of a wider range of functional RNAs arising from processed and modified fragments of primary transcripts.
Collapse
Affiliation(s)
- Tengfei Xiao
- Laboratory of Bioinformatics and Noncoding RNA, Institute of Biophysics, Chinese Academy of Sciences, Beijing 100101, China
- Graduate School of the Chinese Academy of Science, Beijing 100080, China
| | - Yunfei Wang
- Laboratory of Bioinformatics and Noncoding RNA, Institute of Biophysics, Chinese Academy of Sciences, Beijing 100101, China
| | - Huaxia Luo
- Laboratory of Bioinformatics and Noncoding RNA, Institute of Biophysics, Chinese Academy of Sciences, Beijing 100101, China
- Graduate School of the Chinese Academy of Science, Beijing 100080, China
| | - Lihui Liu
- Laboratory of Bioinformatics and Noncoding RNA, Institute of Biophysics, Chinese Academy of Sciences, Beijing 100101, China
- Graduate School of the Chinese Academy of Science, Beijing 100080, China
| | - Guifeng Wei
- Laboratory of Bioinformatics and Noncoding RNA, Institute of Biophysics, Chinese Academy of Sciences, Beijing 100101, China
- Graduate School of the Chinese Academy of Science, Beijing 100080, China
| | - Xiaowei Chen
- Laboratory of Bioinformatics and Noncoding RNA, Institute of Biophysics, Chinese Academy of Sciences, Beijing 100101, China
- Graduate School of the Chinese Academy of Science, Beijing 100080, China
| | - Yu Sun
- Laboratory of Bioinformatics and Noncoding RNA, Institute of Biophysics, Chinese Academy of Sciences, Beijing 100101, China
- Graduate School of the Chinese Academy of Science, Beijing 100080, China
| | - Xiaomin Chen
- Laboratory of Bioinformatics and Noncoding RNA, Institute of Biophysics, Chinese Academy of Sciences, Beijing 100101, China
| | - Geir Skogerbø
- Laboratory of Bioinformatics and Noncoding RNA, Institute of Biophysics, Chinese Academy of Sciences, Beijing 100101, China
| | - Runsheng Chen
- Laboratory of Bioinformatics and Noncoding RNA, Institute of Biophysics, Chinese Academy of Sciences, Beijing 100101, China
| |
Collapse
|
48
|
Bratkovič T, Rogelj B. Biology and applications of small nucleolar RNAs. Cell Mol Life Sci 2011; 68:3843-51. [PMID: 21748470 PMCID: PMC11114935 DOI: 10.1007/s00018-011-0762-y] [Citation(s) in RCA: 49] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2011] [Revised: 06/20/2011] [Accepted: 06/21/2011] [Indexed: 10/18/2022]
Abstract
Small nucleolar RNAs (snoRNAs) constitute a group of non-coding RNAs principally involved in posttranscriptional modification of ubiquitously expressed ribosomal and small nuclear RNAs. However, a number of tissue-specific snoRNAs have recently been identified that apparently do not target conventional substrates and are presumed to guide processing of primary transcripts of protein-coding genes, potentially expanding the diapason of regulatory RNAs that control translation of mRNA to proteins. Here, we review biogenesis of snoRNAs and redefine their function in light of recent exciting discoveries. We also discuss the potential of recombinant snoRNAs to be used in modulation of gene expression.
Collapse
Affiliation(s)
- Tomaž Bratkovič
- Department of Pharmaceutical Biology, University of Ljubljana, Slovenia.
| | | |
Collapse
|
49
|
Lorenz R, Bernhart SH, Höner zu Siederdissen C, Tafer H, Flamm C, Stadler PF, Hofacker IL. ViennaRNA Package 2.0. Algorithms Mol Biol 2011; 6:26. [PMID: 22115189 PMCID: PMC3319429 DOI: 10.1186/1748-7188-6-26] [Citation(s) in RCA: 3003] [Impact Index Per Article: 231.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2011] [Accepted: 11/24/2011] [Indexed: 12/31/2022] Open
Abstract
Background Secondary structure forms an important intermediate level of description of nucleic acids that encapsulates the dominating part of the folding energy, is often well conserved in evolution, and is routinely used as a basis to explain experimental findings. Based on carefully measured thermodynamic parameters, exact dynamic programming algorithms can be used to compute ground states, base pairing probabilities, as well as thermodynamic properties. Results The ViennaRNA Package has been a widely used compilation of RNA secondary structure related computer programs for nearly two decades. Major changes in the structure of the standard energy model, the Turner 2004 parameters, the pervasive use of multi-core CPUs, and an increasing number of algorithmic variants prompted a major technical overhaul of both the underlying RNAlib and the interactive user programs. New features include an expanded repertoire of tools to assess RNA-RNA interactions and restricted ensembles of structures, additional output information such as centroid structures and maximum expected accuracy structures derived from base pairing probabilities, or z-scores for locally stable secondary structures, and support for input in fasta format. Updates were implemented without compromising the computational efficiency of the core algorithms and ensuring compatibility with earlier versions. Conclusions The ViennaRNA Package 2.0, supporting concurrent computations via OpenMP, can be downloaded from http://www.tbi.univie.ac.at/RNA.
Collapse
|
50
|
Makarova JA, Kramerov DA. SNOntology: Myriads of novel snoRNAs or just a mirage? BMC Genomics 2011; 12:543. [PMID: 22047601 PMCID: PMC3349704 DOI: 10.1186/1471-2164-12-543] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2011] [Accepted: 11/03/2011] [Indexed: 12/16/2022] Open
Abstract
Background Small nucleolar RNAs (snoRNAs) are a large group of non-coding RNAs (ncRNAs) that mainly guide 2'-O-methylation (C/D RNAs) and pseudouridylation (H/ACA RNAs) of ribosomal RNAs. The pattern of rRNA modifications and the set of snoRNAs that guide these modifications are conserved in vertebrates. Nearly all snoRNA genes in vertebrates are localized in introns of other genes and are processed from pre-mRNAs. Thus, the same promoter is used for the transcription of snoRNAs and host genes. Results The series of studies by Dahai Zhu and coworkers on snoRNAs and their genes were critically considered. We present evidence that dozens of species-specific snoRNAs that they described in vertebrates are experimental artifacts resulting from the improper use of Northern hybridization. The snoRNA genes with putative intrinsic promoters that were supposed to be transcribed independently proved to contain numerous substitutions and are, most likely, pseudogenes. In some cases, they are localized within introns of overlooked host genes. Finally, an increased number of snoRNA genes in mammalian genomes described by Zhu and coworkers is also an artifact resulting from two mistakes. First, numerous mammalian snoRNA pseudogenes were considered as genes, whereas most of them are localized outside of host genes and contain substitutions that question their functionality. Second, Zhu and coworkers failed to identify many snoRNA genes in non-mammalian species. As an illustration, we present 1352 C/D snoRNA genes that we have identified and annotated in vertebrates. Conclusions Our results demonstrate that conclusions based only on databases with automatically annotated ncRNAs can be erroneous. Special investigations aimed to distinguish true RNA genes from their pseudogenes should be done. Zhu and coworkers, as well as most other groups studying vertebrate snoRNAs, give new names to newly described homologs of human snoRNAs, which significantly complicates comparison between different species. It seems necessary to develop a uniform nomenclature for homologs of human snoRNAs in other vertebrates, e.g., human gene names prefixed with several-letter code denoting the vertebrate species.
Collapse
|