1
|
Ballesio F, Pepe G, Ausiello G, Novelletto A, Helmer-Citterich M, Gherardini PF. Human lncRNAs harbor conserved modules embedded in different sequence contexts. Noncoding RNA Res 2024; 9:1257-1270. [PMID: 39040814 PMCID: PMC11261117 DOI: 10.1016/j.ncrna.2024.06.013] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/02/2024] [Revised: 06/11/2024] [Accepted: 06/19/2024] [Indexed: 07/24/2024] Open
Abstract
We analyzed the structure of human long non-coding RNA (lncRNAs) genes to investigate whether the non-coding transcriptome is organized in modular domains, as is the case for protein-coding genes. To this aim, we compared all known human lncRNA exons and identified 340 pairs of exons with high sequence and/or secondary structure similarity but embedded in a dissimilar sequence context. We grouped these pairs in 106 clusters based on their reciprocal similarities. These shared modules are highly conserved between humans and the four great ape species, display evidence of purifying selection and likely arose as a result of recent segmental duplications. Our analysis contributes to the understanding of the mechanisms driving the evolution of the non-coding genome and suggests additional strategies towards deciphering the functional complexity of this class of molecules.
Collapse
Affiliation(s)
- Francesco Ballesio
- PhD Program in Cellular and Molecular Biology, Department of Biology, University of Rome “Tor Vergata”, Rome, Italy
| | - Gerardo Pepe
- Department of Biology, University of Rome “Tor Vergata”, Rome, Italy
| | - Gabriele Ausiello
- Department of Biology, University of Rome “Tor Vergata”, Rome, Italy
| | - Andrea Novelletto
- Department of Biology, University of Rome “Tor Vergata”, Rome, Italy
| | | | | |
Collapse
|
2
|
Tak H, Anirudh J, Chattopadhyay A, Naick BH. Argonaute protein assisted drug discovery for miRNA-181c-5p and target gene ATM translation repression: a computational approach. Mol Divers 2024:10.1007/s11030-024-10855-3. [PMID: 39026118 DOI: 10.1007/s11030-024-10855-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/26/2023] [Accepted: 03/21/2024] [Indexed: 07/20/2024]
Abstract
The miRNA binds to AGO's seed region, prompting the exploration of small molecules that can offset miRNA repression of target mRNA. This miRNA-181c-5p was found to be upregulated in the chronic traumatic encephalopathy, a prevalent neurodegenerative disease in contact sports and military personals. The research aimed to identify compounds that disrupt the AGO-assisted loop formation between miRNA-181c-5p and ATM, consequently repressing the translation of ATM. Target genes from commonly three databases (DIANA-microT-CDS, miRDB, RNA22 and TargetScan) were subjected to functional annotation and clustering analysis using DAVID bioinformatics tool. Haddock server were employed to make miRNA-181c-5p:ATM-AGO complex. A total of 2594 small molecules were screened using Glide XP based on their highest binding affinity towards the complex, through a three-phase docking approach. The top 5 compounds (DB00674-Galantamine, DB00371-Meprobamate, DB00694-Daunorubicin, DB00837-Progabide, and DB00851-Dacarbazine) were further analyzed for stability in the miRNA-181c-5p:ATM-AGO-ligand complex interaction using GROMACS (version 2023.2). Hence, these findings suggest that these molecules hold potential for facilitating AGO-assisted repression of ATM gene translation.
Collapse
Affiliation(s)
- Harshita Tak
- Department of Sports Biosciences, School of Sports Science, Central University of Rajasthan, Ajmer, India
| | - Jivanage Anirudh
- Department of Sports Biosciences, School of Sports Science, Central University of Rajasthan, Ajmer, India
| | - Arpan Chattopadhyay
- Department of Sports Biosciences, School of Sports Science, Central University of Rajasthan, Ajmer, India
| | - B Hemanth Naick
- Department of Sports Biosciences, School of Sports Science, Central University of Rajasthan, Ajmer, India.
| |
Collapse
|
3
|
Ward S, Childs A, Staley C, Waugh C, Watts JA, Kotowska AM, Bhosale R, Borkar AN. Integrating cryo-OrbiSIMS with computational modelling and metadynamics simulations enhances RNA structure prediction at atomic resolution. Nat Commun 2024; 15:4367. [PMID: 38777820 PMCID: PMC11111741 DOI: 10.1038/s41467-024-48694-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2023] [Accepted: 05/05/2024] [Indexed: 05/25/2024] Open
Abstract
The 3D architecture of RNAs governs their molecular interactions, chemical reactions, and biological functions. However, a large number of RNAs and their protein complexes remain poorly understood due to the limitations of conventional structural biology techniques in deciphering their complex structures and dynamic interactions. To address this limitation, we have benchmarked an integrated approach that combines cryogenic OrbiSIMS, a state-of-the-art solid-state mass spectrometry technique, with computational methods for modelling RNA structures at atomic resolution with enhanced precision. Furthermore, using 7SK RNP as a test case, we have successfully determined the full 3D structure of a native RNA in its apo, native and disease-remodelled states, which offers insights into the structural interactions and plasticity of the 7SK complex within these states. Overall, our study establishes cryo-OrbiSIMS as a valuable tool in the field of RNA structural biology as it enables the study of challenging, native RNA systems.
Collapse
Affiliation(s)
- Shannon Ward
- School of Veterinary Medicine and Science, University of Nottingham, Nottingham, LE12 5RD, UK
- Wolfson Centre for Global Virus Research, University of Nottingham, Nottingham, LE12 5RD, UK
| | - Alex Childs
- School of Veterinary Medicine and Science, University of Nottingham, Nottingham, LE12 5RD, UK
- Wolfson Centre for Global Virus Research, University of Nottingham, Nottingham, LE12 5RD, UK
| | - Ceri Staley
- School of Veterinary Medicine and Science, University of Nottingham, Nottingham, LE12 5RD, UK
| | - Christopher Waugh
- School of Veterinary Medicine and Science, University of Nottingham, Nottingham, LE12 5RD, UK
- Wolfson Centre for Global Virus Research, University of Nottingham, Nottingham, LE12 5RD, UK
- RHy-X Limited, London, WC2A 2JR, UK
| | - Julie A Watts
- School of Pharmacy, University of Nottingham, Nottingham, NG7 2RD, UK
| | - Anna M Kotowska
- School of Pharmacy, University of Nottingham, Nottingham, NG7 2RD, UK
| | - Rahul Bhosale
- School of Biosciences, University of Nottingham, Nottingham, LE12 5RD, UK
| | - Aditi N Borkar
- School of Veterinary Medicine and Science, University of Nottingham, Nottingham, LE12 5RD, UK.
- Wolfson Centre for Global Virus Research, University of Nottingham, Nottingham, LE12 5RD, UK.
- RHy-X Limited, London, WC2A 2JR, UK.
| |
Collapse
|
4
|
Lasher B, Hendrix DA. bpRNA-align: improved RNA secondary structure global alignment for comparing and clustering RNA structures. RNA (NEW YORK, N.Y.) 2023; 29:584-595. [PMID: 36759128 PMCID: PMC10159002 DOI: 10.1261/rna.079211.122] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/20/2022] [Accepted: 01/14/2023] [Indexed: 05/06/2023]
Abstract
Ribonucleic acid (RNA) is a polymeric molecule that is fundamental to biological processes, with structure being more highly conserved than primary sequence and often key to its function. Advances in RNA structure characterization have resulted in an increase in the number of accurate secondary structures. The task of uncovering common RNA structural motifs with a collective function through structural comparison, providing a level of similarity, remains challenging and could be used to improve RNA secondary structure databases and discover new RNA families. In this work, we present a novel secondary structure alignment method, bpRNA-align. bpRNA-align is a customized global structural alignment method, utilizing an inverted (gap extend costs more than gap open) and context-specific affine gap penalty along with a structural, feature-specific substitution matrix to provide similarity scores. We evaluate our similarity scores in comparison to other methods, using affinity propagation clustering, applied to a benchmarking data set of known structure types. bpRNA-align shows improvement in clustering performance over a broad range of structure types.
Collapse
Affiliation(s)
- Brittany Lasher
- Department of Biochemistry and Biophysics, Oregon State University, Corvallis, Oregon 97331, USA
| | - David A Hendrix
- Department of Biochemistry and Biophysics, Oregon State University, Corvallis, Oregon 97331, USA
- School of Electrical Engineering and Computer Science, Oregon State University, Corvallis, Oregon 97331, USA
| |
Collapse
|
5
|
Wang H, Lu X, Zheng H, Wang W, Zhang G, Wang S, Lin P, Zhuang Y, Chen C, Chen Q, Qu J, Xu L. RNAsmc: A integrated tool for comparing RNA secondary structure and evaluating allosteric effects. Comput Struct Biotechnol J 2023; 21:965-973. [PMID: 36733704 PMCID: PMC9876829 DOI: 10.1016/j.csbj.2023.01.007] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2022] [Revised: 01/06/2023] [Accepted: 01/07/2023] [Indexed: 01/11/2023] Open
Abstract
RNA structure plays a crucial role in gene regulation, in RNA stability and the essential biological processes. RNA secondary structure (RSS) motifs are the basic building blocks for investigating the biological mechanisms of structure. Here, we present a strategy for structural motif-based dynamic alignment, namely, RNA secondary-structural motif-comparing (RNAsmc), to identify structural motifs and quantitatively evaluate their underlying molecular functions. RNAsmc also has strong robustness to sequence length, folding protocol and RNA structural profile by chemical probing. Notably, it is also applicable to quantify structural variation in special RNA editing events (SNVs or SNPs, fragment insertion or deletion, etc.). The findings indicate that RNAsmc can uncover the heterogeneity of RNA secondary structure and score for similarities among components, which provides an impetus to cluster RNA families and evaluate allosteric effects. We find that RNAsmc exhibits remarkable detection efficiency for experimentally-derived RiboSNitches. Finally, the pipeline was assembled into an R software package to serve as an automated toolkit to explore, align, and cluster RSS. It is freely available for download at https://CRAN.R-project.org/package=RNAsmc.
Collapse
Affiliation(s)
- Hong Wang
- National Engineering Research Center of Ophthalmology and Optometry, Eye Hospital, Wenzhou Medical University, Wenzhou 325027, China
- State Key Laboratory of Ophthalmology, Optometry and Visual Science, Eye Hospital, Wenzhou Medical University, Wenzhou 325027, China
- Center of Optometry International Innovation of Wenzhou, Eye Valley, Wenzhou 325027, China
| | - Xiaoyan Lu
- National Engineering Research Center of Ophthalmology and Optometry, Eye Hospital, Wenzhou Medical University, Wenzhou 325027, China
| | - Hewei Zheng
- Wekemo Tech Group Co., Ltd. Shenzhen 518000, China
| | - Wencan Wang
- National Engineering Research Center of Ophthalmology and Optometry, Eye Hospital, Wenzhou Medical University, Wenzhou 325027, China
- State Key Laboratory of Ophthalmology, Optometry and Visual Science, Eye Hospital, Wenzhou Medical University, Wenzhou 325027, China
- Wenzhou Realdata Medical Research Co., Ltd, Wenzhou 325027, China
| | - Guosi Zhang
- National Engineering Research Center of Ophthalmology and Optometry, Eye Hospital, Wenzhou Medical University, Wenzhou 325027, China
- State Key Laboratory of Ophthalmology, Optometry and Visual Science, Eye Hospital, Wenzhou Medical University, Wenzhou 325027, China
| | - Siyu Wang
- National Engineering Research Center of Ophthalmology and Optometry, Eye Hospital, Wenzhou Medical University, Wenzhou 325027, China
- State Key Laboratory of Ophthalmology, Optometry and Visual Science, Eye Hospital, Wenzhou Medical University, Wenzhou 325027, China
| | - Peng Lin
- National Engineering Research Center of Ophthalmology and Optometry, Eye Hospital, Wenzhou Medical University, Wenzhou 325027, China
- State Key Laboratory of Ophthalmology, Optometry and Visual Science, Eye Hospital, Wenzhou Medical University, Wenzhou 325027, China
| | - Youyuan Zhuang
- National Engineering Research Center of Ophthalmology and Optometry, Eye Hospital, Wenzhou Medical University, Wenzhou 325027, China
- State Key Laboratory of Ophthalmology, Optometry and Visual Science, Eye Hospital, Wenzhou Medical University, Wenzhou 325027, China
| | - Chong Chen
- National Engineering Research Center of Ophthalmology and Optometry, Eye Hospital, Wenzhou Medical University, Wenzhou 325027, China
- State Key Laboratory of Ophthalmology, Optometry and Visual Science, Eye Hospital, Wenzhou Medical University, Wenzhou 325027, China
| | - Qi Chen
- National Engineering Research Center of Ophthalmology and Optometry, Eye Hospital, Wenzhou Medical University, Wenzhou 325027, China
- State Key Laboratory of Ophthalmology, Optometry and Visual Science, Eye Hospital, Wenzhou Medical University, Wenzhou 325027, China
| | - Jia Qu
- National Engineering Research Center of Ophthalmology and Optometry, Eye Hospital, Wenzhou Medical University, Wenzhou 325027, China
- State Key Laboratory of Ophthalmology, Optometry and Visual Science, Eye Hospital, Wenzhou Medical University, Wenzhou 325027, China
- Center of Optometry International Innovation of Wenzhou, Eye Valley, Wenzhou 325027, China
- Corresponding authors at: National Engineering Research Center of Ophthalmology and Optometry, Eye Hospital, Wenzhou Medical University, Wenzhou 325027, China
| | - Liangde Xu
- National Engineering Research Center of Ophthalmology and Optometry, Eye Hospital, Wenzhou Medical University, Wenzhou 325027, China
- State Key Laboratory of Ophthalmology, Optometry and Visual Science, Eye Hospital, Wenzhou Medical University, Wenzhou 325027, China
- Center of Optometry International Innovation of Wenzhou, Eye Valley, Wenzhou 325027, China
- Corresponding authors at: National Engineering Research Center of Ophthalmology and Optometry, Eye Hospital, Wenzhou Medical University, Wenzhou 325027, China
| |
Collapse
|
6
|
Laverty KU, Jolma A, Pour SE, Zheng H, Ray D, Morris Q, Hughes TR. PRIESSTESS: interpretable, high-performing models of the sequence and structure preferences of RNA-binding proteins. Nucleic Acids Res 2022; 50:e111. [PMID: 36018788 DOI: 10.1093/nar/gkac694] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2021] [Revised: 07/22/2022] [Accepted: 08/03/2022] [Indexed: 12/23/2022] Open
Abstract
Modelling both primary sequence and secondary structure preferences for RNA binding proteins (RBPs) remains an ongoing challenge. Current models use varied RNA structure representations and can be difficult to interpret and evaluate. To address these issues, we present a universal RNA motif-finding/scanning strategy, termed PRIESSTESS (Predictive RBP-RNA InterpretablE Sequence-Structure moTif regrESSion), that can be applied to diverse RNA binding datasets. PRIESSTESS identifies dozens of enriched RNA sequence and/or structure motifs that are subsequently reduced to a set of core motifs by logistic regression with LASSO regularization. Importantly, these core motifs are easily visualized and interpreted, and provide a measure of RBP secondary structure specificity. We used PRIESSTESS to interrogate new HTR-SELEX data for 23 RBPs with diverse RNA binding modes and captured known primary sequence and secondary structure preferences for each. Moreover, when applying PRIESSTESS to 144 RBPs across 202 RNA binding datasets, 75% showed an RNA secondary structure preference but only 10% had a preference besides unpaired bases, suggesting that most RBPs simply recognize the accessibility of primary sequences.
Collapse
Affiliation(s)
- Kaitlin U Laverty
- Department of Molecular Genetics, University of Toronto, Toronto, Canada
| | - Arttu Jolma
- Department of Molecular Genetics, University of Toronto, Toronto, Canada.,Donnelly Centre, University of Toronto, Toronto, Canada
| | - Sara E Pour
- Department of Molecular Genetics, University of Toronto, Toronto, Canada
| | - Hong Zheng
- Donnelly Centre, University of Toronto, Toronto, Canada
| | - Debashish Ray
- Donnelly Centre, University of Toronto, Toronto, Canada
| | - Quaid Morris
- Department of Molecular Genetics, University of Toronto, Toronto, Canada.,Computational and Systems Biology, Memorial Sloan Kettering Cancer Center, New York, USA
| | - Timothy R Hughes
- Department of Molecular Genetics, University of Toronto, Toronto, Canada.,Donnelly Centre, University of Toronto, Toronto, Canada
| |
Collapse
|
7
|
Sun S, Yang J, Zhang Z. RNALigands: a database and web server for RNA-ligand interactions. RNA (NEW YORK, N.Y.) 2022; 28:115-122. [PMID: 34732566 PMCID: PMC8906548 DOI: 10.1261/rna.078889.121] [Citation(s) in RCA: 14] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/05/2021] [Accepted: 10/25/2021] [Indexed: 06/13/2023]
Abstract
RNA molecules can fold into complex and stable 3D structures, allowing them to carry out important genetic, structural, and regulatory roles inside the cell. These complex structures often contain 3D pockets made up of secondary structural motifs that can be potentially targeted by small molecule ligands. Indeed, many RNA structures in PDB contain bound small molecules, and high-throughput experimental studies have generated a large number of interacting RNA and ligand pairs. There is considerable interest in developing small molecule lead compounds targeting viral RNAs or those RNAs implicated in neurological diseases or cancer. We hypothesize that RNAs that have similar secondary structural motifs may bind to similar small molecule ligands. Toward this goal, we established a database collecting RNA secondary structural motifs and bound small molecule ligands. We further developed a computational pipeline, which takes as input an RNA sequence, predicts its secondary structure, extracts structural motifs, and searches the database for similar secondary structure motifs and interacting small molecule. We demonstrated the utility of the server by querying α-synuclein mRNA 5' UTR sequence and finding potential matches which were validated as correct. The server is publicly available at http://RNALigands.ccbr.utoronto.ca The source code can also be downloaded at https://github.com/SaisaiSun/RNALigands.
Collapse
Affiliation(s)
- Saisai Sun
- School of Computer Science and Technology, Xidian University, Xi'an, 710071, Shanxi, China
- Terrence Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, Toronto, Ontario M5S 3E1, Canada
- Department of Molecular Genetics, University of Toronto, Toronto, Ontario M5S 3E1, Canada
| | - Jianyi Yang
- School of Mathematical Sciences, Nankai University, Tianjin, 300071, China
| | - Zhaolei Zhang
- Terrence Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, Toronto, Ontario M5S 3E1, Canada
- Department of Molecular Genetics, University of Toronto, Toronto, Ontario M5S 3E1, Canada
- Department of Computer Science, University of Toronto, Toronto, Ontario M5S 3E1, Canada
| |
Collapse
|
8
|
Gandhi S, Witten A, De Majo F, Gilbers M, Maessen J, Schotten U, de Windt LJ, Stoll M. Evolutionarily conserved transcriptional landscape of the heart defining the chamber specific physiology. Genomics 2021; 113:3782-3792. [PMID: 34506887 DOI: 10.1016/j.ygeno.2021.09.002] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2021] [Revised: 08/17/2021] [Accepted: 09/05/2021] [Indexed: 12/14/2022]
Abstract
Cardiovascular disease (CVD) remains the leading cause of death worldwide. A deeper characterization of regional transcription patterns within different heart chambers may aid to improve our understanding of the molecular mechanisms involved in myocardial function and further, our ability to develop novel therapeutic strategies. Here, we used RNA sequencing to determine differentially expressed protein coding (PC) and long non-coding (lncRNA) transcripts within the heart chambers across seven vertebrate species and identified evolutionarily conserved chamber specific genes, lncRNAs and pathways. We investigated lncRNA homologs based on sequence, secondary structure, synteny and expressional conservation and found most lncRNAs to be conserved by synteny. Regional co-expression patterns of transcripts are modulated by multiple factors, including genomic overlap, strandedness and transcript biotype. Finally, we provide a community resource designated EvoACTG, which informs researchers on the conserved yet intertwined nature of the coding and non-coding cardiac transcriptome across popular model organisms in CVD research.
Collapse
Affiliation(s)
- Shrey Gandhi
- Institute of Human Genetics, Division of Genetic Epidemiology, University of Muenster, Muenster, Germany
| | - Anika Witten
- Institute of Human Genetics, Division of Genetic Epidemiology, University of Muenster, Muenster, Germany
| | - Federica De Majo
- Department of Molecular Genetics, Maastricht University, Maastricht, the Netherlands
| | - Martijn Gilbers
- Department of Cardiothoracic Surgery, CARIM School for Cardiovascular Diseases, Maastricht University Medical Centre+, Maastricht, the Netherlands
| | - Jos Maessen
- Department of Cardiothoracic Surgery, CARIM School for Cardiovascular Diseases, Maastricht University Medical Centre+, Maastricht, the Netherlands
| | - Ulrich Schotten
- Department of Physiology, CARIM School for Cardiovascular Diseases, Maastricht University, Maastricht, the Netherlands
| | - Leon J de Windt
- Department of Molecular Genetics, Maastricht University, Maastricht, the Netherlands
| | - Monika Stoll
- Institute of Human Genetics, Division of Genetic Epidemiology, University of Muenster, Muenster, Germany; Department of Biochemistry, Genetic Epidemiology and Statistical Genetics, CARIM School for Cardiovascular Diseases, Maastricht University, Maastricht, the Netherlands.
| |
Collapse
|
9
|
Guarracino A, Pepe G, Ballesio F, Adinolfi M, Pietrosanto M, Sangiovanni E, Vitale I, Ausiello G, Helmer-Citterich M. BRIO: a web server for RNA sequence and structure motif scan. Nucleic Acids Res 2021; 49:W67-W71. [PMID: 34038531 PMCID: PMC8262756 DOI: 10.1093/nar/gkab400] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2021] [Revised: 04/27/2021] [Accepted: 05/22/2021] [Indexed: 12/30/2022] Open
Abstract
The interaction between RNA and RNA-binding proteins (RBPs) has a key role in the regulation of gene expression, in RNA stability, and in many other biological processes. RBPs accomplish these functions by binding target RNA molecules through specific sequence and structure motifs. The identification of these binding motifs is therefore fundamental to improve our knowledge of the cellular processes and how they are regulated. Here, we present BRIO (BEAM RNA Interaction mOtifs), a new web server designed for the identification of sequence and structure RNA-binding motifs in one or more RNA molecules of interest. BRIO enables the user to scan over 2508 sequence motifs and 2296 secondary structure motifs identified in Homo sapiens and Mus musculus, in three different types of experiments (PAR-CLIP, eCLIP, HITS). The motifs are associated with the binding of 186 RBPs and 69 protein domains. The web server is freely available at http://brio.bio.uniroma2.it.
Collapse
Affiliation(s)
- Andrea Guarracino
- Department of Biology, University of Rome "Tor Vergata", Rome, Italy
| | - Gerardo Pepe
- Department of Biology, University of Rome "Tor Vergata", Rome, Italy
| | | | - Marta Adinolfi
- Department of Biology, University of Rome "Tor Vergata", Rome, Italy
| | - Marco Pietrosanto
- Department of Biology, University of Rome "Tor Vergata", Rome, Italy
| | - Elisa Sangiovanni
- Department of Biology, University of Rome "Tor Vergata", Rome, Italy
| | - Ilio Vitale
- IIGM - Italian Institute for Genomic Medicine, c/o IRCSS Candiolo, Italy.,Candiolo Cancer Institute, FPO - IRCCS, Candiolo, Italy
| | - Gabriele Ausiello
- Department of Biology, University of Rome "Tor Vergata", Rome, Italy
| | | |
Collapse
|
10
|
Pietrosanto M, Ausiello G, Helmer-Citterich M. Motif Discovery from CLIP Experiments. METHODS IN MOLECULAR BIOLOGY (CLIFTON, N.J.) 2021; 2284:43-50. [PMID: 33835436 DOI: 10.1007/978-1-0716-1307-8_3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
Abstract
RNA primary and secondary motif discovery is an important step in the annotation and characterization of unknown interaction dynamics between RNAs and RNA-Binding Proteins, and several methods have been developed to meet the need of fast and efficient discovery of interaction motifs. Recent advances have increased the amount of data produced by experimental assays and there is no available method suitable for the analysis of all type of results. Here we present a simple workflow to help choosing the more appropriate method, depending on the starting situation, among the three algorithms that best cover the landscape of approaches. A detailed analysis is presented to highlight the need for different algorithms in different working settings. In conclusion, the proposed workflow depends on the nature of the starting data and on the availability of RNA annotations.
Collapse
Affiliation(s)
- Marco Pietrosanto
- Centre for Molecular Bioinformatics, Department of Biology, University of Rome Tor Vergata, Rome, Italy
| | - Gabriele Ausiello
- Centre for Molecular Bioinformatics, Department of Biology, University of Rome Tor Vergata, Rome, Italy
| | - Manuela Helmer-Citterich
- Centre for Molecular Bioinformatics, Department of Biology, University of Rome Tor Vergata, Rome, Italy.
| |
Collapse
|
11
|
Comparative genomics in the search for conserved long noncoding RNAs. Essays Biochem 2021; 65:741-749. [PMID: 33885137 PMCID: PMC8564735 DOI: 10.1042/ebc20200069] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2020] [Revised: 02/15/2021] [Accepted: 03/15/2021] [Indexed: 12/23/2022]
Abstract
Long noncoding RNAs (lncRNAs) have emerged as prominent regulators of gene expression in eukaryotes. The identification of lncRNA orthologs is essential in efforts to decipher their roles across model organisms, as homologous genes tend to have similar molecular and biological functions. The relatively high sequence plasticity of lncRNA genes compared with protein-coding genes, makes the identification of their orthologs a challenging task. This is why comparative genomics of lncRNAs requires the development of specific and, sometimes, complex approaches. Here, we briefly review current advancements and challenges associated with four levels of lncRNA conservation: genomic sequences, splicing signals, secondary structures and syntenic transcription.
Collapse
|
12
|
Pietrosanto M, Adinolfi M, Guarracino A, Ferrè F, Ausiello G, Vitale I, Helmer-Citterich M. Relative Information Gain: Shannon entropy-based measure of the relative structural conservation in RNA alignments. NAR Genom Bioinform 2021; 3:lqab007. [PMID: 33615214 PMCID: PMC7884220 DOI: 10.1093/nargab/lqab007] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2020] [Revised: 12/18/2020] [Accepted: 01/26/2021] [Indexed: 12/21/2022] Open
Abstract
Structural characterization of RNAs is a dynamic field, offering many modelling possibilities. RNA secondary structure models are usually characterized by an encoding that depicts structural information of the molecule through string representations or graphs. In this work, we provide a generalization of the BEAR encoding (a context-aware structural encoding we previously developed) by expanding the set of alignments used for the construction of substitution matrices and then applying it to secondary structure encodings ranging from fine-grained to more coarse-grained representations. We also introduce a re-interpretation of the Shannon Information applied on RNA alignments, proposing a new scoring metric, the Relative Information Gain (RIG). The RIG score is available for any position in an alignment, showing how different levels of detail encoded in the RNA representation can contribute differently to convey structural information. The approaches presented in this study can be used alongside state-of-the-art tools to synergistically gain insights into the structural elements that RNAs and RNA families are composed of. This additional information could potentially contribute to their improvement or increase the degree of confidence in the secondary structure of families and any set of aligned RNAs.
Collapse
Affiliation(s)
- Marco Pietrosanto
- Centre for Molecular Bioinformatics, Department of Biology, University of Rome Tor Vergata, Via della Ricerca Scientifica snc, 00133 Rome, Italy
| | - Marta Adinolfi
- Centre for Molecular Bioinformatics, Department of Biology, University of Rome Tor Vergata, Via della Ricerca Scientifica snc, 00133 Rome, Italy
| | - Andrea Guarracino
- Centre for Molecular Bioinformatics, Department of Biology, University of Rome Tor Vergata, Via della Ricerca Scientifica snc, 00133 Rome, Italy
| | - Fabrizio Ferrè
- Department of Pharmacy and Biotechnology (FaBiT), University of Bologna Alma Mater, Via Belmeloro 6, 40126 Bologna, Italy
| | - Gabriele Ausiello
- Centre for Molecular Bioinformatics, Department of Biology, University of Rome Tor Vergata, Via della Ricerca Scientifica snc, 00133 Rome, Italy
| | - Ilio Vitale
- IIGM - Italian Institute for Genomic Medicine, c/o IRCSS Candiolo,10060 Torino, Italy
| | - Manuela Helmer-Citterich
- Centre for Molecular Bioinformatics, Department of Biology, University of Rome Tor Vergata, Via della Ricerca Scientifica snc, 00133 Rome, Italy
| |
Collapse
|
13
|
Xia CQ, Pan X, Yang Y, Huang Y, Shen HB. Recent Progresses of Computational Analysis of RNA-Protein Interactions. SYSTEMS MEDICINE 2021. [DOI: 10.1016/b978-0-12-801238-3.11315-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/26/2022] Open
|
14
|
Li J, Zhang X, Liu C. The computational approaches of lncRNA identification based on coding potential: Status quo and challenges. Comput Struct Biotechnol J 2020; 18:3666-3677. [PMID: 33304463 PMCID: PMC7710504 DOI: 10.1016/j.csbj.2020.11.030] [Citation(s) in RCA: 20] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2020] [Revised: 11/15/2020] [Accepted: 11/16/2020] [Indexed: 12/13/2022] Open
Abstract
Long noncoding RNAs (lncRNAs) make up a large proportion of transcriptome in eukaryotes, and have been revealed with many regulatory functions in various biological processes. When studying lncRNAs, the first step is to accurately and specifically distinguish them from the colossal transcriptome data with complicated composition, which contains mRNAs, lncRNAs, small RNAs and their primary transcripts. In the face of such a huge and progressively expanding transcriptome data, the in-silico approaches provide a practicable scheme for effectively and rapidly filtering out lncRNA targets, using machine learning and probability statistics. In this review, we mainly discussed the characteristics of algorithms and features on currently developed approaches. We also outlined the traits of some state-of-the-art tools for ease of operation. Finally, we pointed out the underlying challenges in lncRNA identification with the advent of new experimental data.
Collapse
Affiliation(s)
- Jing Li
- CAS Key Laboratory of Tropical Plant Resources and Sustainable Use, Xishuangbanna Tropical Botanical Garden, Chinese Academy of Sciences, Menglun, Mengla, Yunnan 666303, China
- Center of Economic Botany, Core Botanical Gardens, Chinese Academy of Sciences, Menglun, Mengla, Yunnan 666303, China
| | - Xuan Zhang
- CAS Key Laboratory of Tropical Plant Resources and Sustainable Use, Xishuangbanna Tropical Botanical Garden, Chinese Academy of Sciences, Menglun, Mengla, Yunnan 666303, China
| | - Changning Liu
- CAS Key Laboratory of Tropical Plant Resources and Sustainable Use, Xishuangbanna Tropical Botanical Garden, Chinese Academy of Sciences, Menglun, Mengla, Yunnan 666303, China
- Center of Economic Botany, Core Botanical Gardens, Chinese Academy of Sciences, Menglun, Mengla, Yunnan 666303, China
- The Innovative Academy of Seed Design, Chinese Academy of Sciences, Menglun, Mengla, Yunnan 666303, China
| |
Collapse
|
15
|
Schmidt M, Hamacher K, Reinhardt F, Lotz TS, Groher F, Suess B, Jager S. SICOR: Subgraph Isomorphism Comparison of RNA Secondary Structures. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2020; 17:2189-2195. [PMID: 31295116 DOI: 10.1109/tcbb.2019.2926711] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
RNA aptamer selection during SELEX experiments builds on secondary structural diversity. Advanced structural comparison methods can focus this diversity. We develop SICOR, which uses probabilistic subgraph isomorphisms for graph distances between RNA secondary structure graphs. SICOR outperforms other comparison methods and is applicable to many structural comparisons in experimental design.
Collapse
|
16
|
LncLocation: Efficient Subcellular Location Prediction of Long Non-Coding RNA-Based Multi-Source Heterogeneous Feature Fusion. Int J Mol Sci 2020; 21:ijms21197271. [PMID: 33019721 PMCID: PMC7582431 DOI: 10.3390/ijms21197271] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2020] [Revised: 09/27/2020] [Accepted: 09/28/2020] [Indexed: 12/13/2022] Open
Abstract
Recent studies uncover that subcellular location of long non-coding RNAs (lncRNAs) can provide significant information on its function. Due to the lack of experimental data, the number of lncRNAs is very limited, experimentally verified subcellular localization, and the numbers of lncRNAs located in different organelle are wildly imbalanced. The prediction of subcellular location of lncRNAs is actually a multi-classification small sample imbalance problem. The imbalance of data results in the poor recognition effect of machine learning models on small data subsets, which is a puzzling and challenging problem in the existing research. In this study, we integrate multi-source features to construct a sequence-based computational tool, lncLocation, to predict the subcellular location of lncRNAs. Autoencoder is used to enhance part of the features, and the binomial distribution-based filtering method and recursive feature elimination (RFE) are used to filter some of the features. It improves the representation ability of data and reduces the problem of unbalanced multi-classification data. By comprehensive experiments on different feature combinations and machine learning models, we select the optimal features and classifier model scheme to construct a subcellular location prediction tool, lncLocation. LncLocation can obtain an 87.78% accuracy using 5-fold cross validation on the benchmark data, which is higher than the state-of-the-art tools, and the classification performance, especially for small class sets, is improved significantly.
Collapse
|
17
|
Han S, Liang Y, Ma Q, Xu Y, Zhang Y, Du W, Wang C, Li Y. LncFinder: an integrated platform for long non-coding RNA identification utilizing sequence intrinsic composition, structural information and physicochemical property. Brief Bioinform 2020; 20:2009-2027. [PMID: 30084867 PMCID: PMC6954391 DOI: 10.1093/bib/bby065] [Citation(s) in RCA: 82] [Impact Index Per Article: 20.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2018] [Revised: 06/20/2018] [Indexed: 12/31/2022] Open
Abstract
Discovering new long non-coding RNAs (lncRNAs) has been a fundamental step in lncRNA-related research. Nowadays, many machine learning-based tools have been developed for lncRNA identification. However, many methods predict lncRNAs using sequence-derived features alone, which tend to display unstable performances on different species. Moreover, the majority of tools cannot be re-trained or tailored by users and neither can the features be customized or integrated to meet researchers’ requirements. In this study, features extracted from sequence-intrinsic composition, secondary structure and physicochemical property are comprehensively reviewed and evaluated. An integrated platform named LncFinder is also developed to enhance the performance and promote the research of lncRNA identification. LncFinder includes a novel lncRNA predictor using the heterologous features we designed. Experimental results show that our method outperforms several state-of-the-art tools on multiple species with more robust and satisfactory results. Researchers can additionally employ LncFinder to extract various classic features, build classifier with numerous machine learning algorithms and evaluate classifier performance effectively and efficiently. LncFinder can reveal the properties of lncRNA and mRNA from various perspectives and further inspire lncRNA–protein interaction prediction and lncRNA evolution analysis. It is anticipated that LncFinder can significantly facilitate lncRNA-related research, especially for the poorly explored species. LncFinder is released as R package (https://CRAN.R-project.org/package=LncFinder). A web server (http://bmbl.sdstate.edu/lncfinder/) is also developed to maximize its availability.
Collapse
Affiliation(s)
- Siyu Han
- College of Computer Science and Technology, Key Laboratory of Symbol Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun, China
| | - Yanchun Liang
- College of Computer Science and Technology, Key Laboratory of Symbol Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun, China.,Zhuhai Laboratory of Key Laboratory of Symbol Computation and Knowledge Engineering of Ministry of Education, Zhuhai College of Jilin University, Zhuhai, China
| | - Qin Ma
- Bioinformatics and Mathematical Biosciences Lab, Department of Agronomy, Horticulture and Plant Science, South Dakot State University, Brookings, SD, USA.,Department of Mathematics and Statistics, South Dakota State University, Brookings, SD, USA
| | - Yangyi Xu
- College of Computer Science and Technology, Key Laboratory of Symbol Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun, China
| | - Yu Zhang
- College of Computer Science and Technology, Key Laboratory of Symbol Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun, China
| | - Wei Du
- College of Computer Science and Technology, Key Laboratory of Symbol Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun, China
| | - Cankun Wang
- Department of Mathematics and Statistics, South Dakota State University, Brookings, SD, USA
| | - Ying Li
- College of Computer Science and Technology, Key Laboratory of Symbol Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun, China
| |
Collapse
|
18
|
Mishra B, Balaji A, Beesetti H, Swaminathan S, Aduri R. The RNA secondary structural variation in the cyclization elements of the dengue genome and the possible implications in pathogenicity. Virusdisease 2020; 31:299-307. [PMID: 32904896 PMCID: PMC7458965 DOI: 10.1007/s13337-020-00615-w] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2020] [Accepted: 07/20/2020] [Indexed: 10/23/2022] Open
Abstract
Dengue virus (DENV), the causative agent of dengue fever and severe dengue, exists as four antigenically different serotypes. These serotypes are further classified into genotypes and have varying degrees of pathogenicity. The 5' and 3' ends of the genomic RNA play a critical role in the viral life cycle. A global scale study of the RNA structural variation among the sero- and genotypes was carried out to correlate RNA structure with pathogenicity. We found that the GC rich stem and rigid loop structure of the 5' end of the genomic RNA of DENV 2 differs significantly from the others. The observed variation in base composition and base pairing may confer structural and functional advantage in highly virulent strains. This variation in the structure may influence the ease of cyclization and recruitment of viral RNA polymerase, NS5 RdRp, thereby affecting the pathogenicity of these strains.
Collapse
Affiliation(s)
- Bibhudutta Mishra
- Department of Biological Sciences, Birla Institute of Technology and Science, Pilani, K K Birla Goa Campus, Zuarinagar, South Goa, Goa 403 726 India
| | - Advait Balaji
- Department of Biological Sciences, Birla Institute of Technology and Science, Pilani, K K Birla Goa Campus, Zuarinagar, South Goa, Goa 403 726 India
| | - Hemalatha Beesetti
- Department of Biological Sciences, Birla Institute of Technology and Science, Pilani, Hyderabad Campus, Jawahar Nagar, Shameerpet Mandal, Hyderabad, Telangana 500 078 India
- Present Address: Molecular Medicine Division, Recombinant Gene Products Group, International Centre for Genetic Engineering and Biotechnology, New Delhi, 110067 India
| | - Sathyamangalam Swaminathan
- Department of Biological Sciences, Birla Institute of Technology and Science, Pilani, Hyderabad Campus, Jawahar Nagar, Shameerpet Mandal, Hyderabad, Telangana 500 078 India
- Present Address: Molecular Medicine Division, Recombinant Gene Products Group, International Centre for Genetic Engineering and Biotechnology, New Delhi, 110067 India
| | - Raviprasad Aduri
- Department of Biological Sciences, Birla Institute of Technology and Science, Pilani, K K Birla Goa Campus, Zuarinagar, South Goa, Goa 403 726 India
| |
Collapse
|
19
|
PRIME-3D2D is a 3D2D model to predict binding sites of protein-RNA interaction. Commun Biol 2020; 3:384. [PMID: 32678300 PMCID: PMC7366699 DOI: 10.1038/s42003-020-1114-y] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2019] [Accepted: 06/29/2020] [Indexed: 11/08/2022] Open
Abstract
Protein-RNA interaction participates in many biological processes. So, studying protein–RNA interaction can help us to understand the function of protein and RNA. Although the protein–RNA 3D3D model, like PRIME, was useful in building 3D structural complexes, it can’t be used genome-wide, due to lacking RNA 3D structures. To take full advantage of RNA secondary structures revealed from high-throughput sequencing, we present PRIME-3D2D to predict binding sites of protein–RNA interaction. PRIME-3D2D is almost as good as PRIME at modeling protein–RNA complexes. PRIME-3D2D can be used to predict binding sites on PDB data (MCC = 0.75/0.70 for binding sites in protein/RNA) and transcription-wide (MCC = 0.285 for binding sites in RNA). Testing on PDB and yeast transcription-wide data show that PRIME-3D2D performs better than other binding sites predictor. So, PRIME-3D2D can be used to predict the binding sites both on PDB and genome-wide, and it’s freely available. Xie et al. report a new computational method PRIME-3D2D to predict binding sites of protein–RNA interaction by considering protein 3D structure and RNA 2D structure. It is freely available, performs better than other binding sites predictor and is as good as PRIME to model protein–RNA complex.
Collapse
|
20
|
Adinolfi M, Pietrosanto M, Parca L, Ausiello G, Ferrè F, Helmer-Citterich M. Discovering sequence and structure landscapes in RNA interaction motifs. Nucleic Acids Res 2019; 47:4958-4969. [PMID: 31162604 PMCID: PMC6547422 DOI: 10.1093/nar/gkz250] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2018] [Revised: 02/22/2019] [Accepted: 04/09/2019] [Indexed: 12/16/2022] Open
Abstract
RNA molecules are able to bind proteins, DNA and other small or long RNAs using information at primary, secondary or tertiary structure level. Recent techniques that use cross-linking and immunoprecipitation of RNAs can detect these interactions and, if followed by high-throughput sequencing, molecules can be analysed to find recurrent elements shared by interactors, such as sequence and/or structure motifs. Many tools are able to find sequence motifs from lists of target RNAs, while others focus on structure using different approaches to find specific interaction elements. In this work, we make a systematic analysis of RBP-RNA and RNA-RNA datasets to better characterize the interaction landscape with information about multi-motifs on the same RNAs. To achieve this goal, we updated our BEAM algorithm to combine both sequence and structure information to create pairs of patterns that model motifs of interaction. This algorithm was applied to several RNA binding proteins and ncRNAs interactors, confirming already known motifs and discovering new ones. This landscape analysis on interaction variability reflects the diversity of target recognition and underlines that often both primary and secondary structure are involved in molecular recognition.
Collapse
Affiliation(s)
- Marta Adinolfi
- Centre for Molecular Bioinformatics, Department of Biology, University of Rome Tor Vergata, Via della Ricerca Scientifica snc, 00133 Rome, Italy
| | - Marco Pietrosanto
- Centre for Molecular Bioinformatics, Department of Biology, University of Rome Tor Vergata, Via della Ricerca Scientifica snc, 00133 Rome, Italy
| | - Luca Parca
- Centre for Molecular Bioinformatics, Department of Biology, University of Rome Tor Vergata, Via della Ricerca Scientifica snc, 00133 Rome, Italy
| | - Gabriele Ausiello
- Centre for Molecular Bioinformatics, Department of Biology, University of Rome Tor Vergata, Via della Ricerca Scientifica snc, 00133 Rome, Italy
| | - Fabrizio Ferrè
- Department of Pharmacy and Biotechnology (FaBiT), University of Bologna Alma Mater, Via Selmi 3, 40126 Bologna, Italy
| | - Manuela Helmer-Citterich
- Centre for Molecular Bioinformatics, Department of Biology, University of Rome Tor Vergata, Via della Ricerca Scientifica snc, 00133 Rome, Italy
| |
Collapse
|
21
|
Breyta R, Atkinson SD, Bartholomew JL. Evolutionary dynamics of Ceratonova species (Cnidaria: Myxozoa) reveal different host adaptation strategies. INFECTION GENETICS AND EVOLUTION 2019; 78:104081. [PMID: 31676446 DOI: 10.1016/j.meegid.2019.104081] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/25/2019] [Revised: 10/16/2019] [Accepted: 10/22/2019] [Indexed: 10/25/2022]
Abstract
The myxozoan parasite Ceratonova shasta is an important pathogen that infects multiple species of Pacific salmonids. Ongoing genetic surveillance has revealed stable host-parasite relationships throughout the parasite's endemic range. We applied Bayesian phylogenetics to test specific hypotheses about the evolution of these host-parasite relationships within the well-studied Klamath River watershed in Oregon and California, USA. The results provide statistical support that different genotypes of C. shasta are distinct lineages of one species, which is related to two other Ceratonova species in the same ecosystems; Ceratonova X in speckled dace and C. gasterostea in threespine stickleback. Furthermore, we found strong support for the hypothesis that C. shasta type 0 in native steelhead trout and type I in Chinook salmon each evolved with a specialist host adaptation strategy, while C. shasta type II in coho salmon resulted from a generalist host adaptation strategy. Inferred date and host species of the most recent common ancestor of extant Klamath basin types indicate that it occurred between 14,000 and 21,000 years ago, and most likely infected a native steelhead or rainbow trout host.
Collapse
Affiliation(s)
- Rachel Breyta
- Department of Microbiology, Oregon State University, Corvallis, OR, USA; US Geological Survey, Western Fisheries Research Center, Seattle, WA, USA.
| | | | | |
Collapse
|
22
|
Pietrosanto M, Adinolfi M, Casula R, Ausiello G, Ferrè F, Helmer-Citterich M. BEAM web server: a tool for structural RNA motif discovery. Bioinformatics 2019; 34:1058-1060. [PMID: 29095974 PMCID: PMC5860439 DOI: 10.1093/bioinformatics/btx704] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2017] [Accepted: 10/30/2017] [Indexed: 11/26/2022] Open
Abstract
Motivation RNA structural motif finding is a relevant problem that becomes computationally hard when working on high-throughput data (e.g. eCLIP, PAR-CLIP), often represented by thousands of RNA molecules. Currently, the BEAM server is the only web tool capable to handle tens of thousands of RNA in input with a motif discovery procedure that is only limited by the current secondary structure prediction accuracies. Results The recently developed method BEAM (BEAr Motifs finder) can analyze tens of thousands of RNA molecules and identify RNA secondary structure motifs associated to a measure of their statistical significance. BEAM is extremely fast thanks to the BEAR encoding that transforms each RNA secondary structure in a string of characters. BEAM also exploits the evolutionary knowledge contained in a substitution matrix of secondary structure elements, extracted from the RFAM database of families of homologous RNAs. The BEAM web server has been designed to streamline data pre-processing by automatically handling folding and encoding of RNA sequences, giving users a choice for the preferred folding program. The server provides an intuitive and informative results page with the list of secondary structure motifs identified, the logo of each motif, its significance, graphic representation and information about its position in the RNA molecules sharing it. Availability and implementation The web server is freely available at http://beam.uniroma2.it/ and it is implemented in NodeJS and Python with all major browsers supported. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Marco Pietrosanto
- Centre for Molecular Bioinformatics, Department of Biology, University of Rome Tor Vergata, 00133 Rome, Italy
| | - Marta Adinolfi
- Centre for Molecular Bioinformatics, Department of Biology, University of Rome Tor Vergata, 00133 Rome, Italy
| | - Riccardo Casula
- Centre for Molecular Bioinformatics, Department of Biology, University of Rome Tor Vergata, 00133 Rome, Italy
| | - Gabriele Ausiello
- Centre for Molecular Bioinformatics, Department of Biology, University of Rome Tor Vergata, 00133 Rome, Italy
| | - Fabrizio Ferrè
- Department of Pharmacy and Biotechnology, University of Bologna Alma Mater, 40126 Bologna, Italy
| | - Manuela Helmer-Citterich
- Centre for Molecular Bioinformatics, Department of Biology, University of Rome Tor Vergata, 00133 Rome, Italy
| |
Collapse
|
23
|
Jung Y, El-Manzalawy Y, Dobbs D, Honavar VG. Partner-specific prediction of RNA-binding residues in proteins: A critical assessment. Proteins 2018; 87:198-211. [PMID: 30536635 PMCID: PMC6389706 DOI: 10.1002/prot.25639] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2018] [Revised: 10/10/2018] [Accepted: 11/29/2018] [Indexed: 01/06/2023]
Abstract
RNA-protein interactions play essential roles in regulating gene expression. While some RNA-protein interactions are "specific", that is, the RNA-binding proteins preferentially bind to particular RNA sequence or structural motifs, others are "non-RNA specific." Deciphering the protein-RNA recognition code is essential for comprehending the functional implications of these interactions and for developing new therapies for many diseases. Because of the high cost of experimental determination of protein-RNA interfaces, there is a need for computational methods to identify RNA-binding residues in proteins. While most of the existing computational methods for predicting RNA-binding residues in RNA-binding proteins are oblivious to the characteristics of the partner RNA, there is growing interest in methods for partner-specific prediction of RNA binding sites in proteins. In this work, we assess the performance of two recently published partner-specific protein-RNA interface prediction tools, PS-PRIP, and PRIdictor, along with our own new tools. Specifically, we introduce a novel metric, RNA-specificity metric (RSM), for quantifying the RNA-specificity of the RNA binding residues predicted by such tools. Our results show that the RNA-binding residues predicted by previously published methods are oblivious to the characteristics of the putative RNA binding partner. Moreover, when evaluated using partner-agnostic metrics, RNA partner-specific methods are outperformed by the state-of-the-art partner-agnostic methods. We conjecture that either (a) the protein-RNA complexes in PDB are not representative of the protein-RNA interactions in nature, or (b) the current methods for partner-specific prediction of RNA-binding residues in proteins fail to account for the differences in RNA partner-specific versus partner-agnostic protein-RNA interactions, or both.
Collapse
Affiliation(s)
- Yong Jung
- Bioinformatics and Genomics Graduate Program, Pennsylvania State University, University Park, Pennsylvania.,Artificial Intelligence Research Laboratory, Pennsylvania State University, University Park, Pennsylvania.,The Huck Institutes of the Life Sciences, Pennsylvania State University, University Park, Pennsylvania
| | - Yasser El-Manzalawy
- Artificial Intelligence Research Laboratory, Pennsylvania State University, University Park, Pennsylvania.,Clinical and Translational Sciences Institute, Pennsylvania State University, University Park, Pennsylvania.,College of Information Sciences and Technology, Pennsylvania State University, Pennsylvania
| | - Drena Dobbs
- Bioinformatics and Computational Biology Program, Iowa State University, Ames, Iowa.,Department of Genetics, Development, and Cell Biology, Iowa State University, Ames, Iowa
| | - Vasant G Honavar
- Bioinformatics and Genomics Graduate Program, Pennsylvania State University, University Park, Pennsylvania.,Artificial Intelligence Research Laboratory, Pennsylvania State University, University Park, Pennsylvania.,Institute for Cyberscience, Pennsylvania State University, University Park, Pennsylvania.,Clinical and Translational Sciences Institute, Pennsylvania State University, University Park, Pennsylvania.,The Huck Institutes of the Life Sciences, Pennsylvania State University, University Park, Pennsylvania.,College of Information Sciences and Technology, Pennsylvania State University, Pennsylvania
| |
Collapse
|
24
|
Natsidis P, Kappas I, Karlowski WM. StarSeeker: an automated tool for mature duplex microRNA sequence identification based on secondary structure modeling of precursor molecule. ACTA ACUST UNITED AC 2018; 25:11. [PMID: 29946534 PMCID: PMC6003123 DOI: 10.1186/s40709-018-0081-7] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2017] [Accepted: 06/08/2018] [Indexed: 11/10/2022]
Abstract
Background MicroRNAs (miRNAs) are small, non-coding RNA molecules that play a key role in gene regulation in both plants and animals. MicroRNA biogenesis involves the enzymatic processing of a primary RNA transcript. The final step is the production of a duplex molecule, often designated as miRNA:miRNA*, that will yield a functional miRNA by separation of the two strands. This miRNA will be incorporated into the RNA-induced silencing complex, which subsequently will bind to its target mRNA in order to suppress its expression. The analysis of miRNAs is still a developing area for computational biology with many open questions regarding the structure and function of this important class of molecules. Here, we present StarSeeker, a simple tool that outputs the putative miRNA* sequence given the precursor and the mature sequences. Results We evaluated StarSeeker using a dataset consisting of all plant sequences available in miRBase (6992 precursor sequences and 8496 mature sequences). The program returned a total of 15,468 predicted miRNA* sequences. Of these, 2650 sequences were matched to annotated miRNAs (~ 90% of the miRBase-annotated sequences). The remaining predictions could not be verified, mainly because they do not comply with the rule requiring the two overhanging nucleotides in the duplex molecule. Conclusions The expression pattern of some miRNAs in plants can be altered under various abiotic stress conditions. Potential miRNA* molecules that do not degrade can thus be detected and also discovered in high-throughput sequencing data, helping us to understand their role in gene regulation.
Collapse
Affiliation(s)
- Paschalis Natsidis
- 1Department of Genetics, Development & Molecular Biology, School of Biology, Aristotle University of Thessaloniki, 54124 Thessaloniki, Greece.,3Present Address: School of Medicine, University of Crete, Voutes University Campus, 70013 Heraklion, Crete, Greece
| | - Ilias Kappas
- 1Department of Genetics, Development & Molecular Biology, School of Biology, Aristotle University of Thessaloniki, 54124 Thessaloniki, Greece
| | - Wojciech M Karlowski
- 2Department of Computational Biology, Institute of Molecular Biology and Biotechnology, Faculty of Biology, Adam Mickiewicz University, 61-614 Poznan, Poland
| |
Collapse
|
25
|
Adjeroh D, Allaga M, Tan J, Lin J, Jiang Y, Abbasi A, Zhou X. Feature-Based and String-Based Models for Predicting RNA-Protein Interaction. Molecules 2018; 23:E697. [PMID: 29562711 PMCID: PMC6017419 DOI: 10.3390/molecules23030697] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2017] [Revised: 02/17/2018] [Accepted: 02/21/2018] [Indexed: 12/13/2022] Open
Abstract
In this work, we study two approaches for the problem of RNA-Protein Interaction (RPI). In the first approach, we use a feature-based technique by combining extracted features from both sequences and secondary structures. The feature-based approach enhanced the prediction accuracy as it included much more available information about the RNA-protein pairs. In the second approach, we apply search algorithms and data structures to extract effective string patterns for prediction of RPI, using both sequence information (protein and RNA sequences), and structure information (protein and RNA secondary structures). This led to different string-based models for predicting interacting RNA-protein pairs. We show results that demonstrate the effectiveness of the proposed approaches, including comparative results against leading state-of-the-art methods.
Collapse
Affiliation(s)
- Donald Adjeroh
- Lane Department of Computer Science and Electrical Engineering, West Virginia University, Morgantown, WV 26508, USA.
| | - Maen Allaga
- Lane Department of Computer Science and Electrical Engineering, West Virginia University, Morgantown, WV 26508, USA.
| | - Jun Tan
- Lane Department of Computer Science and Electrical Engineering, West Virginia University, Morgantown, WV 26508, USA.
| | - Jie Lin
- Faculty of Software, Fujian Normal University, Fuzhou 350108, China.
| | - Yue Jiang
- Faculty of Software, Fujian Normal University, Fuzhou 350108, China.
| | - Ahmed Abbasi
- McIntire School of Commerce, University of Virginia, Charlottesville, VA 22904, USA.
| | - Xiaobo Zhou
- McGovern Medical School, and School of Biomedical Informatics, The University of Texas Health Science Center at Houston (UTHealth), Houston, TX 77030, USA.
| |
Collapse
|
26
|
Shen EZ, Chen H, Ozturk AR, Tu S, Shirayama M, Tang W, Ding YH, Dai SY, Weng Z, Mello CC. Identification of piRNA Binding Sites Reveals the Argonaute Regulatory Landscape of the C. elegans Germline. Cell 2018; 172:937-951.e18. [PMID: 29456082 DOI: 10.1016/j.cell.2018.02.002] [Citation(s) in RCA: 139] [Impact Index Per Article: 23.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/02/2018] [Revised: 01/26/2018] [Accepted: 01/31/2018] [Indexed: 12/20/2022]
Abstract
piRNAs (Piwi-interacting small RNAs) engage Piwi Argonautes to silence transposons and promote fertility in animal germlines. Genetic and computational studies have suggested that C. elegans piRNAs tolerate mismatched pairing and in principle could target every transcript. Here we employ in vivo cross-linking to identify transcriptome-wide interactions between piRNAs and target RNAs. We show that piRNAs engage all germline mRNAs and that piRNA binding follows microRNA-like pairing rules. Targeting correlates better with binding energy than with piRNA abundance, suggesting that piRNA concentration does not limit targeting. In mRNAs silenced by piRNAs, secondary small RNAs accumulate at the center and ends of piRNA binding sites. In germline-expressed mRNAs, however, targeting by the CSR-1 Argonaute correlates with reduced piRNA binding density and suppression of piRNA-associated secondary small RNAs. Our findings reveal physiologically important and nuanced regulation of individual piRNA targets and provide evidence for a comprehensive post-transcriptional regulatory step in germline gene expression.
Collapse
Affiliation(s)
- En-Zhi Shen
- RNA Therapeutics Institute, University of Massachusetts Medical School, Worcester, MA 01605, USA
| | - Hao Chen
- Program in Bioinformatics and Integrative Biology, University of Massachusetts Medical School, Worcester, MA 01605, USA; Bioinformatics Program, Boston University, Boston, MA 02215, USA
| | - Ahmet R Ozturk
- RNA Therapeutics Institute, University of Massachusetts Medical School, Worcester, MA 01605, USA
| | - Shikui Tu
- Program in Bioinformatics and Integrative Biology, University of Massachusetts Medical School, Worcester, MA 01605, USA; Department of Computer Science and Engineering, and CMaCH center, Shanghai Jiao Tong University, Shanghai, China
| | - Masaki Shirayama
- RNA Therapeutics Institute, University of Massachusetts Medical School, Worcester, MA 01605, USA; Howard Hughes Medical Institute
| | - Wen Tang
- RNA Therapeutics Institute, University of Massachusetts Medical School, Worcester, MA 01605, USA
| | - Yue-He Ding
- RNA Therapeutics Institute, University of Massachusetts Medical School, Worcester, MA 01605, USA
| | - Si-Yuan Dai
- RNA Therapeutics Institute, University of Massachusetts Medical School, Worcester, MA 01605, USA
| | - Zhiping Weng
- Program in Bioinformatics and Integrative Biology, University of Massachusetts Medical School, Worcester, MA 01605, USA
| | - Craig C Mello
- RNA Therapeutics Institute, University of Massachusetts Medical School, Worcester, MA 01605, USA; Howard Hughes Medical Institute.
| |
Collapse
|
27
|
Cook KB, Vembu S, Ha KCH, Zheng H, Laverty KU, Hughes TR, Ray D, Morris QD. RNAcompete-S: Combined RNA sequence/structure preferences for RNA binding proteins derived from a single-step in vitro selection. Methods 2017; 126:18-28. [PMID: 28651966 DOI: 10.1016/j.ymeth.2017.06.024] [Citation(s) in RCA: 22] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2017] [Revised: 06/16/2017] [Accepted: 06/21/2017] [Indexed: 12/15/2022] Open
Abstract
RNA-binding proteins recognize RNA sequences and structures, but there is currently no systematic and accurate method to derive large (>12base) motifs de novo that reflect a combination of intrinsic preference to both sequence and structure. To address this absence, we introduce RNAcompete-S, which couples a single-step competitive binding reaction with an excess of random RNA 40-mers to a custom computational pipeline for interrogation of the bound RNA sequences and derivation of SSMs (Sequence and Structure Models). RNAcompete-S confirms that HuR, QKI, and SRSF1 prefer binding sites that are single stranded, and recapitulates known 8-10bp sequence and structure preferences for Vts1p and RBMY. We also derive an 18-base long SSM for Drosophila SLBP, which to our knowledge has not been previously determined by selections from pure random sequence, and accurately discriminates human replication-dependent histone mRNAs. Thus, RNAcompete-S enables accurate identification of large, intrinsic sequence-structure specificities with a uniform assay.
Collapse
Affiliation(s)
- Kate B Cook
- Department of Molecular Genetics, University of Toronto, Toronto M5S 1A8, Canada
| | - Shankar Vembu
- Donnelly Centre, University of Toronto, Toronto M5S 3E1, Canada
| | - Kevin C H Ha
- Department of Molecular Genetics, University of Toronto, Toronto M5S 1A8, Canada
| | - Hong Zheng
- Donnelly Centre, University of Toronto, Toronto M5S 3E1, Canada
| | - Kaitlin U Laverty
- Department of Molecular Genetics, University of Toronto, Toronto M5S 1A8, Canada
| | - Timothy R Hughes
- Department of Molecular Genetics, University of Toronto, Toronto M5S 1A8, Canada; Donnelly Centre, University of Toronto, Toronto M5S 3E1, Canada.
| | - Debashish Ray
- Donnelly Centre, University of Toronto, Toronto M5S 3E1, Canada.
| | - Quaid D Morris
- Department of Molecular Genetics, University of Toronto, Toronto M5S 1A8, Canada; Donnelly Centre, University of Toronto, Toronto M5S 3E1, Canada; Department of Computer Science, University of Toronto, Toronto M5S 2E4, Canada; Department of Electrical and Computer Engineering, University of Toronto, Toronto M5S 3G4, Canada.
| |
Collapse
|
28
|
Glouzon JPS, Perreault JP, Wang S. The super-n-motifs model: a novel alignment-free approach for representing and comparing RNA secondary structures. Bioinformatics 2017; 33:1169-1178. [PMID: 28088762 DOI: 10.1093/bioinformatics/btw773] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2015] [Indexed: 12/13/2022] Open
Abstract
Motivation Comparing ribonucleic acid (RNA) secondary structures of arbitrary size uncovers structural patterns that can provide a better understanding of RNA functions. However, performing fast and accurate secondary structure comparisons is challenging when we take into account the RNA configuration (i.e. linear or circular), the presence of pseudoknot and G-quadruplex (G4) motifs and the increasing number of secondary structures generated by high-throughput probing techniques. To address this challenge, we propose the super-n-motifs model based on a latent analysis of enhanced motifs comprising not only basic motifs but also adjacency relations. The super-n-motifs model computes a vector representation of secondary structures as linear combinations of these motifs. Results We demonstrate the accuracy of our model for comparison of secondary structures from linear and circular RNA while also considering pseudoknot and G4 motifs. We show that the super-n-motifs representation effectively captures the most important structural features of secondary structures, as compared to other representations such as ordered tree, arc-annotated and string representations. Finally, we demonstrate the time efficiency of our model, which is alignment free and capable of performing large-scale comparisons of 10 000 secondary structures with an efficiency up to 4 orders of magnitude faster than existing approaches. Availability and Implementation The super-n-motifs model was implemented in C ++. Source code and Linux binary are freely available at http://jpsglouzon.github.io/supernmotifs/ . Contact Shengrui.Wang@Usherbrooke.ca. Supplementary information Supplementary data are available at Bioinformatics o nline.
Collapse
Affiliation(s)
- Jean-Pierre Séhi Glouzon
- Department of Computer Science, Faculty of Science, Université de Sherbrooke, Sherbrooke, QC J1H 5N4, Canada.,RNA Group, Department of Biochemistry, Faculty of Medicine and Health Sciences, Applied Cancer Research Pavilion, Université de Sherbrooke, Sherbrooke, QC J1E 4K8, Canada
| | - Jean-Pierre Perreault
- RNA Group, Department of Biochemistry, Faculty of Medicine and Health Sciences, Applied Cancer Research Pavilion, Université de Sherbrooke, Sherbrooke, QC J1E 4K8, Canada
| | - Shengrui Wang
- Department of Computer Science, Faculty of Science, Université de Sherbrooke, Sherbrooke, QC J1H 5N4, Canada
| |
Collapse
|
29
|
Luo J, Liu L, Venkateswaran S, Song Q, Zhou X. RPI-Bind: a structure-based method for accurate identification of RNA-protein binding sites. Sci Rep 2017; 7:614. [PMID: 28377624 PMCID: PMC5429624 DOI: 10.1038/s41598-017-00795-4] [Citation(s) in RCA: 21] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2016] [Accepted: 03/13/2017] [Indexed: 01/11/2023] Open
Abstract
RNA and protein interactions play crucial roles in multiple biological processes, while these interactions are significantly influenced by the structures and sequences of protein and RNA molecules. In this study, we first performed an analysis of RNA-protein interacting complexes, and identified interface properties of sequences and structures, which reveal the diverse nature of the binding sites. With the observations, we built a three-step prediction model, namely RPI-Bind, for the identification of RNA-protein binding regions using the sequences and structures of both proteins and RNAs. The three steps include 1) the prediction of RNA binding regions on protein, 2) the prediction of protein binding regions on RNA, and 3) the prediction of interacting regions on both RNA and protein simultaneously, with the results from steps 1) and 2). Compared with existing methods, most of which employ only sequences, our model significantly improves the prediction accuracy at each of the three steps. Especially, our model outperforms the catRAPID by >20% at the 3rd step. All of these results indicate the importance of structures in RNA-protein interactions, and suggest that the RPI-Bind model is a powerful theoretical framework for studying RNA-protein interactions.
Collapse
Affiliation(s)
- Jiesi Luo
- Center for Bioinformatics and Systems Biology and Department of Radiology, Wake Forest School of Medicine, Winston-Salem, NC, 27157, USA
| | - Liang Liu
- Center for Bioinformatics and Systems Biology and Department of Radiology, Wake Forest School of Medicine, Winston-Salem, NC, 27157, USA
| | - Suresh Venkateswaran
- Center for Bioinformatics and Systems Biology and Department of Radiology, Wake Forest School of Medicine, Winston-Salem, NC, 27157, USA
| | - Qianqian Song
- Center for Bioinformatics and Systems Biology and Department of Radiology, Wake Forest School of Medicine, Winston-Salem, NC, 27157, USA
| | - Xiaobo Zhou
- Center for Bioinformatics and Systems Biology and Department of Radiology, Wake Forest School of Medicine, Winston-Salem, NC, 27157, USA.
| |
Collapse
|
30
|
Li Y, Shi X, Liang Y, Xie J, Zhang Y, Ma Q. RNA-TVcurve: a Web server for RNA secondary structure comparison based on a multi-scale similarity of its triple vector curve representation. BMC Bioinformatics 2017; 18:51. [PMID: 28109252 PMCID: PMC5251234 DOI: 10.1186/s12859-017-1481-7] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2016] [Accepted: 01/10/2017] [Indexed: 01/10/2023] Open
Abstract
Background RNAs have been found to carry diverse functionalities in nature. Inferring the similarity between two given RNAs is a fundamental step to understand and interpret their functional relationship. The majority of functional RNAs show conserved secondary structures, rather than sequence conservation. Those algorithms relying on sequence-based features usually have limitations in their prediction performance. Hence, integrating RNA structure features is very critical for RNA analysis. Existing algorithms mainly fall into two categories: alignment-based and alignment-free. The alignment-free algorithms of RNA comparison usually have lower time complexity than alignment-based algorithms. Results An alignment-free RNA comparison algorithm was proposed, in which novel numerical representations RNA-TVcurve (triple vector curve representation) of RNA sequence and corresponding secondary structure features are provided. Then a multi-scale similarity score of two given RNAs was designed based on wavelet decomposition of their numerical representation. In support of RNA mutation and phylogenetic analysis, a web server (RNA-TVcurve) was designed based on this alignment-free RNA comparison algorithm. It provides three functional modules: 1) visualization of numerical representation of RNA secondary structure; 2) detection of single-point mutation based on secondary structure; and 3) comparison of pairwise and multiple RNA secondary structures. The inputs of the web server require RNA primary sequences, while corresponding secondary structures are optional. For the primary sequences alone, the web server can compute the secondary structures using free energy minimization algorithm in terms of RNAfold tool from Vienna RNA package. Conclusion RNA-TVcurve is the first integrated web server, based on an alignment-free method, to deliver a suite of RNA analysis functions, including visualization, mutation analysis and multiple RNAs structure comparison. The comparison results with two popular RNA comparison tools, RNApdist and RNAdistance, showcased that RNA-TVcurve can efficiently capture subtle relationships among RNAs for mutation detection and non-coding RNA classification. All the relevant results were shown in an intuitive graphical manner, and can be freely downloaded from this server. RNA-TVcurve, along with test examples and detailed documents, are available at: http://ml.jlu.edu.cn/tvcurve/. Electronic supplementary material The online version of this article (doi:10.1186/s12859-017-1481-7) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Ying Li
- College of Computer Science and Technology, Jilin University, Changchun, 130012, China.,Key Laboratory of Symbolic Computation and Knowledge Engineering (Jilin University), Ministry of Education, Changchun, 130012, China
| | - Xiaohu Shi
- College of Computer Science and Technology, Jilin University, Changchun, 130012, China.,Key Laboratory of Symbolic Computation and Knowledge Engineering (Jilin University), Ministry of Education, Changchun, 130012, China
| | - Yanchun Liang
- College of Computer Science and Technology, Jilin University, Changchun, 130012, China.,Key Laboratory of Symbolic Computation and Knowledge Engineering (Jilin University), Ministry of Education, Changchun, 130012, China.,Zhuhai Laboratory of Key Laboratory of Symbol Computation and Knowledge Engineering of Ministry of Education, Zhuhai College of Jilin University, Zhuhai, 519041, China
| | - Juan Xie
- Department of Mathematics and Statistics, South Dakota State University, Brookings, SD, 57007, USA.,Bioinformatics and Mathematical Biosciences Lab, Department of Agronomy, Horticulture and Plant Science, South Dakota State University, Brookings, SD, 57007, USA.,BioSNTR, Brookings, SD, USA
| | - Yu Zhang
- College of Computer Science and Technology, Jilin University, Changchun, 130012, China. .,Key Laboratory of Symbolic Computation and Knowledge Engineering (Jilin University), Ministry of Education, Changchun, 130012, China.
| | - Qin Ma
- Department of Mathematics and Statistics, South Dakota State University, Brookings, SD, 57007, USA. .,Bioinformatics and Mathematical Biosciences Lab, Department of Agronomy, Horticulture and Plant Science, South Dakota State University, Brookings, SD, 57007, USA. .,BioSNTR, Brookings, SD, USA.
| |
Collapse
|
31
|
Pietrosanto M, Mattei E, Helmer-Citterich M, Ferrè F. A novel method for the identification of conserved structural patterns in RNA: From small scale to high-throughput applications. Nucleic Acids Res 2016; 44:8600-8609. [PMID: 27580722 PMCID: PMC5062999 DOI: 10.1093/nar/gkw750] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2016] [Accepted: 08/17/2016] [Indexed: 12/21/2022] Open
Abstract
Functional RNA regions are often related to recurrent secondary structure patterns (or motifs), which can exert their role in several different ways, particularly in dictating the interaction with RNA-binding proteins, and acting in the regulation of a large number of cellular processes. Among the available motif-finding tools, the majority focuses on sequence patterns, sometimes including secondary structure as additional constraints to improve their performance. Nonetheless, secondary structures motifs may be concurrent to their sequence counterparts or even encode a stronger functional signal. Current methods for searching structural motifs generally require long pipelines and/or high computational efforts or previously aligned sequences. Here, we present BEAM (BEAr Motif finder), a novel method for structural motif discovery from a set of unaligned RNAs, taking advantage of a recently developed encoding for RNA secondary structure named BEAR (Brand nEw Alphabet for RNAs) and of evolutionary substitution rates of secondary structure elements. Tested in a varied set of scenarios, from small- to large-scale, BEAM is successful in retrieving structural motifs even in highly noisy data sets, such as those that can arise in CLIP-Seq or other high-throughput experiments.
Collapse
Affiliation(s)
- Marco Pietrosanto
- Centre for Molecular Bioinformatics, Department of Biology, University of Rome Tor Vergata, Via della Ricerca Scientifica snc, 00133 Rome, Italy
| | - Eugenio Mattei
- Centre for Molecular Bioinformatics, Department of Biology, University of Rome Tor Vergata, Via della Ricerca Scientifica snc, 00133 Rome, Italy
| | - Manuela Helmer-Citterich
- Centre for Molecular Bioinformatics, Department of Biology, University of Rome Tor Vergata, Via della Ricerca Scientifica snc, 00133 Rome, Italy
| | - Fabrizio Ferrè
- Department of Pharmacy and Biotechnology (FaBiT), University of Bologna Alma Mater, Via Belmeloro 8/2, 40126 Bologna, Italy
| |
Collapse
|
32
|
Abstract
Long non-coding RNAs (lncRNAs) are associated to a plethora of cellular functions, most of which require the interaction with one or more RNA-binding proteins (RBPs); similarly, RBPs are often able to bind a large number of different RNAs. The currently available knowledge is already drawing an intricate network of interactions, whose deregulation is frequently associated to pathological states. Several different techniques were developed in the past years to obtain protein–RNA binding data in a high-throughput fashion. In parallel, in silico inference methods were developed for the accurate computational prediction of the interaction of RBP–lncRNA pairs. The field is growing rapidly, and it is foreseeable that in the near future, the protein–lncRNA interaction network will rise, offering essential clues for a better understanding of lncRNA cellular mechanisms and their disease-associated perturbations.
Collapse
|
33
|
Mattei E, Pietrosanto M, Ferrè F, Helmer-Citterich M. Web-Beagle: a web server for the alignment of RNA secondary structures. Nucleic Acids Res 2015; 43:W493-7. [PMID: 25977293 PMCID: PMC4489221 DOI: 10.1093/nar/gkv489] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2015] [Accepted: 05/02/2015] [Indexed: 12/18/2022] Open
Abstract
Web-Beagle (http://beagle.bio.uniroma2.it) is a web server for the pairwise global or local alignment of RNA secondary structures. The server exploits a new encoding for RNA secondary structure and a substitution matrix of RNA structural elements to perform RNA structural alignments. The web server allows the user to compute up to 10 000 alignments in a single run, taking as input sets of RNA sequences and structures or primary sequences alone. In the latter case, the server computes the secondary structure prediction for the RNAs on-the-fly using RNAfold (free energy minimization). The user can also compare a set of input RNAs to one of five pre-compiled RNA datasets including lncRNAs and 3′ UTRs. All types of comparison produce in output the pairwise alignments along with structural similarity and statistical significance measures for each resulting alignment. A graphical color-coded representation of the alignments allows the user to easily identify structural similarities between RNAs. Web-Beagle can be used for finding structurally related regions in two or more RNAs, for the identification of homologous regions or for functional annotation. Benchmark tests show that Web-Beagle has lower computational complexity, running time and better performances than other available methods.
Collapse
Affiliation(s)
- Eugenio Mattei
- Centre for Molecular Bioinformatics, Department of Biology, University of Rome Tor Vergata, Via della Ricerca Scientifica snc, 00133 Rome, Italy
| | - Marco Pietrosanto
- Centre for Molecular Bioinformatics, Department of Biology, University of Rome Tor Vergata, Via della Ricerca Scientifica snc, 00133 Rome, Italy
| | - Fabrizio Ferrè
- Department of Pharmacy and Biotechnology (FaBiT), University of Bologna Alma Mater, Via Belmeloro 6, 40126 Bologna, Italy
| | - Manuela Helmer-Citterich
- Centre for Molecular Bioinformatics, Department of Biology, University of Rome Tor Vergata, Via della Ricerca Scientifica snc, 00133 Rome, Italy
| |
Collapse
|