1
|
Pietrosanto M, Ausiello G, Helmer-Citterich M. Motif Discovery from CLIP Experiments. METHODS IN MOLECULAR BIOLOGY (CLIFTON, N.J.) 2021; 2284:43-50. [PMID: 33835436 DOI: 10.1007/978-1-0716-1307-8_3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
Abstract
RNA primary and secondary motif discovery is an important step in the annotation and characterization of unknown interaction dynamics between RNAs and RNA-Binding Proteins, and several methods have been developed to meet the need of fast and efficient discovery of interaction motifs. Recent advances have increased the amount of data produced by experimental assays and there is no available method suitable for the analysis of all type of results. Here we present a simple workflow to help choosing the more appropriate method, depending on the starting situation, among the three algorithms that best cover the landscape of approaches. A detailed analysis is presented to highlight the need for different algorithms in different working settings. In conclusion, the proposed workflow depends on the nature of the starting data and on the availability of RNA annotations.
Collapse
Affiliation(s)
- Marco Pietrosanto
- Centre for Molecular Bioinformatics, Department of Biology, University of Rome Tor Vergata, Rome, Italy
| | - Gabriele Ausiello
- Centre for Molecular Bioinformatics, Department of Biology, University of Rome Tor Vergata, Rome, Italy
| | - Manuela Helmer-Citterich
- Centre for Molecular Bioinformatics, Department of Biology, University of Rome Tor Vergata, Rome, Italy.
| |
Collapse
|
2
|
Pietrosanto M, Adinolfi M, Guarracino A, Ferrè F, Ausiello G, Vitale I, Helmer-Citterich M. Relative Information Gain: Shannon entropy-based measure of the relative structural conservation in RNA alignments. NAR Genom Bioinform 2021; 3:lqab007. [PMID: 33615214 PMCID: PMC7884220 DOI: 10.1093/nargab/lqab007] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2020] [Revised: 12/18/2020] [Accepted: 01/26/2021] [Indexed: 12/21/2022] Open
Abstract
Structural characterization of RNAs is a dynamic field, offering many modelling possibilities. RNA secondary structure models are usually characterized by an encoding that depicts structural information of the molecule through string representations or graphs. In this work, we provide a generalization of the BEAR encoding (a context-aware structural encoding we previously developed) by expanding the set of alignments used for the construction of substitution matrices and then applying it to secondary structure encodings ranging from fine-grained to more coarse-grained representations. We also introduce a re-interpretation of the Shannon Information applied on RNA alignments, proposing a new scoring metric, the Relative Information Gain (RIG). The RIG score is available for any position in an alignment, showing how different levels of detail encoded in the RNA representation can contribute differently to convey structural information. The approaches presented in this study can be used alongside state-of-the-art tools to synergistically gain insights into the structural elements that RNAs and RNA families are composed of. This additional information could potentially contribute to their improvement or increase the degree of confidence in the secondary structure of families and any set of aligned RNAs.
Collapse
Affiliation(s)
- Marco Pietrosanto
- Centre for Molecular Bioinformatics, Department of Biology, University of Rome Tor Vergata, Via della Ricerca Scientifica snc, 00133 Rome, Italy
| | - Marta Adinolfi
- Centre for Molecular Bioinformatics, Department of Biology, University of Rome Tor Vergata, Via della Ricerca Scientifica snc, 00133 Rome, Italy
| | - Andrea Guarracino
- Centre for Molecular Bioinformatics, Department of Biology, University of Rome Tor Vergata, Via della Ricerca Scientifica snc, 00133 Rome, Italy
| | - Fabrizio Ferrè
- Department of Pharmacy and Biotechnology (FaBiT), University of Bologna Alma Mater, Via Belmeloro 6, 40126 Bologna, Italy
| | - Gabriele Ausiello
- Centre for Molecular Bioinformatics, Department of Biology, University of Rome Tor Vergata, Via della Ricerca Scientifica snc, 00133 Rome, Italy
| | - Ilio Vitale
- IIGM - Italian Institute for Genomic Medicine, c/o IRCSS Candiolo,10060 Torino, Italy
| | - Manuela Helmer-Citterich
- Centre for Molecular Bioinformatics, Department of Biology, University of Rome Tor Vergata, Via della Ricerca Scientifica snc, 00133 Rome, Italy
| |
Collapse
|
3
|
Singh J, Hanson J, Paliwal K, Zhou Y. RNA secondary structure prediction using an ensemble of two-dimensional deep neural networks and transfer learning. Nat Commun 2019; 10:5407. [PMID: 31776342 PMCID: PMC6881452 DOI: 10.1038/s41467-019-13395-9] [Citation(s) in RCA: 152] [Impact Index Per Article: 30.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2019] [Accepted: 11/01/2019] [Indexed: 01/03/2023] Open
Abstract
The majority of our human genome transcribes into noncoding RNAs with unknown structures and functions. Obtaining functional clues for noncoding RNAs requires accurate base-pairing or secondary-structure prediction. However, the performance of such predictions by current folding-based algorithms has been stagnated for more than a decade. Here, we propose the use of deep contextual learning for base-pair prediction including those noncanonical and non-nested (pseudoknot) base pairs stabilized by tertiary interactions. Since only [Formula: see text]250 nonredundant, high-resolution RNA structures are available for model training, we utilize transfer learning from a model initially trained with a recent high-quality bpRNA dataset of [Formula: see text]10,000 nonredundant RNAs made available through comparative analysis. The resulting method achieves large, statistically significant improvement in predicting all base pairs, noncanonical and non-nested base pairs in particular. The proposed method (SPOT-RNA), with a freely available server and standalone software, should be useful for improving RNA structure modeling, sequence alignment, and functional annotations.
Collapse
Affiliation(s)
- Jaswinder Singh
- Signal Processing Laboratory, School of Engineering and Built Environment, Griffith University, Brisbane, QLD, 4111, Australia
| | - Jack Hanson
- Signal Processing Laboratory, School of Engineering and Built Environment, Griffith University, Brisbane, QLD, 4111, Australia
| | - Kuldip Paliwal
- Signal Processing Laboratory, School of Engineering and Built Environment, Griffith University, Brisbane, QLD, 4111, Australia.
| | - Yaoqi Zhou
- Institute for Glycomics and School of Information and Communication Technology, Griffith University, Parklands Dr., Southport, QLD, 4222, Australia.
| |
Collapse
|
4
|
Adinolfi M, Pietrosanto M, Parca L, Ausiello G, Ferrè F, Helmer-Citterich M. Discovering sequence and structure landscapes in RNA interaction motifs. Nucleic Acids Res 2019; 47:4958-4969. [PMID: 31162604 PMCID: PMC6547422 DOI: 10.1093/nar/gkz250] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2018] [Revised: 02/22/2019] [Accepted: 04/09/2019] [Indexed: 12/16/2022] Open
Abstract
RNA molecules are able to bind proteins, DNA and other small or long RNAs using information at primary, secondary or tertiary structure level. Recent techniques that use cross-linking and immunoprecipitation of RNAs can detect these interactions and, if followed by high-throughput sequencing, molecules can be analysed to find recurrent elements shared by interactors, such as sequence and/or structure motifs. Many tools are able to find sequence motifs from lists of target RNAs, while others focus on structure using different approaches to find specific interaction elements. In this work, we make a systematic analysis of RBP-RNA and RNA-RNA datasets to better characterize the interaction landscape with information about multi-motifs on the same RNAs. To achieve this goal, we updated our BEAM algorithm to combine both sequence and structure information to create pairs of patterns that model motifs of interaction. This algorithm was applied to several RNA binding proteins and ncRNAs interactors, confirming already known motifs and discovering new ones. This landscape analysis on interaction variability reflects the diversity of target recognition and underlines that often both primary and secondary structure are involved in molecular recognition.
Collapse
Affiliation(s)
- Marta Adinolfi
- Centre for Molecular Bioinformatics, Department of Biology, University of Rome Tor Vergata, Via della Ricerca Scientifica snc, 00133 Rome, Italy
| | - Marco Pietrosanto
- Centre for Molecular Bioinformatics, Department of Biology, University of Rome Tor Vergata, Via della Ricerca Scientifica snc, 00133 Rome, Italy
| | - Luca Parca
- Centre for Molecular Bioinformatics, Department of Biology, University of Rome Tor Vergata, Via della Ricerca Scientifica snc, 00133 Rome, Italy
| | - Gabriele Ausiello
- Centre for Molecular Bioinformatics, Department of Biology, University of Rome Tor Vergata, Via della Ricerca Scientifica snc, 00133 Rome, Italy
| | - Fabrizio Ferrè
- Department of Pharmacy and Biotechnology (FaBiT), University of Bologna Alma Mater, Via Selmi 3, 40126 Bologna, Italy
| | - Manuela Helmer-Citterich
- Centre for Molecular Bioinformatics, Department of Biology, University of Rome Tor Vergata, Via della Ricerca Scientifica snc, 00133 Rome, Italy
| |
Collapse
|
5
|
Evolutionary Patterns of Non-Coding RNA in Cardiovascular Biology. Noncoding RNA 2019; 5:ncrna5010015. [PMID: 30709035 PMCID: PMC6468844 DOI: 10.3390/ncrna5010015] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2018] [Revised: 01/26/2019] [Accepted: 01/29/2019] [Indexed: 12/15/2022] Open
Abstract
Cardiovascular diseases (CVDs) affect the heart and the vascular system with a high prevalence and place a huge burden on society as well as the healthcare system. These complex diseases are often the result of multiple genetic and environmental risk factors and pose a great challenge to understanding their etiology and consequences. With the advent of next generation sequencing, many non-coding RNA transcripts, especially long non-coding RNAs (lncRNAs), have been linked to the pathogenesis of CVD. Despite increasing evidence, the proper functional characterization of most of these molecules is still lacking. The exploration of conservation of sequences across related species has been used to functionally annotate protein coding genes. In contrast, the rapid evolutionary turnover and weak sequence conservation of lncRNAs make it difficult to characterize functional homologs for these sequences. Recent studies have tried to explore other dimensions of interspecies conservation to elucidate the functional role of these novel transcripts. In this review, we summarize various methodologies adopted to explore the evolutionary conservation of cardiovascular non-coding RNAs at sequence, secondary structure, syntenic, and expression level.
Collapse
|
6
|
Yartseva V, Takacs CM, Vejnar CE, Lee MT, Giraldez AJ. RESA identifies mRNA-regulatory sequences at high resolution. Nat Methods 2016; 14:201-207. [PMID: 28024160 PMCID: PMC5423094 DOI: 10.1038/nmeth.4121] [Citation(s) in RCA: 30] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2016] [Accepted: 12/02/2016] [Indexed: 11/22/2022]
Abstract
Gene expression is regulated extensively at the level of mRNA stability, localization, and translation. However, decoding functional RNA regulatory features remains a limitation to understanding post-transcriptional regulation in vivo. Here, we developed RNA Element Selection Assay (RESA), a method that selects RNA elements based on their activity in vivo and uses high-throughput sequencing to provide quantitative measurement of their regulatory function with near nucleotide resolution. We implemented RESA to identify sequence elements modulating mRNA stability during zebrafish embryogenesis. RESA provides a sensitive and quantitative measure of microRNA activity in vivo and also identifies novel regulatory sequences. To uncover specific sequence requirements within regulatory elements, we developed a bisulfite-mediated nucleotide conversion strategy for large-scale mutational analysis (RESA-bisulfite). Finally, we used the versatile RESA platform to map candidate protein-RNA interactions in vivo (RESA-CLIP). The RESA platform can be broadly applicable to uncover the regulatory features shaping gene expression and cellular function.
Collapse
Affiliation(s)
- Valeria Yartseva
- Department of Genetics, Yale University School of Medicine, New Haven, Connecticut, USA
| | - Carter M Takacs
- Department of Genetics, Yale University School of Medicine, New Haven, Connecticut, USA
| | - Charles E Vejnar
- Department of Genetics, Yale University School of Medicine, New Haven, Connecticut, USA
| | - Miler T Lee
- Department of Genetics, Yale University School of Medicine, New Haven, Connecticut, USA.,Department of Biological Sciences, University of Pittsburgh, Pittsburgh, Pennsylvania, USA
| | - Antonio J Giraldez
- Department of Genetics, Yale University School of Medicine, New Haven, Connecticut, USA.,Yale Stem Cell Center, Yale University School of Medicine, New Haven, Connecticut, USA.,Yale Cancer Center, Yale University School of Medicine, New Haven, Connecticut, USA
| |
Collapse
|
7
|
Single-stranded DNA aptamer targeting and neutralization of anti-D alloantibody: a potential therapeutic strategy for haemolytic diseases caused by Rhesus alloantibody. BLOOD TRANSFUSION = TRASFUSIONE DEL SANGUE 2016; 16:184-192. [PMID: 27893356 DOI: 10.2450/2016.0123-16] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Subscribe] [Scholar Register] [Received: 05/06/2016] [Accepted: 09/26/2016] [Indexed: 01/09/2023]
Abstract
BACKGROUND Rhesus (Rh) D antigen is the most important antigen in the Rh blood group system because of its strong immunogenicity. When RhD-negative individuals are exposed to RhD-positive blood, they may produce anti-D alloantibody, potentially resulting in delayed haemolytic transfusion reactions and Rh haemolytic disease of the foetus and newborn, which are difficult to treat. Inhibition of the binding of anti-D antibody with RhD antigens on the surface of red blood cells may effectively prevent immune haemolytic diseases. MATERIALS AND METHODS In this study, single-stranded (ss) DNA aptamers, specifically binding to anti-D antibodies, were selected via systematic evolution of ligands by exponential enrichment (SELEX) technology. After 14 rounds of selection, the purified ssDNA was sequenced using a Personal Genome Machine system. Haemagglutination inhibition assays were performed to screen aptamers for biological activity in terms of blocking antigen-antibody reactions: the affinity and specificity of the aptamers were also determined. RESULTS In addition to high specificity, the aptamers which were selected showed high affinity for anti-D antibodies with dissociation constant (Kd) values ranging from 51.46±14.90 to 543.30±92.59 nM. By the combined use of specific ssDNA aptamer 7 and auxiliary ssDNA aptamer 2, anti-D could be effectively neutralised at low concentrations of the aptamers. DISCUSSION Our results demonstrate that ssDNA aptamers may be a novel, promising strategy for the treatment of delayed haemolytic transfusion reactions and Rh haemolytic disease of the foetus and newborn.
Collapse
|
8
|
Systematic identification of regulatory elements in conserved 3' UTRs of human transcripts. Cell Rep 2014; 7:281-92. [PMID: 24656821 DOI: 10.1016/j.celrep.2014.03.001] [Citation(s) in RCA: 48] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/24/2013] [Revised: 02/03/2014] [Accepted: 03/03/2014] [Indexed: 11/21/2022] Open
Abstract
Posttranscriptional regulatory programs governing diverse aspects of RNA biology remain largely uncharacterized. Understanding the functional roles of RNA cis-regulatory elements is essential for decoding complex programs that underlie the dynamic regulation of transcript stability, splicing, localization, and translation. Here, we describe a combined experimental/computational technology to reveal a catalog of functional regulatory elements embedded in 3' UTRs of human transcripts. We used a bidirectional reporter system coupled with flow cytometry and high-throughput sequencing to measure the effect of short, noncoding, vertebrate-conserved RNA sequences on transcript stability and translation. Information-theoretic motif analysis of the resulting sequence-to-gene-expression mapping revealed linear and structural RNA cis-regulatory elements that positively and negatively modulate the posttranscriptional fates of human transcripts. This combined experimental/computational strategy can be used to systematically characterize the vast landscape of posttranscriptional regulatory elements controlling physiological and pathological cellular state transitions.
Collapse
|
9
|
Abstract
Transcriptomics experiments and computational predictions both enable systematic discovery of new functional RNAs. However, many putative noncoding transcripts arise instead from artifacts and biological noise, and current computational prediction methods have high false positive rates. I discuss prospects for improving computational methods for analyzing and identifying functional RNAs, with a focus on detecting signatures of conserved RNA secondary structure. An interesting new front is the application of chemical and enzymatic experiments that probe RNA structure on a transcriptome-wide scale. I review several proposed approaches for incorporating structure probing data into the computational prediction of RNA secondary structure. Using probabilistic inference formalisms, I show how all these approaches can be unified in a well-principled framework, which in turn allows RNA probing data to be easily integrated into a wide range of analyses that depend on RNA secondary structure inference. Such analyses include homology search and genome-wide detection of new structural RNAs.
Collapse
Affiliation(s)
- Sean R Eddy
- Howard Hughes Medical Institute Janelia Farm Research Campus, Ashburn, Virginia 20147;
| |
Collapse
|
10
|
Milek M, Wyler E, Landthaler M. Transcriptome-wide analysis of protein–RNA interactions using high-throughput sequencing. Semin Cell Dev Biol 2012; 23:206-12. [DOI: 10.1016/j.semcdb.2011.12.001] [Citation(s) in RCA: 36] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2011] [Revised: 11/22/2011] [Accepted: 12/04/2011] [Indexed: 12/14/2022]
|