1
|
Liu W, Wang X. Prediction of functional microRNA targets by integrative modeling of microRNA binding and target expression data. Genome Biol 2019; 20:18. [PMID: 30670076 PMCID: PMC6341724 DOI: 10.1186/s13059-019-1629-z] [Citation(s) in RCA: 554] [Impact Index Per Article: 92.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2018] [Accepted: 01/13/2019] [Indexed: 02/07/2023] Open
Abstract
We perform a large-scale RNA sequencing study to experimentally identify genes that are downregulated by 25 miRNAs. This RNA-seq dataset is combined with public miRNA target binding data to systematically identify miRNA targeting features that are characteristic of both miRNA binding and target downregulation. By integrating these common features in a machine learning framework, we develop and validate an improved computational model for genome-wide miRNA target prediction. All prediction data can be accessed at miRDB ( http://mirdb.org ).
Collapse
|
Research Support, N.I.H., Extramural |
6 |
554 |
2
|
Van Nostrand EL, Pratt GA, Yee BA, Wheeler EC, Blue SM, Mueller J, Park SS, Garcia KE, Gelboin-Burkhart C, Nguyen TB, Rabano I, Stanton R, Sundararaman B, Wang R, Fu XD, Graveley BR, Yeo GW. Principles of RNA processing from analysis of enhanced CLIP maps for 150 RNA binding proteins. Genome Biol 2020; 21:90. [PMID: 32252787 PMCID: PMC7137325 DOI: 10.1186/s13059-020-01982-9] [Citation(s) in RCA: 134] [Impact Index Per Article: 26.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2019] [Accepted: 03/03/2020] [Indexed: 02/08/2023] Open
Abstract
BACKGROUND A critical step in uncovering rules of RNA processing is to study the in vivo regulatory networks of RNA binding proteins (RBPs). Crosslinking and immunoprecipitation (CLIP) methods enable mapping RBP targets transcriptome-wide, but methodological differences present challenges to large-scale analysis across datasets. The development of enhanced CLIP (eCLIP) enabled the mapping of targets for 150 RBPs in K562 and HepG2, creating a unique resource of RBP interactomes profiled with a standardized methodology in the same cell types. RESULTS Our analysis of 223 eCLIP datasets reveals a range of binding modalities, including highly resolved positioning around splicing signals and mRNA untranslated regions that associate with distinct RBP functions. Quantification of enrichment for repetitive and abundant multicopy elements reveals 70% of RBPs have enrichment for non-mRNA element classes, enables identification of novel ribosomal RNA processing factors and sites, and suggests that association with retrotransposable elements reflects multiple RBP mechanisms of action. Analysis of spliceosomal RBPs indicates that eCLIP resolves AQR association after intronic lariat formation, enabling identification of branch points with single-nucleotide resolution, and provides genome-wide validation for a branch point-based scanning model for 3' splice site recognition. Finally, we show that eCLIP peak co-occurrences across RBPs enable the discovery of novel co-interacting RBPs. CONCLUSIONS This work reveals novel insights into RNA biology by integrated analysis of eCLIP profiling of 150 RBPs with distinct functions. Further, our quantification of both mRNA and other element association will enable further research to identify novel roles of RBPs in regulating RNA processing.
Collapse
|
Research Support, N.I.H., Extramural |
5 |
134 |
3
|
Pan X, Shen HB. RNA-protein binding motifs mining with a new hybrid deep learning based cross-domain knowledge integration approach. BMC Bioinformatics 2017; 18:136. [PMID: 28245811 PMCID: PMC5331642 DOI: 10.1186/s12859-017-1561-8] [Citation(s) in RCA: 116] [Impact Index Per Article: 14.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2016] [Accepted: 02/23/2017] [Indexed: 01/08/2023] Open
Abstract
Background RNAs play key roles in cells through the interactions with proteins known as the RNA-binding proteins (RBP) and their binding motifs enable crucial understanding of the post-transcriptional regulation of RNAs. How the RBPs correctly recognize the target RNAs and why they bind specific positions is still far from clear. Machine learning-based algorithms are widely acknowledged to be capable of speeding up this process. Although many automatic tools have been developed to predict the RNA-protein binding sites from the rapidly growing multi-resource data, e.g. sequence, structure, their domain specific features and formats have posed significant computational challenges. One of current difficulties is that the cross-source shared common knowledge is at a higher abstraction level beyond the observed data, resulting in a low efficiency of direct integration of observed data across domains. The other difficulty is how to interpret the prediction results. Existing approaches tend to terminate after outputting the potential discrete binding sites on the sequences, but how to assemble them into the meaningful binding motifs is a topic worth of further investigation. Results In viewing of these challenges, we propose a deep learning-based framework (iDeep) by using a novel hybrid convolutional neural network and deep belief network to predict the RBP interaction sites and motifs on RNAs. This new protocol is featured by transforming the original observed data into a high-level abstraction feature space using multiple layers of learning blocks, where the shared representations across different domains are integrated. To validate our iDeep method, we performed experiments on 31 large-scale CLIP-seq datasets, and our results show that by integrating multiple sources of data, the average AUC can be improved by 8% compared to the best single-source-based predictor; and through cross-domain knowledge integration at an abstraction level, it outperforms the state-of-the-art predictors by 6%. Besides the overall enhanced prediction performance, the convolutional neural network module embedded in iDeep is also able to automatically capture the interpretable binding motifs for RBPs. Large-scale experiments demonstrate that these mined binding motifs agree well with the experimentally verified results, suggesting iDeep is a promising approach in the real-world applications. Conclusion The iDeep framework not only can achieve promising performance than the state-of-the-art predictors, but also easily capture interpretable binding motifs. iDeep is available at http://www.csbio.sjtu.edu.cn/bioinf/iDeep Electronic supplementary material The online version of this article (doi:10.1186/s12859-017-1561-8) contains supplementary material, which is available to authorized users.
Collapse
|
Validation Study |
8 |
116 |
4
|
Benhalevy D, Gupta SK, Danan CH, Ghosal S, Sun HW, Kazemier HG, Paeschke K, Hafner M, Juranek SA. The Human CCHC-type Zinc Finger Nucleic Acid-Binding Protein Binds G-Rich Elements in Target mRNA Coding Sequences and Promotes Translation. Cell Rep 2017; 18:2979-2990. [PMID: 28329689 DOI: 10.1016/j.celrep.2017.02.080] [Citation(s) in RCA: 99] [Impact Index Per Article: 12.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2016] [Revised: 07/18/2016] [Accepted: 02/27/2017] [Indexed: 12/16/2022] Open
Abstract
The CCHC-type zinc finger nucleic acid-binding protein (CNBP/ZNF9) is conserved in eukaryotes and is essential for embryonic development in mammals. It has been implicated in transcriptional, as well as post-transcriptional, gene regulation; however, its nucleic acid ligands and molecular function remain elusive. Here, we use multiple systems-wide approaches to identify CNBP targets and function. We used photoactivatable ribonucleoside-enhanced crosslinking and immunoprecipitation (PAR-CLIP) to identify 8,420 CNBP binding sites on 4,178 mRNAs. CNBP preferentially bound G-rich elements in the target mRNA coding sequences, most of which were previously found to form G-quadruplex and other stable structures in vitro. Functional analyses, including RNA sequencing, ribosome profiling, and quantitative mass spectrometry, revealed that CNBP binding did not influence target mRNA abundance but rather increased their translational efficiency. Considering that CNBP binding prevented G-quadruplex structure formation in vitro, we hypothesize that CNBP is supporting translation by resolving stable structures on mRNAs.
Collapse
|
Research Support, N.I.H., Intramural |
8 |
99 |
5
|
Madison BB, Liu Q, Zhong X, Hahn CM, Lin N, Emmett MJ, Stanger BZ, Lee JS, Rustgi AK. LIN28B promotes growth and tumorigenesis of the intestinal epithelium via Let-7. Genes Dev 2013; 27:2233-45. [PMID: 24142874 PMCID: PMC3814644 DOI: 10.1101/gad.224659.113] [Citation(s) in RCA: 99] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023]
Abstract
The RNA-binding proteins LIN28A and LIN28B have diverse functions in cellular reprogramming, growth, and oncogenesis. Madison et al. discover that intestine targeted expression of LIN28B causes intestinal hypertrophy, crypt expansion, and adenocarcinoma formation. Modulation of Let-7 levels via deletion of the mirLet7c2/mirLet7b genes recapitulated these effects, and intestine-specific Let-7 expression reversed the hypertrophy and Paneth cell depletion caused by Lin28b. These results demonstrate that Let-7 miRNAs are critical for repressing intestinal tissue growth and that LIN28B can act as an oncogene. The RNA-binding proteins LIN28A and LIN28B have diverse functions in embryonic stem cells, cellular reprogramming, growth, and oncogenesis. Many of these effects occur via direct inhibition of Let-7 microRNAs (miRNAs), although Let-7-independent effects have been surmised. We report that intestine targeted expression of LIN28B causes intestinal hypertrophy, crypt expansion, and Paneth cell loss. Furthermore, LIN28B fosters intestinal polyp and adenocarcinoma formation. To examine potential Let-7-independent functions of LIN28B, we pursued ribonucleoprotein cross-linking, immunoprecipitation, and high-throughput sequencing (CLIP-seq) to identify direct RNA targets. This revealed that LIN28B bound a substantial number of mRNAs and modestly augmented protein levels of these target mRNAs in vivo. Conversely, Let-7 had a profound effect; modulation of Let-7 levels via deletion of the mirLet7c2/mirLet7b genes recapitulated effects of Lin28b overexpression. Furthermore, intestine-specific Let-7 expression could reverse hypertrophy and Paneth cell depletion caused by Lin28b. This was independent of effects on insulin–PI3K–mTOR signaling. Our study reveals that Let-7 miRNAs are critical for repressing intestinal tissue growth and promoting Paneth cell differentiation. Let-7-dependent effects of LIN28B may supersede Let-7-independent effects on intestinal tissue growth. In summary, LIN28B can definitively act as an oncogene in the absence of canonical genetic alterations.
Collapse
|
Research Support, Non-U.S. Gov't |
12 |
99 |
6
|
Van Nostrand EL, Nguyen TB, Gelboin-Burkhart C, Wang R, Blue SM, Pratt GA, Louie AL, Yeo GW. Robust, Cost-Effective Profiling of RNA Binding Protein Targets with Single-end Enhanced Crosslinking and Immunoprecipitation (seCLIP). Methods Mol Biol 2017; 1648:177-200. [PMID: 28766298 DOI: 10.1007/978-1-4939-7204-3_14] [Citation(s) in RCA: 69] [Impact Index Per Article: 8.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
Profiling of RNA binding protein targets in vivo provides critical insights into the mechanistic roles they play in regulating RNA processing. The enhanced crosslinking and immunoprecipitation (eCLIP) methodology provides a framework for robust, reproducible identification of transcriptome-wide protein-RNA interactions, with dramatically improved efficiency over previous methods. Here we provide a step-by-step description of the eCLIP method, along with insights into optimal performance of critical steps in the protocol. In particular, we describe improvements to the adaptor strategy that enables single-end enhanced CLIP (seCLIP), which removes the requirement for paired-end sequencing of eCLIP libraries. Further, we describe the observation of contaminating RNA present in standard nitrocellulose membrane suppliers, and present options with significantly reduced contamination for sensitive applications. These notes further refine the eCLIP methodology, simplifying robust RNA binding protein studies for all users.
Collapse
|
Research Support, N.I.H., Extramural |
8 |
69 |
7
|
Oh S, Flynn RA, Floor SN, Purzner J, Martin L, Do BT, Schubert S, Vaka D, Morrissy S, Li Y, Kool M, Hovestadt V, Jones DTW, Northcott PA, Risch T, Warnatz HJ, Yaspo ML, Adams CM, Leib RD, Breese M, Marra MA, Malkin D, Lichter P, Doudna JA, Pfister SM, Taylor MD, Chang HY, Cho YJ. Medulloblastoma-associated DDX3 variant selectively alters the translational response to stress. Oncotarget 2018; 7:28169-82. [PMID: 27058758 PMCID: PMC5053718 DOI: 10.18632/oncotarget.8612] [Citation(s) in RCA: 61] [Impact Index Per Article: 8.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2016] [Accepted: 03/26/2016] [Indexed: 12/14/2022] Open
Abstract
DDX3X encodes a DEAD-box family RNA helicase (DDX3) commonly mutated in medulloblastoma, a highly aggressive cerebellar tumor affecting both children and adults. Despite being implicated in several facets of RNA metabolism, the nature and scope of DDX3′s interactions with RNA remain unclear. Here, we show DDX3 collaborates extensively with the translation initiation machinery through direct binding to 5′UTRs of nearly all coding RNAs, specific sites on the 18S rRNA, and multiple components of the translation initiation complex. Impairment of translation initiation is also evident in primary medulloblastomas harboring mutations in DDX3X, further highlighting DDX3′s role in this process. Arsenite-induced stress shifts DDX3 binding from the 5′UTR into the coding region of mRNAs concomitant with a general reduction of translation, and both the shift of DDX3 on mRNA and decreased translation are blunted by expression of a catalytically-impaired, medulloblastoma-associated DDX3R534H variant. Furthermore, despite the global repression of translation induced by arsenite, translation is preserved on select genes involved in chromatin organization in DDX3R534H-expressing cells. Thus, DDX3 interacts extensively with RNA and ribosomal machinery to help remodel the translation landscape in response to stress, while cancer-related DDX3 variants adapt this response to selectively preserve translation.
Collapse
|
Journal Article |
7 |
61 |
8
|
|
Editorial |
12 |
43 |
9
|
Kini HK, Silverman IM, Ji X, Gregory BD, Liebhaber SA. Cytoplasmic poly(A) binding protein-1 binds to genomically encoded sequences within mammalian mRNAs. RNA (NEW YORK, N.Y.) 2016; 22:61-74. [PMID: 26554031 PMCID: PMC4691835 DOI: 10.1261/rna.053447.115] [Citation(s) in RCA: 43] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/21/2015] [Accepted: 10/02/2015] [Indexed: 06/05/2023]
Abstract
The functions of the major mammalian cytoplasmic poly(A) binding protein, PABPC1, have been characterized predominantly in the context of its binding to the 3' poly(A) tails of mRNAs. These interactions play important roles in post-transcriptional gene regulation by enhancing translation and mRNA stability. Here, we performed transcriptome-wide CLIP-seq analysis to identify additional PABPC1 binding sites within genomically encoded mRNA sequences that may impact on gene regulation. From this analysis, we found that PABPC1 binds directly to the canonical polyadenylation signal in thousands of mRNAs in the mouse transcriptome. PABPC1 binding also maps to translation initiation and termination sites bracketing open reading frames, exemplified most dramatically in replication-dependent histone mRNAs. Additionally, a more restricted subset of PABPC1 interaction sites comprised A-rich sequences within the 5' UTRs of mRNAs, including Pabpc1 mRNA itself. Functional analyses revealed that these PABPC1 interactions in the 5' UTR mediate both auto- and trans-regulatory translational control. In total, these findings reveal a repertoire of PABPC1 binding that is substantially broader than previously recognized with a corresponding potential to impact and coordinate post-transcriptional controls critical to a broad array of cellular functions.
Collapse
|
Research Support, N.I.H., Extramural |
9 |
43 |
10
|
Li Y, Wu K, Quan W, Yu L, Chen S, Cheng C, Wu Q, Zhao S, Zhang Y, Zhou L. The dynamics of FTO binding and demethylation from the m 6A motifs. RNA Biol 2019; 16:1179-1189. [PMID: 31149892 PMCID: PMC6693534 DOI: 10.1080/15476286.2019.1621120] [Citation(s) in RCA: 40] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2019] [Revised: 04/20/2019] [Accepted: 05/09/2019] [Indexed: 10/26/2022] Open
Abstract
N6-methyladenosine (m6A) is considered as a reversible RNA modification occurring more frequently on the GAC than AAC context in vivo, which regulates post-transcriptional gene expression in mammalian cells. m6A 'writers' METTL3 and METTL14 demonstrate a strong preference for binding AC-containing motifs in living cells. However, this evidence is currently lacking for m6A erasers, leaving the dynamics of the internal m6A modification under debate recently. We analysed three recently published FTO CLIP-seq data sets and two generated in this study, one of the two known m6A 'erasers'. FTO binding peaks from all cell lines contain RRACH motifs. Only those from K562, 3T3-L1and HeLa cells were enriched in AC-containing motifs, while those from HEK293 were not. The exogenously overexpressed FTO effectively binds to m6A motif-containing RNA sites. FTO overexpression specifically removed m6A modification from GGACU and RRACU motifs in a concentration-dependent manner. These findings underline the dynamics of FTO in target selection, which is predicted to contribute to both the m6A dynamics and the FTO plasticity in biological functions and diseases.
Collapse
|
research-article |
6 |
40 |
11
|
Panda AC, Dudekula DB, Abdelmohsen K, Gorospe M. Analysis of Circular RNAs Using the Web Tool CircInteractome. Methods Mol Biol 2018; 1724:43-56. [PMID: 29322439 DOI: 10.1007/978-1-4939-7562-4_4] [Citation(s) in RCA: 36] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/13/2023]
Abstract
Circular RNAs (circRNAs) are generated through nonlinear back splicing, during which the 5' and 3' ends are covalently joined. Consequently, the lack of free ends makes them very stable compared to their counterpart linear RNAs. By selectively interacting with microRNAs and RNA-binding proteins (RBPs), circRNAs have been shown to influence gene expression programs. We designed a web tool, CircInteractome, in order to (1) explore potential interactions of circRNAs with RBPs, (2) design specific divergent primers to detect circRNAs, (3) study tissue- and cell-specific circRNAs, (4) identify gene-specific circRNAs, (5) explore potential miRNAs interacting with circRNAs, and (6) design specific siRNAs to silence circRNAs. Here, we review the CircInteractome tool and explain recent updates to the site. The database is freely accessible at http://circinteractome.nia.nih.gov .
Collapse
|
Research Support, N.I.H., Intramural |
7 |
36 |
12
|
Li YE, Xiao M, Shi B, Yang YCT, Wang D, Wang F, Marcia M, Lu ZJ. Identification of high-confidence RNA regulatory elements by combinatorial classification of RNA-protein binding sites. Genome Biol 2017; 18:169. [PMID: 28886744 PMCID: PMC5591525 DOI: 10.1186/s13059-017-1298-8] [Citation(s) in RCA: 35] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2017] [Accepted: 08/14/2017] [Indexed: 12/20/2022] Open
Abstract
Crosslinking immunoprecipitation sequencing (CLIP-seq) technologies have enabled researchers to characterize transcriptome-wide binding sites of RNA-binding protein (RBP) with high resolution. We apply a soft-clustering method, RBPgroup, to various CLIP-seq datasets to group together RBPs that specifically bind the same RNA sites. Such combinatorial clustering of RBPs helps interpret CLIP-seq data and suggests functional RNA regulatory elements. Furthermore, we validate two RBP–RBP interactions in cell lines. Our approach links proteins and RNA motifs known to possess similar biochemical and cellular properties and can, when used in conjunction with additional experimental data, identify high-confidence RBP groups and their associated RNA regulatory elements.
Collapse
|
Research Support, Non-U.S. Gov't |
8 |
35 |
13
|
Improvements to the HITS-CLIP protocol eliminate widespread mispriming artifacts. BMC Genomics 2016; 17:338. [PMID: 27150721 PMCID: PMC4858895 DOI: 10.1186/s12864-016-2675-5] [Citation(s) in RCA: 33] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2015] [Accepted: 04/28/2016] [Indexed: 01/13/2023] Open
Abstract
Background High-throughput sequencing of RNA isolated by crosslinking immunoprecipitation (HITS-CLIP) allows for high resolution, genome-wide mapping of RNA-binding proteins. This methodology is frequently used to validate predicted targets of microRNA binding, as well as direct targets of other RNA-binding proteins. Hence, the accuracy and sensitivity of binding site identification is critical. Results We found that substantial mispriming during reverse transcription results in the overrepresentation of sequences complementary to the primer used for reverse transcription. Up to 45 % of peaks in publicly available HITS-CLIP libraries are attributable to this mispriming artifact, and the majority of libraries have detectable levels of mispriming. We also found that standard techniques for validating microRNA-target interactions fail to differentiate between artifactual peaks and physiologically relevant peaks. Conclusions Here, we present a modification to the HITS-CLIP protocol that effectively eliminates this artifact and improves the sensitivity and complexity of resulting libraries. Electronic supplementary material The online version of this article (doi:10.1186/s12864-016-2675-5) contains supplementary material, which is available to authorized users.
Collapse
|
Research Support, N.I.H., Extramural |
9 |
33 |
14
|
Van Nostrand EL, Gelboin-Burkhart C, Wang R, Pratt GA, Blue SM, Yeo GW. CRISPR/Cas9-mediated integration enables TAG-eCLIP of endogenously tagged RNA binding proteins. Methods 2016; 118-119:50-59. [PMID: 28003131 DOI: 10.1016/j.ymeth.2016.12.007] [Citation(s) in RCA: 32] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/27/2016] [Revised: 12/08/2016] [Accepted: 12/10/2016] [Indexed: 12/22/2022] Open
Abstract
Identification of in vivo direct RNA targets for RNA binding proteins (RBPs) provides critical insight into their regulatory activities and mechanisms. Recently, we described a methodology for enhanced crosslinking and immunoprecipitation followed by high-throughput sequencing (eCLIP) using antibodies against endogenous RNA binding proteins. However, in many cases it is desirable to profile targets of an RNA binding protein for which an immunoprecipitation-grade antibody is lacking. Here we describe a scalable method for using CRISPR/Cas9-mediated homologous recombination to insert a peptide tag into the endogenous RNA binding protein locus. Further, we show that TAG-eCLIP performed using tag-specific antibodies can yield the same robust binding profiles after proper control normalization as eCLIP with antibodies against endogenous proteins. Finally, we note that antibodies against commonly used tags can immunoprecipitate significant amounts of antibody-specific RNA, emphasizing the need for paired controls alongside each experiment for normalization. TAG-eCLIP enables eCLIP profiling of new native proteins where no suitable antibody exists, expanding the RBP-RNA interaction landscape.
Collapse
|
Research Support, U.S. Gov't, Non-P.H.S. |
9 |
32 |
15
|
Abstract
Pumilio/fem-3 mRNA binding factor (PUF) proteins bind RNA with sequence specificity and modularity, and have become exemplary scaffolds in the reengineering of new RNA specificities. Here, we report the in vivo RNA binding sites of wild-type (WT) and reengineered forms of the PUF protein Saccharomyces cerevisiae Puf2p across the transcriptome. Puf2p defines an ancient protein family present throughout fungi, with divergent and distinctive PUF RNA binding domains, RNA-recognition motifs (RRMs), and prion regions. We identify sites in RNA bound to Puf2p in vivo by using two forms of UV cross-linking followed by immunopurification. The protein specifically binds more than 1,000 mRNAs, which contain multiple iterations of UAAU-binding elements. Regions outside the PUF domain, including the RRM, enhance discrimination among targets. Compensatory mutants reveal that one Puf2p molecule binds one UAAU sequence, and align the protein with the RNA site. Based on this architecture, we redesign Puf2p to bind UAAG and identify the targets of this reengineered PUF in vivo. The mutant protein finds its target site in 1,800 RNAs and yields a novel RNA network with a dramatic redistribution of binding elements. The mutant protein exhibits even greater RNA specificity than wild type. The redesigned protein decreases the abundance of RNAs in its redesigned network. These results suggest that reengineering using the PUF scaffold redirects and can even enhance specificity in vivo.
Collapse
|
Research Support, Non-U.S. Gov't |
10 |
27 |
16
|
Zhou Y, Peng H, Cui Q, Zhou Y. tRFTar: Prediction of tRF-target gene interactions via systemic re-analysis of Argonaute CLIP-seq datasets. Methods 2020; 187:57-67. [PMID: 33045361 DOI: 10.1016/j.ymeth.2020.10.006] [Citation(s) in RCA: 26] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2020] [Revised: 10/04/2020] [Accepted: 10/07/2020] [Indexed: 12/21/2022] Open
Abstract
tRNA-derived fragments (tRFs), which by definition are cleaved from tRNAs, comprise a novel class of regulatory small non-coding RNAs. Recent evidence has revealed that tRFs can be loaded onto Argonaute (AGO) family proteins to perform post-transcriptional regulations via substantial tRF-target gene interactions (TGIs). However, there is no resource that systematically profiles potential AGO-mediated TGIs. To this end, we performed a systemic computational screening of potential AGO-mediated TGIs by a re-analysis of 146 crosslinking-immunoprecipitation and high-throughput sequencing (CLIP-seq) datasets in which 920,690 TGIs between 12,102 tRFs and 5,688 target genes were identified. The predicted TGIs have superior signal-to-noise ratio and good consistency with TGIs identified from an orthogonal technique. AGO-bound tRFs are not evenly distributed, where the 5'-tRF and 3'-tRF are enriched and some commonly expressed tRFs are also overrepresented. The tRFs tend to target conserved regions of transcripts and co-express with their target genes. Filtering TGIs with consistent co-expression with target genes results in a set of regulatory TGIs that contains 25,281 tRF-target pairs. Together, our results unveiled the extensive regulatory interactions between tRFs and target genes. Finally, the CLIP-derived TGIs were incorporated in a user-friendly online platform termed as tRFTar, where various functions like custom searching, co-expressed TGI filtering, genome browser and TGI-based tRF functional enrichment analysis are enabled to help users to investigate the functions of tRFs. The tRFTar is freely available at http://www.rnanut.net/tRFTar/.
Collapse
|
Research Support, Non-U.S. Gov't |
5 |
26 |
17
|
Lim DH, Lee S, Han JY, Choi MS, Hong JS, Seong Y, Kwon YS, Lee YS. Ecdysone-responsive microRNA-252-5p controls the cell cycle by targeting Abi in Drosophila. FASEB J 2018. [PMID: 29543534 DOI: 10.1096/fj.201701185rr] [Citation(s) in RCA: 24] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
Abstract
The steroid hormone ecdysone has a central role in the developmental transitions of insects through its control of responsive protein-coding and microRNA (miRNA) gene expression. However, the complete regulatory network controlling the expression of these genes remains to be elucidated. In this study, we performed cross-linking immunoprecipitation coupled with deep sequencing of endogenous Argonaute 1 (Ago1) protein, the core effector of the miRNA pathway, in Drosophila S2 cells. We found that regulatory interactions between miRNAs and their cognate targets were substantially altered by Ago1 in response to ecdysone signaling. Additionally, during the larva-to-adult metamorphosis, miR-252-5p was up-regulated via the canonical ecdysone-signaling pathway. Moreover, we provide evidence that miR-252-5p targets Abelson interacting protein ( Abi) to decrease the protein levels of cyclins A and B, controlling the cell cycle. Overall, our data suggest a potential role for the ecdysone/miR-252-5p/Abi regulatory axis partly in cell-cycle control during metamorphosis in Drosophila.-Lim, D.-H., Lee, S., Han, J. Y., Choi, M.-S., Hong, J.-S., Seong, Y., Kwon, Y.-S., Lee, Y. S. Ecdysone-responsive microR-252-5p controls the cell cycle by targeting Abi in Drosophila.
Collapse
|
Research Support, Non-U.S. Gov't |
7 |
24 |
18
|
Patton RD, Sanjeev M, Woodward LA, Mabin JW, Bundschuh R, Singh G. Chemical crosslinking enhances RNA immunoprecipitation for efficient identification of binding sites of proteins that photo-crosslink poorly with RNA. RNA (NEW YORK, N.Y.) 2020; 26:1216-1233. [PMID: 32467309 PMCID: PMC7430673 DOI: 10.1261/rna.074856.120] [Citation(s) in RCA: 24] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/13/2020] [Accepted: 05/17/2020] [Indexed: 05/14/2023]
Abstract
In eukaryotic cells, proteins that associate with RNA regulate its activity to control cellular function. To fully illuminate the basis of RNA function, it is essential to identify such RNA-associated proteins, their mode of action on RNA, and their preferred RNA targets and binding sites. By analyzing catalogs of human RNA-associated proteins defined by ultraviolet light (UV)-dependent and -independent approaches, we classify these proteins into two major groups: (i) the widely recognized RNA binding proteins (RBPs), which bind RNA directly and UV-crosslink efficiently to RNA, and (ii) a new group of RBP-associated factors (RAFs), which bind RNA indirectly via RBPs and UV-crosslink poorly to RNA. As the UV crosslinking and immunoprecipitation followed by sequencing (CLIP-seq) approach will be unsuitable to identify binding sites of RAFs, we show that formaldehyde crosslinking stabilizes RAFs within ribonucleoproteins to allow for their immunoprecipitation under stringent conditions. Using an RBP (CASC3) and an RAF (RNPS1) within the exon junction complex (EJC) as examples, we show that formaldehyde crosslinking combined with RNA immunoprecipitation in tandem followed by sequencing (xRIPiT-seq) far exceeds CLIP-seq to identify binding sites of RNPS1. xRIPiT-seq reveals that RNPS1 occupancy is increased on exons immediately upstream of strong recursively spliced exons, which depend on the EJC for their inclusion.
Collapse
|
Research Support, N.I.H., Extramural |
5 |
24 |
19
|
Maragkakis M, Alexiou P, Nakaya T, Mourelatos Z. CLIPSeqTools--a novel bioinformatics CLIP-seq analysis suite. RNA (NEW YORK, N.Y.) 2016; 22:1-9. [PMID: 26577377 PMCID: PMC4691824 DOI: 10.1261/rna.052167.115] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/13/2015] [Accepted: 10/18/2015] [Indexed: 05/22/2023]
Abstract
Immunoprecipitation of RNA binding proteins (RBPs) after in vivo crosslinking, coupled with sequencing of associated RNA footprints (HITS-CLIP, CLIP-seq), is a method of choice for the identification of RNA targets and binding sites for RBPs. Compared with RNA-seq, CLIP-seq analysis is widely diverse and depending on the RBPs that are analyzed, the approaches vary significantly, necessitating the development of flexible and efficient informatics tools. In this study, we present CLIPSeqTools, a novel, highly flexible computational suite that can perform analysis from raw sequencing data with minimal user input. It contains a wide array of tools to provide an in-depth view of CLIP-seq data sets. It supports extensive customization and promotes improvization, a critical virtue, since CLIP-seq analysis is rarely well defined a priori. To highlight CLIPSeqTools capabilities, we used the suite to analyze Ago-miRNA HITS-CLIP data sets that we prepared from human brains.
Collapse
|
Research Support, N.I.H., Extramural |
9 |
23 |
20
|
Yadav M, Singh RS, Hogan D, Vidhyasagar V, Yang S, Chung IYW, Kusalik A, Dmitriev OY, Cygler M, Wu Y. The KH domain facilitates the substrate specificity and unwinding processivity of DDX43 helicase. J Biol Chem 2021; 296:100085. [PMID: 33199368 PMCID: PMC7949032 DOI: 10.1074/jbc.ra120.015824] [Citation(s) in RCA: 19] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2020] [Revised: 11/03/2020] [Accepted: 11/16/2020] [Indexed: 01/21/2023] Open
Abstract
The K-homology (KH) domain is a nucleic acid-binding domain present in many proteins. Recently, we found that the DEAD-box helicase DDX43 contains a KH domain in its N-terminus; however, its function remains unknown. Here, we purified recombinant DDX43 KH domain protein and found that it prefers binding ssDNA and ssRNA. Electrophoretic mobility shift assay and NMR revealed that the KH domain favors pyrimidines over purines. Mutational analysis showed that the GXXG loop in the KH domain is involved in pyrimidine binding. Moreover, we found that an alanine residue adjacent to the GXXG loop is critical for binding. Systematic evolution of ligands by exponential enrichment, chromatin immunoprecipitation-seq, and cross-linking immunoprecipitation-seq showed that the KH domain binds C-/T-rich DNA and U-rich RNA. Bioinformatics analysis suggested that the KH domain prefers to bind promoters. Using 15N-heteronuclear single quantum coherence NMR, the optimal binding sequence was identified as TTGT. Finally, we found that the full-length DDX43 helicase prefers DNA or RNA substrates with TTGT or UUGU single-stranded tails and that the KH domain is critically important for sequence specificity and unwinding processivity. Collectively, our results demonstrated that the KH domain facilitates the substrate specificity and processivity of the DDX43 helicase.
Collapse
|
research-article |
4 |
19 |
21
|
omniCLIP: probabilistic identification of protein-RNA interactions from CLIP-seq data. Genome Biol 2018; 19:183. [PMID: 30384847 PMCID: PMC6211453 DOI: 10.1186/s13059-018-1521-2] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2017] [Accepted: 09/03/2018] [Indexed: 12/04/2022] Open
Abstract
CLIP-seq methods allow the generation of genome-wide maps of RNA binding protein – RNA interaction sites. However, due to differences between different CLIP-seq assays, existing computational approaches to analyze the data can only be applied to a subset of assays. Here, we present a probabilistic model called omniCLIP that can detect regulatory elements in RNAs from data of all CLIP-seq assays. omniCLIP jointly models data across replicates and can integrate background information. Therefore, omniCLIP greatly simplifies the data analysis, increases the reliability of results and paves the way for integrative studies based on data from different assays.
Collapse
|
Research Support, Non-U.S. Gov't |
7 |
19 |
22
|
Brooks L, Lyons SM, Mahoney JM, Welch JD, Liu Z, Marzluff WF, Whitfield ML. A multiprotein occupancy map of the mRNP on the 3' end of histone mRNAs. RNA (NEW YORK, N.Y.) 2015; 21:1943-65. [PMID: 26377992 PMCID: PMC4604434 DOI: 10.1261/rna.053389.115] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/20/2015] [Accepted: 07/23/2015] [Indexed: 05/20/2023]
Abstract
The animal replication-dependent (RD) histone mRNAs are coordinately regulated with chromosome replication. The RD-histone mRNAs are the only known cellular mRNAs that are not polyadenylated. Instead, the mature transcripts end in a conserved stem-loop (SL) structure. This SL structure interacts with the stem-loop binding protein (SLBP), which is involved in all aspects of RD-histone mRNA metabolism. We used several genomic methods, including high-throughput sequencing of cross-linked immunoprecipitate (HITS-CLIP) to analyze the RNA-binding landscape of SLBP. SLBP was not bound to any RNAs other than histone mRNAs. We performed bioinformatic analyses of the HITS-CLIP data that included (i) clustering genes by sequencing read coverage using CVCA, (ii) mapping the bound RNA fragment termini, and (iii) mapping cross-linking induced mutation sites (CIMS) using CLIP-PyL software. These analyses allowed us to identify specific sites of molecular contact between SLBP and its RD-histone mRNA ligands. We performed in vitro crosslinking assays to refine the CIMS mapping and found that uracils one and three in the loop of the histone mRNA SL preferentially crosslink to SLBP, whereas uracil two in the loop preferentially crosslinks to a separate component, likely the 3'hExo. We also performed a secondary analysis of an iCLIP data set to map UPF1 occupancy across the RD-histone mRNAs and found that UPF1 is bound adjacent to the SLBP-binding site. Multiple proteins likely bind the 3' end of RD-histone mRNAs together with SLBP.
Collapse
|
Research Support, N.I.H., Extramural |
10 |
19 |
23
|
Van Nostrand EL, Shishkin AA, Pratt GA, Nguyen TB, Yeo GW. Variation in single-nucleotide sensitivity of eCLIP derived from reverse transcription conditions. Methods 2017; 126:29-37. [PMID: 28790018 PMCID: PMC5582984 DOI: 10.1016/j.ymeth.2017.08.002] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2017] [Revised: 06/15/2017] [Accepted: 08/03/2017] [Indexed: 12/20/2022] Open
Abstract
Crosslinking and immunoprecipitation (CLIP) followed by high-throughput sequencing identifies the binding sites of RNA binding proteins on RNAs. The covalent RNA-amino acid adducts produced by UV irradiation can cause premature reverse transcription termination and deletions (referred to as crosslink-induced mutation sites (CIMS)), which may decrease overall cDNA yield but are exploited in state-of-the-art CLIP methods to identify these crosslink sites at single-nucleotide resolution. Here, we show the ratio of both crosslinked base deletions and read-through versus termination are highly dependent on the identity of the reverse transcriptase enzyme as well as on buffer conditions used. AffinityScript and TGIRT showed a lack of deletion of the crosslinked base with other enzymes showing variable rates, indicating that utilization and interpretation of CIMS analysis requires knowledge of the reverse transcriptase enzyme used. Commonly used enzymes, including Superscript III and AffinityScript, show high termination rates in standard magnesium buffer conditions, but show a single base difference in the position of termination for TARDBP motifs. In contrast, manganese-containing buffer promoted read-through at the adduct site. These results validate the use of standard enzymes and also propose alternative enzyme and buffer choices for particularly challenging samples that contain extensive RNA adducts or other modifications that inhibit standard reverse transcription.
Collapse
|
Research Support, N.I.H., Extramural |
8 |
17 |
24
|
Reyes-Herrera PH, Ficarra E. Computational Methods for CLIP-seq Data Processing. Bioinform Biol Insights 2014; 8:199-207. [PMID: 25336930 PMCID: PMC4196881 DOI: 10.4137/bbi.s16803] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2014] [Revised: 07/29/2014] [Accepted: 08/01/2014] [Indexed: 12/25/2022] Open
Abstract
RNA-binding proteins (RBPs) are at the core of post-transcriptional regulation and thus of gene expression control at the RNA level. One of the principal challenges in the field of gene expression regulation is to understand RBPs mechanism of action. As a result of recent evolution of experimental techniques, it is now possible to obtain the RNA regions recognized by RBPs on a transcriptome-wide scale. In fact, CLIP-seq protocols use the joint action of CLIP, crosslinking immunoprecipitation, and high-throughput sequencing to recover the transcriptome-wide set of interaction regions for a particular protein. Nevertheless, computational methods are necessary to process CLIP-seq experimental data and are a key to advancement in the understanding of gene regulatory mechanisms. Considering the importance of computational methods in this area, we present a review of the current status of computational approaches used and proposed for CLIP-seq data.
Collapse
|
Review |
11 |
17 |
25
|
Le Tonquèze O, Gschloessl B, Legagneux V, Paillard L, Audic Y. Identification of CELF1 RNA targets by CLIP-seq in human HeLa cells. GENOMICS DATA 2016; 8:97-103. [PMID: 27222809 PMCID: PMC4872370 DOI: 10.1016/j.gdata.2016.04.009] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/04/2016] [Revised: 04/15/2016] [Accepted: 04/16/2016] [Indexed: 02/06/2023]
Abstract
The specific interactions between RNA-binding proteins and their target RNAs are an essential level to control gene expression. By combining ultra-violet cross-linking and immunoprecipitation (CLIP) and massive SoliD sequencing we identified the RNAs bound by the RNA-binding protein CELF1, in human HeLa cells. The CELF1 binding sites deduced from the sequence data allow characterizing specific features of CELF1-RNA association. We present therefore the first map of CELF1 binding sites in human cells.
Collapse
|
Journal Article |
9 |
17 |