1
|
Prediction of protein-RNA interactions from single-cell transcriptomic data. Nucleic Acids Res 2024; 52:e31. [PMID: 38364867 PMCID: PMC11014251 DOI: 10.1093/nar/gkae076] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2023] [Revised: 01/12/2024] [Accepted: 01/26/2024] [Indexed: 02/18/2024] Open
Abstract
Proteins are crucial in regulating every aspect of RNA life, yet understanding their interactions with coding and noncoding RNAs remains limited. Experimental studies are typically restricted to a small number of cell lines and a limited set of RNA-binding proteins (RBPs). Although computational methods based on physico-chemical principles can predict protein-RNA interactions accurately, they often lack the ability to consider cell-type-specific gene expression and the broader context of gene regulatory networks (GRNs). Here, we assess the performance of several GRN inference algorithms in predicting protein-RNA interactions from single-cell transcriptomic data, and propose a pipeline, called scRAPID (single-cell transcriptomic-based RnA Protein Interaction Detection), that integrates these methods with the catRAPID algorithm, which can identify direct physical interactions between RBPs and RNA molecules. Our approach demonstrates that RBP-RNA interactions can be predicted from single-cell transcriptomic data, with performances comparable or superior to those achieved for the well-established task of inferring transcription factor-target interactions. The incorporation of catRAPID significantly enhances the accuracy of identifying interactions, particularly with long noncoding RNAs, and enables the identification of hub RBPs and RNAs. Additionally, we show that interactions between RBPs can be detected based on their inferred RNA targets. The software is freely available at https://github.com/tartaglialabIIT/scRAPID.
Collapse
|
2
|
Predicting nuclear G-quadruplex RNA-binding proteins with roles in transcription and phase separation. Nat Commun 2024; 15:2585. [PMID: 38519458 PMCID: PMC10959947 DOI: 10.1038/s41467-024-46731-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2023] [Accepted: 03/08/2024] [Indexed: 03/25/2024] Open
Abstract
RNA-binding proteins are central for many biological processes and their characterization has demonstrated a broad range of functions as well as a wide spectrum of target structures. RNA G-quadruplexes are important regulatory elements occurring in both coding and non-coding transcripts, yet our knowledge of their structure-based interactions is at present limited. Here, using theoretical predictions and experimental approaches, we show that many chromatin-binding proteins bind to RNA G-quadruplexes, and we classify them based on their RNA G-quadruplex-binding potential. Combining experimental identification of nuclear RNA G-quadruplex-binding proteins with computational approaches, we build a prediction tool that assigns probability score for a nuclear protein to bind RNA G-quadruplexes. We show that predicted G-quadruplex RNA-binding proteins exhibit a high degree of protein disorder and hydrophilicity and suggest involvement in both transcription and phase-separation into membrane-less organelles. Finally, we present the G4-Folded/UNfolded Nuclear Interaction Explorer System (G4-FUNNIES) for estimating RNA G4-binding propensities at http://service.tartaglialab.com/new_submission/G4FUNNIES .
Collapse
|
3
|
The PENGUIN approach to reconstruct protein interactions at enhancer-promoter regions and its application to prostate cancer. Nat Commun 2023; 14:8084. [PMID: 38057321 PMCID: PMC10700545 DOI: 10.1038/s41467-023-43767-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2022] [Accepted: 11/18/2023] [Indexed: 12/08/2023] Open
Abstract
We introduce Promoter-Enhancer-Guided Interaction Networks (PENGUIN), a method for studying protein-protein interaction (PPI) networks within enhancer-promoter interactions. PENGUIN integrates H3K27ac-HiChIP data with tissue-specific PPIs to define enhancer-promoter PPI networks (EPINs). We validated PENGUIN using cancer (LNCaP) and benign (LHSAR) prostate cell lines. Our analysis detected EPIN clusters enriched with the architectural protein CTCF, a regulator of enhancer-promoter interactions. CTCF presence was coupled with the prevalence of prostate cancer (PrCa) single nucleotide polymorphisms (SNPs) within the same EPIN clusters, suggesting functional implications in PrCa. Within the EPINs displaying enrichments in both CTCF and PrCa SNPs, we also show enrichment in oncogenes. We substantiated our identified SNPs through CRISPR/Cas9 knockout and RNAi screens experiments. Here we show that PENGUIN provides insights into the intricate interplay between enhancer-promoter interactions and PPI networks, which are crucial for identifying key genes and potential intervention targets. A dedicated server is available at https://penguin.life.bsc.es/ .
Collapse
|
4
|
LINE-1 regulates cortical development by acting as long non-coding RNAs. Nat Commun 2023; 14:4974. [PMID: 37591988 PMCID: PMC10435495 DOI: 10.1038/s41467-023-40743-7] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2022] [Accepted: 08/07/2023] [Indexed: 08/19/2023] Open
Abstract
Long Interspersed Nuclear Elements-1s (L1s) are transposable elements that constitute most of the genome's transcriptional output yet have still largely unknown functions. Here we show that L1s are required for proper mouse brain corticogenesis operating as regulatory long non-coding RNAs. They contribute to the regulation of the balance between neuronal progenitors and differentiation, the migration of post-mitotic neurons and the proportions of different cell types. In cortical cultured neurons, L1 RNAs are mainly associated to chromatin and interact with the Polycomb Repressive Complex 2 (PRC2) protein subunits enhancer of Zeste homolog 2 (Ezh2) and suppressor of zeste 12 (Suz12). L1 RNA silencing influences PRC2's ability to bind a portion of its targets and the deposition of tri-methylated histone H3 (H3K27me3) marks. Our results position L1 RNAs as crucial signalling hubs for genome-wide chromatin remodelling, enabling the fine-tuning of gene expression during brain development and evolution.
Collapse
|
5
|
ORC1 binds to cis-transcribed RNAs for efficient activation of replication origins. Nat Commun 2023; 14:4447. [PMID: 37488096 PMCID: PMC10366126 DOI: 10.1038/s41467-023-40105-3] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2023] [Accepted: 07/11/2023] [Indexed: 07/26/2023] Open
Abstract
Cells must coordinate the activation of thousands of replication origins dispersed throughout their genome. Active transcription is known to favor the formation of mammalian origins, although the role that RNA plays in this process remains unclear. We show that the ORC1 subunit of the human Origin Recognition Complex interacts with RNAs transcribed from genes with origins in their transcription start sites (TSSs), displaying a positive correlation between RNA binding and origin activity. RNA depletion, or the use of ORC1 RNA-binding mutant, result in inefficient activation of proximal origins, linked to impaired ORC1 chromatin release. ORC1 RNA binding activity resides in its intrinsically disordered region, involved in intra- and inter-molecular interactions, regulation by phosphorylation, and phase-separation. We show that RNA binding favors ORC1 chromatin release, by regulating its phosphorylation and subsequent degradation. Our results unveil a non-coding function of RNA as a dynamic component of the chromatin, orchestrating the activation of replication origins.
Collapse
|
6
|
MAPK/MAK/MRK overlapping kinase (MOK) controls microglial inflammatory/type-I IFN responses via Brd4 and is involved in ALS. Proc Natl Acad Sci U S A 2023; 120:e2302143120. [PMID: 37399380 PMCID: PMC10334760 DOI: 10.1073/pnas.2302143120] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2023] [Accepted: 05/26/2023] [Indexed: 07/05/2023] Open
Abstract
Amyotrophic lateral sclerosis (ALS) is a fatal and incurable neurodegenerative disease affecting motor neurons and characterized by microglia-mediated neurotoxic inflammation whose underlying mechanisms remain incompletely understood. In this work, we reveal that MAPK/MAK/MRK overlapping kinase (MOK), with an unknown physiological substrate, displays an immune function by controlling inflammatory and type-I interferon (IFN) responses in microglia which are detrimental to primary motor neurons. Moreover, we uncover the epigenetic reader bromodomain-containing protein 4 (Brd4) as an effector protein regulated by MOK, by promoting Ser492-phospho-Brd4 levels. We further demonstrate that MOK regulates Brd4 functions by supporting its binding to cytokine gene promoters, therefore enabling innate immune responses. Remarkably, we show that MOK levels are increased in the ALS spinal cord, particularly in microglial cells, and that administration of a chemical MOK inhibitor to ALS model mice can modulate Ser492-phospho-Brd4 levels, suppress microglial activation, and modify the disease course, indicating a pathophysiological role of MOK kinase in ALS and neuroinflammation.
Collapse
|
7
|
The PRALINE database: protein and Rna humAn singLe nucleotIde variaNts in condEnsates. Bioinformatics 2023; 39:6967034. [PMID: 36592044 PMCID: PMC9825767 DOI: 10.1093/bioinformatics/btac847] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2022] [Revised: 11/16/2022] [Accepted: 12/30/2022] [Indexed: 01/03/2023] Open
Abstract
SUMMARY Biological condensates are membraneless organelles with different material properties. Proteins and RNAs are the main components, but most of their interactions are still unknown. Here, we introduce PRALINE, a database for the interrogation of proteins and RNAs contained in stress granules, processing bodies and other assemblies including droplets and amyloids. PRALINE provides information about the predicted and experimentally validated protein-protein, protein-RNA and RNA-RNA interactions. For proteins, it reports the liquid-liquid phase separation and liquid-solid phase separation propensities. For RNAs, it provides information on predicted secondary structure content. PRALINE shows detailed information on human single-nucleotide variants, their clinical significance and presence in protein and RNA binding sites, and how they can affect condensates' physical properties. AVAILABILITY AND IMPLEMENTATION PRALINE is freely accessible on the web at http://praline.tartaglialab.com.
Collapse
|
8
|
A high-throughput approach to predict A-to-I effects on RNA structure indicates a change of double-stranded content in non-coding RNAs. IUBMB Life 2022; 75:411-426. [PMID: 36057100 DOI: 10.1002/iub.2673] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2022] [Accepted: 08/21/2022] [Indexed: 11/09/2022]
Abstract
RNA molecules undergo a number of chemical modifications whose effects can alter their structure and molecular interactions. Previous studies have shown that RNA editing can impact the formation of ribonucleoprotein complexes and influence the assembly of membrane-less organelles such as stress-granules. For instance, N6-methyladenosine (m6A) enhances SG formation and N1-methyladenosine (m1A) prevents their transition to solid-like aggregates. Yet, very little is known about adenosine to inosine (A-to-I) modification that is very abundant in human cells and not only impacts mRNAs but also non-coding RNAs. Here, we built the CROSSalive predictor of A-to-I effects on RNA structure based on high-throughput in-cell experiments. Our method shows an accuracy of 90% in predicting the single and double-stranded content of transcripts and identifies a general enrichment of double-stranded regions caused by A-to-I in long intergenic non-coding RNAs (lincRNAs). For the individual cases of NEAT1, NORAD and XIST, we investigated the relationship between A-to-I editing and interactions with RNA-binding proteins using available CLIP data and catRAPID predictions. We found that A-to-I editing is linked to alteration of interaction sites with proteins involved in phase-separation, which suggests that RNP assembly can be influenced by A-to-I. CROSSalive is available at http://service.tartaglialab.com/new_submission/crossalive. This article is protected by copyright. All rights reserved.
Collapse
|
9
|
Probing TDP-43 condensation using an in silico designed aptamer. Nat Commun 2022; 13:3306. [PMID: 35739092 PMCID: PMC9226187 DOI: 10.1038/s41467-022-30944-x] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2021] [Accepted: 05/23/2022] [Indexed: 12/03/2022] Open
Abstract
Aptamers are artificial oligonucleotides binding to specific molecular targets. They have a promising role in therapeutics and diagnostics but are often difficult to design. Here, we exploited the catRAPID algorithm to generate aptamers targeting TAR DNA-binding protein 43 (TDP-43), whose aggregation is associated with Amyotrophic Lateral Sclerosis. On the pathway to forming insoluble inclusions, TDP-43 adopts a heterogeneous population of assemblies, many smaller than the diffraction-limit of light. We demonstrated that our aptamers bind TDP-43 and used the tightest interactor, Apt-1, as a probe to visualize TDP-43 condensates with super-resolution microscopy. At a resolution of 10 nanometers, we tracked TDP-43 oligomers undetectable by standard approaches. In cells, Apt-1 interacts with both diffuse and condensed forms of TDP-43, indicating that Apt-1 can be exploited to follow TDP-43 phase transition. The de novo generation of aptamers and their use for microscopy opens a new page to study protein condensation.
Collapse
|
10
|
Identification of long non-coding RNAs and RNA binding proteins in breast cancer subtypes. Sci Rep 2022; 12:693. [PMID: 35027621 PMCID: PMC8758778 DOI: 10.1038/s41598-021-04664-z] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2021] [Accepted: 12/17/2021] [Indexed: 12/14/2022] Open
Abstract
Breast cancer is a heterogeneous disease classified into four main subtypes with different clinical outcomes, such as patient survival, prognosis, and relapse. Current genetic tests for the differential diagnosis of BC subtypes showed a poor reproducibility. Therefore, an early and correct diagnosis of molecular subtypes is one of the challenges in the clinic. In the present study, we identified differentially expressed genes, long non-coding RNAs and RNA binding proteins for each BC subtype from a public dataset applying bioinformatics algorithms. In addition, we investigated their interactions and we proposed interacting biomarkers as potential signature specific for each BC subtype. We found a network of only 2 RBPs (RBM20 and PCDH20) and 2 genes (HOXB3 and RASSF7) for luminal A, a network of 21 RBPs and 53 genes for luminal B, a HER2-specific network of 14 RBPs and 30 genes, and a network of 54 RBPs and 302 genes for basal BC. We validated the signature considering their expression levels on an independent dataset evaluating their ability to classify the different molecular subtypes with a machine learning approach. Overall, we achieved good performances of classification with an accuracy >0.80. In addition, we found some interesting novel prognostic biomarkers such as RASSF7 for luminal A, DCTPP1 for luminal B, DHRS11, KLC3, NAGS, and TMEM98 for HER2, and ABHD14A and ADSSL1 for basal. The findings could provide preliminary evidence to identify putative new prognostic biomarkers and therapeutic targets for individual breast cancer subtypes.
Collapse
|
11
|
Abstract
Motivation Thermal properties of proteins are of great importance for a number of theoretical and practical implications. Predicting the thermal stability of a protein is a difficult and still scarcely addressed task. Results Here, we introduce Thermometer, a webserver to assess the thermal stability of a protein using structural information. Thermometer is implemented as a publicly available, user-friendly interface. Availability and implementation Our server can be found at the following link (all major browser supported): http://service.tartaglialab.com/new_submission/thermometer_file. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
|
12
|
catRAPID omics v2.0: going deeper and wider in the prediction of protein-RNA interactions. Nucleic Acids Res 2021; 49:W72-W79. [PMID: 34086933 PMCID: PMC8262727 DOI: 10.1093/nar/gkab393] [Citation(s) in RCA: 57] [Impact Index Per Article: 19.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2021] [Revised: 04/26/2021] [Accepted: 04/29/2021] [Indexed: 12/12/2022] Open
Abstract
Prediction of protein-RNA interactions is important to understand post-transcriptional events taking place in the cell. Here we introduce catRAPID omics v2.0, an update of our web server dedicated to the computation of protein-RNA interaction propensities at the transcriptome- and RNA-binding proteome-level in 8 model organisms. The server accepts multiple input protein or RNA sequences and computes their catRAPID interaction scores on updated precompiled libraries. Additionally, it is now possible to predict the interactions between a custom protein set and a custom RNA set. Considerable effort has been put into the generation of a new database of RNA-binding motifs that are searched within the predicted RNA targets of proteins. In this update, the sequence fragmentation scheme of the catRAPID fragment module has been included, which allows the server to handle long linear RNAs and to analyse circular RNAs. For the top-scoring protein-RNA pairs, the web server shows the predicted binding sites in both protein and RNA sequences and reports whether the predicted interactions are conserved in orthologous protein-RNA pairs. The catRAPID omics v2.0 web server is a powerful tool for the characterization and classification of RNA-protein interactions and is freely available at http://service.tartaglialab.com/page/catrapid_omics2_group along with documentation and tutorial.
Collapse
|
13
|
Aggregation is a Context-Dependent Constraint on Protein Evolution. Front Mol Biosci 2021; 8:678115. [PMID: 34222334 PMCID: PMC8249573 DOI: 10.3389/fmolb.2021.678115] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2021] [Accepted: 05/13/2021] [Indexed: 12/27/2022] Open
Abstract
Solubility is a requirement for many cellular processes. Loss of solubility and aggregation can lead to the partial or complete abrogation of protein function. Thus, understanding the relationship between protein evolution and aggregation is an important goal. Here, we analysed two deep mutational scanning experiments to investigate the role of protein aggregation in molecular evolution. In one data set, mutants of a protein involved in RNA biogenesis and processing, human TAR DNA binding protein 43 (TDP-43), were expressed in S. cerevisiae. In the other data set, mutants of a bacterial enzyme that controls resistance to penicillins and cephalosporins, TEM-1 beta-lactamase, were expressed in E. coli under the selective pressure of an antibiotic treatment. We found that aggregation differentiates the effects of mutations in the two different cellular contexts. Specifically, aggregation was found to be associated with increased cell fitness in the case of TDP-43 mutations, as it protects the host from aberrant interactions. By contrast, in the case of TEM-1 beta-lactamase mutations, aggregation is linked to a decreased cell fitness due to inactivation of protein function. Our study shows that aggregation is an important context-dependent constraint of molecular evolution and opens up new avenues to investigate the role of aggregation in the cell.
Collapse
|
14
|
RNA-protein interactions: Central players in coordination of regulatory networks. Bioessays 2020; 43:e2000118. [PMID: 33284474 DOI: 10.1002/bies.202000118] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2020] [Revised: 09/30/2020] [Accepted: 10/01/2020] [Indexed: 12/12/2022]
Abstract
Changes in the abundance of protein and RNA molecules can impair the formation of complexes in the cell leading to toxicity and death. Here we exploit the information contained in protein, RNA and DNA interaction networks to provide a comprehensive view of the regulation layers controlling the concentration-dependent formation of assemblies in the cell. We present the emerging concept that RNAs can act as scaffolds to promote the formation ribonucleoprotein complexes and coordinate the post-transcriptional layer of gene regulation. We describe the structural and interaction network properties that characterize the ability of protein and RNA molecules to interact and phase separate in liquid-like compartments. Finally, we show that presence of structurally disordered regions in proteins correlate with the propensity to undergo liquid-to-solid phase transitions and cause human diseases. Also see the video abstract here https://youtu.be/kfpqibsNfS0.
Collapse
|
15
|
Abstract
Specific elements of viral genomes regulate interactions within host cells. Here, we calculated the secondary structure content of >2000 coronaviruses and computed >100 000 human protein interactions with severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). The genomic regions display different degrees of conservation. SARS-CoV-2 domain encompassing nucleotides 22 500-23 000 is conserved both at the sequence and structural level. The regions upstream and downstream, however, vary significantly. This part of the viral sequence codes for the Spike S protein that interacts with the human receptor angiotensin-converting enzyme 2 (ACE2). Thus, variability of Spike S is connected to different levels of viral entry in human cells within the population. Our predictions indicate that the 5' end of SARS-CoV-2 is highly structured and interacts with several human proteins. The binding proteins are involved in viral RNA processing, include double-stranded RNA specific editases and ATP-dependent RNA-helicases and have strong propensity to form stress granules and phase-separated assemblies. We propose that these proteins, also implicated in viral infections such as HIV, are selectively recruited by SARS-CoV-2 genome to alter transcriptional and post-transcriptional regulation of host cells and to promote viral replication.
Collapse
|
16
|
Structural analysis of SARS-CoV-2 genome and predictions of the human interactome. Nucleic Acids Res 2020; 48:11270-11283. [PMID: 33068416 PMCID: PMC7672441 DOI: 10.1093/nar/gkaa864] [Citation(s) in RCA: 54] [Impact Index Per Article: 13.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2020] [Revised: 09/15/2020] [Accepted: 09/25/2020] [Indexed: 12/17/2022] Open
Abstract
Specific elements of viral genomes regulate interactions within host cells. Here, we calculated the secondary structure content of >2000 coronaviruses and computed >100 000 human protein interactions with severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). The genomic regions display different degrees of conservation. SARS-CoV-2 domain encompassing nucleotides 22 500-23 000 is conserved both at the sequence and structural level. The regions upstream and downstream, however, vary significantly. This part of the viral sequence codes for the Spike S protein that interacts with the human receptor angiotensin-converting enzyme 2 (ACE2). Thus, variability of Spike S is connected to different levels of viral entry in human cells within the population. Our predictions indicate that the 5' end of SARS-CoV-2 is highly structured and interacts with several human proteins. The binding proteins are involved in viral RNA processing, include double-stranded RNA specific editases and ATP-dependent RNA-helicases and have strong propensity to form stress granules and phase-separated assemblies. We propose that these proteins, also implicated in viral infections such as HIV, are selectively recruited by SARS-CoV-2 genome to alter transcriptional and post-transcriptional regulation of host cells and to promote viral replication.
Collapse
|
17
|
RNA-binding and prion domains: the Yin and Yang of phase separation. Nucleic Acids Res 2020; 48:9491-9504. [PMID: 32857852 PMCID: PMC7515694 DOI: 10.1093/nar/gkaa681] [Citation(s) in RCA: 47] [Impact Index Per Article: 11.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2020] [Revised: 07/08/2020] [Accepted: 08/05/2020] [Indexed: 12/17/2022] Open
Abstract
Proteins and RNAs assemble in membrane-less organelles that organize intracellular spaces and regulate biochemical reactions. The ability of proteins and RNAs to form condensates is encoded in their sequences, yet it is unknown which domains drive the phase separation (PS) process and what are their specific roles. Here, we systematically investigated the human and yeast proteomes to find regions promoting condensation. Using advanced computational methods to predict the PS propensity of proteins, we designed a set of experiments to investigate the contributions of Prion-Like Domains (PrLDs) and RNA-binding domains (RBDs). We found that one PrLD is sufficient to drive PS, whereas multiple RBDs are needed to modulate the dynamics of the assemblies. In the case of stress granule protein Pub1 we show that the PrLD promotes sequestration of protein partners and the RBD confers liquid-like behaviour to the condensate. Our work sheds light on the fine interplay between RBDs and PrLD to regulate formation of membrane-less organelles, opening up the avenue for their manipulation.
Collapse
|
18
|
RNAct: Protein-RNA interaction predictions for model organisms with supporting experimental data. Nucleic Acids Res 2020; 47:D601-D606. [PMID: 30445601 PMCID: PMC6324028 DOI: 10.1093/nar/gky967] [Citation(s) in RCA: 62] [Impact Index Per Article: 15.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2018] [Accepted: 10/11/2018] [Indexed: 01/15/2023] Open
Abstract
Protein-RNA interactions are implicated in a number of physiological roles as well as diseases, with molecular mechanisms ranging from defects in RNA splicing, localization and translation to the formation of aggregates. Currently, ∼1400 human proteins have experimental evidence of RNA-binding activity. However, only ∼250 of these proteins currently have experimental data on their target RNAs from various sequencing-based methods such as eCLIP. To bridge this gap, we used an established, computationally expensive protein-RNA interaction prediction method, catRAPID, to populate a large database, RNAct. RNAct allows easy lookup of known and predicted interactions and enables global views of the human, mouse and yeast protein-RNA interactomes, expanding them in a genome-wide manner far beyond experimental data (http://rnact.crg.eu).
Collapse
|
19
|
|
20
|
The moonlighting RNA-binding activity of cytosolic serine hydroxymethyltransferase contributes to control compartmentalization of serine metabolism. Nucleic Acids Res 2019; 47:4240-4254. [PMID: 30809670 PMCID: PMC6486632 DOI: 10.1093/nar/gkz129] [Citation(s) in RCA: 26] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2018] [Revised: 02/01/2019] [Accepted: 02/15/2019] [Indexed: 12/30/2022] Open
Abstract
Enzymes of intermediary metabolism are often reported to have moonlighting functions as RNA-binding proteins and have regulatory roles beyond their primary activities. Human serine hydroxymethyltransferase (SHMT) is essential for the one-carbon metabolism, which sustains growth and proliferation in normal and tumour cells. Here, we characterize the RNA-binding function of cytosolic SHMT (SHMT1) in vitro and using cancer cell models. We show that SHMT1 controls the expression of its mitochondrial counterpart (SHMT2) by binding to the 5'untranslated region of the SHMT2 transcript (UTR2). Importantly, binding to RNA is modulated by metabolites in vitro and the formation of the SHMT1-UTR2 complex inhibits the serine cleavage activity of the SHMT1, without affecting the reverse reaction. Transfection of UTR2 in cancer cells controls SHMT1 activity and reduces cell viability. We propose a novel mechanism of SHMT regulation, which interconnects RNA and metabolites levels to control the cross-talk between cytosolic and mitochondrial compartments of serine metabolism.
Collapse
|
21
|
CROSSalive: a web server for predicting the in vivo structure of RNA molecules. Bioinformatics 2019; 36:940-941. [PMID: 31504168 PMCID: PMC9883674 DOI: 10.1093/bioinformatics/btz666] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2019] [Revised: 05/24/2019] [Accepted: 08/26/2019] [Indexed: 02/02/2023] Open
Abstract
MOTIVATION RNA structure is difficult to predict in vivo due to interactions with enzymes and other molecules. Here we introduce CROSSalive, an algorithm to predict the single- and double-stranded regions of RNAs in vivo using predictions of protein interactions. RESULTS Trained on icSHAPE data in presence (m6a+) and absence of N6 methyladenosine modification (m6a-), CROSSalive achieves cross-validation accuracies between 0.70 and 0.88 in identifying high-confidence single- and double-stranded regions. The algorithm was applied to the long non-coding RNA Xist (17 900 nt, not present in the training) and shows an Area under the ROC curve of 0.83 in predicting structured regions. AVAILABILITY AND IMPLEMENTATION CROSSalive webserver is freely accessible at http://service.tartaglialab.com/new_submission/crossalive. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
|
22
|
Abstract
The combination of high-throughput sequencing and in vivo crosslinking approaches leads to the progressive uncovering of the complex interdependence between cellular transcriptome and proteome. Yet, the molecular determinants governing interactions in protein-RNA networks are not well understood. Here we investigated the relationship between the structure of an RNA and its ability to interact with proteins. Analysing in silico, in vitro and in vivo experiments, we find that the amount of double-stranded regions in an RNA correlates with the number of protein contacts. This relationship -which we call structure-driven protein interactivity- allows classification of RNA types, plays a role in gene regulation and could have implications for the formation of phase-separated ribonucleoprotein assemblies. We validate our hypothesis by showing that a highly structured RNA can rearrange the composition of a protein aggregate. We report that the tendency of proteins to phase-separate is reduced by interactions with specific RNAs.
Collapse
|
23
|
A Method for RNA Structure Prediction Shows Evidence for Structure in lncRNAs. Front Mol Biosci 2018; 5:111. [PMID: 30560136 PMCID: PMC6286970 DOI: 10.3389/fmolb.2018.00111] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2018] [Accepted: 11/16/2018] [Indexed: 12/18/2022] Open
Abstract
To compare the secondary structure profiles of RNA molecules we developed the CROSSalign method. CROSSalign is based on the combination of the Computational Recognition Of Secondary Structure (CROSS) algorithm to predict the RNA secondary structure profile at single-nucleotide resolution and the Dynamic Time Warping (DTW) method to align profiles of different lengths. We applied CROSSalign to investigate the structural conservation of long non-coding RNAs such as XIST and HOTAIR as well as ssRNA viruses including HIV. CROSSalign performs pair-wise comparisons and is able to find homologs between thousands of matches identifying the exact regions of similarity between profiles of different lengths. In a pool of sequences with the same secondary structure CROSSalign accurately recognizes repeat A of XIST and domain D2 of HOTAIR and outperforms other methods based on covariance modeling. The algorithm is freely available at the webpage http://service.tartaglialab.com//new_submission/crossalign.
Collapse
|
24
|
Abstract
Summary Here we introduce omiXcore, a server for calculations of protein binding to large RNAs (> 500 nucleotides). Our webserver allows (i) use of both protein and RNA sequences without size restriction, (ii) pre-compiled library for exploration of human long intergenic RNAs interactions and (iii) prediction of binding sites. Results omiXcore was trained and tested on enhanced UV Cross-Linking and ImmunoPrecipitation data. The method discriminates interacting and non-interacting protein-RNA pairs and identifies RNA binding sites with Areas under the ROC curve > 0.80, which suggests that the tool is particularly useful to prioritize candidates for further experimental validation. Availability and implementation omiXcore is freely accessed on the web at http://service.tartaglialab.com/grant_submission/omixcore. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
|
25
|
A high-throughput approach to profile RNA structure. Nucleic Acids Res 2017; 45:e35. [PMID: 27899588 PMCID: PMC5389523 DOI: 10.1093/nar/gkw1094] [Citation(s) in RCA: 33] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2016] [Accepted: 10/28/2016] [Indexed: 11/12/2022] Open
Abstract
Here we introduce the Computational Recognition of Secondary Structure (CROSS) method to calculate the structural profile of an RNA sequence (single- or double-stranded state) at single-nucleotide resolution and without sequence length restrictions. We trained CROSS using data from high-throughput experiments such as Selective 2΄-Hydroxyl Acylation analyzed by Primer Extension (SHAPE; Mouse and HIV transcriptomes) and Parallel Analysis of RNA Structure (PARS; Human and Yeast transcriptomes) as well as high-quality NMR/X-ray structures (PDB database). The algorithm uses primary structure information alone to predict experimental structural profiles with >80% accuracy, showing high performances on large RNAs such as Xist (17 900 nucleotides; Area Under the ROC Curve AUC of 0.75 on dimethyl sulfate (DMS) experiments). We integrated CROSS in thermodynamics-based methods to predict secondary structure and observed an increase in their predictive power by up to 30%.
Collapse
|