1
|
Qiao Y, Yang R, Liu Y, Chen J, Zhao L, Huo P, Wang Z, Bu D, Wu Y, Zhao Y. DeepFusion: A deep bimodal information fusion network for unraveling protein-RNA interactions using in vivo RNA structures. Comput Struct Biotechnol J 2024; 23:617-625. [PMID: 38274994 PMCID: PMC10808905 DOI: 10.1016/j.csbj.2023.12.040] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2023] [Revised: 12/04/2023] [Accepted: 12/26/2023] [Indexed: 01/27/2024] Open
Abstract
RNA-binding proteins (RBPs) are key post-transcriptional regulators, and the malfunctions of RBP-RNA binding lead to diverse human diseases. However, prediction of RBP binding sites is largely based on RNA sequence features, whereas in vivo RNA structural features based on high-throughput sequencing are rarely incorporated. Here, we designed a deep bimodal information fusion network called DeepFusion for unraveling protein-RNA interactions by incorporating structural features derived from DMS-seq data. DeepFusion integrates two sub-models to extract local motif-like information and long-term context information. We show that DeepFusion performs best compared with other cutting-edge methods with only sequence inputs on two datasets. DeepFusion's performance is further improved with bimodal input after adding in vivo DMS-seq structural features. Furthermore, DeepFusion can be used for analyzing RNA degradation, demonstrating significantly different RBP-binding scores in genes with slow degradation rates versus those with rapid degradation rates. DeepFusion thus provides enhanced abilities for further analysis of functional RNAs. DeepFusion's code and data are available at http://bioinfo.org/deepfusion/.
Collapse
Affiliation(s)
- Yixuan Qiao
- Research Center for Ubiquitous Computing Systems, Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100190, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Rui Yang
- Research Center for Ubiquitous Computing Systems, Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100190, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Yang Liu
- Research Center for Ubiquitous Computing Systems, Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100190, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Jiaxin Chen
- Research Center for Ubiquitous Computing Systems, Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100190, China
| | - Lianhe Zhao
- Research Center for Ubiquitous Computing Systems, Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100190, China
| | - Peipei Huo
- Research Center for Ubiquitous Computing Systems, Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100190, China
| | - Zhihao Wang
- Research Center for Ubiquitous Computing Systems, Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100190, China
| | - Dechao Bu
- Research Center for Ubiquitous Computing Systems, Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100190, China
| | - Yang Wu
- Research Center for Ubiquitous Computing Systems, Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100190, China
| | - Yi Zhao
- Research Center for Ubiquitous Computing Systems, Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100190, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| |
Collapse
|
2
|
Cao X, Zhang Y, Ding Y, Wan Y. Identification of RNA structures and their roles in RNA functions. Nat Rev Mol Cell Biol 2024; 25:784-801. [PMID: 38926530 DOI: 10.1038/s41580-024-00748-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 05/28/2024] [Indexed: 06/28/2024]
Abstract
The development of high-throughput RNA structure profiling methods in the past decade has greatly facilitated our ability to map and characterize different aspects of RNA structures transcriptome-wide in cell populations, single cells and single molecules. The resulting high-resolution data have provided insights into the static and dynamic nature of RNA structures, revealing their complexity as they perform their respective functions in the cell. In this Review, we discuss recent technical advances in the determination of RNA structures, and the roles of RNA structures in RNA biogenesis and functions, including in transcription, processing, translation, degradation, localization and RNA structure-dependent condensates. We also discuss the current understanding of how RNA structures could guide drug design for treating genetic diseases and battling pathogenic viruses, and highlight existing challenges and future directions in RNA structure research.
Collapse
Affiliation(s)
- Xinang Cao
- Stem Cell and Regenerative Biology, Genome Institute of Singapore, Singapore, Singapore
| | - Yueying Zhang
- Department of Cell and Developmental Biology, John Innes Centre, Norwich, UK
| | - Yiliang Ding
- Department of Cell and Developmental Biology, John Innes Centre, Norwich, UK.
| | - Yue Wan
- Stem Cell and Regenerative Biology, Genome Institute of Singapore, Singapore, Singapore.
- Department of Biochemistry, Yong Loo Lin School of Medicine, National University of Singapore, Singapore, Singapore.
| |
Collapse
|
3
|
Guarnacci M, Zhang PH, Kanchi M, Hung YT, Lin H, Shirokikh NE, Yang L, Preiss T. Substrate diversity of NSUN enzymes and links of 5-methylcytosine to mRNA translation and turnover. Life Sci Alliance 2024; 7:e202402613. [PMID: 38986569 PMCID: PMC11235314 DOI: 10.26508/lsa.202402613] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2024] [Revised: 06/28/2024] [Accepted: 06/28/2024] [Indexed: 07/12/2024] Open
Abstract
Maps of the RNA modification 5-methylcytosine (m5C) often diverge markedly not only because of differences in detection methods, data depand analysis pipelines but also biological factors. We re-analysed bisulfite RNA sequencing datasets from five human cell lines and seven tissues using a coherent m5C site calling pipeline. With the resulting union list of 6,393 m5C sites, we studied site distribution, enzymology, interaction with RNA-binding proteins and molecular function. We confirmed tRNA:m5C methyltransferases NSUN2 and NSUN6 as the main mRNA m5C "writers," but further showed that the rRNA:m5C methyltransferase NSUN5 can also modify mRNA. Each enzyme recognises mRNA features that strongly resemble their canonical substrates. By analysing proximity between mRNA m5C sites and footprints of RNA-binding proteins, we identified new candidates for functional interactions, including the RNA helicases DDX3X, involved in mRNA translation, and UPF1, an mRNA decay factor. We found that lack of NSUN2 in HeLa cells affected both steady-state levels of, and UPF1-binding to, target mRNAs. Our studies emphasise the emerging diversity of m5C writers and readers and their effect on mRNA function.
Collapse
Affiliation(s)
- Marco Guarnacci
- https://ror.org/019wvm592 Shine-Dalgarno Centre for RNA Innovation, Division of Genome Science and Cancer, John Curtin School of Medical Research, Australian National University, Canberra, Australia
| | - Pei-Hong Zhang
- Shanghai Institute of Nutrition and Health, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai, China
- Center for Molecular Medicine, Children's Hospital, Shanghai Key Laboratory of Medical Epigenetics, International Laboratory of Medical Epigenetics and Metabolism, Institutes of Biomedical Sciences, Fudan University, Shanghai, China
| | - Madhu Kanchi
- https://ror.org/019wvm592 Shine-Dalgarno Centre for RNA Innovation, Division of Genome Science and Cancer, John Curtin School of Medical Research, Australian National University, Canberra, Australia
| | - Yu-Ting Hung
- https://ror.org/019wvm592 Shine-Dalgarno Centre for RNA Innovation, Division of Genome Science and Cancer, John Curtin School of Medical Research, Australian National University, Canberra, Australia
| | - Hanrong Lin
- https://ror.org/019wvm592 Shine-Dalgarno Centre for RNA Innovation, Division of Genome Science and Cancer, John Curtin School of Medical Research, Australian National University, Canberra, Australia
| | - Nikolay E Shirokikh
- https://ror.org/019wvm592 Shine-Dalgarno Centre for RNA Innovation, Division of Genome Science and Cancer, John Curtin School of Medical Research, Australian National University, Canberra, Australia
| | - Li Yang
- Center for Molecular Medicine, Children's Hospital, Shanghai Key Laboratory of Medical Epigenetics, International Laboratory of Medical Epigenetics and Metabolism, Institutes of Biomedical Sciences, Fudan University, Shanghai, China
| | - Thomas Preiss
- https://ror.org/019wvm592 Shine-Dalgarno Centre for RNA Innovation, Division of Genome Science and Cancer, John Curtin School of Medical Research, Australian National University, Canberra, Australia
- Victor Chang Cardiac Research Institute, Sydney, Australia
| |
Collapse
|
4
|
Lu L, Zhang X, Zhou Y, Shi Z, Xie X, Zhang X, Gao L, Fu A, Liu C, He B, Xiong X, Yin Y, Wang Q, Yi C, Li X. Base-resolution m 5C profiling across the mammalian transcriptome by bisulfite-free enzyme-assisted chemical labeling approach. Mol Cell 2024; 84:2984-3000.e8. [PMID: 39002544 DOI: 10.1016/j.molcel.2024.06.021] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2023] [Revised: 06/03/2024] [Accepted: 06/20/2024] [Indexed: 07/15/2024]
Abstract
5-methylcytosine (m5C) is a prevalent RNA modification crucial for gene expression regulation. However, accurate and sensitive m5C sites identification remains challenging due to severe RNA degradation and reduced sequence complexity during bisulfite sequencing (BS-seq). Here, we report m5C-TAC-seq, a bisulfite-free approach combining TET-assisted m5C-to-f5C oxidation with selective chemical labeling, therefore enabling direct base-resolution m5C detection through pre-enrichment and C-to-T transitions at m5C sites. With m5C-TAC-seq, we comprehensively profiled the m5C methylomes in human and mouse cells, identifying a substantially larger number of confident m5C sites. Through perturbing potential m5C methyltransferases, we deciphered the responsible enzymes for most m5C sites, including the characterization of NSUN5's involvement in mRNA m5C deposition. Additionally, we characterized m5C dynamics during mESC differentiation. Notably, the mild reaction conditions and preservation of nucleotide composition in m5C-TAC-seq allow m5C detection in chromatin-associated RNAs. The accurate and robust m5C-TAC-seq will advance research into m5C methylation functional investigation.
Collapse
Affiliation(s)
- Liang Lu
- Department of Biochemistry and Department of Gastroenterology of the Second Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou 310058, China; Institute of Immunology, Zhejiang University School of Medicine, Hangzhou 310058, China
| | - Xiaoting Zhang
- State Key Laboratory of Protein and Plant Gene Research, School of Life Sciences, Peking University, Beijing 100871, China
| | - Yuenan Zhou
- Department of Biochemistry and Department of Gastroenterology of the Second Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou 310058, China; Department of Cell Biology and Department of Cardiology of the Second Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou 310058, China
| | - Zuokun Shi
- Department of Biochemistry and Department of Gastroenterology of the Second Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou 310058, China; Department of Cell Biology and Department of Cardiology of the Second Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou 310058, China
| | - Xiwen Xie
- Department of Biochemistry and Department of Gastroenterology of the Second Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou 310058, China
| | - Xinyue Zhang
- Department of Cell Biology and Department of Cardiology of the Second Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou 310058, China
| | - Liaoliao Gao
- Department of Biochemistry and Department of Gastroenterology of the Second Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou 310058, China
| | - Anbo Fu
- Department of Biochemistry and Department of Gastroenterology of the Second Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou 310058, China
| | - Cong Liu
- State Key Laboratory of Protein and Plant Gene Research, School of Life Sciences, Peking University, Beijing 100871, China
| | - Bo He
- Peking-Tsinghua Center for Life Sciences, Peking University, Beijing 100871, China
| | - Xushen Xiong
- The Second Affiliated Hospital and Liangzhu Laboratory, Zhejiang University School of Medicine, Hangzhou 311121, China
| | - Yafei Yin
- Department of Cell Biology and Department of Cardiology of the Second Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou 310058, China
| | - Qingqing Wang
- Institute of Immunology, Zhejiang University School of Medicine, Hangzhou 310058, China
| | - Chengqi Yi
- State Key Laboratory of Protein and Plant Gene Research, School of Life Sciences, Peking University, Beijing 100871, China; Peking-Tsinghua Center for Life Sciences, Peking University, Beijing 100871, China; Department of Chemical Biology and Synthetic and Functional Biomolecules Center, College of Chemistry and Molecular Engineering, Peking University, Beijing 100871, China.
| | - Xiaoyu Li
- Department of Biochemistry and Department of Gastroenterology of the Second Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou 310058, China.
| |
Collapse
|
5
|
Her H, Rothamel KL, Nguyen GG, Boyle EA, Yeo GW. Mudskipper detects combinatorial RNA binding protein interactions in multiplexed CLIP data. CELL GENOMICS 2024; 4:100603. [PMID: 38955188 DOI: 10.1016/j.xgen.2024.100603] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/22/2023] [Revised: 04/08/2024] [Accepted: 06/07/2024] [Indexed: 07/04/2024]
Abstract
The uncovering of protein-RNA interactions enables a deeper understanding of RNA processing. Recent multiplexed crosslinking and immunoprecipitation (CLIP) technologies such as antibody-barcoded eCLIP (ABC) dramatically increase the throughput of mapping RNA binding protein (RBP) binding sites. However, multiplex CLIP datasets are multivariate, and each RBP suffers non-uniform signal-to-noise ratio. To address this, we developed Mudskipper, a versatile computational suite comprising two components: a Dirichlet multinomial mixture model to account for the multivariate nature of ABC datasets and a softmasking approach that identifies and removes non-specific protein-RNA interactions in RBPs with low signal-to-noise ratio. Mudskipper demonstrates superior precision and recall over existing tools on multiplex datasets and supports analysis of repetitive elements and small non-coding RNAs. Our findings unravel splicing outcomes and variant-associated disruptions, enabling higher-throughput investigations into diseases and regulation mediated by RBPs.
Collapse
Affiliation(s)
- Hsuanlin Her
- Department of Cellular and Molecular Medicine, University of California, San Diego, La Jolla, CA 92093, USA; Sanford Stem Cell Institute Innovation Center and Stem Cell Program, University of California, San Diego, La Jolla, CA 92093, USA; Institute for Genomic Medicine, University of California, San Diego, La Jolla, CA 92093, USA; Bioinformatics and Systems Biology Graduate Program, University of California, San Diego, La Jolla, CA 92093, USA
| | - Katherine L Rothamel
- Department of Cellular and Molecular Medicine, University of California, San Diego, La Jolla, CA 92093, USA; Sanford Stem Cell Institute Innovation Center and Stem Cell Program, University of California, San Diego, La Jolla, CA 92093, USA; Institute for Genomic Medicine, University of California, San Diego, La Jolla, CA 92093, USA
| | - Grady G Nguyen
- Department of Cellular and Molecular Medicine, University of California, San Diego, La Jolla, CA 92093, USA; Sanford Stem Cell Institute Innovation Center and Stem Cell Program, University of California, San Diego, La Jolla, CA 92093, USA; Institute for Genomic Medicine, University of California, San Diego, La Jolla, CA 92093, USA
| | - Evan A Boyle
- Department of Cellular and Molecular Medicine, University of California, San Diego, La Jolla, CA 92093, USA; Sanford Stem Cell Institute Innovation Center and Stem Cell Program, University of California, San Diego, La Jolla, CA 92093, USA; Institute for Genomic Medicine, University of California, San Diego, La Jolla, CA 92093, USA
| | - Gene W Yeo
- Department of Cellular and Molecular Medicine, University of California, San Diego, La Jolla, CA 92093, USA; Sanford Stem Cell Institute Innovation Center and Stem Cell Program, University of California, San Diego, La Jolla, CA 92093, USA; Institute for Genomic Medicine, University of California, San Diego, La Jolla, CA 92093, USA.
| |
Collapse
|
6
|
Scholten NR, Haandrikman D, Tolhuis JO, Morandi E, Incarnato D. SHAPEwarp-web: sequence-agnostic search for structurally homologous RNA regions across databases of chemical probing data. Nucleic Acids Res 2024; 52:W362-W367. [PMID: 38709889 PMCID: PMC11223795 DOI: 10.1093/nar/gkae348] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2024] [Revised: 04/08/2024] [Accepted: 04/19/2024] [Indexed: 05/08/2024] Open
Abstract
RNA molecules perform a variety of functions in cells, many of which rely on their secondary and tertiary structures. Chemical probing methods coupled with high-throughput sequencing have significantly accelerated the mapping of RNA structures, and increasingly large datasets of transcriptome-wide RNA chemical probing data are becoming available. Analogously to what has been done for decades in the protein world, this RNA structural information can be leveraged to aid the discovery of structural similarity to a known RNA (or RNA family), which, in turn, can inform about the function of transcripts. We have previously developed SHAPEwarp, a sequence-agnostic method for the search of structurally homologous RNA segments in a database of reactivity profiles derived from chemical probing experiments. In its original implementation, however, SHAPEwarp required substantial computational resources, even for moderately sized databases, as well as significant Linux command line know-how. To address these limitations, we introduce here SHAPEwarp-web, a user-friendly web interface to rapidly query large databases of RNA chemical probing data for structurally similar RNAs. Aside from featuring a completely rewritten core, which speeds up by orders of magnitude the search inside large databases, the web server hosts several high-quality chemical probing databases across multiple species. SHAPEwarp-web is available from https://shapewarp.incarnatolab.com.
Collapse
Affiliation(s)
- Niek R Scholten
- School for Life Science and Technology, Hanze University of Applied Sciences, 9747 AS Groningen, The Netherlands
| | - Dennis Haandrikman
- School for Life Science and Technology, Hanze University of Applied Sciences, 9747 AS Groningen, The Netherlands
| | - Joshua O Tolhuis
- School for Life Science and Technology, Hanze University of Applied Sciences, 9747 AS Groningen, The Netherlands
| | - Edoardo Morandi
- Department of Molecular Genetics, Groningen Biomolecular Sciences and Biotechnology Institute (GBB), University of Groningen, 9747 AG Groningen, The Netherlands
| | - Danny Incarnato
- Department of Molecular Genetics, Groningen Biomolecular Sciences and Biotechnology Institute (GBB), University of Groningen, 9747 AG Groningen, The Netherlands
| |
Collapse
|
7
|
Hwang H, Jeon H, Yeo N, Baek D. Big data and deep learning for RNA biology. Exp Mol Med 2024; 56:1293-1321. [PMID: 38871816 PMCID: PMC11263376 DOI: 10.1038/s12276-024-01243-w] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2024] [Revised: 02/27/2024] [Accepted: 03/05/2024] [Indexed: 06/15/2024] Open
Abstract
The exponential growth of big data in RNA biology (RB) has led to the development of deep learning (DL) models that have driven crucial discoveries. As constantly evidenced by DL studies in other fields, the successful implementation of DL in RB depends heavily on the effective utilization of large-scale datasets from public databases. In achieving this goal, data encoding methods, learning algorithms, and techniques that align well with biological domain knowledge have played pivotal roles. In this review, we provide guiding principles for applying these DL concepts to various problems in RB by demonstrating successful examples and associated methodologies. We also discuss the remaining challenges in developing DL models for RB and suggest strategies to overcome these challenges. Overall, this review aims to illuminate the compelling potential of DL for RB and ways to apply this powerful technology to investigate the intriguing biology of RNA more effectively.
Collapse
Affiliation(s)
- Hyeonseo Hwang
- School of Biological Sciences, Seoul National University, Seoul, Republic of Korea
| | - Hyeonseong Jeon
- Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul, Republic of Korea
- Genome4me Inc., Seoul, Republic of Korea
| | - Nagyeong Yeo
- School of Biological Sciences, Seoul National University, Seoul, Republic of Korea
| | - Daehyun Baek
- School of Biological Sciences, Seoul National University, Seoul, Republic of Korea.
- Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul, Republic of Korea.
- Genome4me Inc., Seoul, Republic of Korea.
| |
Collapse
|
8
|
Rennie S. Deep Learning for Elucidating Modifications to RNA-Status and Challenges Ahead. Genes (Basel) 2024; 15:629. [PMID: 38790258 PMCID: PMC11121098 DOI: 10.3390/genes15050629] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2024] [Revised: 05/11/2024] [Accepted: 05/11/2024] [Indexed: 05/26/2024] Open
Abstract
RNA-binding proteins and chemical modifications to RNA play vital roles in the co- and post-transcriptional regulation of genes. In order to fully decipher their biological roles, it is an essential task to catalogue their precise target locations along with their preferred contexts and sequence-based determinants. Recently, deep learning approaches have significantly advanced in this field. These methods can predict the presence or absence of modification at specific genomic regions based on diverse features, particularly sequence and secondary structure, allowing us to decipher the highly non-linear sequence patterns and structures that underlie site preferences. This article provides an overview of how deep learning is being applied to this area, with a particular focus on the problem of mRNA-RBP binding, while also considering other types of chemical modification to RNA. It discusses how different types of model can handle sequence-based and/or secondary-structure-based inputs, the process of model training, including choice of negative regions and separating sets for testing and training, and offers recommendations for developing biologically relevant models. Finally, it highlights four key areas that are crucial for advancing the field.
Collapse
Affiliation(s)
- Sarah Rennie
- Section for Computational and RNA Biology, Department of Biology, University of Copenhagen, 2200 Copenhagen, Denmark
| |
Collapse
|
9
|
Qian J, Zhang S, Wang F, Li J, Zhang J. What makes SARS-CoV-2 unique? Focusing on the spike protein. Cell Biol Int 2024; 48:404-430. [PMID: 38263600 DOI: 10.1002/cbin.12130] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2023] [Revised: 12/25/2023] [Accepted: 01/02/2024] [Indexed: 01/25/2024]
Abstract
Severe acute respiratory syndrome-related coronavirus 2 (SARS-CoV-2) seriously threatens public health and safety. Genetic variants determine the expression of SARS-CoV-2 structural proteins, which are associated with enhanced transmissibility, enhanced virulence, and immune escape. Vaccination is encouraged as a public health intervention, and different types of vaccines are used worldwide. However, new variants continue to emerge, especially the Omicron complex, and the neutralizing antibody responses are diminished significantly. In this review, we outlined the uniqueness of SARS-CoV-2 from three perspectives. First, we described the detailed structure of the spike (S) protein, which is highly susceptible to mutations and contributes to the distinct infection cycle of the virus. Second, we systematically summarized the immunoglobulin G epitopes of SARS-CoV-2 and highlighted the central role of the nonconserved regions of the S protein in adaptive immune escape. Third, we provided an overview of the vaccines targeting the S protein and discussed the impact of the nonconserved regions on vaccine effectiveness. The characterization and identification of the structure and genomic organization of SARS-CoV-2 will help elucidate its mechanisms of viral mutation and infection and provide a basis for the selection of optimal treatments. The leaps in advancements regarding improved diagnosis, targeted vaccines and therapeutic remedies provide sound evidence showing that scientific understanding, research, and technology evolved at the pace of the pandemic.
Collapse
Affiliation(s)
- Jingbo Qian
- Department of Laboratory Medicine, The First Affiliated Hospital with Nanjing Medical University, Nanjing, China
- Branch of National Clinical Research Center for Laboratory Medicine, Nanjing, China
| | - Shichang Zhang
- Department of Clinical Laboratory Medicine, Shenzhen Hospital of Southern Medical University, Shenzhen, China
| | - Fang Wang
- Department of Laboratory Medicine, The First Affiliated Hospital with Nanjing Medical University, Nanjing, China
- Branch of National Clinical Research Center for Laboratory Medicine, Nanjing, China
| | - Jinming Li
- National Center for Clinical Laboratories, Institute of Geriatric Medicine, Chinese Academy of Medical Sciences, Beijing Hospital/National Center of Gerontology, Beijing, China
- National Center for Clinical Laboratories, Chinese Academy of Medical Sciences & Peking Union Medical College, Beijing, China
- Beijing Engineering Research Center of Laboratory Medicine, Beijing Hospital, Beijing, China
| | - Jiexin Zhang
- Department of Laboratory Medicine, The First Affiliated Hospital with Nanjing Medical University, Nanjing, China
- Branch of National Clinical Research Center for Laboratory Medicine, Nanjing, China
| |
Collapse
|
10
|
Xu Q, Bao X, Lin Z, Tang L, He LN, Ren J, Zuo Z, Hu K. AStruct: detection of allele-specific RNA secondary structure in structuromic probing data. BMC Bioinformatics 2024; 25:91. [PMID: 38429654 PMCID: PMC11264973 DOI: 10.1186/s12859-024-05704-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2023] [Accepted: 02/14/2024] [Indexed: 03/03/2024] Open
Abstract
BACKGROUND Uncovering functional genetic variants from an allele-specific perspective is of paramount importance in advancing our understanding of gene regulation and genetic diseases. Recently, various allele-specific events, such as allele-specific gene expression, allele-specific methylation, and allele-specific binding, have been explored on a genome-wide scale due to the development of high-throughput sequencing methods. RNA secondary structure, which plays a crucial role in multiple RNA-associated processes like RNA modification, translation and splicing, has emerged as an essential focus of relevant research. However, tools to identify genetic variants associated with allele-specific RNA secondary structures are still lacking. RESULTS Here, we develop a computational tool called 'AStruct' that enables us to detect allele-specific RNA secondary structure (ASRS) from RT-stop based structuromic probing data. AStruct shows robust performance in both simulated datasets and public icSHAPE datasets. We reveal that single nucleotide polymorphisms (SNPs) with higher AStruct scores are enriched in coding regions and tend to be functional. These SNPs are highly conservative, have the potential to disrupt sites involved in m6A modification or protein binding, and are frequently associated with disease. CONCLUSIONS AStruct is a tool dedicated to invoke allele-specific RNA secondary structure events at heterozygous SNPs in RT-stop based structuromic probing data. It utilizes allelic variants, base pairing and RT-stop information under different cell conditions to detect dynamic and functional ASRS. Compared to sequence-based tools, AStruct considers dynamic cell conditions and outperforms in detecting functional variants. AStruct is implemented in JAVA and is freely accessible at: https://github.com/canceromics/AStruct .
Collapse
Affiliation(s)
- Qingru Xu
- State Key Laboratory of Oncology in South China, Cancer Center, Collaborative Innovation Center for Cancer Medicine, School of Life Sciences, Sun Yat-sen University, Guangzhou, 510060, China
- Department of Data Science, Dana-Farber Cancer Institute, Boston, MA, 02215, USA
| | - Xiaoqiong Bao
- State Key Laboratory of Oncology in South China, Cancer Center, Collaborative Innovation Center for Cancer Medicine, School of Life Sciences, Sun Yat-sen University, Guangzhou, 510060, China
| | - Zhuobin Lin
- Guangdong Key Laboratory of Liver Disease Research, The Third Affiliated Hospital of Sun Yat-sen University, Guangzhou, China
| | - Lin Tang
- State Key Laboratory of Oncology in South China, Cancer Center, Collaborative Innovation Center for Cancer Medicine, School of Life Sciences, Sun Yat-sen University, Guangzhou, 510060, China
| | - Li-Na He
- State Key Laboratory of Oncology in South China, Cancer Center, Collaborative Innovation Center for Cancer Medicine, School of Life Sciences, Sun Yat-sen University, Guangzhou, 510060, China
| | - Jian Ren
- State Key Laboratory of Oncology in South China, Cancer Center, Collaborative Innovation Center for Cancer Medicine, School of Life Sciences, Sun Yat-sen University, Guangzhou, 510060, China
| | - Zhixiang Zuo
- State Key Laboratory of Oncology in South China, Cancer Center, Collaborative Innovation Center for Cancer Medicine, School of Life Sciences, Sun Yat-sen University, Guangzhou, 510060, China.
| | - Kunhua Hu
- Guangdong Key Laboratory of Liver Disease Research, The Third Affiliated Hospital of Sun Yat-sen University, Guangzhou, China.
| |
Collapse
|
11
|
Bose R, Saleem I, Mustoe AM. Causes, functions, and therapeutic possibilities of RNA secondary structure ensembles and alternative states. Cell Chem Biol 2024; 31:17-35. [PMID: 38199037 PMCID: PMC10842484 DOI: 10.1016/j.chembiol.2023.12.010] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2023] [Revised: 11/21/2023] [Accepted: 12/12/2023] [Indexed: 01/12/2024]
Abstract
RNA secondary structure plays essential roles in encoding RNA regulatory fate and function. Most RNAs populate ensembles of alternatively paired states and are continually unfolded and refolded by cellular processes. Measuring these structural ensembles and their contributions to cellular function has traditionally posed major challenges, but new methods and conceptual frameworks are beginning to fill this void. In this review, we provide a mechanism- and function-centric compendium of the roles of RNA secondary structural ensembles and minority states in regulating the RNA life cycle, from transcription to degradation. We further explore how dysregulation of RNA structural ensembles contributes to human disease and discuss the potential of drugging alternative RNA states to therapeutically modulate RNA activity. The emerging paradigm of RNA structural ensembles as central to RNA function provides a foundation for a deeper understanding of RNA biology and new therapeutic possibilities.
Collapse
Affiliation(s)
- Ritwika Bose
- Therapeutic Innovation Center (THINC), Department of Biochemistry and Molecular Pharmacology, Baylor College of Medicine, Houston, TX, USA
| | - Irfana Saleem
- Therapeutic Innovation Center (THINC), Department of Biochemistry and Molecular Pharmacology, Baylor College of Medicine, Houston, TX, USA
| | - Anthony M Mustoe
- Therapeutic Innovation Center (THINC), Department of Biochemistry and Molecular Pharmacology, Baylor College of Medicine, Houston, TX, USA; Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, USA.
| |
Collapse
|
12
|
Kuhle B, Chen Q, Schimmel P. tRNA renovatio: Rebirth through fragmentation. Mol Cell 2023; 83:3953-3971. [PMID: 37802077 PMCID: PMC10841463 DOI: 10.1016/j.molcel.2023.09.016] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2023] [Revised: 08/15/2023] [Accepted: 09/12/2023] [Indexed: 10/08/2023]
Abstract
tRNA function is based on unique structures that enable mRNA decoding using anticodon trinucleotides. These structures interact with specific aminoacyl-tRNA synthetases and ribosomes using 3D shape and sequence signatures. Beyond translation, tRNAs serve as versatile signaling molecules interacting with other RNAs and proteins. Through evolutionary processes, tRNA fragmentation emerges as not merely random degradation but an act of recreation, generating specific shorter molecules called tRNA-derived small RNAs (tsRNAs). These tsRNAs exploit their linear sequences and newly arranged 3D structures for unexpected biological functions, epitomizing the tRNA "renovatio" (from Latin, meaning renewal, renovation, and rebirth). Emerging methods to uncover full tRNA/tsRNA sequences and modifications, combined with techniques to study RNA structures and to integrate AI-powered predictions, will enable comprehensive investigations of tRNA fragmentation products and new interaction potentials in relation to their biological functions. We anticipate that these directions will herald a new era for understanding biological complexity and advancing pharmaceutical engineering.
Collapse
Affiliation(s)
- Bernhard Kuhle
- Department of Molecular Medicine, The Scripps Research Institute, La Jolla, CA, USA; Department of Cellular Biochemistry, University Medical Center Göttingen, Göttingen, Germany
| | - Qi Chen
- Molecular Medicine Program, Department of Human Genetics, and Division of Urology, Department of Surgery, University of Utah School of Medicine, Salt Lake City, UT, USA
| | - Paul Schimmel
- Department of Molecular Medicine, The Scripps Research Institute, La Jolla, CA, USA.
| |
Collapse
|
13
|
Yang M, Chen S, Huang Z, Gao S, Yu T, Du T, Zhang H, Li X, Liu CM, Chen S, Li H. Deep learning-enabled discovery and characterization of HKT genes in Spartina alterniflora. THE PLANT JOURNAL : FOR CELL AND MOLECULAR BIOLOGY 2023; 116:690-705. [PMID: 37494542 DOI: 10.1111/tpj.16397] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/10/2023] [Revised: 07/03/2023] [Accepted: 07/11/2023] [Indexed: 07/28/2023]
Abstract
Spartina alterniflora is a halophyte that can survive in high-salinity environments, and it is phylogenetically close to important cereal crops, such as maize and rice. It is of scientific interest to understand why S. alterniflora can live under such extremely stressful conditions. The molecular mechanism underlying its high-saline tolerance is still largely unknown. Here we investigated the possibility that high-affinity K+ transporters (HKTs), which function in salt tolerance and maintenance of ion homeostasis in plants, are responsible for salt tolerance in S. alterniflora. To overcome the imprecision and unstable of the gene screening method caused by the conventional sequence alignment, we used a deep learning method, DeepGOPlus, to automatically extract sequence and protein characteristics from our newly assemble S. alterniflora genome to identify SaHKTs. Results showed that a total of 16 HKT genes were identified. The number of S. alterniflora HKTs (SaHKTs) is larger than that in all other investigated plant species except wheat. Phylogenetically related SaHKT members had similar gene structures, conserved protein domains and cis-elements. Expression profiling showed that most SaHKT genes are expressed in specific tissues and are differentially expressed under salt stress. Yeast complementation expression analysis showed that type I members SaHKT1;2, SaHKT1;3 and SaHKT1;8 and type II members SaHKT2;1, SaHKT2;3 and SaHKT2;4 had low-affinity K+ uptake ability and that type II members showed stronger K+ affinity than rice and Arabidopsis HKTs, as well as most SaHKTs showed preference for Na+ transport. We believe the deep learning-based methods are powerful approaches to uncovering new functional genes, and the SaHKT genes identified are important resources for breeding new varieties of salt-tolerant crops.
Collapse
Affiliation(s)
- Maogeng Yang
- State Key Laboratory of Crop Gene Resources and Breeding, Institute of Crop Sciences, Chinese Academy of Agricultural Sciences (CAAS), Beijing, China
- Nanfan Research Institute, CAAS, Sanya, Hainan, China
- Key Laboratory of Plant Molecular & Developmental Biology, College of Life Sciences, Yantai University, Yantai, Shandong, China
| | - Shoukun Chen
- State Key Laboratory of Crop Gene Resources and Breeding, Institute of Crop Sciences, Chinese Academy of Agricultural Sciences (CAAS), Beijing, China
- Nanfan Research Institute, CAAS, Sanya, Hainan, China
- Hainan Yazhou Bay Seed Laboratory, Sanya, Hainan, China
| | - Zhangping Huang
- State Key Laboratory of Crop Gene Resources and Breeding, Institute of Crop Sciences, Chinese Academy of Agricultural Sciences (CAAS), Beijing, China
- Nanfan Research Institute, CAAS, Sanya, Hainan, China
| | - Shang Gao
- State Key Laboratory of Crop Gene Resources and Breeding, Institute of Crop Sciences, Chinese Academy of Agricultural Sciences (CAAS), Beijing, China
- Nanfan Research Institute, CAAS, Sanya, Hainan, China
| | - Tingxi Yu
- State Key Laboratory of Crop Gene Resources and Breeding, Institute of Crop Sciences, Chinese Academy of Agricultural Sciences (CAAS), Beijing, China
- Nanfan Research Institute, CAAS, Sanya, Hainan, China
| | - Tingting Du
- State Key Laboratory of Crop Gene Resources and Breeding, Institute of Crop Sciences, Chinese Academy of Agricultural Sciences (CAAS), Beijing, China
- Nanfan Research Institute, CAAS, Sanya, Hainan, China
| | - Hao Zhang
- State Key Laboratory of Crop Gene Resources and Breeding, Institute of Crop Sciences, Chinese Academy of Agricultural Sciences (CAAS), Beijing, China
- Nanfan Research Institute, CAAS, Sanya, Hainan, China
| | - Xiang Li
- State Key Laboratory of Plant Genomics and National Center for Plant Gene Research, Institute of Genetics and Developmental Biology, Innovation Academy for Seed Design, Chinese Academy of Sciences, Beijing, China
| | - Chun-Ming Liu
- State Key Laboratory of Crop Gene Resources and Breeding, Institute of Crop Sciences, Chinese Academy of Agricultural Sciences (CAAS), Beijing, China
- Key Laboratory of Plant Molecular Physiology, Institute of Botany, Chinese Academy of Sciences, Beijing, China
- College of Life Sciences, University of Chinese Academy of Sciences, Beijing, China
- School of Advanced Agricultural Sciences, Peking University, Beijing, China
| | - Shihua Chen
- Key Laboratory of Plant Molecular & Developmental Biology, College of Life Sciences, Yantai University, Yantai, Shandong, China
| | - Huihui Li
- State Key Laboratory of Crop Gene Resources and Breeding, Institute of Crop Sciences, Chinese Academy of Agricultural Sciences (CAAS), Beijing, China
- Nanfan Research Institute, CAAS, Sanya, Hainan, China
| |
Collapse
|
14
|
Zhu H, Yang Y, Wang Y, Wang F, Huang Y, Chang Y, Wong KC, Li X. Dynamic characterization and interpretation for protein-RNA interactions across diverse cellular conditions using HDRNet. Nat Commun 2023; 14:6824. [PMID: 37884495 PMCID: PMC10603054 DOI: 10.1038/s41467-023-42547-1] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2023] [Accepted: 10/13/2023] [Indexed: 10/28/2023] Open
Abstract
RNA-binding proteins play crucial roles in the regulation of gene expression, and understanding the interactions between RNAs and RBPs in distinct cellular conditions forms the basis for comprehending the underlying RNA function. However, current computational methods pose challenges to the cross-prediction of RNA-protein binding events across diverse cell lines and tissue contexts. Here, we develop HDRNet, an end-to-end deep learning-based framework to precisely predict dynamic RBP binding events under diverse cellular conditions. Our results demonstrate that HDRNet can accurately and efficiently identify binding sites, particularly for dynamic prediction, outperforming other state-of-the-art models on 261 linear RNA datasets from both eCLIP and CLIP-seq, supplemented with additional tissue data. Moreover, we conduct motif and interpretation analyses to provide fresh insights into the pathological mechanisms underlying RNA-RBP interactions from various perspectives. Our functional genomic analysis further explores the gene-human disease associations, uncovering previously uncharacterized observations for a broad range of genetic disorders.
Collapse
Affiliation(s)
- Haoran Zhu
- School of Artificial Intelligence, Jilin University, 130012, Changchun, China
| | - Yuning Yang
- Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, Toronto, ON, Canada
| | - Yunhe Wang
- School of Artificial Intelligence, Hebei University of Technology, Tianjin, China
| | - Fuzhou Wang
- Department of Computer Science, City University of Hong Kong, Hong Kong, Hong Kong SAR
| | - Yujian Huang
- College of Computer Science and Cyber Security, Chengdu University of Technology, 610059, Chengdu, China
| | - Yi Chang
- School of Artificial Intelligence, Jilin University, 130012, Changchun, China
| | - Ka-Chun Wong
- Department of Computer Science, City University of Hong Kong, Hong Kong, Hong Kong SAR.
| | - Xiangtao Li
- School of Artificial Intelligence, Jilin University, 130012, Changchun, China.
| |
Collapse
|
15
|
Schmidt N, Ganskih S, Wei Y, Gabel A, Zielinski S, Keshishian H, Lareau CA, Zimmermann L, Makroczyova J, Pearce C, Krey K, Hennig T, Stegmaier S, Moyon L, Horlacher M, Werner S, Aydin J, Olguin-Nava M, Potabattula R, Kibe A, Dölken L, Smyth RP, Caliskan N, Marsico A, Krempl C, Bodem J, Pichlmair A, Carr SA, Chlanda P, Erhard F, Munschauer M. SND1 binds SARS-CoV-2 negative-sense RNA and promotes viral RNA synthesis through NSP9. Cell 2023; 186:4834-4850.e23. [PMID: 37794589 PMCID: PMC10617981 DOI: 10.1016/j.cell.2023.09.002] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2022] [Revised: 07/13/2023] [Accepted: 09/01/2023] [Indexed: 10/06/2023]
Abstract
Regulation of viral RNA biogenesis is fundamental to productive SARS-CoV-2 infection. To characterize host RNA-binding proteins (RBPs) involved in this process, we biochemically identified proteins bound to genomic and subgenomic SARS-CoV-2 RNAs. We find that the host protein SND1 binds the 5' end of negative-sense viral RNA and is required for SARS-CoV-2 RNA synthesis. SND1-depleted cells form smaller replication organelles and display diminished virus growth kinetics. We discover that NSP9, a viral RBP and direct SND1 interaction partner, is covalently linked to the 5' ends of positive- and negative-sense RNAs produced during infection. These linkages occur at replication-transcription initiation sites, consistent with NSP9 priming viral RNA synthesis. Mechanistically, SND1 remodels NSP9 occupancy and alters the covalent linkage of NSP9 to initiating nucleotides in viral RNA. Our findings implicate NSP9 in the initiation of SARS-CoV-2 RNA synthesis and unravel an unsuspected role of a cellular protein in orchestrating viral RNA production.
Collapse
Affiliation(s)
- Nora Schmidt
- Helmholtz Institute for RNA-based Infection Research (HIRI), Helmholtz Centre for Infection Research (HZI), Würzburg, Germany
| | - Sabina Ganskih
- Helmholtz Institute for RNA-based Infection Research (HIRI), Helmholtz Centre for Infection Research (HZI), Würzburg, Germany
| | - Yuanjie Wei
- Helmholtz Institute for RNA-based Infection Research (HIRI), Helmholtz Centre for Infection Research (HZI), Würzburg, Germany
| | - Alexander Gabel
- Helmholtz Institute for RNA-based Infection Research (HIRI), Helmholtz Centre for Infection Research (HZI), Würzburg, Germany
| | - Sebastian Zielinski
- Helmholtz Institute for RNA-based Infection Research (HIRI), Helmholtz Centre for Infection Research (HZI), Würzburg, Germany
| | | | - Caleb A Lareau
- Program in Computational and System Biology, Memorial Sloan Kettering Cancer Center, New York, NY, USA
| | - Liv Zimmermann
- Schaller Research Group, Department of Infectious Diseases, Virology, Heidelberg University Hospital, Heidelberg, Germany
| | - Jana Makroczyova
- Schaller Research Group, Department of Infectious Diseases, Virology, Heidelberg University Hospital, Heidelberg, Germany
| | | | - Karsten Krey
- School of Medicine, Institute of Virology, Technical University of Munich, Munich, Germany
| | - Thomas Hennig
- Institute for Virology and Immunobiology, Julius-Maximilians-University Würzburg, Würzburg, Germany
| | - Sebastian Stegmaier
- Helmholtz Institute for RNA-based Infection Research (HIRI), Helmholtz Centre for Infection Research (HZI), Würzburg, Germany
| | - Lambert Moyon
- Computational Health Center, Helmholtz Center Munich, Munich, Germany
| | - Marc Horlacher
- Computational Health Center, Helmholtz Center Munich, Munich, Germany
| | - Simone Werner
- Helmholtz Institute for RNA-based Infection Research (HIRI), Helmholtz Centre for Infection Research (HZI), Würzburg, Germany
| | - Jens Aydin
- Helmholtz Institute for RNA-based Infection Research (HIRI), Helmholtz Centre for Infection Research (HZI), Würzburg, Germany
| | - Marco Olguin-Nava
- Helmholtz Institute for RNA-based Infection Research (HIRI), Helmholtz Centre for Infection Research (HZI), Würzburg, Germany
| | - Ramya Potabattula
- Institute of Human Genetics, Julius-Maximilians-University Würzburg, Würzburg, Germany
| | - Anuja Kibe
- Helmholtz Institute for RNA-based Infection Research (HIRI), Helmholtz Centre for Infection Research (HZI), Würzburg, Germany
| | - Lars Dölken
- Institute for Virology and Immunobiology, Julius-Maximilians-University Würzburg, Würzburg, Germany
| | - Redmond P Smyth
- Helmholtz Institute for RNA-based Infection Research (HIRI), Helmholtz Centre for Infection Research (HZI), Würzburg, Germany
| | - Neva Caliskan
- Helmholtz Institute for RNA-based Infection Research (HIRI), Helmholtz Centre for Infection Research (HZI), Würzburg, Germany
| | - Annalisa Marsico
- Computational Health Center, Helmholtz Center Munich, Munich, Germany
| | - Christine Krempl
- Institute for Virology and Immunobiology, Julius-Maximilians-University Würzburg, Würzburg, Germany
| | - Jochen Bodem
- Institute for Virology and Immunobiology, Julius-Maximilians-University Würzburg, Würzburg, Germany
| | - Andreas Pichlmair
- School of Medicine, Institute of Virology, Technical University of Munich, Munich, Germany; German Center for Infection Research (DZIF), Munich Partner Site, Munich, Germany
| | - Steven A Carr
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Petr Chlanda
- Schaller Research Group, Department of Infectious Diseases, Virology, Heidelberg University Hospital, Heidelberg, Germany
| | - Florian Erhard
- Institute for Virology and Immunobiology, Julius-Maximilians-University Würzburg, Würzburg, Germany; Faculty for Computer and Data Science, University of Regensburg, Regensburg, Germany
| | - Mathias Munschauer
- Helmholtz Institute for RNA-based Infection Research (HIRI), Helmholtz Centre for Infection Research (HZI), Würzburg, Germany; Faculty of Medicine, Julius-Maximilians-University Würzburg, Würzburg, Germany.
| |
Collapse
|
16
|
Ballarino M, Pepe G, Helmer-Citterich M, Palma A. Exploring the landscape of tools and resources for the analysis of long non-coding RNAs. Comput Struct Biotechnol J 2023; 21:4706-4716. [PMID: 37841333 PMCID: PMC10568309 DOI: 10.1016/j.csbj.2023.09.041] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2023] [Revised: 09/28/2023] [Accepted: 09/28/2023] [Indexed: 10/17/2023] Open
Abstract
In recent years, research on long non-coding RNAs (lncRNAs) has gained considerable attention due to the increasing number of newly identified transcripts. Several characteristics make their functional evaluation challenging, which called for the urgent need to combine molecular biology with other disciplines, including bioinformatics. Indeed, the recent development of computational pipelines and resources has greatly facilitated both the discovery and the mechanisms of action of lncRNAs. In this review, we present a curated collection of the most recent computational resources, which have been categorized into distinct groups: databases and annotation, identification and classification, interaction prediction, and structure prediction. As the repertoire of lncRNAs and their analysis tools continues to expand over the years, standardizing the computational pipelines and improving the existing annotation of lncRNAs will be crucial to facilitate functional genomics studies.
Collapse
Affiliation(s)
- Monica Ballarino
- Department of Biology and Biotechnologies “Charles Darwin”, Sapienza University of Rome, Piazzale Aldo Moro 5, 00161 Rome, Italy
| | - Gerardo Pepe
- Department of Biology, University of Rome Tor Vergata, Via della Ricerca Scientifica, 1, 00133 Rome, Italy
| | - Manuela Helmer-Citterich
- Department of Biology, University of Rome Tor Vergata, Via della Ricerca Scientifica, 1, 00133 Rome, Italy
| | - Alessandro Palma
- Department of Biology and Biotechnologies “Charles Darwin”, Sapienza University of Rome, Piazzale Aldo Moro 5, 00161 Rome, Italy
| |
Collapse
|
17
|
Vaculík O, Chalupová E, Grešová K, Majtner T, Alexiou P. Transfer Learning Allows Accurate RBP Target Site Prediction with Limited Sample Sizes. BIOLOGY 2023; 12:1276. [PMID: 37886986 PMCID: PMC10604046 DOI: 10.3390/biology12101276] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/15/2023] [Revised: 09/19/2023] [Accepted: 09/21/2023] [Indexed: 10/28/2023]
Abstract
RNA-binding proteins are vital regulators in numerous biological processes. Their disfunction can result in diverse diseases, such as cancer or neurodegenerative disorders, making the prediction of their binding sites of high importance. Deep learning (DL) has brought about a revolution in various biological domains, including the field of protein-RNA interactions. Nonetheless, several challenges persist, such as the limited availability of experimentally validated binding sites to train well-performing DL models for the majority of proteins. Here, we present a novel training approach based on transfer learning (TL) to address the issue of limited data. Employing a sophisticated and interpretable architecture, we compare the performance of our method trained using two distinct approaches: training from scratch (SCR) and utilizing TL. Additionally, we benchmark our results against the current state-of-the-art methods. Furthermore, we tackle the challenges associated with selecting appropriate input features and determining optimal interval sizes. Our results show that TL enhances model performance, particularly in datasets with minimal training data, where satisfactory results can be achieved with just a few hundred RNA binding sites. Moreover, we demonstrate that integrating both sequence and evolutionary conservation information leads to superior performance. Additionally, we showcase how incorporating an attention layer into the model facilitates the interpretation of predictions within a biologically relevant context.
Collapse
Affiliation(s)
- Ondřej Vaculík
- Central European Institute of Technology (CEITEC), Masaryk University, 625 00 Brno, Czech Republic
- Faculty of Science, National Centre for Biomolecular Research, Masaryk University, 625 00 Brno, Czech Republic
| | - Eliška Chalupová
- Faculty of Science, National Centre for Biomolecular Research, Masaryk University, 625 00 Brno, Czech Republic
| | - Katarína Grešová
- Central European Institute of Technology (CEITEC), Masaryk University, 625 00 Brno, Czech Republic
- Faculty of Science, National Centre for Biomolecular Research, Masaryk University, 625 00 Brno, Czech Republic
| | - Tomáš Majtner
- Central European Institute of Technology (CEITEC), Masaryk University, 625 00 Brno, Czech Republic
- Department of Molecular Sociology, Max Planck Institute of Biophysics, 60439 Frankfurt am Main, Germany
| | - Panagiotis Alexiou
- Central European Institute of Technology (CEITEC), Masaryk University, 625 00 Brno, Czech Republic
- Department of Applied Biomedical Science, Faculty of Health Sciences, University of Malta, MSD 2080 Msida, Malta
- Centre for Molecular Medicine & Biobanking, University of Malta, MSD 2080 Msida, Malta
| |
Collapse
|
18
|
Horlacher M, Cantini G, Hesse J, Schinke P, Goedert N, Londhe S, Moyon L, Marsico A. A systematic benchmark of machine learning methods for protein-RNA interaction prediction. Brief Bioinform 2023; 24:bbad307. [PMID: 37635383 PMCID: PMC10516373 DOI: 10.1093/bib/bbad307] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2023] [Revised: 06/15/2023] [Accepted: 07/18/2023] [Indexed: 08/29/2023] Open
Abstract
RNA-binding proteins (RBPs) are central actors of RNA post-transcriptional regulation. Experiments to profile-binding sites of RBPs in vivo are limited to transcripts expressed in the experimental cell type, creating the need for computational methods to infer missing binding information. While numerous machine-learning based methods have been developed for this task, their use of heterogeneous training and evaluation datasets across different sets of RBPs and CLIP-seq protocols makes a direct comparison of their performance difficult. Here, we compile a set of 37 machine learning (primarily deep learning) methods for in vivo RBP-RNA interaction prediction and systematically benchmark a subset of 11 representative methods across hundreds of CLIP-seq datasets and RBPs. Using homogenized sample pre-processing and two negative-class sample generation strategies, we evaluate methods in terms of predictive performance and assess the impact of neural network architectures and input modalities on model performance. We believe that this study will not only enable researchers to choose the optimal prediction method for their tasks at hand, but also aid method developers in developing novel, high-performing methods by introducing a standardized framework for their evaluation.
Collapse
Affiliation(s)
- Marc Horlacher
- Computational Health Center, Helmholtz Center Munich, Germany
- School of Computation, Information and Technology, Technical University Munich (TUM), Germany
| | - Giulia Cantini
- Computational Health Center, Helmholtz Center Munich, Germany
| | - Julian Hesse
- Computational Health Center, Helmholtz Center Munich, Germany
| | - Patrick Schinke
- Computational Health Center, Helmholtz Center Munich, Germany
| | - Nicolas Goedert
- Computational Health Center, Helmholtz Center Munich, Germany
| | | | - Lambert Moyon
- Computational Health Center, Helmholtz Center Munich, Germany
| | | |
Collapse
|
19
|
Xiang Y, Huang W, Tan L, Chen T, He Y, Irving PS, Weeks KM, Zhang QC, Dong X. Pervasive downstream RNA hairpins dynamically dictate start-codon selection. Nature 2023; 621:423-430. [PMID: 37674078 PMCID: PMC10499604 DOI: 10.1038/s41586-023-06500-y] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2022] [Accepted: 07/31/2023] [Indexed: 09/08/2023]
Abstract
Translational reprogramming allows organisms to adapt to changing conditions. Upstream start codons (uAUGs), which are prevalently present in mRNAs, have crucial roles in regulating translation by providing alternative translation start sites1-4. However, what determines this selective initiation of translation between conditions remains unclear. Here, by integrating transcriptome-wide translational and structural analyses during pattern-triggered immunity in Arabidopsis, we found that transcripts with immune-induced translation are enriched with upstream open reading frames (uORFs). Without infection, these uORFs are selectively translated owing to hairpins immediately downstream of uAUGs, presumably by slowing and engaging the scanning preinitiation complex. Modelling using deep learning provides unbiased support for these recognizable double-stranded RNA structures downstream of uAUGs (which we term uAUG-ds) being responsible for the selective translation of uAUGs, and allows the prediction and rational design of translating uAUG-ds. We found that uAUG-ds-mediated regulation can be generalized to human cells. Moreover, uAUG-ds-mediated start-codon selection is dynamically regulated. After immune challenge in plants, induced RNA helicases that are homologous to Ded1p in yeast and DDX3X in humans resolve these structures, allowing ribosomes to bypass uAUGs to translate downstream defence proteins. This study shows that mRNA structures dynamically regulate start-codon selection. The prevalence of this RNA structural feature and the conservation of RNA helicases across kingdoms suggest that mRNA structural remodelling is a general feature of translational reprogramming.
Collapse
Affiliation(s)
- Yezi Xiang
- Department of Biology, Duke University, Durham, NC, USA
- Howard Hughes Medical Institute, Duke University, Durham, NC, USA
| | - Wenze Huang
- MOE Key Laboratory of Bioinformatics, Center for Synthetic and Systems Biology, School of Life Sciences, Tsinghua University, Beijing, China
- Beijing Frontier Research Center for Biological Structures, Beijing Advanced Innovation Center for Structural Biology, Tsinghua University, Beijing, China
- Tsinghua-Peking Center for Life Sciences, Beijing, China
| | - Lianmei Tan
- Department of Pharmacology and Cancer Biology, Duke Medical Center, Duke University, Durham, NC, USA
| | - Tianyuan Chen
- Department of Biology, Duke University, Durham, NC, USA
- Howard Hughes Medical Institute, Duke University, Durham, NC, USA
| | - Yang He
- Department of Biology, Duke University, Durham, NC, USA
- Howard Hughes Medical Institute, Duke University, Durham, NC, USA
| | - Patrick S Irving
- Department of Chemistry, University of North Carolina, Chapel Hill, NC, USA
| | - Kevin M Weeks
- Department of Chemistry, University of North Carolina, Chapel Hill, NC, USA
| | - Qiangfeng Cliff Zhang
- MOE Key Laboratory of Bioinformatics, Center for Synthetic and Systems Biology, School of Life Sciences, Tsinghua University, Beijing, China
- Beijing Frontier Research Center for Biological Structures, Beijing Advanced Innovation Center for Structural Biology, Tsinghua University, Beijing, China
- Tsinghua-Peking Center for Life Sciences, Beijing, China
| | - Xinnian Dong
- Department of Biology, Duke University, Durham, NC, USA.
- Howard Hughes Medical Institute, Duke University, Durham, NC, USA.
| |
Collapse
|
20
|
Liu X, Duan Y, Hong X, Xie J, Liu S. Challenges in structural modeling of RNA-protein interactions. Curr Opin Struct Biol 2023; 81:102623. [PMID: 37301066 DOI: 10.1016/j.sbi.2023.102623] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2023] [Revised: 05/14/2023] [Accepted: 05/16/2023] [Indexed: 06/12/2023]
Abstract
In the past few years, the number of RNA-binding proteins (RBP) and RNA-RBP interactions has increased significantly. Here, we review recent developments in the methodology for protein-RNA and protein-protein complex structure modeling with deep learning and co-evolution, as well as discuss the challenges and opportunities for building a reliable approach for protein-RNA complex structure modelling. Protein Data bank (PDB) and Cross-linking immunoprecipitation (CLIP) data could be combined together and used to infer 2D geometry of protein-RNA interactions by deep learning.
Collapse
Affiliation(s)
- Xudong Liu
- School of Physics, Huazhong University of Science and Technology, Wuhan, Hubei, 430074, China
| | - Yingtian Duan
- School of Physics, Huazhong University of Science and Technology, Wuhan, Hubei, 430074, China
| | - Xu Hong
- School of Physics, Huazhong University of Science and Technology, Wuhan, Hubei, 430074, China
| | - Juan Xie
- School of Physics, Huazhong University of Science and Technology, Wuhan, Hubei, 430074, China
| | - Shiyong Liu
- School of Physics, Huazhong University of Science and Technology, Wuhan, Hubei, 430074, China.
| |
Collapse
|
21
|
Choi SR, Lee M. Transformer Architecture and Attention Mechanisms in Genome Data Analysis: A Comprehensive Review. BIOLOGY 2023; 12:1033. [PMID: 37508462 PMCID: PMC10376273 DOI: 10.3390/biology12071033] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/20/2023] [Revised: 07/18/2023] [Accepted: 07/21/2023] [Indexed: 07/30/2023]
Abstract
The emergence and rapid development of deep learning, specifically transformer-based architectures and attention mechanisms, have had transformative implications across several domains, including bioinformatics and genome data analysis. The analogous nature of genome sequences to language texts has enabled the application of techniques that have exhibited success in fields ranging from natural language processing to genomic data. This review provides a comprehensive analysis of the most recent advancements in the application of transformer architectures and attention mechanisms to genome and transcriptome data. The focus of this review is on the critical evaluation of these techniques, discussing their advantages and limitations in the context of genome data analysis. With the swift pace of development in deep learning methodologies, it becomes vital to continually assess and reflect on the current standing and future direction of the research. Therefore, this review aims to serve as a timely resource for both seasoned researchers and newcomers, offering a panoramic view of the recent advancements and elucidating the state-of-the-art applications in the field. Furthermore, this review paper serves to highlight potential areas of future investigation by critically evaluating studies from 2019 to 2023, thereby acting as a stepping-stone for further research endeavors.
Collapse
Affiliation(s)
| | - Minhyeok Lee
- School of Electrical and Electronics Engineering, Chung-Ang University, Seoul 06974, Republic of Korea;
| |
Collapse
|
22
|
Riley AT, Robson JM, Green AA. Generative and predictive neural networks for the design of functional RNA molecules. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.07.14.549043. [PMID: 37503279 PMCID: PMC10370010 DOI: 10.1101/2023.07.14.549043] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/29/2023]
Abstract
RNA is a remarkably versatile molecule that has been engineered for applications in therapeutics, diagnostics, and in vivo information-processing systems. However, the complex relationship between the sequence and structural properties of an RNA molecule and its ability to perform specific functions often necessitates extensive experimental screening of candidate sequences. Here we present a generalized neural network architecture that utilizes the sequence and structure of RNA molecules (SANDSTORM) to inform functional predictions. We demonstrate that this approach achieves state-of-the-art performance across several distinct RNA prediction tasks, while learning interpretable abstractions of RNA secondary structure. We paired these predictive models with generative adversarial RNA design networks (GARDN), allowing the generative modelling of novel mRNA 5' untranslated regions and toehold switch riboregulators exhibiting a predetermined fitness. This approach enabled the design of novel toehold switches with a 43-fold increase in experimentally characterized dynamic range compared to those designed using classic thermodynamic algorithms. SANDSTORM and GARDN thus represent powerful new predictive and generative tools for the development of diagnostic and therapeutic RNA molecules with improved function.
Collapse
Affiliation(s)
- Aidan T. Riley
- Department of Biomedical Engineering, Boston University, Boston, MA 02215, USA
- Biological Design Center, Boston University, Boston, MA 02215, USA
| | - James M. Robson
- Department of Biomedical Engineering, Boston University, Boston, MA 02215, USA
- Biological Design Center, Boston University, Boston, MA 02215, USA
| | - Alexander A. Green
- Department of Biomedical Engineering, Boston University, Boston, MA 02215, USA
- Biological Design Center, Boston University, Boston, MA 02215, USA
- Molecular Biology, Cell Biology & Biochemistry Program, Graduate School of Arts and Sciences, Boston University, Boston, MA 02215, USA
| |
Collapse
|
23
|
Lin Z, Zhao S, Li X, Miao Z, Cao J, Chen Y, Shi Z, Zhang J, Wang D, Chen S, Wang L, Gu A, Chen F, Yang T, Sun K, Han Y, Xie L, Chen H, Ji Y. Cathepsin B S-nitrosylation promotes ADAR1-mediated editing of its own mRNA transcript via an ADD1/MATR3 regulatory axis. Cell Res 2023; 33:546-561. [PMID: 37156877 PMCID: PMC10313700 DOI: 10.1038/s41422-023-00812-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2022] [Accepted: 04/07/2023] [Indexed: 05/10/2023] Open
Abstract
Genetic information is generally transferred from RNA to protein according to the classic "Central Dogma". Here, we made a striking discovery that post-translational modification of a protein specifically regulates the editing of its own mRNA. We show that S-nitrosylation of cathepsin B (CTSB) exclusively alters the adenosine-to-inosine (A-to-I) editing of its own mRNA. Mechanistically, CTSB S-nitrosylation promotes the dephosphorylation and nuclear translocation of ADD1, leading to the recruitment of MATR3 and ADAR1 to CTSB mRNA. ADAR1-mediated A-to-I RNA editing enables the binding of HuR to CTSB mRNA, resulting in increased CTSB mRNA stability and subsequently higher steady-state levels of CTSB protein. Together, we uncovered a unique feedforward mechanism of protein expression regulation mediated by the ADD1/MATR3/ADAR1 regulatory axis. Our study demonstrates a novel reverse flow of information from the post-translational modification of a protein back to the post-transcriptional regulation of its own mRNA precursor. We coined this process as "Protein-directed EDiting of its Own mRNA by ADAR1 (PEDORA)" and suggest that this constitutes an additional layer of protein expression control. "PEDORA" could represent a currently hidden mechanism in eukaryotic gene expression regulation.
Collapse
Affiliation(s)
- Zhe Lin
- Key Laboratory of Cardiovascular and Cerebrovascular Medicine, Key Laboratory of Targeted Intervention of Cardiovascular Disease, Collaborative Innovation Center for Cardiovascular Disease Translational Medicine, State Key Laboratory of Reproductive Medicine, School of Pharmacy, the Affiliated Suzhou Hospital of Nanjing Medical University, Gusu School, Nanjing Medical University, Nanjing, Jiangsu, China
| | - Shuang Zhao
- Key Laboratory of Cardiovascular and Cerebrovascular Medicine, Key Laboratory of Targeted Intervention of Cardiovascular Disease, Collaborative Innovation Center for Cardiovascular Disease Translational Medicine, State Key Laboratory of Reproductive Medicine, School of Pharmacy, the Affiliated Suzhou Hospital of Nanjing Medical University, Gusu School, Nanjing Medical University, Nanjing, Jiangsu, China
| | - Xuesong Li
- Key Laboratory of Cardiovascular and Cerebrovascular Medicine, Key Laboratory of Targeted Intervention of Cardiovascular Disease, Collaborative Innovation Center for Cardiovascular Disease Translational Medicine, State Key Laboratory of Reproductive Medicine, School of Pharmacy, the Affiliated Suzhou Hospital of Nanjing Medical University, Gusu School, Nanjing Medical University, Nanjing, Jiangsu, China
| | - Zian Miao
- Key Laboratory of Cardiovascular and Cerebrovascular Medicine, Key Laboratory of Targeted Intervention of Cardiovascular Disease, Collaborative Innovation Center for Cardiovascular Disease Translational Medicine, State Key Laboratory of Reproductive Medicine, School of Pharmacy, the Affiliated Suzhou Hospital of Nanjing Medical University, Gusu School, Nanjing Medical University, Nanjing, Jiangsu, China
| | - Jiawei Cao
- Key Laboratory of Cardiovascular and Cerebrovascular Medicine, Key Laboratory of Targeted Intervention of Cardiovascular Disease, Collaborative Innovation Center for Cardiovascular Disease Translational Medicine, State Key Laboratory of Reproductive Medicine, School of Pharmacy, the Affiliated Suzhou Hospital of Nanjing Medical University, Gusu School, Nanjing Medical University, Nanjing, Jiangsu, China
| | - Yurong Chen
- Key Laboratory of Cardiovascular and Cerebrovascular Medicine, Key Laboratory of Targeted Intervention of Cardiovascular Disease, Collaborative Innovation Center for Cardiovascular Disease Translational Medicine, State Key Laboratory of Reproductive Medicine, School of Pharmacy, the Affiliated Suzhou Hospital of Nanjing Medical University, Gusu School, Nanjing Medical University, Nanjing, Jiangsu, China
| | - Zhiguang Shi
- Key Laboratory of Cardiovascular and Cerebrovascular Medicine, Key Laboratory of Targeted Intervention of Cardiovascular Disease, Collaborative Innovation Center for Cardiovascular Disease Translational Medicine, State Key Laboratory of Reproductive Medicine, School of Pharmacy, the Affiliated Suzhou Hospital of Nanjing Medical University, Gusu School, Nanjing Medical University, Nanjing, Jiangsu, China
| | - Jia Zhang
- Key Laboratory of Cardiovascular and Cerebrovascular Medicine, Key Laboratory of Targeted Intervention of Cardiovascular Disease, Collaborative Innovation Center for Cardiovascular Disease Translational Medicine, State Key Laboratory of Reproductive Medicine, School of Pharmacy, the Affiliated Suzhou Hospital of Nanjing Medical University, Gusu School, Nanjing Medical University, Nanjing, Jiangsu, China
| | - Dongjin Wang
- Department of Thoracic and Cardiovascular Surgery, the Affiliated Drum Tower Hospital of Nanjing University Medical School, Nanjing Drum Tower Hospital Clinical College of Nanjing Medical University, Institute of Cardiothoracic Vascular Disease, Nanjing University, Nanjing, Jiangsu, China
| | - Shaoliang Chen
- Department of Cardiology, Nanjing First Hospital, Nanjing Medical University, Nanjing, Jiangsu, China
| | - Liansheng Wang
- Department of Cardiology, the First Affiliated Hospital of Nanjing Medical University, Nanjing, Jiangsu, China
| | - Aihua Gu
- State Key Laboratory of Reproductive Medicine, Institute of Toxicology, School of Public Health, Nanjing Medical University, Nanjing, Jiangsu, China
| | - Feng Chen
- Department of Forensic Medicine, Nanjing Medical University, Nanjing, Jiangsu, China
| | - Tao Yang
- Department of Endocrinology and Metabolism, the First Affiliated Hospital of Nanjing Medical University, Nanjing, Jiangsu, China
| | - Kangyun Sun
- Department of Cardiology, the Affiliated Suzhou Hospital of Nanjing Medical University, Suzhou Municipal Hospital, Gusu School, Nanjing Medical University, Suzhou, Jiangsu, China
| | - Yi Han
- Department of Geriatrics, the First Affiliated Hospital of Nanjing Medical University, Nanjing, Jiangsu, China.
| | - Liping Xie
- Key Laboratory of Cardiovascular and Cerebrovascular Medicine, Key Laboratory of Targeted Intervention of Cardiovascular Disease, Collaborative Innovation Center for Cardiovascular Disease Translational Medicine, State Key Laboratory of Reproductive Medicine, School of Pharmacy, the Affiliated Suzhou Hospital of Nanjing Medical University, Gusu School, Nanjing Medical University, Nanjing, Jiangsu, China.
| | - Hongshan Chen
- Key Laboratory of Cardiovascular and Cerebrovascular Medicine, Key Laboratory of Targeted Intervention of Cardiovascular Disease, Collaborative Innovation Center for Cardiovascular Disease Translational Medicine, State Key Laboratory of Reproductive Medicine, School of Pharmacy, the Affiliated Suzhou Hospital of Nanjing Medical University, Gusu School, Nanjing Medical University, Nanjing, Jiangsu, China.
| | - Yong Ji
- Key Laboratory of Cardiovascular and Cerebrovascular Medicine, Key Laboratory of Targeted Intervention of Cardiovascular Disease, Collaborative Innovation Center for Cardiovascular Disease Translational Medicine, State Key Laboratory of Reproductive Medicine, School of Pharmacy, the Affiliated Suzhou Hospital of Nanjing Medical University, Gusu School, Nanjing Medical University, Nanjing, Jiangsu, China.
- National Key Laboratory of Frigid Zone Cardiovascular Diseases (NKLFZCD), Department of Pharmacology (State-Province Key Laboratories of Biomedicine-Pharmaceutics of China), College of Pharmacy, Key Laboratory of Cardiovascular Medicine Research and Key Laboratory of Myocardial Ischemia, Chinese Ministry of Education, NHC Key Laboratory of Cell Transplantation, the Central Laboratory of the First Affiliated Hospital, Harbin Medical University, Harbin, Heilongjiang, China.
| |
Collapse
|
24
|
Xu Y, Zhu J, Huang W, Xu K, Yang R, Zhang QC, Sun L. PrismNet: predicting protein-RNA interaction using in vivo RNA structural information. Nucleic Acids Res 2023:7151359. [PMID: 37140045 DOI: 10.1093/nar/gkad353] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2023] [Revised: 04/13/2023] [Accepted: 04/26/2023] [Indexed: 05/05/2023] Open
Abstract
Fundamental to post-transcriptional regulation, the in vivo binding of RNA binding proteins (RBPs) on their RNA targets heavily depends on RNA structures. To date, most methods for RBP-RNA interaction prediction are based on RNA structures predicted from sequences, which do not consider the various intracellular environments and thus cannot predict cell type-specific RBP-RNA interactions. Here, we present a web server PrismNet that uses a deep learning tool to integrate in vivo RNA secondary structures measured by icSHAPE experiments with RBP binding site information from UV cross-linking and immunoprecipitation in the same cell lines to predict cell type-specific RBP-RNA interactions. Taking an RBP and an RNA region with sequential and structural information as input ('Sequence & Structure' mode), PrismNet outputs the binding probability of the RBP and this RNA region, together with a saliency map and a sequence-structure integrative motif. The web server is freely available at http://prismnetweb.zhanglab.net.
Collapse
Affiliation(s)
- Yiran Xu
- MOE Key Laboratory of Bioinformatics, Beijing Advanced Innovation Center for Structural Biology & Frontier Research Center for Biological Structure, Center for Synthetic and Systems Biology, School of Life Sciences, Tsinghua University, Beijing 100084, China
- Tsinghua-Peking Center for Life Sciences, Beijing 100084, China
| | - Jianghui Zhu
- MOE Key Laboratory of Bioinformatics, Beijing Advanced Innovation Center for Structural Biology & Frontier Research Center for Biological Structure, Center for Synthetic and Systems Biology, School of Life Sciences, Tsinghua University, Beijing 100084, China
- Tsinghua-Peking Center for Life Sciences, Beijing 100084, China
| | - Wenze Huang
- MOE Key Laboratory of Bioinformatics, Beijing Advanced Innovation Center for Structural Biology & Frontier Research Center for Biological Structure, Center for Synthetic and Systems Biology, School of Life Sciences, Tsinghua University, Beijing 100084, China
- Tsinghua-Peking Center for Life Sciences, Beijing 100084, China
| | - Kui Xu
- MOE Key Laboratory of Bioinformatics, Beijing Advanced Innovation Center for Structural Biology & Frontier Research Center for Biological Structure, Center for Synthetic and Systems Biology, School of Life Sciences, Tsinghua University, Beijing 100084, China
- Tsinghua-Peking Center for Life Sciences, Beijing 100084, China
| | - Rui Yang
- MOE Key Laboratory of Bioinformatics, Beijing Advanced Innovation Center for Structural Biology & Frontier Research Center for Biological Structure, Center for Synthetic and Systems Biology, School of Life Sciences, Tsinghua University, Beijing 100084, China
- Tsinghua-Peking Center for Life Sciences, Beijing 100084, China
| | - Qiangfeng Cliff Zhang
- MOE Key Laboratory of Bioinformatics, Beijing Advanced Innovation Center for Structural Biology & Frontier Research Center for Biological Structure, Center for Synthetic and Systems Biology, School of Life Sciences, Tsinghua University, Beijing 100084, China
- Tsinghua-Peking Center for Life Sciences, Beijing 100084, China
| | - Lei Sun
- Shandong Provincial Key Laboratory of Animal Cell and Developmental Biology, School of Life Sciences, Shandong University, Qingdao 266237, China
| |
Collapse
|
25
|
Zhao R, Fang X, Mai Z, Chen X, Mo J, Lin Y, Xiao R, Bao X, Weng X, Zhou X. Transcriptome-wide identification of single-stranded RNA binding proteins. Chem Sci 2023; 14:4038-4047. [PMID: 37063799 PMCID: PMC10094363 DOI: 10.1039/d3sc00957b] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/20/2023] [Accepted: 03/07/2023] [Indexed: 04/18/2023] Open
Abstract
RNA-protein interactions are precisely regulated by RNA secondary structures in various biological processes. Large-scale identification of proteins that interact with particular RNA structure is important to the RBPome. Herein, a kethoxal assisted single-stranded RNA interactome capture (KASRIC) strategy was developed to globally identify single-stranded RNA binding proteins (ssRBPs). This approach combines RNA secondary structure probing technology with the conventional method of RNA-binding proteins profiling, realizing the transcriptome-wide identification of ssRBPs. Applying KASRIC, we identified 3180 candidate RBPs and 244 candidate ssRBPs in HeLa cells. Importantly, the 244 candidate ssRBPs contained 55 previously reported ssRBPs and 189 novel ssRBPs. Function analysis of the candidate ssRBPs exhibited enrichment in cellular processes related to RNA splicing and RNA degradation. The KASRIC strategy will facilitate the investigation of RNA-protein interactions.
Collapse
Affiliation(s)
- Ruiqi Zhao
- College of Chemistry and Molecular Sciences, Key Laboratory of Biomedical Polymers-Ministry of Education, Wuhan University Wuhan Hubei 430072 P. R. China
| | - Xin Fang
- College of Chemistry and Molecular Sciences, Key Laboratory of Biomedical Polymers-Ministry of Education, Wuhan University Wuhan Hubei 430072 P. R. China
| | - Zhibiao Mai
- Laboratory of RNA Molecular Biology, Guangdong Provincial Key Laboratory of Stem Cell and Regenerative Medicine, CAS Key Laboratory of Regenerative Biology, GIBH-CUHK Joint Research Laboratory on Stem Cell and Regenerative Medicine, Guangzhou Institutes of Biomedicine and Health, Chinese Academy of Sciences Guangzhou Guangdong Province 510530 China
| | - Xi Chen
- College of Chemistry and Molecular Sciences, Key Laboratory of Biomedical Polymers-Ministry of Education, Wuhan University Wuhan Hubei 430072 P. R. China
| | - Jing Mo
- College of Chemistry and Molecular Sciences, Key Laboratory of Biomedical Polymers-Ministry of Education, Wuhan University Wuhan Hubei 430072 P. R. China
| | - Yingying Lin
- Laboratory of RNA Molecular Biology, Guangdong Provincial Key Laboratory of Stem Cell and Regenerative Medicine, CAS Key Laboratory of Regenerative Biology, GIBH-CUHK Joint Research Laboratory on Stem Cell and Regenerative Medicine, Guangzhou Institutes of Biomedicine and Health, Chinese Academy of Sciences Guangzhou Guangdong Province 510530 China
| | - Rui Xiao
- Frontier Science Center for Immunology and Metabolism, Medical Research Institute, Wuhan University Wuhan Hubei 430071 China
- TaiKang Center for Life and Medical Sciences, Wuhan University Wuhan Hubei 430071 China
| | - Xichen Bao
- Laboratory of RNA Molecular Biology, Guangdong Provincial Key Laboratory of Stem Cell and Regenerative Medicine, CAS Key Laboratory of Regenerative Biology, GIBH-CUHK Joint Research Laboratory on Stem Cell and Regenerative Medicine, Guangzhou Institutes of Biomedicine and Health, Chinese Academy of Sciences Guangzhou Guangdong Province 510530 China
| | - Xiaocheng Weng
- College of Chemistry and Molecular Sciences, Key Laboratory of Biomedical Polymers-Ministry of Education, Wuhan University Wuhan Hubei 430072 P. R. China
| | - Xiang Zhou
- College of Chemistry and Molecular Sciences, Key Laboratory of Biomedical Polymers-Ministry of Education, Wuhan University Wuhan Hubei 430072 P. R. China
- TaiKang Center for Life and Medical Sciences, Wuhan University Wuhan Hubei 430071 China
| |
Collapse
|
26
|
How does precursor RNA structure influence RNA processing and gene expression? Biosci Rep 2023; 43:232489. [PMID: 36689327 PMCID: PMC9977717 DOI: 10.1042/bsr20220149] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2022] [Revised: 01/17/2023] [Accepted: 01/23/2023] [Indexed: 01/24/2023] Open
Abstract
RNA is a fundamental biomolecule that has many purposes within cells. Due to its single-stranded and flexible nature, RNA naturally folds into complex and dynamic structures. Recent technological and computational advances have produced an explosion of RNA structural data. Many RNA structures have regulatory and functional properties. Studying the structure of nascent RNAs is particularly challenging due to their low abundance and long length, but their structures are important because they can influence RNA processing. Precursor RNA processing is a nexus of pathways that determines mature isoform composition and that controls gene expression. In this review, we examine what is known about human nascent RNA structure and the influence of RNA structure on processing of precursor RNAs. These known structures provide examples of how other nascent RNAs may be structured and show how novel RNA structures may influence RNA processing including splicing and polyadenylation. RNA structures can be targeted therapeutically to treat disease.
Collapse
|
27
|
Yan C, Meng Y, Yang J, Chen J, Jiang W. Translational landscape in human early neural fate determination. Development 2023; 150:297188. [PMID: 36846898 DOI: 10.1242/dev.201177] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2022] [Accepted: 02/19/2023] [Indexed: 03/01/2023]
Abstract
Gene expression regulation in eukaryotes is a multi-level process, including transcription, mRNA translation and protein turnover. Many studies have reported sophisticated transcriptional regulation during neural development, but the global translational dynamics are still ambiguous. Here, we differentiate human embryonic stem cells (ESCs) into neural progenitor cells (NPCs) with high efficiency and perform ribosome sequencing and RNA sequencing on both ESCs and NPCs. Data analysis reveals that translational controls engage in many crucial pathways and contribute significantly to regulation of neural fate determination. Furthermore, we show that the sequence characteristics of the untranslated region (UTR) might regulate translation efficiency. Specifically, genes with short 5'UTR and intense Kozak sequence are associated with high translation efficiency in human ESCs, whereas genes with long 3'UTR are related to high translation efficiency in NPCs. In addition, we have identified four biasedly used codons (GAC, GAT, AGA and AGG) and dozens of short open reading frames during neural progenitor differentiation. Thus, our study reveals the translational landscape during early human neural differentiation and provides insights into the regulation of cell fate determination at the translational level.
Collapse
Affiliation(s)
- Chenchao Yan
- Department of Biological Repositories, Frontier Science Center for Immunology and Metabolism, Medical Research Institute, Zhongnan Hospital of Wuhan University, Wuhan 430071, China
| | - Yajing Meng
- Department of Biological Repositories, Frontier Science Center for Immunology and Metabolism, Medical Research Institute, Zhongnan Hospital of Wuhan University, Wuhan 430071, China
| | - Jie Yang
- Department of Biological Repositories, Frontier Science Center for Immunology and Metabolism, Medical Research Institute, Zhongnan Hospital of Wuhan University, Wuhan 430071, China
| | - Jian Chen
- Chinese Institute for Brain Research (Beijing), Research Unit of Medical Neurobiology, Chinese Academy of Medical Sciences, Beijing 102206, China
| | - Wei Jiang
- Department of Biological Repositories, Frontier Science Center for Immunology and Metabolism, Medical Research Institute, Zhongnan Hospital of Wuhan University, Wuhan 430071, China
- Human Genetics Resource Preservation Center of Wuhan University, Wuhan 430071, China
| |
Collapse
|
28
|
Zhang L, Lu C, Zeng M, Li Y, Wang J. CRMSS: predicting circRNA-RBP binding sites based on multi-scale characterizing sequence and structure features. Brief Bioinform 2023; 24:6889442. [PMID: 36511222 DOI: 10.1093/bib/bbac530] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2022] [Revised: 11/01/2022] [Accepted: 11/07/2022] [Indexed: 12/14/2022] Open
Abstract
Circular RNAs (circRNAs) are reverse-spliced and covalently closed RNAs. Their interactions with RNA-binding proteins (RBPs) have multiple effects on the progress of many diseases. Some computational methods are proposed to identify RBP binding sites on circRNAs but suffer from insufficient accuracy, robustness and explanation. In this study, we first take the characteristics of both RNA and RBP into consideration. We propose a method for discriminating circRNA-RBP binding sites based on multi-scale characterizing sequence and structure features, called CRMSS. For circRNAs, we use sequence ${k}\hbox{-}{mer}$ embedding and the forming probabilities of local secondary structures as features. For RBPs, we combine sequence and structure frequencies of RNA-binding domain regions to generate features. We capture binding patterns with multi-scale residual blocks. With BiLSTM and attention mechanism, we obtain the contextual information of high-level representation for circRNA-RBP binding. To validate the effectiveness of CRMSS, we compare its predictive performance with other methods on 37 RBPs. Taking the properties of both circRNAs and RBPs into account, CRMSS achieves superior performance over state-of-the-art methods. In the case study, our model provides reliable predictions and correctly identifies experimentally verified circRNA-RBP pairs. The code of CRMSS is freely available at https://github.com/BioinformaticsCSU/CRMSS.
Collapse
Affiliation(s)
- Lishen Zhang
- Hunan Provincial Key Lab on Bioinformatics, School of Computer Science and Engineering, Central South University, China
| | - Chengqian Lu
- Hunan Provincial Key Lab on Bioinformatics, School of Computer Science and Engineering, Central South University, China
| | - Min Zeng
- Hunan Provincial Key Lab on Bioinformatics, School of Computer Science and Engineering, Central South University, China
| | - Yaohang Li
- Department of Computer Science at Old Dominion University, USA
| | - Jianxin Wang
- Hunan Provincial Key Lab on Bioinformatics, School of Computer Science and Engineering, Central South University, China
| |
Collapse
|
29
|
Wang H, Lu X, Zheng H, Wang W, Zhang G, Wang S, Lin P, Zhuang Y, Chen C, Chen Q, Qu J, Xu L. RNAsmc: A integrated tool for comparing RNA secondary structure and evaluating allosteric effects. Comput Struct Biotechnol J 2023; 21:965-973. [PMID: 36733704 PMCID: PMC9876829 DOI: 10.1016/j.csbj.2023.01.007] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2022] [Revised: 01/06/2023] [Accepted: 01/07/2023] [Indexed: 01/11/2023] Open
Abstract
RNA structure plays a crucial role in gene regulation, in RNA stability and the essential biological processes. RNA secondary structure (RSS) motifs are the basic building blocks for investigating the biological mechanisms of structure. Here, we present a strategy for structural motif-based dynamic alignment, namely, RNA secondary-structural motif-comparing (RNAsmc), to identify structural motifs and quantitatively evaluate their underlying molecular functions. RNAsmc also has strong robustness to sequence length, folding protocol and RNA structural profile by chemical probing. Notably, it is also applicable to quantify structural variation in special RNA editing events (SNVs or SNPs, fragment insertion or deletion, etc.). The findings indicate that RNAsmc can uncover the heterogeneity of RNA secondary structure and score for similarities among components, which provides an impetus to cluster RNA families and evaluate allosteric effects. We find that RNAsmc exhibits remarkable detection efficiency for experimentally-derived RiboSNitches. Finally, the pipeline was assembled into an R software package to serve as an automated toolkit to explore, align, and cluster RSS. It is freely available for download at https://CRAN.R-project.org/package=RNAsmc.
Collapse
Affiliation(s)
- Hong Wang
- National Engineering Research Center of Ophthalmology and Optometry, Eye Hospital, Wenzhou Medical University, Wenzhou 325027, China
- State Key Laboratory of Ophthalmology, Optometry and Visual Science, Eye Hospital, Wenzhou Medical University, Wenzhou 325027, China
- Center of Optometry International Innovation of Wenzhou, Eye Valley, Wenzhou 325027, China
| | - Xiaoyan Lu
- National Engineering Research Center of Ophthalmology and Optometry, Eye Hospital, Wenzhou Medical University, Wenzhou 325027, China
| | - Hewei Zheng
- Wekemo Tech Group Co., Ltd. Shenzhen 518000, China
| | - Wencan Wang
- National Engineering Research Center of Ophthalmology and Optometry, Eye Hospital, Wenzhou Medical University, Wenzhou 325027, China
- State Key Laboratory of Ophthalmology, Optometry and Visual Science, Eye Hospital, Wenzhou Medical University, Wenzhou 325027, China
- Wenzhou Realdata Medical Research Co., Ltd, Wenzhou 325027, China
| | - Guosi Zhang
- National Engineering Research Center of Ophthalmology and Optometry, Eye Hospital, Wenzhou Medical University, Wenzhou 325027, China
- State Key Laboratory of Ophthalmology, Optometry and Visual Science, Eye Hospital, Wenzhou Medical University, Wenzhou 325027, China
| | - Siyu Wang
- National Engineering Research Center of Ophthalmology and Optometry, Eye Hospital, Wenzhou Medical University, Wenzhou 325027, China
- State Key Laboratory of Ophthalmology, Optometry and Visual Science, Eye Hospital, Wenzhou Medical University, Wenzhou 325027, China
| | - Peng Lin
- National Engineering Research Center of Ophthalmology and Optometry, Eye Hospital, Wenzhou Medical University, Wenzhou 325027, China
- State Key Laboratory of Ophthalmology, Optometry and Visual Science, Eye Hospital, Wenzhou Medical University, Wenzhou 325027, China
| | - Youyuan Zhuang
- National Engineering Research Center of Ophthalmology and Optometry, Eye Hospital, Wenzhou Medical University, Wenzhou 325027, China
- State Key Laboratory of Ophthalmology, Optometry and Visual Science, Eye Hospital, Wenzhou Medical University, Wenzhou 325027, China
| | - Chong Chen
- National Engineering Research Center of Ophthalmology and Optometry, Eye Hospital, Wenzhou Medical University, Wenzhou 325027, China
- State Key Laboratory of Ophthalmology, Optometry and Visual Science, Eye Hospital, Wenzhou Medical University, Wenzhou 325027, China
| | - Qi Chen
- National Engineering Research Center of Ophthalmology and Optometry, Eye Hospital, Wenzhou Medical University, Wenzhou 325027, China
- State Key Laboratory of Ophthalmology, Optometry and Visual Science, Eye Hospital, Wenzhou Medical University, Wenzhou 325027, China
| | - Jia Qu
- National Engineering Research Center of Ophthalmology and Optometry, Eye Hospital, Wenzhou Medical University, Wenzhou 325027, China
- State Key Laboratory of Ophthalmology, Optometry and Visual Science, Eye Hospital, Wenzhou Medical University, Wenzhou 325027, China
- Center of Optometry International Innovation of Wenzhou, Eye Valley, Wenzhou 325027, China
- Corresponding authors at: National Engineering Research Center of Ophthalmology and Optometry, Eye Hospital, Wenzhou Medical University, Wenzhou 325027, China
| | - Liangde Xu
- National Engineering Research Center of Ophthalmology and Optometry, Eye Hospital, Wenzhou Medical University, Wenzhou 325027, China
- State Key Laboratory of Ophthalmology, Optometry and Visual Science, Eye Hospital, Wenzhou Medical University, Wenzhou 325027, China
- Center of Optometry International Innovation of Wenzhou, Eye Valley, Wenzhou 325027, China
- Corresponding authors at: National Engineering Research Center of Ophthalmology and Optometry, Eye Hospital, Wenzhou Medical University, Wenzhou 325027, China
| |
Collapse
|
30
|
Huang W, Zhang QC. Prediction of Dynamic RBP-RNA Interactions Using PrismNet. Methods Mol Biol 2023; 2568:123-132. [PMID: 36227565 DOI: 10.1007/978-1-0716-2687-0_8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
A capacity to detect the binding profiles of RNA targets for an RNA-binding protein (RBP) under different cellular conditions is essential to understand the functions of the RBP in posttranscriptional regulation. However, the prediction of RBP binding sites in vivo remains challenging. Tools that predict RBP-RNA interactions using sequence and/or predicted structures cannot reflect the exact state of RNA in vivo. PrismNet, which uses both sequences and in vivo RNA structure information from probing experiments, can accurately predict RBP binding under different cellular conditions by deep learning, and can be applied for functional studies of RBPs. Here, we provide a detailed protocol showing how to train a PrismNet model of RBP-RNA interactions for an RBP, and how to apply the model for predictions of the RBP binding under different conditions.
Collapse
Affiliation(s)
- Wenze Huang
- MOE Key Laboratory of Bioinformatics, Center for Synthetic and Systems Biology, School of Life Sciences, Tsinghua University, Beijing, China
- Beijing Advanced Innovation Center for Structural Biology & Frontier Research Center for Biological Structure, School of Life Sciences, Tsinghua University, Beijing, China
- Tsinghua-Peking Center for Life Sciences, Beijing, China
| | - Qiangfeng Cliff Zhang
- MOE Key Laboratory of Bioinformatics, Center for Synthetic and Systems Biology, School of Life Sciences, Tsinghua University, Beijing, China.
- Beijing Advanced Innovation Center for Structural Biology & Frontier Research Center for Biological Structure, School of Life Sciences, Tsinghua University, Beijing, China.
- Tsinghua-Peking Center for Life Sciences, Beijing, China.
| |
Collapse
|
31
|
Sanofi-Cell Research outstanding paper award of 2021. Cell Res 2022; 32:1035. [PMID: 36380149 PMCID: PMC9715938 DOI: 10.1038/s41422-022-00749-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open
Affiliation(s)
- Cell Research Editorial Team
- Shanghai Institute of Biochemistry and Cell Biology, Center for Excellence in Molecular Cell Science, Chinese Academy of Sciences, Shanghai, China
| |
Collapse
|
32
|
Yu B, Li P, Zhang QC, Hou L. Differential analysis of RNA structure probing experiments at nucleotide resolution: uncovering regulatory functions of RNA structure. Nat Commun 2022; 13:4227. [PMID: 35869080 PMCID: PMC9307511 DOI: 10.1038/s41467-022-31875-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2021] [Accepted: 07/05/2022] [Indexed: 11/09/2022] Open
Abstract
RNAs perform their function by forming specific structures, which can change across cellular conditions. Structure probing experiments combined with next generation sequencing technology have enabled transcriptome-wide analysis of RNA secondary structure in various cellular conditions. Differential analysis of structure probing data in different conditions can reveal the RNA structurally variable regions (SVRs), which is important for understanding RNA functions. Here, we propose DiffScan, a computational framework for normalization and differential analysis of structure probing data in high resolution. DiffScan preprocesses structure probing datasets to remove systematic bias, and then scans the transcripts to identify SVRs and adaptively determines their lengths and locations. The proposed approach is compatible with most structure probing platforms (e.g., icSHAPE, DMS-seq). When evaluated with simulated and benchmark datasets, DiffScan identifies structurally variable regions at nucleotide resolution, with substantial improvement in accuracy compared with existing SVR detection methods. Moreover, the improvement is robust when tested in multiple structure probing platforms. Application of DiffScan in a dataset of multi-subcellular RNA structurome and a subsequent motif enrichment analysis suggest potential links of RNA structural variation and mRNA abundance, possibly mediated by RNA binding proteins such as the serine/arginine rich splicing factors. This work provides an effective tool for differential analysis of RNA secondary structure, reinforcing the power of structure probing experiments in deciphering the dynamic RNA structurome. The authors present DiffScan, an advanced tool for normalization and differential analysis of RNA structure probing experiments, combining their power in deciphering the dynamic RNA structurome and facilitating the discovery of RNA regulatory functions.
Collapse
|
33
|
Pepe G, Appierdo R, Carrino C, Ballesio F, Helmer-Citterich M, Gherardini PF. Artificial intelligence methods enhance the discovery of RNA interactions. Front Mol Biosci 2022; 9:1000205. [PMID: 36275611 PMCID: PMC9585310 DOI: 10.3389/fmolb.2022.1000205] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2022] [Accepted: 09/20/2022] [Indexed: 11/13/2022] Open
Abstract
Understanding how RNAs interact with proteins, RNAs, or other molecules remains a challenge of main interest in biology, given the importance of these complexes in both normal and pathological cellular processes. Since experimental datasets are starting to be available for hundreds of functional interactions between RNAs and other biomolecules, several machine learning and deep learning algorithms have been proposed for predicting RNA-RNA or RNA-protein interactions. However, most of these approaches were evaluated on a single dataset, making performance comparisons difficult. With this review, we aim to summarize recent computational methods, developed in this broad research area, highlighting feature encoding and machine learning strategies adopted. Given the magnitude of the effect that dataset size and quality have on performance, we explored the characteristics of these datasets. Additionally, we discuss multiple approaches to generate datasets of negative examples for training. Finally, we describe the best-performing methods to predict interactions between proteins and specific classes of RNA molecules, such as circular RNAs (circRNAs) and long non-coding RNAs (lncRNAs), and methods to predict RNA-RNA or RNA-RBP interactions independently of the RNA type.
Collapse
Affiliation(s)
- G Pepe
- Department of Biology, University of Rome “Tor Vergata”, Rome, Italy
- *Correspondence: G Pepe, ; M Helmer-Citterich,
| | - R Appierdo
- Department of Biology, University of Rome “Tor Vergata”, Rome, Italy
| | - C Carrino
- PhD Program in Cellular and Molecular Biology, Department of Biology, University of Rome “Tor Vergata”, Rome, Italy
| | - F Ballesio
- PhD Program in Cellular and Molecular Biology, Department of Biology, University of Rome “Tor Vergata”, Rome, Italy
| | - M Helmer-Citterich
- Department of Biology, University of Rome “Tor Vergata”, Rome, Italy
- *Correspondence: G Pepe, ; M Helmer-Citterich,
| | - PF Gherardini
- Department of Biology, University of Rome “Tor Vergata”, Rome, Italy
| |
Collapse
|
34
|
Zhang J, Fei Y, Sun L, Zhang QC. Advances and opportunities in RNA structure experimental determination and computational modeling. Nat Methods 2022; 19:1193-1207. [PMID: 36203019 DOI: 10.1038/s41592-022-01623-y] [Citation(s) in RCA: 38] [Impact Index Per Article: 19.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2022] [Accepted: 08/23/2022] [Indexed: 11/09/2022]
Abstract
Beyond transferring genetic information, RNAs are molecules with diverse functions that include catalyzing biochemical reactions and regulating gene expression. Most of these activities depend on RNAs' specific structures. Therefore, accurately determining RNA structure is integral to advancing our understanding of RNA functions. Here, we summarize the state-of-the-art experimental and computational technologies developed to evaluate RNA secondary and tertiary structures. We also highlight how the rapid increase of experimental data facilitates the integrative modeling approaches for better resolving RNA structures. Finally, we provide our thoughts on the latest advances and challenges in RNA structure determination methods, as well as on future directions for both experimental approaches and artificial intelligence-based computational tools to model RNA structure. Ultimately, we hope the technological advances will deepen our understanding of RNA biology and facilitate RNA structure-based biomedical research such as designing specific RNA structures for therapeutics and deploying RNA-targeting small-molecule drugs.
Collapse
Affiliation(s)
- Jinsong Zhang
- MOE Key Laboratory of Bioinformatics, Center for Synthetic and Systems Biology, School of Life Sciences, Tsinghua University, Beijing, China.,Beijing Advanced Innovation Center for Structural Biology & Frontier Research Center for Biological Structure, School of Life Sciences, Tsinghua University, Beijing, China.,Tsinghua-Peking Center for Life Sciences, Beijing, China
| | - Yuhan Fei
- MOE Key Laboratory of Bioinformatics, Center for Synthetic and Systems Biology, School of Life Sciences, Tsinghua University, Beijing, China.,Beijing Advanced Innovation Center for Structural Biology & Frontier Research Center for Biological Structure, School of Life Sciences, Tsinghua University, Beijing, China.,Tsinghua-Peking Center for Life Sciences, Beijing, China
| | - Lei Sun
- MOE Key Laboratory of Bioinformatics, Center for Synthetic and Systems Biology, School of Life Sciences, Tsinghua University, Beijing, China. .,Beijing Advanced Innovation Center for Structural Biology & Frontier Research Center for Biological Structure, School of Life Sciences, Tsinghua University, Beijing, China. .,Tsinghua-Peking Center for Life Sciences, Beijing, China.
| | - Qiangfeng Cliff Zhang
- MOE Key Laboratory of Bioinformatics, Center for Synthetic and Systems Biology, School of Life Sciences, Tsinghua University, Beijing, China. .,Beijing Advanced Innovation Center for Structural Biology & Frontier Research Center for Biological Structure, School of Life Sciences, Tsinghua University, Beijing, China. .,Tsinghua-Peking Center for Life Sciences, Beijing, China.
| |
Collapse
|
35
|
Waldern JM, Kumar J, Laederach A. Disease-associated human genetic variation through the lens of precursor and mature RNA structure. Hum Genet 2022; 141:1659-1672. [PMID: 34741198 PMCID: PMC9072596 DOI: 10.1007/s00439-021-02395-9] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2021] [Accepted: 10/26/2021] [Indexed: 12/14/2022]
Abstract
Disease-associated variants (DAVs) are commonly considered either through a genomic lens that describes variant function at the DNA level, or at the protein function level if the variant is translated. Although the genomic and proteomic effects of variation are well-characterized, genetic variants disrupting post-transcriptional regulation is another mechanism of disease that remains understudied. Specific RNA sequence motifs mediate post-transcriptional regulation both in the nucleus and cytoplasm of eukaryotic cells, often by binding to RNA-binding proteins or other RNAs. However, many DAVs map far from these motifs, which suggests deeper layers of post-transcriptional mechanistic control. Here, we consider a transcriptomic framework to outline the importance of post-transcriptional regulation as a mechanism of disease-causing single-nucleotide variation in the human genome. We first describe the composition of the human transcriptome and the importance of abundant yet overlooked components such as introns and untranslated regions (UTRs) of messenger RNAs (mRNAs). We present an analysis of Human Gene Mutation Database variants mapping to mRNAs and examine the distribution of causative disease-associated variation across the transcriptome. Although our analysis confirms the importance of post-transcriptional regulatory motifs, a majority of DAVs do not directly map to known regulatory motifs. Therefore, we review evidence that regions outside these well-characterized motifs can regulate function by RNA structure-mediated mechanisms in all four elements of an mRNA: exons, introns, 5' and 3' UTRs. To this end, we review published examples of riboSNitches, which are single-nucleotide variants that result in a change in RNA structure that is causative of the disease phenotype. In this review, we present the current state of knowledge of how DAVs act at the transcriptome level, both through altering post-transcriptional regulatory motifs and by the effects of RNA structure.
Collapse
Affiliation(s)
- Justin M Waldern
- Department of Biology, University of North Carolina, Chapel Hill, NC, 27599, USA
| | - Jayashree Kumar
- Department of Biology, University of North Carolina, Chapel Hill, NC, 27599, USA
- Curriculum in Bioinformatics and Computational Biology, University of North Carolina at Chapel Hill, Chapel Hill, NC, 27599, USA
| | - Alain Laederach
- Department of Biology, University of North Carolina, Chapel Hill, NC, 27599, USA.
- Curriculum in Bioinformatics and Computational Biology, University of North Carolina at Chapel Hill, Chapel Hill, NC, 27599, USA.
| |
Collapse
|
36
|
Kuret K, Amalietti AG, Jones DM, Capitanchik C, Ule J. Positional motif analysis reveals the extent of specificity of protein-RNA interactions observed by CLIP. Genome Biol 2022; 23:191. [PMID: 36085079 PMCID: PMC9461102 DOI: 10.1186/s13059-022-02755-2] [Citation(s) in RCA: 14] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2021] [Accepted: 08/22/2022] [Indexed: 12/01/2022] Open
Abstract
BACKGROUND Crosslinking and immunoprecipitation (CLIP) is a method used to identify in vivo RNA-protein binding sites on a transcriptome-wide scale. With the increasing amounts of available data for RNA-binding proteins (RBPs), it is important to understand to what degree the enriched motifs specify the RNA-binding profiles of RBPs in cells. RESULTS We develop positionally enriched k-mer analysis (PEKA), a computational tool for efficient analysis of enriched motifs from individual CLIP datasets, which minimizes the impact of technical and regional genomic biases by internal data normalization. We cross-validate PEKA with mCross and show that the use of input control for background correction is not required to yield high specificity of enriched motifs. We identify motif classes with common enrichment patterns across eCLIP datasets and across RNA regions, while also observing variations in the specificity and the extent of motif enrichment across eCLIP datasets, between variant CLIP protocols, and between CLIP and in vitro binding data. Thereby, we gain insights into the contributions of technical and regional genomic biases to the enriched motifs, and find how motif enrichment features relate to the domain composition and low-complexity regions of the studied proteins. CONCLUSIONS Our study provides insights into the overall contributions of regional binding preferences, protein domains, and low-complexity regions to the specificity of protein-RNA interactions, and shows the value of cross-motif and cross-RBP comparison for data interpretation. Our results are presented for exploratory analysis via an online platform in an RBP-centric and motif-centric manner ( https://imaps.goodwright.com/apps/peka/ ).
Collapse
Affiliation(s)
- Klara Kuret
- National Institute of Chemistry, Hajdrihova 19, SI-1001 Ljubljana, Slovenia
- Jozef Stefan International Postgraduate School, Jamova cesta 39, 1000 Ljubljana, Slovenia
- The Francis Crick Institute, 1 Midland Road, London, NW1 1AT UK
| | - Aram Gustav Amalietti
- National Institute of Chemistry, Hajdrihova 19, SI-1001 Ljubljana, Slovenia
- The Francis Crick Institute, 1 Midland Road, London, NW1 1AT UK
| | - D. Marc Jones
- The Francis Crick Institute, 1 Midland Road, London, NW1 1AT UK
- UK Dementia Research Institute, King’s College London, London, UK
| | - Charlotte Capitanchik
- The Francis Crick Institute, 1 Midland Road, London, NW1 1AT UK
- UK Dementia Research Institute, King’s College London, London, UK
| | - Jernej Ule
- National Institute of Chemistry, Hajdrihova 19, SI-1001 Ljubljana, Slovenia
- The Francis Crick Institute, 1 Midland Road, London, NW1 1AT UK
- UK Dementia Research Institute, King’s College London, London, UK
| |
Collapse
|
37
|
Laverty KU, Jolma A, Pour SE, Zheng H, Ray D, Morris Q, Hughes TR. PRIESSTESS: interpretable, high-performing models of the sequence and structure preferences of RNA-binding proteins. Nucleic Acids Res 2022; 50:e111. [PMID: 36018788 DOI: 10.1093/nar/gkac694] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2021] [Revised: 07/22/2022] [Accepted: 08/03/2022] [Indexed: 12/23/2022] Open
Abstract
Modelling both primary sequence and secondary structure preferences for RNA binding proteins (RBPs) remains an ongoing challenge. Current models use varied RNA structure representations and can be difficult to interpret and evaluate. To address these issues, we present a universal RNA motif-finding/scanning strategy, termed PRIESSTESS (Predictive RBP-RNA InterpretablE Sequence-Structure moTif regrESSion), that can be applied to diverse RNA binding datasets. PRIESSTESS identifies dozens of enriched RNA sequence and/or structure motifs that are subsequently reduced to a set of core motifs by logistic regression with LASSO regularization. Importantly, these core motifs are easily visualized and interpreted, and provide a measure of RBP secondary structure specificity. We used PRIESSTESS to interrogate new HTR-SELEX data for 23 RBPs with diverse RNA binding modes and captured known primary sequence and secondary structure preferences for each. Moreover, when applying PRIESSTESS to 144 RBPs across 202 RNA binding datasets, 75% showed an RNA secondary structure preference but only 10% had a preference besides unpaired bases, suggesting that most RBPs simply recognize the accessibility of primary sequences.
Collapse
Affiliation(s)
- Kaitlin U Laverty
- Department of Molecular Genetics, University of Toronto, Toronto, Canada
| | - Arttu Jolma
- Department of Molecular Genetics, University of Toronto, Toronto, Canada.,Donnelly Centre, University of Toronto, Toronto, Canada
| | - Sara E Pour
- Department of Molecular Genetics, University of Toronto, Toronto, Canada
| | - Hong Zheng
- Donnelly Centre, University of Toronto, Toronto, Canada
| | - Debashish Ray
- Donnelly Centre, University of Toronto, Toronto, Canada
| | - Quaid Morris
- Department of Molecular Genetics, University of Toronto, Toronto, Canada.,Computational and Systems Biology, Memorial Sloan Kettering Cancer Center, New York, USA
| | - Timothy R Hughes
- Department of Molecular Genetics, University of Toronto, Toronto, Canada.,Donnelly Centre, University of Toronto, Toronto, Canada
| |
Collapse
|
38
|
Ma H, Wen H, Xue Z, Li G, Zhang Z. RNANetMotif: Identifying sequence-structure RNA network motifs in RNA-protein binding sites. PLoS Comput Biol 2022; 18:e1010293. [PMID: 35819951 PMCID: PMC9275694 DOI: 10.1371/journal.pcbi.1010293] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2021] [Accepted: 06/09/2022] [Indexed: 11/19/2022] Open
Abstract
RNA molecules can adopt stable secondary and tertiary structures, which are essential in mediating physical interactions with other partners such as RNA binding proteins (RBPs) and in carrying out their cellular functions. In vivo and in vitro experiments such as RNAcompete and eCLIP have revealed in vitro binding preferences of RBPs to RNA oligomers and in vivo binding sites in cells. Analysis of these binding data showed that the structure properties of the RNAs in these binding sites are important determinants of the binding events; however, it has been a challenge to incorporate the structure information into an interpretable model. Here we describe a new approach, RNANetMotif, which takes predicted secondary structure of thousands of RNA sequences bound by an RBP as input and uses a graph theory approach to recognize enriched subgraphs. These enriched subgraphs are in essence shared sequence-structure elements that are important in RBP-RNA binding. To validate our approach, we performed RNA structure modeling via coarse-grained molecular dynamics folding simulations for selected 4 RBPs, and RNA-protein docking for LIN28B. The simulation results, e.g., solvent accessibility and energetics, further support the biological relevance of the discovered network subgraphs. RNA binding proteins (RBPs) regulate every aspect of RNA biology, including splicing, translation, transportation, and degradation. High-throughput technologies such as eCLIP have identified thousands of binding sites for a given RBP throughout the genome. It has been shown by earlier studies that, in addition to nucleotide sequences, the structure and conformation of RNAs also play important role in RBP-RNA interactions. Analogous to protein-protein interactions or protein-DNA interactions, it is likely that there exist intrinsic sequence-structure motifs common to these RNAs that underlie their binding specificity to specific RBPs. It is known that RNAs form energetically favorable secondary structures, which can be represented as graphs, with nucleotides being nodes and backbone covalent bonds and base-pairing hydrogen bonds representing edges. We hypothesize that these graphs can be mined by graph theory approaches to identify sequence-structure motifs as enriched sub-graphs. In this article, we described the details of this approach, termed RNANetMotif and associated new concepts, namely EKS (Extended K-mer Subgraph) and GraphK graph algorithm. To test the utility of our approach, we conducted 3D structure modeling of selected RNA sequences through molecular dynamics (MD) folding simulation and evaluated the significance of the discovered RNA motifs by comparing their spatial exposure with other regions on the RNA. We believe that this approach has the novelty of treating the RNA sequence as a graph and RBP binding sites as enriched subgraph, which has broader applications beyond RBP-RNA interactions.
Collapse
Affiliation(s)
- Hongli Ma
- Research Center for Mathematics and Interdisciplinary Sciences, Shandong University, Qingdao, China
- Terrence Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, Toronto, Ontario, Canada
- Department of Molecular Genetics, University of Toronto, Toronto, Ontario, Canada
- School of Mathematics, Shandong University, Jinan, China
| | - Han Wen
- Terrence Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, Toronto, Ontario, Canada
| | - Zhiyuan Xue
- West China Biomedical Big Data Center, West China Hospital, Sichuan University, Chengdu, China
| | - Guojun Li
- Research Center for Mathematics and Interdisciplinary Sciences, Shandong University, Qingdao, China
- School of Mathematics, Shandong University, Jinan, China
- School of Mathematical Science, Liaocheng University, Liaocheng, China
| | - Zhaolei Zhang
- Terrence Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, Toronto, Ontario, Canada
- Department of Molecular Genetics, University of Toronto, Toronto, Ontario, Canada
- Department of Computer Science, University of Toronto, Toronto, Ontario, Canada
| |
Collapse
|
39
|
Krueger A, Łyszkiewicz M, Heissmeyer V. Post-transcriptional control of T-cell development in the thymus. Immunol Lett 2022; 247:1-12. [DOI: 10.1016/j.imlet.2022.04.009] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2021] [Revised: 03/18/2022] [Accepted: 04/26/2022] [Indexed: 11/05/2022]
|
40
|
Xu B, Zhu Y, Cao C, Chen H, Jin Q, Li G, Ma J, Yang SL, Zhao J, Zhu J, Ding Y, Fang X, Jin Y, Kwok CK, Ren A, Wan Y, Wang Z, Xue Y, Zhang H, Zhang QC, Zhou Y. Recent advances in RNA structurome. SCIENCE CHINA. LIFE SCIENCES 2022; 65:1285-1324. [PMID: 35717434 PMCID: PMC9206424 DOI: 10.1007/s11427-021-2116-2] [Citation(s) in RCA: 16] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Received: 03/10/2022] [Accepted: 04/01/2022] [Indexed: 12/27/2022]
Abstract
RNA structures are essential to support RNA functions and regulation in various biological processes. Recently, a range of novel technologies have been developed to decode genome-wide RNA structures and novel modes of functionality across a wide range of species. In this review, we summarize key strategies for probing the RNA structurome and discuss the pros and cons of representative technologies. In particular, these new technologies have been applied to dissect the structural landscape of the SARS-CoV-2 RNA genome. We also summarize the functionalities of RNA structures discovered in different regulatory layers-including RNA processing, transport, localization, and mRNA translation-across viruses, bacteria, animals, and plants. We review many versatile RNA structural elements in the context of different physiological and pathological processes (e.g., cell differentiation, stress response, and viral replication). Finally, we discuss future prospects for RNA structural studies to map the RNA structurome at higher resolution and at the single-molecule and single-cell level, and to decipher novel modes of RNA structures and functions for innovative applications.
Collapse
Affiliation(s)
- Bingbing Xu
- MOE Laboratory of Biosystems Homeostasis & Protection, Innovation Center for Cell Signaling Network, College of Life Sciences, Zhejiang University, Hangzhou, 310058, China
| | - Yanda Zhu
- MOE Laboratory of Biosystems Homeostasis & Protection, Innovation Center for Cell Signaling Network, College of Life Sciences, Zhejiang University, Hangzhou, 310058, China
| | - Changchang Cao
- Key Laboratory of RNA Biology, Institute of Biophysics, Chinese Academy of Sciences, Beijing, 100101, China
| | - Hao Chen
- Life Sciences Institute, Zhejiang University, Hangzhou, 310058, China
| | - Qiongli Jin
- State Key Laboratory of Plant Physiology and Biochemistry, College of Life Sciences, Zhejiang University, Hangzhou, 310058, China
| | - Guangnan Li
- State Key Laboratory of Virology, College of Life Sciences, Wuhan University, Wuhan, 430072, China
| | - Junfeng Ma
- Beijing Advanced Innovation Center for Structural Biology, School of Life Sciences, Tsinghua University, Beijing, 100084, China
| | - Siwy Ling Yang
- Stem Cell and Regenerative Biology, Genome Institute of Singapore, A*STAR, Singapore, Singapore
| | - Jieyu Zhao
- Department of Chemistry, and State Key Laboratory of Marine Pollution, City University of Hong Kong, Kowloon Tong, Hong Kong SAR, China
| | - Jianghui Zhu
- MOE Key Laboratory of Bioinformatics, Beijing Advanced Innovation Center for Structural Biology and Frontier Research Center for Biological Structure, Center for Synthetic and Systems Biology, School of Life Sciences, Tsinghua University, Beijing, 100084, China
- Tsinghua-Peking Center for Life Sciences, Beijing, 100084, China
| | - Yiliang Ding
- Department of Cell and Developmental Biology, John Innes Centre, Norwich Research Park, Norwich, NR4 7UH, United Kingdom.
| | - Xianyang Fang
- Beijing Advanced Innovation Center for Structural Biology, School of Life Sciences, Tsinghua University, Beijing, 100084, China.
| | - Yongfeng Jin
- MOE Laboratory of Biosystems Homeostasis & Protection, Innovation Center for Cell Signaling Network, College of Life Sciences, Zhejiang University, Hangzhou, 310058, China.
| | - Chun Kit Kwok
- Department of Chemistry, and State Key Laboratory of Marine Pollution, City University of Hong Kong, Kowloon Tong, Hong Kong SAR, China.
- Shenzhen Research Institute of City University of Hong Kong, Shenzhen, 518057, China.
| | - Aiming Ren
- Life Sciences Institute, Zhejiang University, Hangzhou, 310058, China.
| | - Yue Wan
- Stem Cell and Regenerative Biology, Genome Institute of Singapore, A*STAR, Singapore, Singapore.
| | - Zhiye Wang
- State Key Laboratory of Plant Physiology and Biochemistry, College of Life Sciences, Zhejiang University, Hangzhou, 310058, China.
| | - Yuanchao Xue
- Key Laboratory of RNA Biology, Institute of Biophysics, Chinese Academy of Sciences, Beijing, 100101, China.
- University of Chinese Academy of Sciences, Beijing, 100101, China.
| | - Huakun Zhang
- Key Laboratory of Molecular Epigenetics of the Ministry of Education, Northeast Normal University, Changchun, 130024, China.
| | - Qiangfeng Cliff Zhang
- MOE Key Laboratory of Bioinformatics, Beijing Advanced Innovation Center for Structural Biology and Frontier Research Center for Biological Structure, Center for Synthetic and Systems Biology, School of Life Sciences, Tsinghua University, Beijing, 100084, China.
- Tsinghua-Peking Center for Life Sciences, Beijing, 100084, China.
| | - Yu Zhou
- State Key Laboratory of Virology, College of Life Sciences, Wuhan University, Wuhan, 430072, China.
| |
Collapse
|
41
|
Yu H, Qi Y, Ding Y. Deep Learning in RNA Structure Studies. Front Mol Biosci 2022; 9:869601. [PMID: 35677883 PMCID: PMC9168262 DOI: 10.3389/fmolb.2022.869601] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2022] [Accepted: 05/04/2022] [Indexed: 01/27/2023] Open
Abstract
Deep learning, or artificial neural networks, is a type of machine learning algorithm that can decipher underlying relationships from large volumes of data and has been successfully applied to solve structural biology questions, such as RNA structure. RNA can fold into complex RNA structures by forming hydrogen bonds, thereby playing an essential role in biological processes. While experimental effort has enabled resolving RNA structure at the genome-wide scale, deep learning has been more recently introduced for studying RNA structure and its functionality. Here, we discuss successful applications of deep learning to solve RNA problems, including predictions of RNA structures, non-canonical G-quadruplex, RNA-protein interactions and RNA switches. Following these cases, we give a general guide to deep learning for solving RNA structure problems.
Collapse
Affiliation(s)
- Haopeng Yu
- Department of Cell and Developmental Biology, John Innes Centre, Norwich Research Park, Norwich, United Kingdom
| | | | - Yiliang Ding
- Department of Cell and Developmental Biology, John Innes Centre, Norwich Research Park, Norwich, United Kingdom
| |
Collapse
|
42
|
Ferrero-Serrano Á, Sylvia MM, Forstmeier PC, Olson AJ, Ware D, Bevilacqua PC, Assmann SM. Experimental demonstration and pan-structurome prediction of climate-associated riboSNitches in Arabidopsis. Genome Biol 2022; 23:101. [PMID: 35440059 PMCID: PMC9017077 DOI: 10.1186/s13059-022-02656-4] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2021] [Accepted: 03/20/2022] [Indexed: 11/23/2022] Open
Abstract
Background Genome-wide association studies (GWAS) aim to correlate phenotypic changes with genotypic variation. Upon transcription, single nucleotide variants (SNVs) may alter mRNA structure, with potential impacts on transcript stability, macromolecular interactions, and translation. However, plant genomes have not been assessed for the presence of these structure-altering polymorphisms or “riboSNitches.” Results We experimentally demonstrate the presence of riboSNitches in transcripts of two Arabidopsis genes, ZINC RIBBON 3 (ZR3) and COTTON GOLGI-RELATED 3 (CGR3), which are associated with continentality and temperature variation in the natural environment. These riboSNitches are also associated with differences in the abundance of their respective transcripts, implying a role in regulating the gene's expression in adaptation to local climate conditions. We then computationally predict riboSNitches transcriptome-wide in mRNAs of 879 naturally inbred Arabidopsis accessions. We characterize correlations between SNPs/riboSNitches in these accessions and 434 climate descriptors of their local environments, suggesting a role of these variants in local adaptation. We integrate this information in CLIMtools V2.0 and provide a new web resource, T-CLIM, that reveals associations between transcript abundance variation and local environmental variation. Conclusion We functionally validate two plant riboSNitches and, for the first time, demonstrate riboSNitch conditionality dependent on temperature, coining the term “conditional riboSNitch.” We provide the first pan-genome-wide prediction of riboSNitches in plants. We expand our previous CLIMtools web resource with riboSNitch information and with 1868 additional Arabidopsis genomes and 269 additional climate conditions, which will greatly facilitate in silico studies of natural genetic variation, its phenotypic consequences, and its role in local adaptation. Supplementary Information The online version contains supplementary material available at 10.1186/s13059-022-02656-4.
Collapse
Affiliation(s)
- Ángel Ferrero-Serrano
- Department of Biology, Pennsylvania State University, University Park, State College, PA, 16802, USA.
| | - Megan M Sylvia
- Department of Biology, Pennsylvania State University, University Park, State College, PA, 16802, USA
| | - Peter C Forstmeier
- Department of Biochemistry, Microbiology, and Molecular Biology, Pennsylvania State University, University Park, State College, PA, 16802, USA
| | - Andrew J Olson
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, 11724, USA
| | - Doreen Ware
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, 11724, USA.,USDA ARS NAA Robert W. Holley Center for Agriculture and Health, Ithaca, NY, 14853, USA
| | - Philip C Bevilacqua
- Department of Biochemistry, Microbiology, and Molecular Biology, Pennsylvania State University, University Park, State College, PA, 16802, USA.,Department of Chemistry, Pennsylvania State University, University Park, State College, PA, 16802, USA.,Center for RNA Molecular Biology, Pennsylvania State University, University Park, State College, PA, 16802, USA
| | - Sarah M Assmann
- Department of Biology, Pennsylvania State University, University Park, State College, PA, 16802, USA. .,Center for RNA Molecular Biology, Pennsylvania State University, University Park, State College, PA, 16802, USA.
| |
Collapse
|
43
|
Marcia M. The multiple molecular dimensions of long noncoding RNAs that regulate gene expression and tumorigenesis. Curr Opin Oncol 2022; 34:141-147. [PMID: 35025816 DOI: 10.1097/cco.0000000000000813] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
Abstract
PURPOSE OF REVIEW LncRNAs are emerging as key regulators of gene expression and they ensure homeostasis during cell differentiation and development, replication, and adaptation to the environment. Because of their key central role in regulating the biology of living cells, it is crucial to characterize how lncRNAs function at the genetic, transcriptomic, and mechanistic level. RECENT FINDINGS The low endogenous abundance and high molecular complexity of lncRNAs pose unique challenges for their characterization but new methodological advances in biochemistry, biophysics and cell biology have recently made it possible to characterize an increasing number of these transcripts, including oncogenic and tumor suppressor lncRNAs. These recent studies specifically address important issues that had remained controversial, such as the selectivity of lncRNA mechanisms of action, the functional importance of lncRNA sequences, secondary and tertiary structures, and the specificity of lncRNA interactions with proteins. SUMMARY These recent achievements, coupled to population-wide medical and genomic approaches that connect lncRNAs with human diseases and to recent advances in RNA-targeted drug development, open unprecedented new perspectives for exploiting lncRNAs as pharmacological targets or biomarkers to monitor and cure cancer, in addition to metabolic, developmental and cardiovascular diseases.
Collapse
Affiliation(s)
- Marco Marcia
- European Molecular Biology Laboratory (EMBL) Grenoble, Grenoble, France
| |
Collapse
|
44
|
Comparison of viral RNA-host protein interactomes across pathogenic RNA viruses informs rapid antiviral drug discovery for SARS-CoV-2. Cell Res 2022; 32:9-23. [PMID: 34737357 PMCID: PMC8566969 DOI: 10.1038/s41422-021-00581-y] [Citation(s) in RCA: 52] [Impact Index Per Article: 26.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2021] [Accepted: 09/23/2021] [Indexed: 12/27/2022] Open
Abstract
In contrast to the extensive research about viral protein-host protein interactions that has revealed major insights about how RNA viruses engage with host cells during infection, few studies have examined interactions between host factors and viral RNAs (vRNAs). Here, we profiled vRNA-host protein interactomes for three RNA virus pathogens (SARS-CoV-2, Zika, and Ebola viruses) using ChIRP-MS. Comparative interactome analyses discovered both common and virus-specific host responses and vRNA-associated proteins that variously promote or restrict viral infection. In particular, SARS-CoV-2 binds and hijacks the host factor IGF2BP1 to stabilize vRNA and augment viral translation. Our interactome-informed drug repurposing efforts identified several FDA-approved drugs (e.g., Cepharanthine) as broad-spectrum antivirals in cells and hACE2 transgenic mice. A co-treatment comprising Cepharanthine and Trifluoperazine was highly potent against the newly emerged SARS-CoV-2 B.1.351 variant. Thus, our study illustrates the scientific and medical discovery utility of adopting a comparative vRNA-host protein interactome perspective.
Collapse
|
45
|
Wang J, Zhang T, Yu Z, Tan WT, Wen M, Shen Y, Lambert FRP, Huber RG, Wan Y. Genome-wide RNA structure changes during human neurogenesis modulate gene regulatory networks. Mol Cell 2021; 81:4942-4953.e8. [PMID: 34655516 DOI: 10.1016/j.molcel.2021.09.027] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2021] [Revised: 08/01/2021] [Accepted: 09/24/2021] [Indexed: 11/18/2022]
Abstract
The distribution, dynamics, and function of RNA structures in human development are under-explored. Here, we systematically assayed RNA structural dynamics and their relationship with gene expression, translation, and decay during human neurogenesis. We observed that the human ESC transcriptome is globally more structurally accessible than differentiated cells and undergoes extensive RNA structure changes, particularly in the 3' UTR. Additionally, RNA structure changes during differentiation are associated with translation and decay. We observed that RBP and miRNA binding is associated with RNA structural changes during early neuronal differentiation, and splicing is associated during later neuronal differentiation. Furthermore, our analysis suggests that RBPs are major factors in structure remodeling and co-regulate additional RBPs and miRNAs through structure. We demonstrated an example of this by showing that PUM2-induced structure changes on LIN28A enable miR-30 binding. This study deepens our understanding of the widespread and complex role of RNA-based gene regulation during human development.
Collapse
Affiliation(s)
- Jiaxu Wang
- Stem Cell and Regenerative Biology, Genome Institute of Singapore, A(∗)STAR, Singapore 138672, Singapore
| | - Tong Zhang
- Stem Cell and Regenerative Biology, Genome Institute of Singapore, A(∗)STAR, Singapore 138672, Singapore
| | - Zhang Yu
- Stem Cell and Regenerative Biology, Genome Institute of Singapore, A(∗)STAR, Singapore 138672, Singapore
| | - Wen Ting Tan
- Stem Cell and Regenerative Biology, Genome Institute of Singapore, A(∗)STAR, Singapore 138672, Singapore
| | - Ming Wen
- Stem Cell and Regenerative Biology, Genome Institute of Singapore, A(∗)STAR, Singapore 138672, Singapore
| | - Yang Shen
- Stem Cell and Regenerative Biology, Genome Institute of Singapore, A(∗)STAR, Singapore 138672, Singapore
| | - Finnlay R P Lambert
- Stem Cell and Regenerative Biology, Genome Institute of Singapore, A(∗)STAR, Singapore 138672, Singapore; Division of Biomedical Sciences, Warwick Medical School, University of Warwick, Coventry, UK
| | - Roland G Huber
- Bioinformatics Institute, A(∗)STAR, Singapore 138671, Singapore
| | - Yue Wan
- Stem Cell and Regenerative Biology, Genome Institute of Singapore, A(∗)STAR, Singapore 138672, Singapore; Department of Biochemistry, National University of Singapore, Singapore 117596, Singapore.
| |
Collapse
|
46
|
Zhao S, Hamada M. Multi-resBind: a residual network-based multi-label classifier for in vivo RNA binding prediction and preference visualization. BMC Bioinformatics 2021; 22:554. [PMID: 34781902 PMCID: PMC8594109 DOI: 10.1186/s12859-021-04430-y] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2021] [Accepted: 10/06/2021] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Protein-RNA interactions play key roles in many processes regulating gene expression. To understand the underlying binding preference, ultraviolet cross-linking and immunoprecipitation (CLIP)-based methods have been used to identify the binding sites for hundreds of RNA-binding proteins (RBPs) in vivo. Using these large-scale experimental data to infer RNA binding preference and predict missing binding sites has become a great challenge. Some existing deep-learning models have demonstrated high prediction accuracy for individual RBPs. However, it remains difficult to avoid significant bias due to the experimental protocol. The DeepRiPe method was recently developed to solve this problem via introducing multi-task or multi-label learning into this field. However, this method has not reached an ideal level of prediction power due to the weak neural network architecture. RESULTS Compared to the DeepRiPe approach, our Multi-resBind method demonstrated substantial improvements using the same large-scale PAR-CLIP dataset with respect to an increase in the area under the receiver operating characteristic curve and average precision. We conducted extensive experiments to evaluate the impact of various types of input data on the final prediction accuracy. The same approach was used to evaluate the effect of loss functions. Finally, a modified integrated gradient was employed to generate attribution maps. The patterns disentangled from relative contributions according to context offer biological insights into the underlying mechanism of protein-RNA interactions. CONCLUSIONS Here, we propose Multi-resBind as a new multi-label deep-learning approach to infer protein-RNA binding preferences and predict novel interactions. The results clearly demonstrate that Multi-resBind is a promising tool to predict unknown binding sites in vivo and gain biology insights into why the neural network makes a given prediction.
Collapse
Affiliation(s)
- Shitao Zhao
- Waseda Research Institute for Science and Engineering, Waseda University, 3-4-1 Okubo Shinjuku-ku, Tokyo, 169-8555, Japan.
| | - Michiaki Hamada
- Department of Electrical Engineering and Bioscience, Faculty of Science and Engineering, Waseda University, 3-4-1 Okubo Shinjuku-ku, Tokyo, 169-8555, Japan. .,Computational Bio Big-Data Open Innovation Laboratory (CBBD-OIL), National Institute of Advanced Industrial Science and Technology, 3-4-1 Okubo Shinjuku-ku, Tokyo, 169-8555, Japan. .,Graduate School of Medicine, Nippon Medical School, 1-1-5 Sendagi, Bunkyo-ku, Tokyo, 113-8602, Japan.
| |
Collapse
|
47
|
Manigrasso J, Marcia M, De Vivo M. Computer-aided design of RNA-targeted small molecules: A growing need in drug discovery. Chem 2021. [DOI: 10.1016/j.chempr.2021.05.021] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
|
48
|
Radecki P, Uppuluri R, Deshpande K, Aviran S. Accurate detection of RNA stem-loops in structurome data reveals widespread association with protein binding sites. RNA Biol 2021; 18:521-536. [PMID: 34606413 PMCID: PMC8677038 DOI: 10.1080/15476286.2021.1971382] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/31/2022] Open
Abstract
RNA molecules are known to fold into specific structures which often play a central role in their functions and regulation. In silico folding of RNA transcripts, especially when assisted with structure profiling (SP) data, is capable of accurately elucidating relevant structural conformations. However, such methods scale poorly to the swaths of SP data generated by transcriptome-wide experiments, which are becoming more commonplace and advancing our understanding of RNA structure and its regulation at global and local levels. This has created a need for tools capable of rapidly deriving structural assessments from SP data in a scalable manner. One such tool we previously introduced that aims to process such data is patteRNA, a statistical learning algorithm capable of rapidly mining big SP datasets for structural elements. Here, we present a reformulation of patteRNA's pattern recognition scheme that sees significantly improved precision without major compromises to computational overhead. Specifically, we developed a data-driven logistic classifier which interprets patteRNA's statistical characterizations of SP data in addition to local sequence properties as measured with a nearest neighbour thermodynamic model. Application of the classifier to human structurome data reveals a marked association between detected stem-loops and RNA binding protein (RBP) footprints. The results of our application demonstrate that upwards of 30% of RBP footprints occur within loops of stable stem-loop elements. Overall, our work arrives at a rapid and accurate method for automatically detecting families of RNA structure motifs and demonstrates the functional relevance of identifying them transcriptome-wide.
Collapse
Affiliation(s)
- Pierce Radecki
- Biomedical Engineering Department and Genome Center, University of California, Davis, CA, USA
| | - Rahul Uppuluri
- Biomedical Engineering Department and Genome Center, University of California, Davis, CA, USA
| | - Kaustubh Deshpande
- Biomedical Engineering Department and Genome Center, University of California, Davis, CA, USA
| | - Sharon Aviran
- Biomedical Engineering Department and Genome Center, University of California, Davis, CA, USA
| |
Collapse
|
49
|
Wang XW, Liu CX, Chen LL, Zhang QC. RNA structure probing uncovers RNA structure-dependent biological functions. Nat Chem Biol 2021; 17:755-766. [PMID: 34172967 DOI: 10.1038/s41589-021-00805-7] [Citation(s) in RCA: 49] [Impact Index Per Article: 16.3] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2020] [Accepted: 04/23/2021] [Indexed: 01/22/2023]
Abstract
RNA molecules fold into complex structures that enable their diverse functions in cells. Recent revolutionary innovations in transcriptome-wide RNA structural probing of living cells have ushered in a new era in understanding RNA functions. Here, we summarize the latest technological advances for probing RNA secondary structures and discuss striking discoveries that have linked RNA regulation and biological processes through interrogation of RNA structures. In particular, we highlight how different long noncoding RNAs form into distinct secondary structures that determine their modes of interactions with protein partners to realize their unique functions. These dynamic structures mediate RNA regulatory functions through altering interactions with proteins and other RNAs. We also outline current methodological hurdles and speculate about future directions for development of the next generation of RNA structure-probing technologies of higher sensitivity and resolution, which could then be applied in increasingly physiologically relevant studies.
Collapse
Affiliation(s)
- Xi-Wen Wang
- MOE Key Laboratory of Bioinformatics, Beijing Advanced Innovation Center for Structural Biology and Frontier Research Center for Biological Structure, Center for Synthetic and Systems Biology, School of Life Sciences, Tsinghua University, Beijing, China.,Tsinghua-Peking Center for Life Sciences, Beijing, China
| | - Chu-Xiao Liu
- State Key Laboratory of Molecular Biology, Shanghai Key Laboratory of Molecular Andrology, CAS Center for Excellence in Molecular Cell Science, Shanghai Institute of Biochemistry and Cell Biology, University of the Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai, China
| | - Ling-Ling Chen
- State Key Laboratory of Molecular Biology, Shanghai Key Laboratory of Molecular Andrology, CAS Center for Excellence in Molecular Cell Science, Shanghai Institute of Biochemistry and Cell Biology, University of the Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai, China. .,School of Life Science and Technology, ShanghaiTech University, Shanghai, China. .,School of Life Sciences, Hangzhou Institute for Advanced Study, University of the Chinese Academy of Sciences, Hangzhou, China.
| | - Qiangfeng Cliff Zhang
- MOE Key Laboratory of Bioinformatics, Beijing Advanced Innovation Center for Structural Biology and Frontier Research Center for Biological Structure, Center for Synthetic and Systems Biology, School of Life Sciences, Tsinghua University, Beijing, China. .,Tsinghua-Peking Center for Life Sciences, Beijing, China.
| |
Collapse
|
50
|
Rapidly Growing Protein-Centric Technologies to Extensively Identify Protein-RNA Interactions: Application to the Analysis of Co-Transcriptional RNA Processing. Int J Mol Sci 2021; 22:ijms22105312. [PMID: 34070162 PMCID: PMC8158511 DOI: 10.3390/ijms22105312] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2021] [Revised: 05/14/2021] [Accepted: 05/15/2021] [Indexed: 12/11/2022] Open
Abstract
During mRNA transcription, diverse RNA-binding proteins (RBPs) are recruited to RNA polymerase II (RNAP II) transcription machinery. These RBPs bind to distinct sites of nascent RNA to co-transcriptionally operate mRNA processing. Recent studies have revealed a close relationship between transcription and co-transcriptional RNA processing, where one affects the other’s activity, indicating an essential role of protein–RNA interactions for the fine-tuning of mRNA production. Owing to their limited amount in cells, the detection of protein–RNA interactions specifically assembled on the transcribing RNAP II machinery still remains challenging. Currently, cross-linking and immunoprecipitation (CLIP) has become a standard method to detect in vivo protein–RNA interactions, although it requires a large amount of input materials. Several improved methods, such as infrared-CLIP (irCLIP), enhanced CLIP (eCLIP), and target RNA immunoprecipitation (tRIP), have shown remarkable enhancements in the detection efficiency. Furthermore, the utilization of an RNA editing mechanism or proximity labeling strategy has achieved the detection of faint protein–RNA interactions in cells without depending on crosslinking. This review aims to explore various methods being developed to detect endogenous protein–RNA interaction sites and discusses how they may be applied to the analysis of co-transcriptional RNA processing.
Collapse
|