1
|
Zhou Y, Chen SJ. Advances in machine-learning approaches to RNA-targeted drug design. ARTIFICIAL INTELLIGENCE CHEMISTRY 2024; 2:100053. [PMID: 38434217 PMCID: PMC10904028 DOI: 10.1016/j.aichem.2024.100053] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/05/2024]
Abstract
RNA molecules play multifaceted functional and regulatory roles within cells and have garnered significant attention in recent years as promising therapeutic targets. With remarkable successes achieved by artificial intelligence (AI) in different fields such as computer vision and natural language processing, there is a growing imperative to harness AI's potential in computer-aided drug design (CADD) to discover novel drug compounds that target RNA. Although machine-learning (ML) approaches have been widely adopted in the discovery of small molecules targeting proteins, the application of ML approaches to model interactions between RNA and small molecule is still in its infancy. Compared to protein-targeted drug discovery, the major challenges in ML-based RNA-targeted drug discovery stem from the scarcity of available data resources. With the growing interest and the development of curated databases focusing on interactions between RNA and small molecule, the field anticipates a rapid growth and the opening of a new avenue for disease treatment. In this review, we aim to provide an overview of recent advancements in computationally modeling RNA-small molecule interactions within the context of RNA-targeted drug discovery, with a particular emphasis on methodologies employing ML techniques.
Collapse
Affiliation(s)
- Yuanzhe Zhou
- Department of Physics and Astronomy, University of Missouri, Columbia, MO 65211-7010, USA
| | - Shi-Jie Chen
- Department of Physics and Astronomy, Department of Biochemistry, Institute of Data Sciences and Informatics, University of Missouri, Columbia, MO 65211-7010, USA
| |
Collapse
|
2
|
Morishita EC, Nakamura S. Recent applications of artificial intelligence in RNA-targeted small molecule drug discovery. Expert Opin Drug Discov 2024; 19:415-431. [PMID: 38321848 DOI: 10.1080/17460441.2024.2313455] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2023] [Accepted: 01/30/2024] [Indexed: 02/08/2024]
Abstract
INTRODUCTION Targeting RNAs with small molecules offers an alternative to the conventional protein-targeted drug discovery and can potentially address unmet and emerging medical needs. The recent rise of interest in the strategy has already resulted in large amounts of data on disease associated RNAs, as well as on small molecules that bind to such RNAs. Artificial intelligence (AI) approaches, including machine learning and deep learning, present an opportunity to speed up the discovery of RNA-targeted small molecules by improving decision-making efficiency and quality. AREAS COVERED The topics described in this review include the recent applications of AI in the identification of RNA targets, RNA structure determination, screening of chemical compound libraries, and hit-to-lead optimization. The impact and limitations of the recent AI applications are discussed, along with an outlook on the possible applications of next-generation AI tools for the discovery of novel RNA-targeted small molecule drugs. EXPERT OPINION Key areas for improvement include developing AI tools for understanding RNA dynamics and RNA - small molecule interactions. High-quality and comprehensive data still need to be generated especially on the biological activity of small molecules that target RNAs.
Collapse
|
3
|
Sun S, Gao L. Contrastive pre-training and 3D convolution neural network for RNA and small molecule binding affinity prediction. Bioinformatics 2024; 40:btae155. [PMID: 38507691 PMCID: PMC11007238 DOI: 10.1093/bioinformatics/btae155] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2024] [Revised: 02/23/2024] [Accepted: 03/18/2024] [Indexed: 03/22/2024] Open
Abstract
MOTIVATION The diverse structures and functions inherent in RNAs present a wealth of potential drug targets. Some small molecules are anticipated to serve as leading compounds, providing guidance for the development of novel RNA-targeted therapeutics. Consequently, the determination of RNA-small molecule binding affinity is a critical undertaking in the landscape of RNA-targeted drug discovery and development. Nevertheless, to date, only one computational method for RNA-small molecule binding affinity prediction has been proposed. The prediction of RNA-small molecule binding affinity remains a significant challenge. The development of a computational model is deemed essential to effectively extract relevant features and predict RNA-small molecule binding affinity accurately. RESULTS In this study, we introduced RLaffinity, a novel deep learning model designed for the prediction of RNA-small molecule binding affinity based on 3D structures. RLaffinity integrated information from RNA pockets and small molecules, utilizing a 3D convolutional neural network (3D-CNN) coupled with a contrastive learning-based self-supervised pre-training model. To the best of our knowledge, RLaffinity was the first deep learning based method for the prediction of RNA-small molecule binding affinity. Our experimental results exhibited RLaffinity's superior performance compared to baseline methods, revealed by all metrics. The efficacy of RLaffinity underscores the capability of 3D-CNN to accurately extract both global pocket information and local neighbor nucleotide information within RNAs. Notably, the integration of a self-supervised pre-training model significantly enhanced predictive performance. Ultimately, RLaffinity was also proved as a potential tool for RNA-targeted drugs virtual screening. AVAILABILITY AND IMPLEMENTATION https://github.com/SaisaiSun/RLaffinity.
Collapse
Affiliation(s)
- Saisai Sun
- School of Computer Science and Technology, Xidian University, No.266 Xinglong Section of Xi Feng Road, Xi’an, Shaanxi, 710126, China
| | - Lin Gao
- School of Computer Science and Technology, Xidian University, No.266 Xinglong Section of Xi Feng Road, Xi’an, Shaanxi, 710126, China
| |
Collapse
|
4
|
Zhang L, Xiao K, Kong L. A computational method for small molecule-RNA binding sites identification by utilizing position specificity and complex network information. Biosystems 2024; 235:105094. [PMID: 38056591 DOI: 10.1016/j.biosystems.2023.105094] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/03/2023] [Revised: 11/23/2023] [Accepted: 11/24/2023] [Indexed: 12/08/2023]
Abstract
Some computational methods have been given for small molecule-RNA binding site identification due to that it plays a significant role in revealing biology function researches. However, it is still challenging to design an accurate model, especially for MCC. We designed a feature extraction technology from two aspects (position specificity and complex network information). Specifically, complex network was employed to express the space topological structure and sequence position information for improving prediction effect. Then, the features fused position specificity and complex network information were input into random forest classifier for model construction. The AUC of 88.22%, 77.92% and 81.46% were obtained on three independent datasets (RB19, CS71, RB78). Compared with the existing method, the best MCC were obtained on three datasets, which were 8.19%, 0.59% and 4.35% higher than the state-of-the-art prediction methods, respectively. The outstanding performances show that our method is a powerful tool to identify RNA binding sites, helping to the design RNA-targeting small molecule drugs. The data and resource codes are available at https://github.com/Kangxiaoneuq/PCN_RNAsite.
Collapse
Affiliation(s)
- Lichao Zhang
- School of Mathematics and Statistics, Northeastern University at Qinhuangdao, Qinhuangdao, 066000, PR China; Hebei Innovation Center for Smart Perception and Applied Technology of Agricultural Data, Qinhuangdao, 066000, PR China.
| | - Kang Xiao
- School of Mathematics and Statistics, Northeastern University at Qinhuangdao, Qinhuangdao, 066000, PR China.
| | - Liang Kong
- Hebei Innovation Center for Smart Perception and Applied Technology of Agricultural Data, Qinhuangdao, 066000, PR China; School of Mathematics and Information Science & Technology, Hebei Normal University of Science & Technology, Qinhuangdao, 066000, PR China.
| |
Collapse
|
5
|
Liu H, Jian Y, Hou J, Zeng C, Zhao Y. RNet: a network strategy to predict RNA binding preferences. Brief Bioinform 2023; 25:bbad482. [PMID: 38145947 PMCID: PMC10749790 DOI: 10.1093/bib/bbad482] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2023] [Revised: 11/15/2023] [Accepted: 12/05/2023] [Indexed: 12/27/2023] Open
Abstract
Determining the RNA binding preferences remains challenging because of the bottleneck of the binding interactions accompanied by subtle RNA flexibility. Typically, designing RNA inhibitors involves screening thousands of potential candidates for binding. Accurate binding site information can increase the number of successful hits even with few candidates. There are two main issues regarding RNA binding preference: binding site prediction and binding dynamical behavior prediction. Here, we propose one interpretable network-based approach, RNet, to acquire precise binding site and binding dynamical behavior information. RNetsite employs a machine learning-based network decomposition algorithm to predict RNA binding sites by analyzing the local and global network properties. Our research focuses on large RNAs with 3D structures without considering smaller regulatory RNAs, which are too small and dynamic. Our study shows that RNetsite outperforms existing methods, achieving precision values as high as 0.701 on TE18 and 0.788 on RB9 tests. In addition, RNetsite demonstrates remarkable robustness regarding perturbations in RNA structures. We also developed RNetdyn, a distance-based dynamical graph algorithm, to characterize the interface dynamical behavior consequences upon inhibitor binding. The simulation testing of competitive inhibitors indicates that RNetdyn outperforms the traditional method by 30%. The benchmark testing results demonstrate that RNet is highly accurate and robust. Our interpretable network algorithms can assist in predicting RNA binding preferences and accelerating RNA inhibitor design, providing valuable insights to the RNA research community.
Collapse
Affiliation(s)
- Haoquan Liu
- Institute of Biophysics and Department of Physics, Central China Normal University, Wuhan, 430079, China
| | - Yiren Jian
- Department of Computer Science, Dartmouth College, Hanover, NH 03755, USA
| | - Jinxuan Hou
- Department of Thyroid and Breast Surgery, Zhongnan Hospital of Wuhan University, Wuhan 430071, China
| | - Chen Zeng
- Department of Physics, The George Washington University, Washington, DC 20052, USA
| | - Yunjie Zhao
- Institute of Biophysics and Department of Physics, Central China Normal University, Wuhan, 430079, China
| |
Collapse
|
6
|
Fan R, Ji X, Li J, Cui Q, Cui C. Defining the single base importance of human mRNAs and lncRNAs. Brief Bioinform 2023; 24:bbad321. [PMID: 37668090 DOI: 10.1093/bib/bbad321] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2023] [Revised: 07/28/2023] [Accepted: 08/16/2023] [Indexed: 09/06/2023] Open
Abstract
As the fundamental unit of a gene and its transcripts, nucleotides have enormous impacts on the gene function and evolution, and thus on phenotypes and diseases. In order to identify the key nucleotides of one specific gene, it is quite crucial to quantitatively measure the importance of each base on the gene. However, there are still no sequence-based methods of doing that. Here, we proposed Base Importance Calculator (BIC), an algorithm to calculate the importance score of each single base based on sequence information of human mRNAs and long noncoding RNAs (lncRNAs). We then confirmed its power by applying BIC to three different tasks. Firstly, we revealed that BIC can effectively evaluate the pathogenicity of both genes and single bases through single nucleotide variations. Moreover, the BIC score in The Cancer Genome Atlas somatic mutations is able to predict the prognosis of some cancers. Finally, we show that BIC can also precisely predict the transmissibility of SARS-CoV-2. The above results indicate that BIC is a useful tool for evaluating the single base importance of human mRNAs and lncRNAs.
Collapse
Affiliation(s)
- Rui Fan
- Department of Biomedical Informatics, Department of Physiology and Pathophysiology, Center for Noncoding RNA Medicine, State Key Lab of Vascular Homeostasis and Remodeling, School of Basic Medical Sciences, Peking University, 38 Xueyuan Rd, Beijing, 100191, China
| | - Xiangwen Ji
- Department of Biomedical Informatics, Department of Physiology and Pathophysiology, Center for Noncoding RNA Medicine, State Key Lab of Vascular Homeostasis and Remodeling, School of Basic Medical Sciences, Peking University, 38 Xueyuan Rd, Beijing, 100191, China
| | - Jianwei Li
- Institute of Computational Medicine, School of Artificial Intelligence, Hebei University of Technology, Tianjin, 300401, China
| | - Qinghua Cui
- Department of Biomedical Informatics, Department of Physiology and Pathophysiology, Center for Noncoding RNA Medicine, State Key Lab of Vascular Homeostasis and Remodeling, School of Basic Medical Sciences, Peking University, 38 Xueyuan Rd, Beijing, 100191, China
- School of Sports Medicine, Wuhan Institute of Physical Education, No.461 Luoyu Rd. Wuchang District, Wuhan 430079, Hubei Province, China
| | - Chunmei Cui
- Department of Biomedical Informatics, Department of Physiology and Pathophysiology, Center for Noncoding RNA Medicine, State Key Lab of Vascular Homeostasis and Remodeling, School of Basic Medical Sciences, Peking University, 38 Xueyuan Rd, Beijing, 100191, China
| |
Collapse
|
7
|
Wang K, Zhou R, Wu Y, Li M. RLBind: a deep learning method to predict RNA-ligand binding sites. Brief Bioinform 2023; 24:6832814. [PMID: 36398911 DOI: 10.1093/bib/bbac486] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2022] [Revised: 09/28/2022] [Accepted: 10/14/2022] [Indexed: 11/19/2022] Open
Abstract
Identification of RNA-small molecule binding sites plays an essential role in RNA-targeted drug discovery and development. These small molecules are expected to be leading compounds to guide the development of new types of RNA-targeted therapeutics compared with regular therapeutics targeting proteins. RNAs can provide many potential drug targets with diverse structures and functions. However, up to now, only a few methods have been proposed. Predicting RNA-small molecule binding sites still remains a big challenge. New computational model is required to better extract the features and predict RNA-small molecule binding sites more accurately. In this paper, a deep learning model, RLBind, was proposed to predict RNA-small molecule binding sites from sequence-dependent and structure-dependent properties by combining global RNA sequence channel and local neighbor nucleotides channel. To our best knowledge, this research was the first to develop a convolutional neural network for RNA-small molecule binding sites prediction. Furthermore, RLBind also can be used as a potential tool when the RNA experimental tertiary structure is not available. The experimental results show that RLBind outperforms other state-of-the-art methods in predicting binding sites. Therefore, our study demonstrates that the combination of global information for full-length sequences and local information for limited local neighbor nucleotides in RNAs can improve the model's predictive performance for binding sites prediction. All datasets and resource codes are available at https://github.com/KailiWang1/RLBind.
Collapse
Affiliation(s)
- Kaili Wang
- School of Computer Science and Engineering, Central South University, Changsha 410083, China
| | - Renyi Zhou
- School of Computer Science and Engineering, Central South University, Changsha 410083, China
| | - Yifan Wu
- School of Computer Science and Engineering, Central South University, Changsha 410083, China
| | - Min Li
- School of Computer Science and Engineering, Central South University, Changsha 410083, China
| |
Collapse
|
8
|
Möller L, Guerci L, Isert C, Atz K, Schneider G. Translating from proteins to ribonucleic acids for ligand-binding site detection. Mol Inform 2022; 41:e2200059. [PMID: 35577762 DOI: 10.1002/minf.202200059] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2022] [Accepted: 05/16/2022] [Indexed: 11/10/2022]
Abstract
Identifying druggable ligand-binding sites on the surface of the macromolecular targets is an important process in structure-based drug discovery. Deep-learning models have been shown to successfully predict ligand-binding sites of proteins. As a step toward predicting binding sites in RNA and RNA-protein complexes, we employ three-dimensional convolutional neural networks. We introduce a dataset splitting approach to minimize structure-related bias in training data, and investigate the influence of protein-based neural network pre-training before fine-tuning on RNA structures. Models that were pre-trained on proteins considerably outperformed the models that were trained exclusively on RNA structures. Overall, 71% of the known RNA binding sites were correctly located within 4 Å of their true centres with a structural overlap of at least 25%.
Collapse
|
9
|
Zhou Y, Jiang Y, Chen SJ. RNA-ligand molecular docking: advances and challenges. WILEY INTERDISCIPLINARY REVIEWS. COMPUTATIONAL MOLECULAR SCIENCE 2022; 12:e1571. [PMID: 37293430 PMCID: PMC10250017 DOI: 10.1002/wcms.1571] [Citation(s) in RCA: 23] [Impact Index Per Article: 11.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/26/2021] [Accepted: 07/20/2021] [Indexed: 12/16/2022]
Abstract
With rapid advances in computer algorithms and hardware, fast and accurate virtual screening has led to a drastic acceleration in selecting potent small molecules as drug candidates. Computational modeling of RNA-small molecule interactions has become an indispensable tool for RNA-targeted drug discovery. The current models for RNA-ligand binding have mainly focused on the docking-and-scoring method. Accurate docking and scoring should tackle four crucial problems: (1) conformational flexibility of ligand, (2) conformational flexibility of RNA, (3) efficient sampling of binding sites and binding poses, and (4) accurate scoring of different binding modes. Moreover, compared with the problem of protein-ligand docking, predicting ligand binding to RNA, a negatively charged polymer, is further complicated by additional effects such as metal ion effects. Thermodynamic models based on physics-based and knowledge-based scoring functions have shown highly encouraging success in predicting ligand binding poses and binding affinities. Recently, kinetic models for ligand binding have further suggested that including dissociation kinetics (residence time) in ligand docking would result in improved performance in estimating in vivo drug efficacy. More recently, the rise of deep-learning approaches has led to new tools for predicting RNA-small molecule binding. In this review, we present an overview of the recently developed computational methods for RNA-ligand docking and their advantages and disadvantages.
Collapse
Affiliation(s)
- Yuanzhe Zhou
- Department of Physics and Astronomy, Department of Biochemistry, Institute of Data Sciences and Informatics, University of Missouri, Columbia, MO 65211-7010, USA
| | - Yangwei Jiang
- Department of Physics and Astronomy, Department of Biochemistry, Institute of Data Sciences and Informatics, University of Missouri, Columbia, MO 65211-7010, USA
| | - Shi-Jie Chen
- Department of Physics and Astronomy, Department of Biochemistry, Institute of Data Sciences and Informatics, University of Missouri, Columbia, MO 65211-7010, USA
| |
Collapse
|
10
|
Kozlovskii I, Popov P. Structure-based deep learning for binding site detection in nucleic acid macromolecules. NAR Genom Bioinform 2021; 3:lqab111. [PMID: 34859211 PMCID: PMC8633674 DOI: 10.1093/nargab/lqab111] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2021] [Revised: 10/14/2021] [Accepted: 11/09/2021] [Indexed: 12/30/2022] Open
Abstract
Structure-based drug design (SBDD) targeting nucleic acid macromolecules, particularly RNA, is a gaining momentum research direction that already resulted in several FDA-approved compounds. Similar to proteins, one of the critical components in SBDD for RNA is the correct identification of the binding sites for putative drug candidates. RNAs share a common structural organization that, together with the dynamic nature of these molecules, makes it challenging to recognize binding sites for small molecules. Moreover, there is a need for structure-based approaches, as sequence information only does not consider conformation plasticity of nucleic acid macromolecules. Deep learning holds a great promise to resolve binding site detection problem, but requires a large amount of structural data, which is very limited for nucleic acids, compared to proteins. In this study we composed a set of ∼2000 nucleic acid-small molecule structures comprising ∼2500 binding sites, which is ∼40-times larger than previously used one, and demonstrated the first structure-based deep learning approach, BiteNetN, to detect binding sites in nucleic acid structures. BiteNetN operates with arbitrary nucleic acid complexes, shows the state-of-the-art performance, and can be helpful in the analysis of different conformations and mutant variants, as we demonstrated for HIV-1 TAR RNA and ATP-aptamer case studies.
Collapse
Affiliation(s)
- Igor Kozlovskii
- iMolecule, Skolkovo Institute of Science and Technology, Moscow, 121205, Russia
| | - Petr Popov
- iMolecule, Skolkovo Institute of Science and Technology, Moscow, 121205, Russia
| |
Collapse
|
11
|
Feng Y, Yan Y, He J, Tao H, Wu Q, Huang SY. Docking and scoring for nucleic acid-ligand interactions: Principles and current status. Drug Discov Today 2021; 27:838-847. [PMID: 34718205 DOI: 10.1016/j.drudis.2021.10.013] [Citation(s) in RCA: 18] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2021] [Revised: 09/06/2021] [Accepted: 10/20/2021] [Indexed: 12/24/2022]
Abstract
Nucleic acid (NA)-ligand interactions have crucial roles in many cellular processes and, thus, are increasingly attracting therapeutic interest in drug discovery. Molecular docking is a valuable tool for studying molecular interactions. However, because NAs differ significantly from proteins in both their physical and chemical properties, traditional docking algorithms and scoring functions for protein-ligand interactions might not be applicable to NA-ligand docking. Therefore, various sampling strategies and scoring functions for NA-ligand interactions have been developed. Here, we review the basic principles and current status of docking algorithms and scoring functions for DNA/RNA-ligand interactions. We also discuss challenges and limitations of current docking and scoring approaches.
Collapse
Affiliation(s)
- Yuyu Feng
- School of Physics, Huazhong University of Science and Technology, Wuhan, Hubei 430074, PR China
| | - Yumeng Yan
- School of Physics, Huazhong University of Science and Technology, Wuhan, Hubei 430074, PR China
| | - Jiahua He
- School of Physics, Huazhong University of Science and Technology, Wuhan, Hubei 430074, PR China
| | - Huanyu Tao
- School of Physics, Huazhong University of Science and Technology, Wuhan, Hubei 430074, PR China
| | - Qilong Wu
- School of Physics, Huazhong University of Science and Technology, Wuhan, Hubei 430074, PR China
| | - Sheng-You Huang
- School of Physics, Huazhong University of Science and Technology, Wuhan, Hubei 430074, PR China.
| |
Collapse
|
12
|
Jiang Z, Xiao SR, Liu R. Dissecting and predicting different types of binding sites in nucleic acids based on structural information. Brief Bioinform 2021; 23:6384399. [PMID: 34624074 PMCID: PMC8769709 DOI: 10.1093/bib/bbab411] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2021] [Revised: 08/26/2021] [Accepted: 09/07/2021] [Indexed: 12/16/2022] Open
Abstract
The biological functions of DNA and RNA generally depend on their interactions with other molecules, such as small ligands, proteins and nucleic acids. However, our knowledge of the nucleic acid binding sites for different interaction partners is very limited, and identification of these critical binding regions is not a trivial work. Herein, we performed a comprehensive comparison between binding and nonbinding sites and among different categories of binding sites in these two nucleic acid classes. From the structural perspective, RNA may interact with ligands through forming binding pockets and contact proteins and nucleic acids using protruding surfaces, while DNA may adopt regions closer to the middle of the chain to make contacts with other molecules. Based on structural information, we established a feature-based ensemble learning classifier to identify the binding sites by fully using the interplay among different machine learning algorithms, feature spaces and sample spaces. Meanwhile, we designed a template-based classifier by exploiting structural conservation. The complementarity between the two classifiers motivated us to build an integrative framework for improving prediction performance. Moreover, we utilized a post-processing procedure based on the random walk algorithm to further correct the integrative predictions. Our unified prediction framework yielded promising results for different binding sites and outperformed existing methods.
Collapse
Affiliation(s)
- Zheng Jiang
- College of Informatics, Huazhong Agricultural University, Wuhan, P. R. China
| | - Si-Rui Xiao
- College of Informatics, Huazhong Agricultural University, Wuhan, P. R. China
| | - Rong Liu
- College of Informatics, Huazhong Agricultural University, Wuhan, P. R. China
| |
Collapse
|
13
|
Xie J, Frank AT. Mining for Ligandable Cavities in RNA. ACS Med Chem Lett 2021; 12:928-934. [PMID: 34141071 DOI: 10.1021/acsmedchemlett.1c00068] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2021] [Accepted: 05/18/2021] [Indexed: 12/23/2022] Open
Abstract
Identifying potential ligand binding cavities is a critical step in structure-based screening of biomolecular targets. Cavity mapping methods can detect such binding cavities; however, for ribonucleic acid (RNA) targets, determining which of the detected cavities are "ligandable" remains an unsolved challenge. In this study, we trained a set of machine learning classifiers to distinguish ligandable RNA cavities from decoy cavities. Application of our classifiers to two independent test sets demonstrated that we could recover ligandable cavities from decoys with an AUC > 0.83. Interestingly, when we applied our classifiers to a library of modeled structures of the HIV-1 transactivation response (TAR) element RNA, we found that several of the conformers that harbored cavities with high ligandability scores resembled known holo-TAR structures. On the basis of our results, we envision that our classifiers could find utility as a tool to parse RNA structures and prospectively mine for ligandable binding cavities and, in so doing, facilitate structure-based virtual screening efforts against RNA drug targets.
Collapse
Affiliation(s)
- Jingru Xie
- Department of Physics, University of Michigan, Ann Arbor, Michigan 48109, United States
| | - Aaron T. Frank
- Biophysics Program, University of Michigan, Ann Arbor, Michigan 48109, United States
- Department of Chemistry, University of Michigan, Ann Arbor, Michigan 48109, United States
| |
Collapse
|
14
|
Su H, Peng Z, Yang J. Recognition of small molecule-RNA binding sites using RNA sequence and structure. Bioinformatics 2021; 37:36-42. [PMID: 33416863 PMCID: PMC8034527 DOI: 10.1093/bioinformatics/btaa1092] [Citation(s) in RCA: 19] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2019] [Revised: 12/12/2020] [Accepted: 12/23/2020] [Indexed: 11/22/2022] Open
Abstract
Motivation RNA molecules become attractive small molecule drug targets to treat disease in recent years. Computer-aided drug design can be facilitated by detecting the RNA sites that bind small molecules. However, very limited progress has been reported for the prediction of small molecule–RNA binding sites. Results We developed a novel method RNAsite to predict small molecule–RNA binding sites using sequence profile- and structure-based descriptors. RNAsite was shown to be competitive with the state-of-the-art methods on the experimental structures of two independent test sets. When predicted structure models were used, RNAsite outperforms other methods by a large margin. The possibility of improving RNAsite by geometry-based binding pocket detection was investigated. The influence of RNA structure’s flexibility and the conformational changes caused by ligand binding on RNAsite were also discussed. RNAsite is anticipated to be a useful tool for the design of RNA-targeting small molecule drugs. Availability and implementation http://yanglab.nankai.edu.cn/RNAsite. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Hong Su
- School of Mathematical Sciences, Nankai University, Tianjin, 300071, China
| | - Zhenling Peng
- Center for Applied Mathematics, Tianjin University, Tianjin, 300072, China
| | - Jianyi Yang
- School of Mathematical Sciences, Nankai University, Tianjin, 300071, China
| |
Collapse
|
15
|
Wang H, Zhao Y. RBinds: A user-friendly server for RNA binding site prediction. Comput Struct Biotechnol J 2020; 18:3762-3765. [PMID: 34136090 PMCID: PMC8164131 DOI: 10.1016/j.csbj.2020.10.043] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2020] [Revised: 10/27/2020] [Accepted: 10/31/2020] [Indexed: 12/03/2022] Open
Abstract
RNA performs various biological functions by interacting with other molecules. The knowledge of RNA binding sites is essential for the understanding of RNA-protein or RNA-ligand complex structures and their mechanisms. However, the RNA binding site prediction study requires tedious programming scripts and manual handling. One user-friendly bioinformatics tool for RNA binding site prediction has been missing. This limitation motivated us to develop the RBinds, a user-friendly web server, to predict the RNA binding site using a simple graphical user interface. Some advanced features implemented in RBinds are (1) transforming the RNA structure to a network automatically; (2) analyzing the structural network properties to predict binding site; (3) constructing one annotated force-directed network; (4) providing a visualization tool for users to scale and rotate the structure; (5) offering the related tools to predict or simulate RNA structures. RBinds web server is a reliable and user-friendly tool and facilitates the RNA binding site study without installing programs locally. RBinds is freely accessible at http://zhaoserver.com.cn/RBinds/RBinds.html.
Collapse
Affiliation(s)
- Huiwen Wang
- Institute of Biophysics and Department of Physics, Central China Normal University, Wuhan 430079, China
| | - Yunjie Zhao
- Institute of Biophysics and Department of Physics, Central China Normal University, Wuhan 430079, China
| |
Collapse
|
16
|
Wang K, Jian Y, Wang H, Zeng C, Zhao Y. RBind: computational network method to predict RNA binding sites. Bioinformatics 2019; 34:3131-3136. [PMID: 29718097 DOI: 10.1093/bioinformatics/bty345] [Citation(s) in RCA: 27] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2017] [Accepted: 04/24/2018] [Indexed: 12/21/2022] Open
Abstract
Motivation Non-coding RNA molecules play essential roles by interacting with other molecules to perform various biological functions. However, it is difficult to determine RNA structures due to their flexibility. At present, the number of experimentally solved RNA-ligand and RNA-protein structures is still insufficient. Therefore, binding sites prediction of non-coding RNA is required to understand their functions. Results Current RNA binding site prediction algorithms produce many false positive nucleotides that are distance away from the binding sites. Here, we present a network approach, RBind, to predict the RNA binding sites. We benchmarked RBind in RNA-ligand and RNA-protein datasets. The average accuracy of 0.82 in RNA-ligand and 0.63 in RNA-protein testing showed that this network strategy has a reliable accuracy for binding sites prediction. Availability and implementation The codes and datasets are available at https://zhaolab.com.cn/RBind. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Kaili Wang
- Institute of Biophysics and Department of Physics, Central China Normal University, Wuhan, China
| | - Yiren Jian
- Department of Physics, The George Washington University, Washington, DC, USA
| | - Huiwen Wang
- Institute of Biophysics and Department of Physics, Central China Normal University, Wuhan, China
| | - Chen Zeng
- Institute of Biophysics and Department of Physics, Central China Normal University, Wuhan, China.,Department of Physics, The George Washington University, Washington, DC, USA
| | - Yunjie Zhao
- Institute of Biophysics and Department of Physics, Central China Normal University, Wuhan, China
| |
Collapse
|
17
|
Rsite2: an efficient computational method to predict the functional sites of noncoding RNAs. Sci Rep 2016; 6:19016. [PMID: 26751501 PMCID: PMC4707467 DOI: 10.1038/srep19016] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/27/2015] [Accepted: 12/02/2015] [Indexed: 01/11/2023] Open
Abstract
Noncoding RNAs (ncRNAs) represent a big class of important RNA molecules. Given the large number of ncRNAs, identifying their functional sites is becoming one of the most important topics in the post-genomic era, but available computational methods are limited. For the above purpose, we previously presented a tertiary structure based method, Rsite, which first calculates the distance metrics defined in Methods with the tertiary structure of an ncRNA and then identifies the nucleotides located within the extreme points in the distance curve as the functional sites of the given ncRNA. However, the application of Rsite is largely limited because of limited RNA tertiary structures. Here we present a secondary structure based computational method, Rsite2, based on the observation that the secondary structure based nucleotide distance is strongly positively correlated with that derived from tertiary structure. This makes it reasonable to replace tertiary structure with secondary structure, which is much easier to obtain and process. Moreover, we applied Rsite2 to three ncRNAs (tRNA (Lys), Diels-Alder ribozyme, and RNase P) and a list of human mitochondria transcripts. The results show that Rsite2 works well with nearly equivalent accuracy as Rsite but is much more feasible and efficient. Finally, a web-server, the source codes, and the dataset of Rsite2 are available at http://www.cuialb.cn/rsite2.
Collapse
|