1
|
He Y, Ning Z, Zhu X, Zhang Y, Liu C, Jiang S, Yuan Z, Zhang H. Plant lncRNA-miRNA Interaction Prediction Based on Counterfactual Heterogeneous Graph Attention Network. Interdiscip Sci 2024:10.1007/s12539-024-00652-9. [PMID: 39382820 DOI: 10.1007/s12539-024-00652-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2024] [Revised: 08/10/2024] [Accepted: 08/12/2024] [Indexed: 10/10/2024]
Abstract
Identifying interactions between long non-coding RNAs (lncRNAs) and microRNAs (miRNAs) provides a new perspective for understanding regulatory relationships in plant life processes. Recently, computational methods based on graph neural networks (GNNs) have been widely employed to predict lncRNA-miRNA interactions (LMIs), which compensate for the inadequacy of biological experiments. However, the low-semantic and noise of graph limit the performance of existing GNN-based methods. In this paper, we develop a novel Counterfactual Heterogeneous Graph Attention Network (CFHAN) to improve the robustness to against the noise and the prediction of plant LMIs. Firstly, we construct a real-world based lncRNA-miRNA (L-M) heterogeneous network. Secondly, CFHAN utilizes the node-level attention, the semantic-level attention, and the counterfactual links to enhance the node embeddings learning. Finally, these embeddings are used as inputs for Multilayer Perceptron (MLP) to predict the interactions between lncRNAs and miRNAs. Evaluating our method on a benchmark dataset of plant LMIs, CFHAN outperforms five state-of-the-art methods, and achieves an average AUC and average ACC of 0.9953 and 0.9733, respectively. This demonstrates CFHAN's ability to predict plant LMIs and exhibits promising cross-species prediction ability, offering valuable insights for experimental LMI researches.
Collapse
Affiliation(s)
- Yu He
- College of Information and Intelligence, Hunan Agricultural University, Changsha, 410128, China
| | - ZiLan Ning
- College of Information and Intelligence, Hunan Agricultural University, Changsha, 410128, China
| | - XingHui Zhu
- College of Information and Intelligence, Hunan Agricultural University, Changsha, 410128, China
| | - YinQiong Zhang
- College of Information and Intelligence, Hunan Agricultural University, Changsha, 410128, China
| | - ChunHai Liu
- Hunan Engineering & Technology Research Center for Agricultural Big Data Analysis & Decision-Making, College of Plant Protection, Hunan Agricultural University, Changsha, 410128, China
| | - SiWei Jiang
- College of Information and Intelligence, Hunan Agricultural University, Changsha, 410128, China
| | - ZheMing Yuan
- Hunan Engineering & Technology Research Center for Agricultural Big Data Analysis & Decision-Making, College of Plant Protection, Hunan Agricultural University, Changsha, 410128, China.
| | - HongYan Zhang
- College of Information and Intelligence, Hunan Agricultural University, Changsha, 410128, China.
| |
Collapse
|
2
|
Wang XF, Huang L, Wang Y, Guan RC, You ZH, Sheng N, Xie XP, Hou WJ. Multi-view learning framework for predicting unknown types of cancer markers via directed graph neural networks fitting regulatory networks. Brief Bioinform 2024; 25:bbae546. [PMID: 39470307 PMCID: PMC11514060 DOI: 10.1093/bib/bbae546] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2024] [Revised: 09/02/2024] [Accepted: 10/11/2024] [Indexed: 10/30/2024] Open
Abstract
The discovery of diagnostic and therapeutic biomarkers for complex diseases, especially cancer, has always been a central and long-term challenge in molecular association prediction research, offering promising avenues for advancing the understanding of complex diseases. To this end, researchers have developed various network-based prediction techniques targeting specific molecular associations. However, limitations imposed by reductionism and network representation learning have led existing studies to narrowly focus on high prediction efficiency within single association type, thereby glossing over the discovery of unknown types of associations. Additionally, effectively utilizing network structure to fit the interaction properties of regulatory networks and combining specific case biomarker validations remains an unresolved issue in cancer biomarker prediction methods. To overcome these limitations, we propose a multi-view learning framework, CeRVE, based on directed graph neural networks (DGNN) for predicting unknown type cancer biomarkers. CeRVE effectively extracts and integrates subgraph information through multi-view feature learning. Subsequently, CeRVE utilizes DGNN to simulate the entire regulatory network, propagating node attribute features and extracting various interaction relationships between molecules. Furthermore, CeRVE constructed a comparative analysis matrix of three cancers and adjacent normal tissues through The Cancer Genome Atlas and identified multiple types of potential cancer biomarkers through differential expression analysis of mRNA, microRNA, and long noncoding RNA. Computational testing of multiple types of biomarkers for 72 cancers demonstrates that CeRVE exhibits superior performance in cancer biomarker prediction, providing a powerful tool and insightful approach for AI-assisted disease biomarker discovery.
Collapse
Affiliation(s)
- Xin-Fei Wang
- Key Laboratory of Symbol Computation and Knowledge Engineering of Ministry of Education, College of Computer Science and Technology, Jilin University, No. 2699, Qianjin Street, Changchun, 130012, China
| | - Lan Huang
- Key Laboratory of Symbol Computation and Knowledge Engineering of Ministry of Education, College of Computer Science and Technology, Jilin University, No. 2699, Qianjin Street, Changchun, 130012, China
| | - Yan Wang
- Key Laboratory of Symbol Computation and Knowledge Engineering of Ministry of Education, College of Computer Science and Technology, Jilin University, No. 2699, Qianjin Street, Changchun, 130012, China
| | - Ren-Chu Guan
- Key Laboratory of Symbol Computation and Knowledge Engineering of Ministry of Education, College of Computer Science and Technology, Jilin University, No. 2699, Qianjin Street, Changchun, 130012, China
| | - Zhu-Hong You
- School of Computer Science, Northwestern Polytechnical University, Youyi West Road, Xi’an, 710072, China
| | - Nan Sheng
- Key Laboratory of Symbol Computation and Knowledge Engineering of Ministry of Education, College of Computer Science and Technology, Jilin University, No. 2699, Qianjin Street, Changchun, 130012, China
| | - Xu-Ping Xie
- Key Laboratory of Symbol Computation and Knowledge Engineering of Ministry of Education, College of Computer Science and Technology, Jilin University, No. 2699, Qianjin Street, Changchun, 130012, China
| | - Wen-Ju Hou
- Key Laboratory of Symbol Computation and Knowledge Engineering of Ministry of Education, College of Computer Science and Technology, Jilin University, No. 2699, Qianjin Street, Changchun, 130012, China
| |
Collapse
|
3
|
Diao B, Luo J, Guo Y. A comprehensive survey on deep learning-based identification and predicting the interaction mechanism of long non-coding RNAs. Brief Funct Genomics 2024; 23:314-324. [PMID: 38576205 DOI: 10.1093/bfgp/elae010] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2023] [Revised: 02/25/2024] [Accepted: 03/14/2024] [Indexed: 04/06/2024] Open
Abstract
Long noncoding RNAs (lncRNAs) have been discovered to be extensively involved in eukaryotic epigenetic, transcriptional, and post-transcriptional regulatory processes with the advancements in sequencing technology and genomics research. Therefore, they play crucial roles in the body's normal physiology and various disease outcomes. Presently, numerous unknown lncRNA sequencing data require exploration. Establishing deep learning-based prediction models for lncRNAs provides valuable insights for researchers, substantially reducing time and costs associated with trial and error and facilitating the disease-relevant lncRNA identification for prognosis analysis and targeted drug development as the era of artificial intelligence progresses. However, most lncRNA-related researchers lack awareness of the latest advancements in deep learning models and model selection and application in functional research on lncRNAs. Thus, we elucidate the concept of deep learning models, explore several prevalent deep learning algorithms and their data preferences, conduct a comprehensive review of recent literature studies with exemplary predictive performance over the past 5 years in conjunction with diverse prediction functions, critically analyze and discuss the merits and limitations of current deep learning models and solutions, while also proposing prospects based on cutting-edge advancements in lncRNA research.
Collapse
Affiliation(s)
- Biyu Diao
- Department of Breast Surgery, The First Affiliated Hospital of Ningbo University, No. 59, Liuting Street, Haishu District, Ningbo 315000, China
| | - Jin Luo
- Department of Breast Surgery, The First Affiliated Hospital of Ningbo University, No. 59, Liuting Street, Haishu District, Ningbo 315000, China
| | - Yu Guo
- Department of Breast Surgery, The First Affiliated Hospital of Ningbo University, No. 59, Liuting Street, Haishu District, Ningbo 315000, China
| |
Collapse
|
4
|
Zhang H, Zhou Y, Zhang Z, Sun H, Pan Z, Mou M, Zhang W, Ye Q, Hou T, Li H, Hsieh CY, Zhu F. Large Language Model-Based Natural Language Encoding Could Be All You Need for Drug Biomedical Association Prediction. Anal Chem 2024. [PMID: 39011990 DOI: 10.1021/acs.analchem.4c01793] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/17/2024]
Abstract
Analyzing drug-related interactions in the field of biomedicine has been a critical aspect of drug discovery and development. While various artificial intelligence (AI)-based tools have been proposed to analyze drug biomedical associations (DBAs), their feature encoding did not adequately account for crucial biomedical functions and semantic concepts, thereby still hindering their progress. Since the advent of ChatGPT by OpenAI in 2022, large language models (LLMs) have demonstrated rapid growth and significant success across various applications. Herein, LEDAP was introduced, which uniquely leveraged LLM-based biotext feature encoding for predicting drug-disease associations, drug-drug interactions, and drug-side effect associations. Benefiting from the large-scale knowledgebase pre-training, LLMs had great potential in drug development analysis owing to their holistic understanding of natural language and human topics. LEDAP illustrated its notable competitiveness in comparison with other popular DBA analysis tools. Specifically, even in simple conjunction with classical machine learning methods, LLM-based feature representations consistently enabled satisfactory performance across diverse DBA tasks like binary classification, multiclass classification, and regression. Our findings underpinned the considerable potential of LLMs in drug development research, indicating a catalyst for further progress in related fields.
Collapse
Affiliation(s)
- Hanyu Zhang
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, Alibaba-Zhejiang University Joint Research Center of Future Digital Healthcare, Hangzhou 330110, China
- College of Pharmaceutical Sciences, The Second Affiliated Hospital, Zhejiang University School of Medicine, State Key Laboratory of Advanced Drug Delivery and Release Systems, Zhejiang University, Hangzhou 310058, China
| | - Yuan Zhou
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China
| | - Zhichao Zhang
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China
| | - Huaicheng Sun
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China
| | - Ziqi Pan
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China
| | - Minjie Mou
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China
| | - Wei Zhang
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China
| | - Qing Ye
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China
| | - Tingjun Hou
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China
| | - Honglin Li
- Innovation Center for AI and Drug Discovery, East China Normal University, Shanghai 200062, China
| | - Chang-Yu Hsieh
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China
| | - Feng Zhu
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, Alibaba-Zhejiang University Joint Research Center of Future Digital Healthcare, Hangzhou 330110, China
- College of Pharmaceutical Sciences, The Second Affiliated Hospital, Zhejiang University School of Medicine, State Key Laboratory of Advanced Drug Delivery and Release Systems, Zhejiang University, Hangzhou 310058, China
| |
Collapse
|
5
|
Nie Z, Gao M, Jin X, Rao Y, Zhang X. MFPINC: prediction of plant ncRNAs based on multi-source feature fusion. BMC Genomics 2024; 25:531. [PMID: 38816689 PMCID: PMC11137975 DOI: 10.1186/s12864-024-10439-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2023] [Accepted: 05/21/2024] [Indexed: 06/01/2024] Open
Abstract
Non-coding RNAs (ncRNAs) are recognized as pivotal players in the regulation of essential physiological processes such as nutrient homeostasis, development, and stress responses in plants. Common methods for predicting ncRNAs are susceptible to significant effects of experimental conditions and computational methods, resulting in the need for significant investment of time and resources. Therefore, we constructed an ncRNA predictor(MFPINC), to predict potential ncRNA in plants which is based on the PINC tool proposed by our previous studies. Specifically, sequence features were carefully refined using variance thresholding and F-test methods, while deep features were extracted and feature fusion were performed by applying the GRU model. The comprehensive evaluation of multiple standard datasets shows that MFPINC not only achieves more comprehensive and accurate identification of gene sequences, but also significantly improves the expressive and generalization performance of the model, and MFPINC significantly outperforms the existing competing methods in ncRNA identification. In addition, it is worth mentioning that our tool can also be found on Github ( https://github.com/Zhenj-Nie/MFPINC ) the data and source code can also be downloaded for free.
Collapse
Affiliation(s)
- Zhenjun Nie
- School of Information and Artificial Intelligence, Anhui Agricultural University, Hefei, 230036, China
| | - Mengqing Gao
- School of Information and Artificial Intelligence, Anhui Agricultural University, Hefei, 230036, China
| | - Xiu Jin
- School of Information and Artificial Intelligence, Anhui Agricultural University, Hefei, 230036, China
- Key Laboratory of Agricultural Sensors, Ministry of Agriculture and Rural Affairs, Hefei, 230036, China
| | - Yuan Rao
- School of Information and Artificial Intelligence, Anhui Agricultural University, Hefei, 230036, China
- Key Laboratory of Agricultural Sensors, Ministry of Agriculture and Rural Affairs, Hefei, 230036, China
| | - Xiaodan Zhang
- School of Information and Artificial Intelligence, Anhui Agricultural University, Hefei, 230036, China.
- Key Laboratory of Agricultural Sensors, Ministry of Agriculture and Rural Affairs, Hefei, 230036, China.
| |
Collapse
|
6
|
Lu H, Zhang J, Cao Y, Wu S, Wei Y, Yin R. Advances in applications of artificial intelligence algorithms for cancer-related miRNA research. Zhejiang Da Xue Xue Bao Yi Xue Ban 2024; 53:231-243. [PMID: 38650448 PMCID: PMC11057993 DOI: 10.3724/zdxbyxb-2023-0511] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2023] [Accepted: 01/30/2024] [Indexed: 04/25/2024]
Abstract
MiRNAs are a class of small non-coding RNAs, which regulate gene expression post-transcriptionally by partial complementary base pairing. Aberrant miRNA expressions have been reported in tumor tissues and peripheral blood of cancer patients. In recent years, artificial intelligence algorithms such as machine learning and deep learning have been widely used in bioinformatic research. Compared to traditional bioinformatic tools, miRNA target prediction tools based on artificial intelligence algorithms have higher accuracy, and can successfully predict subcellular localization and redistribution of miRNAs to deepen our understanding. Additionally, the construction of clinical models based on artificial intelligence algorithms could significantly improve the mining efficiency of miRNA used as biomarkers. In this article, we summarize recent development of bioinformatic miRNA tools based on artificial intelligence algorithms, focusing on the potential of machine learning and deep learning in cancer-related miRNA research.
Collapse
Affiliation(s)
- Hongyu Lu
- School of Pharmacy, Jiangsu University, Zhenjiang 212013, Jiangsu Province, China.
| | - Jia Zhang
- School of Pharmacy, Jiangsu University, Zhenjiang 212013, Jiangsu Province, China
| | - Yixin Cao
- Department of Medical Oncology, Affiliated Hospital of Jiangsu University, Zhenjiang 212013, Jiangsu Province, China
| | - Shuming Wu
- School of Pharmacy, Jiangsu University, Zhenjiang 212013, Jiangsu Province, China
| | - Yuan Wei
- School of Pharmacy, Jiangsu University, Zhenjiang 212013, Jiangsu Province, China.
| | - Runting Yin
- School of Pharmacy, Jiangsu University, Zhenjiang 212013, Jiangsu Province, China.
| |
Collapse
|
7
|
Zhang W, Mou M, Hu W, Lu M, Zhang H, Zhang H, Luo Y, Xu H, Tao L, Dai H, Gao J, Zhu F. MOINER: A Novel Multiomics Early Integration Framework for Biomedical Classification and Biomarker Discovery. J Chem Inf Model 2024; 64:2720-2732. [PMID: 38373720 DOI: 10.1021/acs.jcim.4c00013] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/21/2024]
Abstract
In the context of precision medicine, multiomics data integration provides a comprehensive understanding of underlying biological processes and is critical for disease diagnosis and biomarker discovery. One commonly used integration method is early integration through concatenation of multiple dimensionally reduced omics matrices due to its simplicity and ease of implementation. However, this approach is seriously limited by information loss and lack of latent feature interaction. Herein, a novel multiomics early integration framework (MOINER) based on information enhancement and image representation learning is thus presented to address the challenges. MOINER employs the self-attention mechanism to capture the intrinsic correlations of omics-features, which make it significantly outperform the existing state-of-the-art methods for multiomics data integration. Moreover, visualizing the attention embedding and identifying potential biomarkers offer interpretable insights into the prediction results. All source codes and model for MOINER are freely available https://github.com/idrblab/MOINER.
Collapse
Affiliation(s)
- Wei Zhang
- College of Pharmaceutical Sciences, The Second Affiliated Hospital, Zhejiang University School of Medicine, Zhejiang University, Hangzhou 310058, China
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, Alibaba-Zhejiang University Joint Research Center of Future Digital Healthcare, Hangzhou 330110, China
| | - Minjie Mou
- College of Pharmaceutical Sciences, The Second Affiliated Hospital, Zhejiang University School of Medicine, Zhejiang University, Hangzhou 310058, China
| | - Wei Hu
- College of Pharmaceutical Sciences, The Second Affiliated Hospital, Zhejiang University School of Medicine, Zhejiang University, Hangzhou 310058, China
| | - Mingkun Lu
- College of Pharmaceutical Sciences, The Second Affiliated Hospital, Zhejiang University School of Medicine, Zhejiang University, Hangzhou 310058, China
| | - Hanyu Zhang
- College of Pharmaceutical Sciences, The Second Affiliated Hospital, Zhejiang University School of Medicine, Zhejiang University, Hangzhou 310058, China
| | - Hongning Zhang
- College of Pharmaceutical Sciences, The Second Affiliated Hospital, Zhejiang University School of Medicine, Zhejiang University, Hangzhou 310058, China
| | - Yongchao Luo
- College of Pharmaceutical Sciences, The Second Affiliated Hospital, Zhejiang University School of Medicine, Zhejiang University, Hangzhou 310058, China
| | - Hongquan Xu
- Key Laboratory of Elemene Class Anti-Cancer Chinese Medicines, School of Pharmacy, Hangzhou Normal University, Hangzhou 311121, China
| | - Lin Tao
- Key Laboratory of Elemene Class Anti-Cancer Chinese Medicines, School of Pharmacy, Hangzhou Normal University, Hangzhou 311121, China
| | - Haibin Dai
- College of Pharmaceutical Sciences, The Second Affiliated Hospital, Zhejiang University School of Medicine, Zhejiang University, Hangzhou 310058, China
| | - Jianqing Gao
- College of Pharmaceutical Sciences, The Second Affiliated Hospital, Zhejiang University School of Medicine, Zhejiang University, Hangzhou 310058, China
| | - Feng Zhu
- College of Pharmaceutical Sciences, The Second Affiliated Hospital, Zhejiang University School of Medicine, Zhejiang University, Hangzhou 310058, China
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, Alibaba-Zhejiang University Joint Research Center of Future Digital Healthcare, Hangzhou 330110, China
| |
Collapse
|
8
|
Chen B, Pan Z, Mou M, Zhou Y, Fu W. Is fragment-based graph a better graph-based molecular representation for drug design? A comparison study of graph-based models. Comput Biol Med 2024; 169:107811. [PMID: 38168647 DOI: 10.1016/j.compbiomed.2023.107811] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2023] [Revised: 11/23/2023] [Accepted: 12/03/2023] [Indexed: 01/05/2024]
Abstract
Graph Neural Networks (GNNs) have gained significant traction in various sectors of AI-driven drug design. Over recent years, the integration of fragmentation concepts into GNNs has emerged as a potent strategy to augment the efficacy of molecular generative models. Nonetheless, challenges such as symmetry breaking and potential misrepresentation of intricate cycles and undefined functional groups raise questions about the superiority of fragment-based graph representation over traditional methods. In our research, we undertook a rigorous evaluation, contrasting the predictive prowess of eight models-developed using deep learning algorithms-across 12 benchmark datasets that span a range of properties. These models encompass established methods like GCN, AttentiveFP, and D-MPNN, as well as innovative fragment-based representation techniques. Our results indicate that fragment-based methodologies, notably PharmHGT, significantly improve model performance and interpretability, particularly in scenarios characterized by limited data availability. However, in situations with extensive training, fragment-based molecular graph representations may not necessarily eclipse traditional methods. In summation, we posit that the integration of fragmentation, as an avant-garde technique in drug design, harbors considerable promise for the future of AI-enhanced drug design.
Collapse
Affiliation(s)
- Baiyu Chen
- Department of Medicinal Chemistry, School of Pharmacy, Fudan University, 202103, China
| | - Ziqi Pan
- College of Pharmaceutical Sciences, The Second Affiliated Hospital, Zhejiang University School of Medicine, Zhejiang University, Hangzhou, 310058, China
| | - Minjie Mou
- College of Pharmaceutical Sciences, The Second Affiliated Hospital, Zhejiang University School of Medicine, Zhejiang University, Hangzhou, 310058, China
| | - Yuan Zhou
- College of Pharmaceutical Sciences, The Second Affiliated Hospital, Zhejiang University School of Medicine, Zhejiang University, Hangzhou, 310058, China
| | - Wei Fu
- Department of Medicinal Chemistry, School of Pharmacy, Fudan University, 202103, China.
| |
Collapse
|
9
|
Amahong K, Zhang W, Liu Y, Li T, Huang S, Han L, Tao L, Zhu F. RVvictor: Virus RNA-directed molecular interactions for RNA virus infection. Comput Biol Med 2024; 169:107886. [PMID: 38157777 DOI: 10.1016/j.compbiomed.2023.107886] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2023] [Revised: 12/14/2023] [Accepted: 12/18/2023] [Indexed: 01/03/2024]
Abstract
RNA viruses are major human pathogens that cause seasonal epidemics and occasional pandemic outbreaks. Due to the nature of their RNA genomes, it is anticipated that virus's RNA interacts with host protein (INTPRO), messenger RNA (INTmRNA), and non-coding RNA (INTncRNA) to perform their particular functions during their transcription and replication. In other words, thus, it is urgently needed to have such valuable data on virus RNA-directed molecular interactions (especially INTPROs), which are highly anticipated to attract broad research interests in the fields of RNA virus translation and replication. In this study, a new database was constructed to describe the virus RNA-directed interaction (INTPRO, INTmRNA, INTncRNA) for RNA virus (RVvictor). This database is unique in a) unambiguously characterizing the interactions between viruses RNAs and host proteins, b) providing, for the first time, the most systematic RNA-directed interaction data resources in providing clues to understand the molecular mechanisms of RNA viruses' translation, and replication, and c) in RVvictor, comprehensive enrichment analysis is conducted for each virus RNA based on its associated target genes/proteins, and the enrichment results were explicitly illustrated using various graphs. We found significant enrichment of a suite of pathways related to infection, translation, and replication, e.g., HIV infection, coronavirus disease, regulation of viral genome replication, and so on. Due to the devastating and persistent threat posed by the RNA virus, RVvictor constructed, for the first time, a possible network of cross-talk in RNA-directed interaction, which may ultimately explain the pathogenicity of RNA virus infection. The knowledge base might help develop new anti-viral therapeutic targets in the future. It's now free and publicly accessible at: https://idrblab.org/rvvictor/.
Collapse
Affiliation(s)
- Kuerbannisha Amahong
- College of Pharmaceutical Sciences, The Second Affiliated Hospital, Zhejiang University School of Medicine, Zhejiang University, Hangzhou, 310058, China; Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, Alibaba-Zhejiang University Joint Research Center of Future Digital Healthcare, Hangzhou, 330110, China
| | - Wei Zhang
- College of Pharmaceutical Sciences, The Second Affiliated Hospital, Zhejiang University School of Medicine, Zhejiang University, Hangzhou, 310058, China; Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, Alibaba-Zhejiang University Joint Research Center of Future Digital Healthcare, Hangzhou, 330110, China
| | - Yuhong Liu
- Key Laboratory of Elemene Class Anti-cancer Chinese Medicines, School of Pharmacy, Hangzhou Normal University, Hangzhou, 311121, China
| | - Teng Li
- College of Pharmaceutical Sciences, The Second Affiliated Hospital, Zhejiang University School of Medicine, Zhejiang University, Hangzhou, 310058, China
| | - Shijie Huang
- College of Pharmaceutical Sciences, The Second Affiliated Hospital, Zhejiang University School of Medicine, Zhejiang University, Hangzhou, 310058, China; Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, Alibaba-Zhejiang University Joint Research Center of Future Digital Healthcare, Hangzhou, 330110, China
| | - Lianyi Han
- Greater Bay Area Institute of Precision Medicine (Guangzhou), School of Life Sciences, Fudan University, Shanghai, 315211, China.
| | - Lin Tao
- Key Laboratory of Elemene Class Anti-cancer Chinese Medicines, School of Pharmacy, Hangzhou Normal University, Hangzhou, 311121, China.
| | - Feng Zhu
- College of Pharmaceutical Sciences, The Second Affiliated Hospital, Zhejiang University School of Medicine, Zhejiang University, Hangzhou, 310058, China; Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, Alibaba-Zhejiang University Joint Research Center of Future Digital Healthcare, Hangzhou, 330110, China.
| |
Collapse
|
10
|
Cordoba-Caballero J, Perkins JR, García-Criado F, Gallego D, Navarro-Sánchez A, Moreno-Estellés M, Garcés C, Bonet F, Romá-Mateo C, Toro R, Perez B, Sanz P, Kohl M, Rojano E, Seoane P, Ranea JAG. Exploring miRNA-target gene pair detection in disease with coRmiT. Brief Bioinform 2024; 25:bbae060. [PMID: 38436559 PMCID: PMC10939301 DOI: 10.1093/bib/bbae060] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2023] [Revised: 12/14/2023] [Accepted: 01/10/2024] [Indexed: 03/05/2024] Open
Abstract
A wide range of approaches can be used to detect micro RNA (miRNA)-target gene pairs (mTPs) from expression data, differing in the ways the gene and miRNA expression profiles are calculated, combined and correlated. However, there is no clear consensus on which is the best approach across all datasets. Here, we have implemented multiple strategies and applied them to three distinct rare disease datasets that comprise smallRNA-Seq and RNA-Seq data obtained from the same samples, obtaining mTPs related to the disease pathology. All datasets were preprocessed using a standardized, freely available computational workflow, DEG_workflow. This workflow includes coRmiT, a method to compare multiple strategies for mTP detection. We used it to investigate the overlap of the detected mTPs with predicted and validated mTPs from 11 different databases. Results show that there is no clear best strategy for mTP detection applicable to all situations. We therefore propose the integration of the results of the different strategies by selecting the one with the highest odds ratio for each miRNA, as the optimal way to integrate the results. We applied this selection-integration method to the datasets and showed it to be robust to changes in the predicted and validated mTP databases. Our findings have important implications for miRNA analysis. coRmiT is implemented as part of the ExpHunterSuite Bioconductor package available from https://bioconductor.org/packages/ExpHunterSuite.
Collapse
Affiliation(s)
- Jose Cordoba-Caballero
- Departamento de Biología Molecular y Bioquímica, Facultad de Ciencias, Universidad de Málaga, Bulevar Louis Pasteur, 31, Málaga, 29010, Spain
- Research Unit, Biomedical Research and Innovation Institute of Cádiz (INiBICA), Puerta del Mar University Hospital, Cádiz, Spain
| | - James R Perkins
- Departamento de Biología Molecular y Bioquímica, Facultad de Ciencias, Universidad de Málaga, Bulevar Louis Pasteur, 31, Málaga, 29010, Spain
- Instituto de Investigación Biomédica de Málaga y Plataforma en Nanomedicina (IBIMA-Plataforma BIONAND), C/ Severo Ochoa, 35, Parque Tecnológico de Andalucía (PTA), Campanillas, Málaga, 29590, Spain
| | - Federico García-Criado
- Departamento de Biología Molecular y Bioquímica, Facultad de Ciencias, Universidad de Málaga, Bulevar Louis Pasteur, 31, Málaga, 29010, Spain
| | - Diana Gallego
- CIBER de Enfermedades Raras (CIBERER), Avda. Monforte de Lemos, 3-5, Pabellón 11, Planta 0, Madrid, 28029, Spain
- Centro de Diagnóstico de Enfermedades Moleculares, Centro de Biología Molecular-SO UAM-CSIC, Universidad Autónoma de Madrid, Campus de Cantoblanco, Madrid, Spain
- Instituto de Investigación Sanitaria IdiPaZ, Madrid, Spain
| | - Alicia Navarro-Sánchez
- CIBER de Enfermedades Raras (CIBERER), Avda. Monforte de Lemos, 3-5, Pabellón 11, Planta 0, Madrid, 28029, Spain
- Departament de Fisiologia, Facultat de Medicina i Odontologia, Universitat de València, Av. Blasco Ibáñez 15, 46010, València, Spain
| | - Mireia Moreno-Estellés
- CIBER de Enfermedades Raras (CIBERER), Avda. Monforte de Lemos, 3-5, Pabellón 11, Planta 0, Madrid, 28029, Spain
- Consejo Superior de Investigaciones Científicas, Instituto de Biomedicina de Valencia, Jaime Roig 11, 46010, Valencia, Spain
| | - Concepción Garcés
- CIBER de Enfermedades Raras (CIBERER), Avda. Monforte de Lemos, 3-5, Pabellón 11, Planta 0, Madrid, 28029, Spain
- Departament de Fisiologia, Facultat de Medicina i Odontologia, Universitat de València, Av. Blasco Ibáñez 15, 46010, València, Spain
| | - Fernando Bonet
- Research Unit, Biomedical Research and Innovation Institute of Cádiz (INiBICA), Puerta del Mar University Hospital, Cádiz, Spain
- Medicine Department, School of Medicine, University of Cádiz, Cádiz, Spain
| | - Carlos Romá-Mateo
- CIBER de Enfermedades Raras (CIBERER), Avda. Monforte de Lemos, 3-5, Pabellón 11, Planta 0, Madrid, 28029, Spain
- Departament de Fisiologia, Facultat de Medicina i Odontologia, Universitat de València, Av. Blasco Ibáñez 15, 46010, València, Spain
- Incliva Biomedical Research Institute, 46010, València, Spain
| | - Rocio Toro
- Research Unit, Biomedical Research and Innovation Institute of Cádiz (INiBICA), Puerta del Mar University Hospital, Cádiz, Spain
- Medicine Department, School of Medicine, University of Cádiz, Cádiz, Spain
| | - Belén Perez
- CIBER de Enfermedades Raras (CIBERER), Avda. Monforte de Lemos, 3-5, Pabellón 11, Planta 0, Madrid, 28029, Spain
- Centro de Diagnóstico de Enfermedades Moleculares, Centro de Biología Molecular-SO UAM-CSIC, Universidad Autónoma de Madrid, Campus de Cantoblanco, Madrid, Spain
- Instituto de Investigación Sanitaria IdiPaZ, Madrid, Spain
| | - Pascual Sanz
- CIBER de Enfermedades Raras (CIBERER), Avda. Monforte de Lemos, 3-5, Pabellón 11, Planta 0, Madrid, 28029, Spain
- Consejo Superior de Investigaciones Científicas, Instituto de Biomedicina de Valencia, Jaime Roig 11, 46010, Valencia, Spain
| | - Matthias Kohl
- Faculty of Medical and Life Sciences, Furtwangen University, Germany
| | - Elena Rojano
- Departamento de Biología Molecular y Bioquímica, Facultad de Ciencias, Universidad de Málaga, Bulevar Louis Pasteur, 31, Málaga, 29010, Spain
- Instituto de Investigación Biomédica de Málaga y Plataforma en Nanomedicina (IBIMA-Plataforma BIONAND), C/ Severo Ochoa, 35, Parque Tecnológico de Andalucía (PTA), Campanillas, Málaga, 29590, Spain
| | - Pedro Seoane
- Departamento de Biología Molecular y Bioquímica, Facultad de Ciencias, Universidad de Málaga, Bulevar Louis Pasteur, 31, Málaga, 29010, Spain
- Instituto de Investigación Biomédica de Málaga y Plataforma en Nanomedicina (IBIMA-Plataforma BIONAND), C/ Severo Ochoa, 35, Parque Tecnológico de Andalucía (PTA), Campanillas, Málaga, 29590, Spain
- CIBER de Enfermedades Raras (CIBERER), Avda. Monforte de Lemos, 3-5, Pabellón 11, Planta 0, Madrid, 28029, Spain
| | - Juan A G Ranea
- Departamento de Biología Molecular y Bioquímica, Facultad de Ciencias, Universidad de Málaga, Bulevar Louis Pasteur, 31, Málaga, 29010, Spain
- Instituto de Investigación Biomédica de Málaga y Plataforma en Nanomedicina (IBIMA-Plataforma BIONAND), C/ Severo Ochoa, 35, Parque Tecnológico de Andalucía (PTA), Campanillas, Málaga, 29590, Spain
- CIBER de Enfermedades Raras (CIBERER), Avda. Monforte de Lemos, 3-5, Pabellón 11, Planta 0, Madrid, 28029, Spain
- Instituto Nacional de Bioinformática (INB/ELIXIR-ES), Instituto de Salud Carlos III (ISCIII), C/ Sinesio Delgado, 4, Madrid, 28029, Spain
| |
Collapse
|
11
|
Ji Y, Sun J, Xie J, Wu W, Shuai SC, Zhao Q, Chen W. m5UMCB: Prediction of RNA 5-methyluridine sites using multi-scale convolutional neural network with BiLSTM. Comput Biol Med 2024; 168:107793. [PMID: 38048661 DOI: 10.1016/j.compbiomed.2023.107793] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2023] [Revised: 11/20/2023] [Accepted: 11/28/2023] [Indexed: 12/06/2023]
Abstract
As a prevalent RNA modification, 5-methyluridine (m5U) plays a critical role in diverse biological processes and disease pathogenesis. High-throughput identification of m5U typically relies on labor-intensive biochemical experiments using various sequencing-based techniques, which are not only time-consuming but also expensive. Consequently, there is a pressing need for more efficient and cost-effective computational methods to complement these high-throughput techniques. In this study, we present m5UMCB, a novel approach that harnesses a multi-scale convolutional neural network (CNN) in tandem with bidirectional long short-term memory (BiLSTM) to recognize m5U sites. Our method involves segmenting RNA sequences into smaller fragments based on a 3-mer length and subsequently mapping each fragment to a lower-dimensional vector representation using the global vectors for word representation (GloVe) technique. Through a series of multi-scale convolution and pooling operations, local features are extracted from RNA sequences and transformed into abstract, high-level features. The feature matrix is then inputted into a BiLSTM network, enabling the capture of contextual information and long-term dependencies within the sequence. Ultimately, a fully connected layer is employed to classify m5U sites. The validation results from 5-fold cross-validation (5-fold CV) test indicate that m5UMCB outperforms existing state-of-the-art predictive methods, demonstrating a 1.98% increase in the area under ROC curve (AUC) and significant improvements in relevant evaluation metrics. We are confident that m5UMCB will serve as a valuable tool for m5U prediction.
Collapse
Affiliation(s)
- Yingshan Ji
- School of Computer Science and Software Engineering, University of Science and Technology Liaoning, Anshan, 114051, China
| | - Jianqiang Sun
- School of Information Science and Engineering, Linyi University, Linyi, 276000, China
| | - Jingxuan Xie
- School of Computer Science and Software Engineering, University of Science and Technology Liaoning, Anshan, 114051, China
| | - Wei Wu
- School of Computer Science and Software Engineering, University of Science and Technology Liaoning, Anshan, 114051, China
| | - Stella C Shuai
- Biological Science, Northwestern University, Evanston, IL, 60208, USA
| | - Qi Zhao
- School of Computer Science and Software Engineering, University of Science and Technology Liaoning, Anshan, 114051, China.
| | - Wei Chen
- Innovative Institute of Chinese Medicine and Pharmacy, Chengdu University of Traditional Chinese Medicine, Chengdu, 611137, China.
| |
Collapse
|
12
|
Wang Y, Yu X, Gu Y, Li W, Zhu K, Chen L, Tang Y, Liu G. XGraphCDS: An explainable deep learning model for predicting drug sensitivity from gene pathways and chemical structures. Comput Biol Med 2024; 168:107746. [PMID: 38039896 DOI: 10.1016/j.compbiomed.2023.107746] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2023] [Revised: 10/29/2023] [Accepted: 11/20/2023] [Indexed: 12/03/2023]
Abstract
Cancer is a highly complex disease characterized by genetic and phenotypic heterogeneity among individuals. In the era of precision medicine, understanding the genetic basis of these individual differences is crucial for developing new drugs and achieving personalized treatment. Despite the increasing abundance of cancer genomics data, predicting the relationship between cancer samples and drug sensitivity remains challenging. In this study, we developed an explainable graph neural network framework for predicting cancer drug sensitivity (XGraphCDS) based on comparative learning by integrating cancer gene expression information and drug chemical structure knowledge. Specifically, XGraphCDS consists of a unified heterogeneous network and multiple sub-networks, with molecular graphs representing drugs and gene enrichment scores representing cell lines. Experimental results showed that XGraphCDS consistently outperformed most state-of-the-art baselines (R2 = 0.863, AUC = 0.858). We also constructed a separate in vivo prediction model by using transfer learning strategies with in vitro experimental data and achieved good predictive power (AUC = 0.808). Simultaneously, our framework is interpretable, providing insights into resistance mechanisms alongside accurate predictions. The excellent performance of XGraphCDS highlights its immense potential in aiding the development of selective anti-tumor drugs and personalized dosing strategies in the field of precision medicine.
Collapse
Affiliation(s)
- Yimeng Wang
- Shanghai Frontiers Science Center of Optogenetic Techniques for Cell Metabolism, Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, Shanghai, 200237, China
| | - Xinxin Yu
- Shanghai Frontiers Science Center of Optogenetic Techniques for Cell Metabolism, Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, Shanghai, 200237, China
| | - Yaxin Gu
- Shanghai Frontiers Science Center of Optogenetic Techniques for Cell Metabolism, Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, Shanghai, 200237, China
| | - Weihua Li
- Shanghai Frontiers Science Center of Optogenetic Techniques for Cell Metabolism, Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, Shanghai, 200237, China
| | - Keyun Zhu
- Shanghai Frontiers Science Center of Optogenetic Techniques for Cell Metabolism, Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, Shanghai, 200237, China
| | - Long Chen
- Shanghai Frontiers Science Center of Optogenetic Techniques for Cell Metabolism, Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, Shanghai, 200237, China
| | - Yun Tang
- Shanghai Frontiers Science Center of Optogenetic Techniques for Cell Metabolism, Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, Shanghai, 200237, China.
| | - Guixia Liu
- Shanghai Frontiers Science Center of Optogenetic Techniques for Cell Metabolism, Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, Shanghai, 200237, China.
| |
Collapse
|
13
|
Li X, Qin X, Huang C, Lu Y, Cheng J, Wang L, Liu O, Shuai J, Yuan CA. SUnet: A multi-organ segmentation network based on multiple attention. Comput Biol Med 2023; 167:107596. [PMID: 37890423 DOI: 10.1016/j.compbiomed.2023.107596] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2023] [Revised: 09/13/2023] [Accepted: 10/17/2023] [Indexed: 10/29/2023]
Abstract
Organ segmentation in abdominal or thoracic computed tomography (CT) images plays a crucial role in medical diagnosis as it enables doctors to locate and evaluate organ abnormalities quickly, thereby guiding surgical planning, and aiding treatment decision-making. This paper proposes a novel and efficient medical image segmentation method called SUnet for multi-organ segmentation in the abdomen and thorax. SUnet is a fully attention-based neural network. Firstly, an efficient spatial reduction attention (ESRA) module is introduced not only to extract image features better, but also to reduce overall model parameters, and to alleviate overfitting. Secondly, SUnet's multiple attention-based feature fusion module enables effective cross-scale feature integration. Additionally, an enhanced attention gate (EAG) module is considered by using grouped convolution and residual connections, providing richer semantic features. We evaluate the performance of the proposed model on synapse multiple organ segmentation dataset and automated cardiac diagnostic challenge dataset. SUnet achieves an average Dice of 84.29% and 92.25% on these two datasets, respectively, outperforming other models of similar complexity and size, and achieving state-of-the-art results.
Collapse
Affiliation(s)
- Xiaosen Li
- School of Artificial Intelligence, Guangxi Minzu University, Nanning, 530006, China; Wenzhou Institute, University of Chinese Academy of Sciences, Wenzhou, 325105, China
| | - Xiao Qin
- Guangxi Key Lab of Human-machine Interaction and Intelligent Decision, Nanning Normal University, Nanning, 530023, China
| | - Chengliang Huang
- Academy of Artificial Intelligence, Zhejiang Dongfang Polytechnic, Wenzhou, 325025, China
| | - Yuer Lu
- Wenzhou Institute, University of Chinese Academy of Sciences, Wenzhou, 325105, China
| | - Jinyan Cheng
- Wenzhou Institute, University of Chinese Academy of Sciences, Wenzhou, 325105, China
| | - Liansheng Wang
- Department of Computer Science, Xiamen University, Xiamen, 361005, China
| | - Ou Liu
- Wenzhou Institute, University of Chinese Academy of Sciences, Wenzhou, 325105, China
| | - Jianwei Shuai
- Wenzhou Institute, University of Chinese Academy of Sciences, Wenzhou, 325105, China.
| | - Chang-An Yuan
- Guangxi Key Lab of Human-machine Interaction and Intelligent Decision, Nanning Normal University, Nanning, 530023, China; Guangxi Academy of Science, Nanning, 530007, China.
| |
Collapse
|
14
|
Mou M, Pan Z, Zhou Z, Zheng L, Zhang H, Shi S, Li F, Sun X, Zhu F. A Transformer-Based Ensemble Framework for the Prediction of Protein-Protein Interaction Sites. RESEARCH (WASHINGTON, D.C.) 2023; 6:0240. [PMID: 37771850 PMCID: PMC10528219 DOI: 10.34133/research.0240] [Citation(s) in RCA: 26] [Impact Index Per Article: 26.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 08/02/2023] [Accepted: 09/08/2023] [Indexed: 09/30/2023]
Abstract
The identification of protein-protein interaction (PPI) sites is essential in the research of protein function and the discovery of new drugs. So far, a variety of computational tools based on machine learning have been developed to accelerate the identification of PPI sites. However, existing methods suffer from the low predictive accuracy or the limited scope of application. Specifically, some methods learned only global or local sequential features, leading to low predictive accuracy, while others achieved improved performance by extracting residue interactions from structures but were limited in their application scope for the serious dependence on precise structure information. There is an urgent need to develop a method that integrates comprehensive information to realize proteome-wide accurate profiling of PPI sites. Herein, a novel ensemble framework for PPI sites prediction, EnsemPPIS, was therefore proposed based on transformer and gated convolutional networks. EnsemPPIS can effectively capture not only global and local patterns but also residue interactions. Specifically, EnsemPPIS was unique in (a) extracting residue interactions from protein sequences with transformer and (b) further integrating global and local sequential features with the ensemble learning strategy. Compared with various existing methods, EnsemPPIS exhibited either superior performance or broader applicability on multiple PPI sites prediction tasks. Moreover, pattern analysis based on the interpretability of EnsemPPIS demonstrated that EnsemPPIS was fully capable of learning residue interactions within the local structure of PPI sites using only sequence information. The web server of EnsemPPIS is freely available at http://idrblab.org/ensemppis.
Collapse
Affiliation(s)
- Minjie Mou
- College of Pharmaceutical Sciences, The Second Affiliated Hospital,
Zhejiang UniversitySchool of Medicine, National Key Laboratory of Advanced Drug Delivery and Release Systems, Zhejiang University, Hangzhou 310058, China
| | - Ziqi Pan
- College of Pharmaceutical Sciences, The Second Affiliated Hospital,
Zhejiang UniversitySchool of Medicine, National Key Laboratory of Advanced Drug Delivery and Release Systems, Zhejiang University, Hangzhou 310058, China
| | - Zhimeng Zhou
- College of Pharmaceutical Sciences, The Second Affiliated Hospital,
Zhejiang UniversitySchool of Medicine, National Key Laboratory of Advanced Drug Delivery and Release Systems, Zhejiang University, Hangzhou 310058, China
| | - Lingyan Zheng
- College of Pharmaceutical Sciences, The Second Affiliated Hospital,
Zhejiang UniversitySchool of Medicine, National Key Laboratory of Advanced Drug Delivery and Release Systems, Zhejiang University, Hangzhou 310058, China
| | - Hanyu Zhang
- College of Pharmaceutical Sciences, The Second Affiliated Hospital,
Zhejiang UniversitySchool of Medicine, National Key Laboratory of Advanced Drug Delivery and Release Systems, Zhejiang University, Hangzhou 310058, China
| | - Shuiyang Shi
- College of Pharmaceutical Sciences, The Second Affiliated Hospital,
Zhejiang UniversitySchool of Medicine, National Key Laboratory of Advanced Drug Delivery and Release Systems, Zhejiang University, Hangzhou 310058, China
| | - Fengcheng Li
- College of Pharmaceutical Sciences, The Second Affiliated Hospital,
Zhejiang UniversitySchool of Medicine, National Key Laboratory of Advanced Drug Delivery and Release Systems, Zhejiang University, Hangzhou 310058, China
| | - Xiuna Sun
- College of Pharmaceutical Sciences, The Second Affiliated Hospital,
Zhejiang UniversitySchool of Medicine, National Key Laboratory of Advanced Drug Delivery and Release Systems, Zhejiang University, Hangzhou 310058, China
| | - Feng Zhu
- College of Pharmaceutical Sciences, The Second Affiliated Hospital,
Zhejiang UniversitySchool of Medicine, National Key Laboratory of Advanced Drug Delivery and Release Systems, Zhejiang University, Hangzhou 310058, China
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, Alibaba-Zhejiang University Joint Research Center of Future Digital Healthcare, Hangzhou 330110, China
| |
Collapse
|
15
|
Liang S, Zhao Y, Jin J, Qiao J, Wang D, Wang Y, Wei L. Rm-LR: A long-range-based deep learning model for predicting multiple types of RNA modifications. Comput Biol Med 2023; 164:107238. [PMID: 37515874 DOI: 10.1016/j.compbiomed.2023.107238] [Citation(s) in RCA: 8] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2023] [Revised: 06/16/2023] [Accepted: 07/07/2023] [Indexed: 07/31/2023]
Abstract
Recent research has highlighted the pivotal role of RNA post-transcriptional modifications in the regulation of RNA expression and function. Accurate identification of RNA modification sites is important for understanding RNA function. In this study, we propose a novel RNA modification prediction method, namely Rm-LR, which leverages a long-range-based deep learning approach to accurately predict multiple types of RNA modifications using RNA sequences only. Rm-LR incorporates two large-scale RNA language pre-trained models to capture discriminative sequential information and learn local important features, which are subsequently integrated through a bilinear attention network. Rm-LR supports a total of ten RNA modification types (m6A, m1A, m5C, m5U, m6Am, Ψ, Am, Cm, Gm, and Um) and significantly outperforms the state-of-the-art methods in terms of predictive capability on benchmark datasets. Experimental results show the effectiveness and superiority of Rm-LR in prediction of various RNA modifications, demonstrating the strong adaptability and robustness of our proposed model. We demonstrate that RNA language pretrained models enable to learn dense biological sequential representations from large-scale long-range RNA corpus, and meanwhile enhance the interpretability of the models. This work contributes to the development of accurate and reliable computational models for RNA modification prediction, providing insights into the complex landscape of RNA modifications.
Collapse
Affiliation(s)
- Sirui Liang
- School of Software, Shandong University, Jinan, 250101, China; Joint SDU-NTU Centre for Artificial Intelligence Research (C-FAIR), Shandong University, Jinan, 250101, China
| | - Yanxi Zhao
- School of Software, Shandong University, Jinan, 250101, China; Joint SDU-NTU Centre for Artificial Intelligence Research (C-FAIR), Shandong University, Jinan, 250101, China
| | - Junru Jin
- School of Software, Shandong University, Jinan, 250101, China; Joint SDU-NTU Centre for Artificial Intelligence Research (C-FAIR), Shandong University, Jinan, 250101, China
| | - Jianbo Qiao
- School of Software, Shandong University, Jinan, 250101, China; Joint SDU-NTU Centre for Artificial Intelligence Research (C-FAIR), Shandong University, Jinan, 250101, China
| | - Ding Wang
- School of Software, Shandong University, Jinan, 250101, China; Joint SDU-NTU Centre for Artificial Intelligence Research (C-FAIR), Shandong University, Jinan, 250101, China
| | - Yu Wang
- School of Software, Shandong University, Jinan, 250101, China; Joint SDU-NTU Centre for Artificial Intelligence Research (C-FAIR), Shandong University, Jinan, 250101, China
| | - Leyi Wei
- School of Software, Shandong University, Jinan, 250101, China; Joint SDU-NTU Centre for Artificial Intelligence Research (C-FAIR), Shandong University, Jinan, 250101, China.
| |
Collapse
|
16
|
Kim Y, Lee M. Deep Learning Approaches for lncRNA-Mediated Mechanisms: A Comprehensive Review of Recent Developments. Int J Mol Sci 2023; 24:10299. [PMID: 37373445 DOI: 10.3390/ijms241210299] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2023] [Revised: 06/16/2023] [Accepted: 06/17/2023] [Indexed: 06/29/2023] Open
Abstract
This review paper provides an extensive analysis of the rapidly evolving convergence of deep learning and long non-coding RNAs (lncRNAs). Considering the recent advancements in deep learning and the increasing recognition of lncRNAs as crucial components in various biological processes, this review aims to offer a comprehensive examination of these intertwined research areas. The remarkable progress in deep learning necessitates thoroughly exploring its latest applications in the study of lncRNAs. Therefore, this review provides insights into the growing significance of incorporating deep learning methodologies to unravel the intricate roles of lncRNAs. By scrutinizing the most recent research spanning from 2021 to 2023, this paper provides a comprehensive understanding of how deep learning techniques are employed in investigating lncRNAs, thereby contributing valuable insights to this rapidly evolving field. The review is aimed at researchers and practitioners looking to integrate deep learning advancements into their lncRNA studies.
Collapse
Affiliation(s)
- Yoojoong Kim
- School of Computer Science and Information Engineering, The Catholic University of Korea, Bucheon 14662, Republic of Korea
| | - Minhyeok Lee
- School of Electrical and Electronics Engineering, Chung-Ang University, Seoul 06974, Republic of Korea
| |
Collapse
|
17
|
Xin X, Jia-Yin Y, Jun-Yang H, Rui W, Xiong-Ri K, Long-Rui D, Liu J, Jue-Yu Z. Comprehensive analysis of lncRNA-mRNA co-expression networks in HPV-driven cervical cancer reveals the pivotal function of LINC00511-PGK1 in tumorigenesis. Comput Biol Med 2023; 159:106943. [PMID: 37099974 DOI: 10.1016/j.compbiomed.2023.106943] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2022] [Revised: 04/06/2023] [Accepted: 04/14/2023] [Indexed: 04/28/2023]
Abstract
BACKGROUND Mounting evidence suggests that noncoding RNAs (lncRNAs) were involved in various human cancers. However, the role of these lncRNAs in HPV-driven cervical cancer (CC) has not been extensively studied. Considering that HR-HPV infections contribute to cervical carcinogenesis by regulating the expression of lncRNAs, miRNAs and mRNAs, we aim to systematically analyze lncRNAs and mRNAs expression profile to identify novel lncRNAs-mRNAs co-expression networks and explore their potential impact on tumorigenesis in HPV-driven CC. METHODS LncRNA/mRNA microarray technology was utilized to identify the differentially expressed lncRNAs (DElncRNAs) and mRNAs (DEmRNAs) in HPV-16 and HPV-18 cervical carcinogenesis compared to normal cervical tissues. Venn diagram and weighted gene co-expression network analysis (WGCNA) were used to identify the hub DElncRNAs/DEmRNAs that were both significantly correlated with HPV-16 and HPV-18 CC patients. LncRNA-mRNA correlation analysis and functional enrichment pathway analysis were performed on these key DElncRNAs/DEmRNAs in HPV-16 and HPV-18 CC patients to explore their mutual mechanism in HPV-driven CC. A lncRNA-mRNA co-expression score (CES) model was established and validated by using the Cox regression method. Afterward, the clinicopathological characteristics were analyzed between CES-high and CES-low groups. In vitro, functional experiments were performed to evaluate the role of LINC00511 and PGK1 in cell proliferation, migration and invasion in CC cells. To understand whether LINC00511 play as an oncogenic role partially via modulating the expression of PGK1, rescue assays were used. RESULTS We identified 81 lncRNAs and 211 mRNAs that were commonly differentially expressed in HPV-16 and HPV-18 CC tissues compared to normal tissues. The results of lncRNA-mRNA correlation analysis and functional enrichment pathway analysis showed that the LINC00511-PGK1 co-expression network may make an important contribution to HPV-mediated tumorigenesis and be closely associated with metabolism-related mechanisms. Combined with clinical survival data, the prognostic lncRNA-mRNA co-expression score (CES) model based on LINC00511 and PGK1 could precisely predict patients' overall survival (OS). CES-high patients had a worse prognosis than CES-low patients and the enriched pathways and potential targets of applicable drugs were explored in CES-high patients. In vitro experiments confirmed the oncogenic functions of LINC00511 and PGK1 in the progression of CC, and revealed that LINC00511 functions in an oncogenic role in CC cells partially via modulating the expression of PGK1. CONCLUSIONS Together, these data identify co-expression modules that provide valuable information to understand the pathogenesis of HPV-mediated tumorigenesis, which highlights the pivotal function of the LINC00511-PGK1 co-expression network in cervical carcinogenesis. Furthermore, our CES model has a reliable predicting ability that could stratify CC patients into low- and high-risk groups of poor survival. This study provides a bioinformatics method to screen prognostic biomarkers which leads to lncRNA-mRNA co-expression network identification and construction for patients' survival prediction and potential drug applications in other cancers.
Collapse
Affiliation(s)
- Xu Xin
- Department of Biochemistry and Molecular Biology, School of Basic Medical Sciences, Southern Medical University, Guangzhou, 510515, Guangdong, China
| | - Yu Jia-Yin
- Department of Radiation Medicine, Guangdong Provincial Key Laboratory for Safety Evaluation of Cosmetics, School of Public Health, Southern Medical University, Guangzhou, 510515, China
| | - Huang Jun-Yang
- Department of Biochemistry and Molecular Biology, School of Basic Medical Sciences, Southern Medical University, Guangzhou, 510515, Guangdong, China; School of Basic Medical Sciences, Southern Medical University, Guangzhou, 510515, Guangdong, China
| | - Wang Rui
- Department of Biochemistry and Molecular Biology, School of Basic Medical Sciences, Southern Medical University, Guangzhou, 510515, Guangdong, China
| | - Kuang Xiong-Ri
- Department of Biochemistry and Molecular Biology, School of Basic Medical Sciences, Southern Medical University, Guangzhou, 510515, Guangdong, China
| | - Dang Long-Rui
- Department of Biochemistry and Molecular Biology, School of Basic Medical Sciences, Southern Medical University, Guangzhou, 510515, Guangdong, China
| | - Jie Liu
- Department of Gynaecology and Obstetrics, Nanfang Hospital, Southern Medical University, Guangzhou, 510515, Guangdong, China
| | - Zhou Jue-Yu
- Department of Biochemistry and Molecular Biology, School of Basic Medical Sciences, Southern Medical University, Guangzhou, 510515, Guangdong, China.
| |
Collapse
|
18
|
He H, Duo H, Hao Y, Zhang X, Zhou X, Zeng Y, Li Y, Li B. Computational drug repurposing by exploiting large-scale gene expression data: Strategy, methods and applications. Comput Biol Med 2023; 155:106671. [PMID: 36805225 DOI: 10.1016/j.compbiomed.2023.106671] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/25/2022] [Revised: 02/05/2023] [Accepted: 02/10/2023] [Indexed: 02/18/2023]
Abstract
De novo drug development is an extremely complex, time-consuming and costly task. Urgent needs for therapies of various diseases have greatly accelerated searches for more effective drug development methods. Luckily, drug repurposing provides a new and effective perspective on disease treatment. Rapidly increased large-scale transcriptome data paints a detailed prospect of gene expression during disease onset and thus has received wide attention in the field of computational drug repurposing. However, how to efficiently mine transcriptome data and identify new indications for old drugs remains a critical challenge. This review discussed the irreplaceable role of transcriptome data in computational drug repurposing and summarized some representative databases, tools and strategies. More importantly, it proposed a practical guideline through establishing the correspondence between three gene expression data types and five strategies, which would facilitate researchers to adopt appropriate strategies to deeply mine large-scale transcriptome data and discover more effective therapies.
Collapse
Affiliation(s)
- Hao He
- College of Life Sciences, Chongqing Normal University, Chongqing, 400044, PR China; State Key Laboratory of Medical Neurobiology and MOE Frontiers Center for Brain Science, Institutes of Brain Science, Fudan University, Shanghai, 200032, PR China
| | - Hongrui Duo
- College of Life Sciences, Chongqing Normal University, Chongqing, 400044, PR China
| | - Youjin Hao
- College of Life Sciences, Chongqing Normal University, Chongqing, 400044, PR China
| | - Xiaoxi Zhang
- College of Life Sciences, Chongqing Normal University, Chongqing, 400044, PR China
| | - Xinyi Zhou
- College of Life Sciences, Chongqing Normal University, Chongqing, 400044, PR China
| | - Yujie Zeng
- College of Life Sciences, Chongqing Normal University, Chongqing, 400044, PR China
| | - Yinghong Li
- The Key Laboratory on Big Data for Bio Intelligence, Chongqing University of Posts and Telecommunications, Chongqing, 400065, PR China
| | - Bo Li
- College of Life Sciences, Chongqing Normal University, Chongqing, 400044, PR China.
| |
Collapse
|
19
|
Wang T, Sun J, Zhao Q. Investigating cardiotoxicity related with hERG channel blockers using molecular fingerprints and graph attention mechanism. Comput Biol Med 2023; 153:106464. [PMID: 36584603 DOI: 10.1016/j.compbiomed.2022.106464] [Citation(s) in RCA: 118] [Impact Index Per Article: 118.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2022] [Revised: 12/12/2022] [Accepted: 12/19/2022] [Indexed: 12/24/2022]
Abstract
Human ether-a-go-go-related gene (hERG) channel blockade by small molecules is a big concern during drug development in the pharmaceutical industry. Failure or inhibition of hERG channel activity caused by drug molecules can lead to prolonging QT interval, which will result in serious cardiotoxicity. Thus, evaluating the hERG blocking activity of all these small molecular compounds is technically challenging, and the relevant procedures are expensive and time-consuming. In this study, we develop a novel deep learning predictive model named DMFGAM for predicting hERG blockers. In order to characterize the molecule more comprehensively, we first consider the fusion of multiple molecular fingerprint features to characterize its final molecular fingerprint features. Then, we use the multi-head attention mechanism to extract the molecular graph features. Both molecular fingerprint features and molecular graph features are fused as the final features of the compounds to make the feature expression of compounds more comprehensive. Finally, the molecules are classified into hERG blockers or hERG non-blockers through the fully connected neural network. We conduct 5-fold cross-validation experiment to evaluate the performance of DMFGAM, and verify the robustness of DMFGAM on external validation datasets. We believe DMFGAM can serve as a powerful tool to predict hERG channel blockers in the early stages of drug discovery and development.
Collapse
Affiliation(s)
- Tianyi Wang
- School of Computer Science and Software Engineering, University of Science and Technology Liaoning, Anshan, 114051, China
| | - Jianqiang Sun
- School of Automation and Electrical Engineering, Linyi University, Linyi, 276000, China
| | - Qi Zhao
- School of Computer Science and Software Engineering, University of Science and Technology Liaoning, Anshan, 114051, China.
| |
Collapse
|
20
|
Yue ZX, Yan TC, Xu HQ, Liu YH, Hong YF, Chen GX, Xie T, Tao L. A systematic review on the state-of-the-art strategies for protein representation. Comput Biol Med 2023; 152:106440. [PMID: 36543002 DOI: 10.1016/j.compbiomed.2022.106440] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2022] [Revised: 12/08/2022] [Accepted: 12/15/2022] [Indexed: 12/23/2022]
Abstract
The study of drug-target protein interaction is a key step in drug research. In recent years, machine learning techniques have become attractive for research, including drug research, due to their automated nature, predictive power, and expected efficiency. Protein representation is a key step in the study of drug-target protein interaction by machine learning, which plays a fundamental role in the ultimate accomplishment of accurate research. With the progress of machine learning, protein representation methods have gradually attracted attention and have consequently developed rapidly. Therefore, in this review, we systematically classify current protein representation methods, comprehensively review them, and discuss the latest advances of interest. According to the information extraction methods and information sources, these representation methods are generally divided into structure and sequence-based representation methods. Each primary class can be further divided into specific subcategories. As for the particular representation methods involve both traditional and the latest approaches. This review contains a comprehensive assessment of the various methods which researchers can use as a reference for their specific protein-related research requirements, including drug research.
Collapse
Affiliation(s)
- Zi-Xuan Yue
- Key Laboratory of Elemene Class Anti-cancer Chinese Medicines, School of Pharmacy, Hangzhou Normal University, Hangzhou, 311121, China
| | - Tian-Ci Yan
- Key Laboratory of Elemene Class Anti-cancer Chinese Medicines, School of Pharmacy, Hangzhou Normal University, Hangzhou, 311121, China
| | - Hong-Quan Xu
- Key Laboratory of Elemene Class Anti-cancer Chinese Medicines, School of Pharmacy, Hangzhou Normal University, Hangzhou, 311121, China
| | - Yu-Hong Liu
- Key Laboratory of Elemene Class Anti-cancer Chinese Medicines, School of Pharmacy, Hangzhou Normal University, Hangzhou, 311121, China
| | - Yan-Feng Hong
- Key Laboratory of Elemene Class Anti-cancer Chinese Medicines, School of Pharmacy, Hangzhou Normal University, Hangzhou, 311121, China
| | - Gong-Xing Chen
- Key Laboratory of Elemene Class Anti-cancer Chinese Medicines, School of Pharmacy, Hangzhou Normal University, Hangzhou, 311121, China
| | - Tian Xie
- Key Laboratory of Elemene Class Anti-cancer Chinese Medicines, School of Pharmacy, Hangzhou Normal University, Hangzhou, 311121, China.
| | - Lin Tao
- Key Laboratory of Elemene Class Anti-cancer Chinese Medicines, School of Pharmacy, Hangzhou Normal University, Hangzhou, 311121, China.
| |
Collapse
|
21
|
Zhou B, Ding M, Feng J, Ji B, Huang P, Zhang J, Yu X, Cao Z, Yang Y, Zhou Y, Wang J. EVlncRNA-Dpred: improved prediction of experimentally validated lncRNAs by deep learning. Brief Bioinform 2022; 24:6961472. [PMID: 36573492 PMCID: PMC9851331 DOI: 10.1093/bib/bbac583] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2022] [Revised: 11/02/2022] [Accepted: 11/29/2022] [Indexed: 12/28/2022] Open
Abstract
Long non-coding RNAs (lncRNAs) played essential roles in nearly every biological process and disease. Many algorithms were developed to distinguish lncRNAs from mRNAs in transcriptomic data and facilitated discoveries of more than 600 000 of lncRNAs. However, only a tiny fraction (<1%) of lncRNA transcripts (~4000) were further validated by low-throughput experiments (EVlncRNAs). Given the cost and labor-intensive nature of experimental validations, it is necessary to develop computational tools to prioritize those potentially functional lncRNAs because many lncRNAs from high-throughput sequencing (HTlncRNAs) could be resulted from transcriptional noises. Here, we employed deep learning algorithms to separate EVlncRNAs from HTlncRNAs and mRNAs. For overcoming the challenge of small datasets, we employed a three-layer deep-learning neural network (DNN) with a K-mer feature as the input and a small convolutional neural network (CNN) with one-hot encoding as the input. Three separate models were trained for human (h), mouse (m) and plant (p), respectively. The final concatenated models (EVlncRNA-Dpred (h), EVlncRNA-Dpred (m) and EVlncRNA-Dpred (p)) provided substantial improvement over a previous model based on support-vector-machines (EVlncRNA-pred). For example, EVlncRNA-Dpred (h) achieved 0.896 for the area under receiver-operating characteristic curve, compared with 0.582 given by sequence-based EVlncRNA-pred model. The models developed here should be useful for screening lncRNA transcripts for experimental validations. EVlncRNA-Dpred is available as a web server at https://www.sdklab-biophysics-dzu.net/EVlncRNA-Dpred/index.html, and the data and source code can be freely available along with the web server.
Collapse
Affiliation(s)
- Bailing Zhou
- Shandong Provincial Key Laboratory of Biophysics, Institute of Biophysics, Dezhou University, Dezhou 253023, China
| | - Maolin Ding
- School of Computer Science and Engineering, Sun Yat-sen University, Guangzhou 510000, China
| | - Jing Feng
- Shandong Provincial Key Laboratory of Biophysics, Institute of Biophysics, Dezhou University, Dezhou 253023, China
| | - Baohua Ji
- Shandong Provincial Key Laboratory of Biophysics, Institute of Biophysics, Dezhou University, Dezhou 253023, China
| | - Pingping Huang
- Shandong Provincial Key Laboratory of Biophysics, Institute of Biophysics, Dezhou University, Dezhou 253023, China
| | - Junye Zhang
- Shandong Provincial Key Laboratory of Biophysics, Institute of Biophysics, Dezhou University, Dezhou 253023, China
| | - Xue Yu
- Shandong Provincial Key Laboratory of Biophysics, Institute of Biophysics, Dezhou University, Dezhou 253023, China
| | - Zanxia Cao
- Shandong Provincial Key Laboratory of Biophysics, Institute of Biophysics, Dezhou University, Dezhou 253023, China
| | - Yuedong Yang
- School of Computer Science and Engineering, Sun Yat-sen University, Guangzhou 510000, China
| | - Yaoqi Zhou
- Corresponding authors: Yaoqi Zhou, Institute for Systems and Physical Biology, Shenzhen Bay Laboratory, Shenzhen 518055, China. Tel.: +86 (755) 6275 2684; E-mail: ; Jihua Wang, Shandong Provincial Key Laboratory of Biophysics, Institute of Biophysics, Dezhou University, Dezhou 253023, China. Tel.: +86 (534) 898 5933; E-mail:
| | - Jihua Wang
- Corresponding authors: Yaoqi Zhou, Institute for Systems and Physical Biology, Shenzhen Bay Laboratory, Shenzhen 518055, China. Tel.: +86 (755) 6275 2684; E-mail: ; Jihua Wang, Shandong Provincial Key Laboratory of Biophysics, Institute of Biophysics, Dezhou University, Dezhou 253023, China. Tel.: +86 (534) 898 5933; E-mail:
| |
Collapse
|