1
|
Zheng H, Xu L, Xie H, Xie J, Ma Y, Hu Y, Wu L, Chen J, Wang M, Yi Y, Huang Y, Wang D. RIscoper 2.0: A deep learning tool to extract RNA biomedical relation sentences from literature. Comput Struct Biotechnol J 2024; 23:1469-1476. [PMID: 38623560 PMCID: PMC11016866 DOI: 10.1016/j.csbj.2024.03.017] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2024] [Revised: 03/15/2024] [Accepted: 03/21/2024] [Indexed: 04/17/2024] Open
Abstract
RNA plays an extensive role in a multi-dimensional regulatory system, and its biomedical relationships are scattered across numerous biological studies. However, text mining works dedicated to the extraction of RNA biomedical relations remain limited. In this study, we established a comprehensive and reliable corpus of RNA biomedical relations, recruiting over 30,000 sentences manually curated from more than 15,000 biomedical literature. We also updated RIscoper 2.0, a BERT-based deep learning tool to extract RNA biomedical relation sentences from literature. Benefiting from approximately 100,000 annotated named entities, we integrated the text classification and named entity recognition tasks in this tool. Additionally, RIscoper 2.0 outperformed the original tool in both tasks and can discover new RNA biomedical relations. Additionally, we provided a user-friendly online search tool that enables rapid scanning of RNA biomedical relationships using local and online resources. Both the online tools and data resources of RIscoper 2.0 are available at http://www.rnainter.org/riscoper.
Collapse
Affiliation(s)
- Hailong Zheng
- Department of Bioinformatics, School of Basic Medical Sciences, Southern Medical University, 510515 Guangzhou, China
| | - Linfu Xu
- Department of Bioinformatics, School of Basic Medical Sciences, Southern Medical University, 510515 Guangzhou, China
| | - Hailong Xie
- Department of Bioinformatics, School of Basic Medical Sciences, Southern Medical University, 510515 Guangzhou, China
| | - Jiajing Xie
- National Institute for Data Science in Health and Medicine, Xiamen University, 361102 Xiamen, China
| | - Yapeng Ma
- Department of Bioinformatics, School of Basic Medical Sciences, Southern Medical University, 510515 Guangzhou, China
| | - Yongfei Hu
- Department of Bioinformatics, School of Basic Medical Sciences, Southern Medical University, 510515 Guangzhou, China
| | - Le Wu
- Department of Bioinformatics, School of Basic Medical Sciences, Southern Medical University, 510515 Guangzhou, China
| | - Jia Chen
- Department of Bioinformatics, School of Basic Medical Sciences, Southern Medical University, 510515 Guangzhou, China
| | - Meiyi Wang
- Department of Bioinformatics, School of Basic Medical Sciences, Southern Medical University, 510515 Guangzhou, China
| | - Ying Yi
- Department of Bioinformatics, School of Basic Medical Sciences, Southern Medical University, 510515 Guangzhou, China
| | - Yan Huang
- Department of Bioinformatics, School of Basic Medical Sciences, Southern Medical University, 510515 Guangzhou, China
| | - Dong Wang
- Department of Bioinformatics, School of Basic Medical Sciences, Southern Medical University, 510515 Guangzhou, China
- Guangdong Province Key Laboratory of Molecular Tumor Pathology, 510515, Guangzhou, China
| |
Collapse
|
2
|
He Y, Bao X, Chen T, Jiang Q, Zhang L, He LN, Zheng J, Zhao A, Ren J, Zuo Z. RPS 2.0: an updated database of RNAs involved in liquid-liquid phase separation. Nucleic Acids Res 2024:gkae951. [PMID: 39460625 DOI: 10.1093/nar/gkae951] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2024] [Revised: 10/05/2024] [Accepted: 10/14/2024] [Indexed: 10/28/2024] Open
Abstract
Liquid-liquid phase separation (LLPS) is a crucial process for the formation of biomolecular condensates such as coacervate droplets, P-bodies and stress granules, which play critical roles in many physiological and pathological processes. Increasing studies have shown that not only proteins but also RNAs play a critical role in LLPS. To host LLPS-associated RNAs, we previously developed a database named 'RPS' in 2021. In this study, we present an updated version RPS 2.0 (https://rps.renlab.cn/) to incorporate the newly generated data and to host new LLPS-associated RNAs driven by post-transcriptional regulatory mechanisms. Currently, RPS 2.0 hosts 171 301 entries of LLPS-associated RNAs in 24 different biomolecular condensates with four evidence types, including 'Reviewed', 'High-throughput (LLPS enrichment)', 'High-throughput (LLPS perturbation)' and 'Predicted', and five event types, including 'Expression', 'APA', 'AS', 'A-to-I' and 'Modification'. Additionally, extensive annotations of LLPS-associated RNAs are provided in RPS 2.0, including RNA sequence and structure features, RNA-protein/RNA-RNA interactions, RNA modifications, as well as diseases related annotations. We expect that RPS 2.0 will further promote research of LLPS-associated RNAs and deepen our understanding of the biological functions and regulatory mechanisms of LLPS.
Collapse
Affiliation(s)
- Yongxin He
- School of Life Sciences, State Key Laboratory of Oncology in South China, Cancer Center, Collaborative Innovation Center for Cancer Medicine, Sun Yat-sen University, Guangzhou 510060, China
| | - Xiaoqiong Bao
- School of Life Sciences, State Key Laboratory of Oncology in South China, Cancer Center, Collaborative Innovation Center for Cancer Medicine, Sun Yat-sen University, Guangzhou 510060, China
| | - Tianjian Chen
- School of Life Sciences, State Key Laboratory of Oncology in South China, Cancer Center, Collaborative Innovation Center for Cancer Medicine, Sun Yat-sen University, Guangzhou 510060, China
| | - Qi Jiang
- School of Life Sciences, State Key Laboratory of Oncology in South China, Cancer Center, Collaborative Innovation Center for Cancer Medicine, Sun Yat-sen University, Guangzhou 510060, China
| | - Luowanyue Zhang
- School of Life Sciences, State Key Laboratory of Oncology in South China, Cancer Center, Collaborative Innovation Center for Cancer Medicine, Sun Yat-sen University, Guangzhou 510060, China
| | - Li-Na He
- School of Life Sciences, State Key Laboratory of Oncology in South China, Cancer Center, Collaborative Innovation Center for Cancer Medicine, Sun Yat-sen University, Guangzhou 510060, China
| | - Jian Zheng
- School of Life Sciences, State Key Laboratory of Oncology in South China, Cancer Center, Collaborative Innovation Center for Cancer Medicine, Sun Yat-sen University, Guangzhou 510060, China
| | - An Zhao
- Zhejiang Cancer Institute, Zhejiang Cancer Hospital, Hangzhou Institute of Medicine, Chinese Academy of Sciences, Hangzhou 310000, China
| | - Jian Ren
- School of Life Sciences, State Key Laboratory of Oncology in South China, Cancer Center, Collaborative Innovation Center for Cancer Medicine, Sun Yat-sen University, Guangzhou 510060, China
| | - Zhixiang Zuo
- School of Life Sciences, State Key Laboratory of Oncology in South China, Cancer Center, Collaborative Innovation Center for Cancer Medicine, Sun Yat-sen University, Guangzhou 510060, China
| |
Collapse
|
3
|
Zhou X, Qin Y, Li J, Fan L, Zhang S, Zhang B, Wu L, Gao A, Yang Y, Lv X, Guo B, Sun L. LncPepAtlas: a comprehensive resource for exploring the translational landscape of long non-coding RNAs. Nucleic Acids Res 2024:gkae905. [PMID: 39435995 DOI: 10.1093/nar/gkae905] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2024] [Revised: 09/20/2024] [Accepted: 10/07/2024] [Indexed: 10/23/2024] Open
Abstract
Long non-coding RNAs were commonly viewed as non-coding elements. However, they are increasingly recognized for their ability to be translated into proteins, thereby playing a significant role in various cellular processes and diseases. With developments in biotechnology and computational algorithms, a range of novel approaches are being applied to investigate the translation of long non-coding RNA (lncRNAs). Herein, we developed the LncPepAtlas database (http://www.cnitbiotool.net/LncPepAtlas/), which aims to compile multiple evidences for the translation of lncRNAs and annotations for the upstream regulation of lncRNAs across various species. LncPepAtlas integrated compelling evidence from nine distinct sources for the translation of lncRNAs. These include a dataset comprising 2631 publicly available Ribo-seq samples from nine species, which has been collected and analysed. LncPepAtlas offers extensive annotation for lncRNA upstream regulation and expression profiles across various cancers, tissues or cell lines at transcriptional and translational levels. Importantly, it enables novel antigen predictions for lncRNA-encoded peptides. By identifying numerous peptide candidates that could potentially bind to major histocompatibility complex class I and II molecules, this work may provide new insights into cancer immunotherapy. The function of peptides were inferred by aligning them with experimentally detected proteins. LncPepAtlas aims to become a convenient resource for exploring translatable lncRNAs.
Collapse
Affiliation(s)
- Xinyuan Zhou
- Binzhou People's Hospital Affiliated to Shandong First Medical University/College of Medical Information and Artificial Intelligence, Shandong First Medical University and Shandong Academy of Medical Sciences, Jinan, Shandong 250117, China
- Institute of Brain Science and Brain-inspired Research, Shandong First Medical University and Shandong Academy of Medical Sciences, Jinan, Shandong 250117, China
| | - Yanxia Qin
- Binzhou People's Hospital Affiliated to Shandong First Medical University/College of Medical Information and Artificial Intelligence, Shandong First Medical University and Shandong Academy of Medical Sciences, Jinan, Shandong 250117, China
| | - Jiangxue Li
- Binzhou People's Hospital Affiliated to Shandong First Medical University/College of Medical Information and Artificial Intelligence, Shandong First Medical University and Shandong Academy of Medical Sciences, Jinan, Shandong 250117, China
| | - Linyuan Fan
- Department of Thoracic Surgery, Qilu Hospital of Shandong University, Jinan, Shandong 250000, China
| | - Shun Zhang
- School of Information Science and Engineering, University of Jinan, Jinan, Shandong 250022, China
| | - Bing Zhang
- School of Mathematical Sciences, Harbin Normal University, Harbin, Heilongjiang 150025, China
| | - Luoxuan Wu
- College of Ophthalmology, Shandong First Medical University and Shandong Academy of Medical Sciences, Jinan, Shandong 250117, China
| | - Anwei Gao
- Binzhou People's Hospital Affiliated to Shandong First Medical University/College of Medical Information and Artificial Intelligence, Shandong First Medical University and Shandong Academy of Medical Sciences, Jinan, Shandong 250117, China
| | - Yongsan Yang
- West China Biomedical Big Data Center, West China Hospital, Sichuan University, Chengdu, Sichuan 610041, China
| | - Xueqin Lv
- School of Mathematical Sciences, Harbin Normal University, Harbin, Heilongjiang 150025, China
- College of Basic Science, Tianjin Sino-German University of Applied Sciences, Tianjin 300350, China
| | - Bingzhou Guo
- Binzhou People's Hospital Affiliated to Shandong First Medical University/College of Medical Information and Artificial Intelligence, Shandong First Medical University and Shandong Academy of Medical Sciences, Jinan, Shandong 250117, China
| | - Liang Sun
- Binzhou People's Hospital Affiliated to Shandong First Medical University/College of Medical Information and Artificial Intelligence, Shandong First Medical University and Shandong Academy of Medical Sciences, Jinan, Shandong 250117, China
| |
Collapse
|
4
|
Zhang J, Liu L, Wei X, Zhao C, Luo Y, Li J, Le TD. Scanning sample-specific miRNA regulation from bulk and single-cell RNA-sequencing data. BMC Biol 2024; 22:218. [PMID: 39334271 PMCID: PMC11438147 DOI: 10.1186/s12915-024-02020-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2024] [Accepted: 09/24/2024] [Indexed: 09/30/2024] Open
Abstract
BACKGROUND RNA-sequencing technology provides an effective tool for understanding miRNA regulation in complex human diseases, including cancers. A large number of computational methods have been developed to make use of bulk and single-cell RNA-sequencing data to identify miRNA regulations at the resolution of multiple samples (i.e. group of cells or tissues). However, due to the heterogeneity of individual samples, there is a strong need to infer miRNA regulation specific to individual samples to uncover miRNA regulation at the single-sample resolution level. RESULTS Here, we develop a framework, Scan, for scanning sample-specific miRNA regulation. Since a single network inference method or strategy cannot perform well for all types of new data, Scan incorporates 27 network inference methods and two strategies to infer tissue-specific or cell-specific miRNA regulation from bulk or single-cell RNA-sequencing data. Results on bulk and single-cell RNA-sequencing data demonstrate the effectiveness of Scan in inferring sample-specific miRNA regulation. Moreover, we have found that incorporating the prior information of miRNA targets can generally improve the accuracy of miRNA target prediction. In addition, Scan can contribute to construct cell/tissue correlation networks and recover aggregate miRNA regulatory networks. Finally, the comparison results have shown that the performance of network inference methods is likely to be data-specific, and selecting optimal network inference methods is required for more accurate prediction of miRNA targets. CONCLUSIONS Scan provides a useful method to help infer sample-specific miRNA regulation for new data, benchmark new network inference methods and deepen the understanding of miRNA regulation at the resolution of individual samples.
Collapse
Affiliation(s)
- Junpeng Zhang
- School of Engineering, Dali University, Dali, 671003, Yunnan, China.
| | - Lin Liu
- UniSA STEM, University of South Australia, Mawson Lakes, SA, 5095, Australia
| | - Xuemei Wei
- School of Engineering, Dali University, Dali, 671003, Yunnan, China
| | - Chunwen Zhao
- School of Engineering, Dali University, Dali, 671003, Yunnan, China
| | - Yanbi Luo
- School of Engineering, Dali University, Dali, 671003, Yunnan, China
| | - Jiuyong Li
- UniSA STEM, University of South Australia, Mawson Lakes, SA, 5095, Australia
| | - Thuc Duy Le
- UniSA STEM, University of South Australia, Mawson Lakes, SA, 5095, Australia.
| |
Collapse
|
5
|
Fuller RN, Morcos A, Bustillos JG, Molina DC, Wall NR. Small non-coding RNAs and pancreatic ductal adenocarcinoma: Linking diagnosis, pathogenesis, drug resistance, and therapeutic potential. Biochim Biophys Acta Rev Cancer 2024; 1879:189153. [PMID: 38986720 DOI: 10.1016/j.bbcan.2024.189153] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2024] [Revised: 07/03/2024] [Accepted: 07/05/2024] [Indexed: 07/12/2024]
Abstract
This review comprehensively investigates the intricate interplay between small non-coding RNAs (sncRNAs) and pancreatic ductal adenocarcinoma (PDAC), a devastating malignancy with limited therapeutic options. Our analysis reveals the pivotal roles of sncRNAs in various facets of PDAC biology, spanning diagnosis, pathogenesis, drug resistance, and therapeutic strategies. sncRNAs have emerged as promising biomarkers for PDAC, demonstrating distinct expression profiles in diseased tissues. sncRNA differential expression patterns, often detectable in bodily fluids, hold potential for early and minimally invasive diagnostic approaches. Furthermore, sncRNAs exhibit intricate involvement in PDAC pathogenesis, regulating critical cellular processes such as proliferation, apoptosis, and metastasis. Additionally, mechanistic insights into sncRNA-mediated pathogenic pathways illuminate novel therapeutic targets and interventions. A significant focus of this review is dedicated to unraveling sncRNA mechanisms underlying drug resistance in PDAC. Understanding these mechanisms at the molecular level is imperative for devising strategies to overcome drug resistance. Exploring the therapeutic landscape, we discuss the potential of sncRNAs as therapeutic agents themselves as their ability to modulate gene expression with high specificity renders them attractive candidates for targeted therapy. In summary, this review integrates current knowledge on sncRNAs in PDAC, offering a holistic perspective on their diagnostic, pathogenic, and therapeutic relevance. By elucidating the roles of sncRNAs in PDAC biology, this review provides valuable insights for the development of novel diagnostic tools and targeted therapeutic approaches, crucial for improving the prognosis of PDAC patients.
Collapse
Affiliation(s)
- Ryan N Fuller
- Department of Basic Science, Division of Biochemistry, Center for Health Disparity and Mol. Med., Loma Linda University, Loma Linda, CA 92350, USA; Department of Radiation Medicine, James M. Slater, MD Proton Treatment and Research Center, Loma Linda University, Loma Linda, CA 92350, USA
| | - Ann Morcos
- Department of Basic Science, Division of Biochemistry, Center for Health Disparity and Mol. Med., Loma Linda University, Loma Linda, CA 92350, USA; Department of Radiation Medicine, James M. Slater, MD Proton Treatment and Research Center, Loma Linda University, Loma Linda, CA 92350, USA
| | - Joab Galvan Bustillos
- Department of Basic Science, Division of Biochemistry, Center for Health Disparity and Mol. Med., Loma Linda University, Loma Linda, CA 92350, USA; Division of Surgical Oncology, Department of Surgery, Loma Linda University, Loma Linda, CA 92350, USA
| | - David Caba Molina
- Division of Surgical Oncology, Department of Surgery, Loma Linda University, Loma Linda, CA 92350, USA
| | - Nathan R Wall
- Department of Basic Science, Division of Biochemistry, Center for Health Disparity and Mol. Med., Loma Linda University, Loma Linda, CA 92350, USA; Department of Radiation Medicine, James M. Slater, MD Proton Treatment and Research Center, Loma Linda University, Loma Linda, CA 92350, USA.
| |
Collapse
|
6
|
Cavalleri E, Cabri A, Soto-Gomez M, Bonfitto S, Perlasca P, Gliozzo J, Callahan TJ, Reese J, Robinson PN, Casiraghi E, Valentini G, Mesiti M. An ontology-based knowledge graph for representing interactions involving RNA molecules. Sci Data 2024; 11:906. [PMID: 39174566 PMCID: PMC11341713 DOI: 10.1038/s41597-024-03673-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2023] [Accepted: 07/23/2024] [Indexed: 08/24/2024] Open
Abstract
The "RNA world" represents a novel frontier for the study of fundamental biological processes and human diseases and is paving the way for the development of new drugs tailored to each patient's biomolecular characteristics. Although scientific data about coding and non-coding RNA molecules are constantly produced and available from public repositories, they are scattered across different databases and a centralized, uniform, and semantically consistent representation of the "RNA world" is still lacking. We propose RNA-KG, a knowledge graph (KG) encompassing biological knowledge about RNAs gathered from more than 60 public databases, integrating functional relationships with genes, proteins, and chemicals and ontologically grounded biomedical concepts. To develop RNA-KG, we first identified, pre-processed, and characterized each data source; next, we built a meta-graph that provides an ontological description of the KG by representing all the bio-molecular entities and medical concepts of interest in this domain, as well as the types of interactions connecting them. Finally, we leveraged an instance-based semantically abstracted knowledge model to specify the ontological alignment according to which RNA-KG was generated. RNA-KG can be downloaded in different formats and also queried by a SPARQL endpoint. A thorough topological analysis of the resulting heterogeneous graph provides further insights into the characteristics of the "RNA world". RNA-KG can be both directly explored and visualized, and/or analyzed by applying computational methods to infer bio-medical knowledge from its heterogeneous nodes and edges. The resource can be easily updated with new experimental data, and specific views of the overall KG can be extracted according to the bio-medical problem to be studied.
Collapse
Affiliation(s)
- Emanuele Cavalleri
- AnacletoLab, Computer Science Department, University of Milan, Milan, 20133, Italy
| | - Alberto Cabri
- AnacletoLab, Computer Science Department, University of Milan, Milan, 20133, Italy
| | - Mauricio Soto-Gomez
- AnacletoLab, Computer Science Department, University of Milan, Milan, 20133, Italy
| | - Sara Bonfitto
- AnacletoLab, Computer Science Department, University of Milan, Milan, 20133, Italy
| | - Paolo Perlasca
- AnacletoLab, Computer Science Department, University of Milan, Milan, 20133, Italy
| | - Jessica Gliozzo
- AnacletoLab, Computer Science Department, University of Milan, Milan, 20133, Italy
| | - Tiffany J Callahan
- Department of Biomedical Informatics, Columbia University Irving Medical Center, New York, NY, 10032, USA
| | - Justin Reese
- Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA, 94720, USA
| | - Peter N Robinson
- Berlin Institute of Health - Charité, Universitätsmedizin, Berlin, 13353, Germany
- ELLIS, European Laboratory for Learning and Intelligent Systems, Munich, Germany
| | - Elena Casiraghi
- AnacletoLab, Computer Science Department, University of Milan, Milan, 20133, Italy
- Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA, 94720, USA
- ELLIS, European Laboratory for Learning and Intelligent Systems, Munich, Germany
| | - Giorgio Valentini
- AnacletoLab, Computer Science Department, University of Milan, Milan, 20133, Italy
- ELLIS, European Laboratory for Learning and Intelligent Systems, Munich, Germany
| | - Marco Mesiti
- AnacletoLab, Computer Science Department, University of Milan, Milan, 20133, Italy.
- Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA, 94720, USA.
| |
Collapse
|
7
|
Suzuki T, Bono H. A systematic exploration of unexploited genes for oxidative stress in Parkinson's disease. NPJ Parkinsons Dis 2024; 10:160. [PMID: 39154038 PMCID: PMC11330442 DOI: 10.1038/s41531-024-00776-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2024] [Accepted: 08/05/2024] [Indexed: 08/19/2024] Open
Abstract
Human disease-associated gene data are accessible through databases, including the Open Targets Platform, DisGeNET, miRTex, RNADisease, and PubChem. However, missing data entries in such databases are anticipated because of curational errors, biases, and text-mining failures. Additionally, the extensive research on human diseases has led to challenges in registering comprehensive data. The lack of essential data in databases hinders knowledge sharing and should be addressed. Therefore, we propose an analysis pipeline to explore missing entries of unexploited genes in the human disease-associated gene databases. Using this pipeline for genes in Parkinson's disease with oxidative stress revealed two unexploited genes: nuclear protein 1 (NUPR1) and ubiquitin-like with PHD and ring finger domains 2 (UHRF2). This methodology enhances the identification of underrepresented disease-associated genes, facilitating easier access to potential human disease-related functional genes. This study aims to identify unexploited genes for further research and does not include independent experimental validation.
Collapse
Affiliation(s)
- Takayuki Suzuki
- Graduate School of Integrated Sciences for Life, Hiroshima University, 3-10-23 Kagamiyama, Higashi-Hiroshima, Hiroshima, 739-0046, Japan
| | - Hidemasa Bono
- Graduate School of Integrated Sciences for Life, Hiroshima University, 3-10-23 Kagamiyama, Higashi-Hiroshima, Hiroshima, 739-0046, Japan.
- Genome Editing Innovation Center, Hiroshima University, 3-10-23 Kagamiyama, Higashi-Hiroshima, Hiroshima, 739-0046, Japan.
- Database Center for Life Science (DBCLS), Joint Support-Center for Data Science Research, Research Organization of Information and Systems (ROIS), 178-4-4 Wakashiba, Kashiwa, Chiba, 277-0871, Japan.
| |
Collapse
|
8
|
Chaudhary U, Banerjee S. Decoding the Non-coding: Tools and Databases Unveiling the Hidden World of "Junk" RNAs for Innovative Therapeutic Exploration. ACS Pharmacol Transl Sci 2024; 7:1901-1915. [PMID: 39022352 PMCID: PMC11249652 DOI: 10.1021/acsptsci.3c00388] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/25/2023] [Revised: 05/15/2024] [Accepted: 05/27/2024] [Indexed: 07/20/2024]
Abstract
Non-coding RNAs are pivotal regulators of gene and protein expression, exerting crucial influences on diverse biological processes. Their dysregulation is frequently implicated in the onset and progression of diseases, notably cancer. A profound comprehension of the intricate mechanisms governing ncRNAs is imperative for devising innovative therapeutic interventions against these debilitating conditions. Significantly, nearly 80% of our genome comprises ncRNAs, underscoring their centrality in cellular processes. The elucidation of ncRNA functions is pivotal for grasping the complexities of gene regulation and its implications for human health. Modern genome sequencing techniques yield vast datasets, stored in specialized databases. To harness this wealth of information and to understand the crosstalk of non-coding RNAs, knowledge of available databases is required, and many new sophisticated computational tools have emerged. These tools play a pivotal role in the identification, prediction, and annotation of ncRNAs, thereby facilitating their experimental validation. This Review succinctly outlines the current understanding of ncRNAs, emphasizing their involvement in disease development. It also highlights the databases and tools instrumental in classifying, annotating, and evaluating ncRNAs. By extracting meaningful biological insights from seemingly "junk" data, these tools empower scientists to unravel the intricate roles of ncRNAs in shaping human health.
Collapse
Affiliation(s)
- Uma Chaudhary
- Department of Biotechnology,
School of Biosciences and Technology, Vellore
Institute of Technology (VIT), Vellore, Tamil Nadu 632014, India
| | - Satarupa Banerjee
- Department of Biotechnology,
School of Biosciences and Technology, Vellore
Institute of Technology (VIT), Vellore, Tamil Nadu 632014, India
| |
Collapse
|
9
|
Zhao Y, Xiang J, Shi X, Jia P, Zhang Y, Li M. MDDOmics: multi-omics resource of major depressive disorder. Database (Oxford) 2024; 2024:baae042. [PMID: 38917209 PMCID: PMC11197964 DOI: 10.1093/database/baae042] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2023] [Revised: 03/02/2024] [Accepted: 05/29/2024] [Indexed: 06/27/2024]
Abstract
Major depressive disorder (MDD) is a pressing global health issue. Its pathogenesis remains elusive, but numerous studies have revealed its intricate associations with various biological factors. Consequently, there is an urgent need for a comprehensive multi-omics resource to help researchers in conducting multi-omics data analysis for MDD. To address this issue, we constructed the MDDOmics database (Major Depressive Disorder Omics, (https://www.csuligroup.com/MDDOmics/), which integrates an extensive collection of published multi-omics data related to MDD. The database contains 41 222 entries of MDD research results and several original datasets, including Single Nucleotide Polymorphisms, genes, non-coding RNAs, DNA methylations, metabolites and proteins, and offers various interfaces for searching and visualization. We also provide extensive downstream analyses of the collected MDD data, including differential analysis, enrichment analysis and disease-gene prediction. Moreover, the database also incorporates multi-omics data for bipolar disorder, schizophrenia and anxiety disorder, due to the challenge in differentiating MDD from similar psychiatric disorders. In conclusion, by leveraging the rich content and online interfaces from MDDOmics, researchers can conduct more comprehensive analyses of MDD and its similar disorders from various perspectives, thereby gaining a deeper understanding of potential MDD biomarkers and intricate disease pathogenesis. Database URL: https://www.csuligroup.com/MDDOmics/.
Collapse
Affiliation(s)
- Yichao Zhao
- School of Computer Science and Engineering, Central South University, No.932 South Lushan Road, Changsha 410083, China
| | - Ju Xiang
- School of Computer and Communication Engineering, Changsha University of Science and Technology, No.45 Chiling Road, Changsha 410114, China
| | - Xingyuan Shi
- School of Computer Science and Engineering, Central South University, No.932 South Lushan Road, Changsha 410083, China
| | - Pengzhen Jia
- School of Computer Science and Engineering, Central South University, No.932 South Lushan Road, Changsha 410083, China
| | - Yan Zhang
- Department of Psychiatry, and National Clinical Research Center for Mental Disorders, The Second Xiangya Hospital of Central South University, No.139 Renmin Road Central, Changsha 410011, China
| | - Min Li
- School of Computer Science and Engineering, Central South University, No.932 South Lushan Road, Changsha 410083, China
| |
Collapse
|
10
|
Fuller RN, Vallejos PA, Kabagwira J, Liu T, Wang C, Wall NR. miRNA signatures underlie chemoresistance in the gemcitabine-resistant pancreatic ductal adenocarcinoma cell line MIA PaCa-2 GR. Front Genet 2024; 15:1393353. [PMID: 38919953 PMCID: PMC11196613 DOI: 10.3389/fgene.2024.1393353] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2024] [Accepted: 05/03/2024] [Indexed: 06/27/2024] Open
Abstract
Introduction: Chemotherapy resistance remains a significant challenge in the treatment of pancreatic adenocarcinoma (PDAC), particularly in relation to gemcitabine (Gem), a commonly used chemotherapeutic agent. MicroRNAs (miRNAs) are known to influence cancer progression and chemoresistance. This study investigates the association between miRNA expression profiles and gemcitabine resistance in PDAC. Methods: The miRNA expression profiles of a gemcitabine-sensitive (GS) PDAC cell line, MIA PaCa-2, and its gemcitabine-resistant (GR) progeny, MIA PaCa-2 GR, were analyzed. miRNA sequencing (miRNA-seq) was employed to identify miRNAs expressed in these cell lines. Differential expression analysis was performed, and Ingenuity Pathway Analysis (IPA) was utilized to elucidate the biological functions of the differentially expressed miRNAs. Results: A total of 1867 miRNAs were detected across both cell lines. Among these, 97 (5.2%) miRNAs showed significant differential expression between the GR and GS cell lines, with 65 (3.5%) miRNAs upregulated and 32 (1.7%) miRNAs downregulated in the GR line. The most notably altered miRNAs were implicated in key biological processes such as cell proliferation, migration, invasion, chemosensitization, alternative splicing, apoptosis, and angiogenesis. A subset of these miRNAs was further analyzed in patient samples to identify potential markers for recurrent tumors. Discussion: The differential miRNA expression profiles identified in this study highlight the complex regulatory roles of miRNAs in gemcitabine resistance in PDAC. These findings suggest potential targets for improving prognosis and tailoring treatment strategies in PDAC patients, particularly those showing resistance to gemcitabine. Future research should focus on validating these miRNAs as biomarkers for resistance and exploring their therapeutic potential in overcoming chemoresistance.
Collapse
Affiliation(s)
- Ryan N. Fuller
- Division of Biochemistry, Department of Basic Science, Center for Health Disparities and Molecular Medicine, Loma Linda, CA, United States
| | - Paul A. Vallejos
- Division of Biochemistry, Department of Basic Science, Center for Health Disparities and Molecular Medicine, Loma Linda, CA, United States
| | - Janviere Kabagwira
- Division of Biochemistry, Department of Basic Science, Center for Health Disparities and Molecular Medicine, Loma Linda, CA, United States
| | - Tiantian Liu
- Center for Genomics, Loma Linda University School of Medicine, Loma Linda, CA, United States
| | - Charles Wang
- Center for Genomics, Loma Linda University School of Medicine, Loma Linda, CA, United States
- Division of Microbiology, Department of Basic Science, Loma Linda University School of Medicine, Loma Linda, CA, United States
| | - Nathan R. Wall
- Division of Biochemistry, Department of Basic Science, Center for Health Disparities and Molecular Medicine, Loma Linda, CA, United States
- Department of Radiation Medicine, James M. Slater, MD Proton Treatment and Research Center, Loma Linda University School of Medicine, Loma Linda, CA, United States
| |
Collapse
|
11
|
Peng L, Ren M, Huang L, Chen M. GEnDDn: An lncRNA-Disease Association Identification Framework Based on Dual-Net Neural Architecture and Deep Neural Network. Interdiscip Sci 2024; 16:418-438. [PMID: 38733474 DOI: 10.1007/s12539-024-00619-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2023] [Revised: 02/02/2024] [Accepted: 02/03/2024] [Indexed: 05/13/2024]
Abstract
Accumulating studies have demonstrated close relationships between long non-coding RNAs (lncRNAs) and diseases. Identification of new lncRNA-disease associations (LDAs) enables us to better understand disease mechanisms and further provides promising insights into cancer targeted therapy and anti-cancer drug design. Here, we present an LDA prediction framework called GEnDDn based on deep learning. GEnDDn mainly comprises two steps: First, features of both lncRNAs and diseases are extracted by combining similarity computation, non-negative matrix factorization, and graph attention auto-encoder, respectively. And each lncRNA-disease pair (LDP) is depicted as a vector based on concatenation operation on the extracted features. Subsequently, unknown LDPs are classified by aggregating dual-net neural architecture and deep neural network. Using six different evaluation metrics, we found that GEnDDn surpassed four competing LDA identification methods (SDLDA, LDNFSGB, IPCARF, LDASR) on the lncRNADisease and MNDR databases under fivefold cross-validation experiments on lncRNAs, diseases, LDPs, and independent lncRNAs and independent diseases, respectively. Ablation experiments further validated the powerful LDA prediction performance of GEnDDn. Furthermore, we utilized GEnDDn to find underlying lncRNAs for lung cancer and breast cancer. The results elucidated that there may be dense linkages between IFNG-AS1 and lung cancer as well as between HIF1A-AS1 and breast cancer. The results require further biomedical experimental verification. GEnDDn is publicly available at https://github.com/plhhnu/GEnDDn.
Collapse
Affiliation(s)
- Lihong Peng
- College of Life Science and Chemistry, Hunan University of Technology, Zhuzhou, 412007, China
| | - Mengnan Ren
- College of Life Science and Chemistry, Hunan University of Technology, Zhuzhou, 412007, China
| | - Liangliang Huang
- College of Life Science and Chemistry, Hunan University of Technology, Zhuzhou, 412007, China
| | - Min Chen
- School of Computer Science, Hunan Institute of Technology, Hengyang, 421002, China.
| |
Collapse
|
12
|
Sun SL, Zhou BW, Liu SZ, Xiu YH, Bilal A, Long HX. Prediction of miRNAs and diseases association based on sparse autoencoder and MLP. Front Genet 2024; 15:1369811. [PMID: 38873111 PMCID: PMC11169787 DOI: 10.3389/fgene.2024.1369811] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2024] [Accepted: 05/07/2024] [Indexed: 06/15/2024] Open
Abstract
Introduction: MicroRNAs (miRNAs) are small and non-coding RNA molecules which have multiple important regulatory roles within cells. With the deepening research on miRNAs, more and more researches show that the abnormal expression of miRNAs is closely related to various diseases. The relationship between miRNAs and diseases is crucial for discovering the pathogenesis of diseases and exploring new treatment methods. Methods: Therefore, we propose a new sparse autoencoder and MLP method (SPALP) to predict the association between miRNAs and diseases. In this study, we adopt advanced deep learning technologies, including sparse autoencoder and multi-layer perceptron (MLP), to improve the accuracy of predicting miRNA-disease associations. Firstly, the SPALP model uses a sparse autoencoder to perform feature learning and extract the initial features of miRNAs and diseases separately, obtaining the latent features of miRNAs and diseases. Then, the latent features combine miRNAs functional similarity data with diseases semantic similarity data to construct comprehensive miRNAs-diseases datasets. Subsequently, the MLP model can predict the unknown association among miRNAs and diseases. Result: To verify the performance of our model, we set up several comparative experiments. The experimental results show that, compared with traditional methods and other deep learning prediction methods, our method has significantly improved the accuracy of predicting miRNAs-disease associations, with 94.61% accuracy and 0.9859 AUC value. Finally, we conducted case study of SPALP model. We predicted the top 30 miRNAs that might be related to Lupus Erythematosus, Ecute Myeloid Leukemia, Cardiovascular, Stroke, Diabetes Mellitus five elderly diseases and validated that 27, 29, 29, 30, and 30 of the top 30 are indeed associated. Discussion: The SPALP approach introduced in this study is adept at forecasting the links between miRNAs and diseases, addressing the complexities of analyzing extensive bioinformatics datasets and enriching the comprehension contribution to disease progression of miRNAs.
Collapse
Affiliation(s)
- Si-Lin Sun
- Department of Information Science Technology, Hainan Normal University, Haikou, Hainan, China
| | - Bing-Wei Zhou
- Department of Information Science Technology, Hainan Normal University, Haikou, Hainan, China
| | - Sheng-Zheng Liu
- Department of Information Science Technology, Hainan Normal University, Haikou, Hainan, China
| | - Yu-Han Xiu
- Department of Information Science Technology, Hainan Normal University, Haikou, Hainan, China
| | - Anas Bilal
- Department of Information Science Technology, Hainan Normal University, Haikou, Hainan, China
- Key Laboratory of Data Science and Smart Education, Ministry of Education, Hainan Normal University, Haikou, China
| | - Hai-Xia Long
- Department of Information Science Technology, Hainan Normal University, Haikou, Hainan, China
- Key Laboratory of Data Science and Smart Education, Ministry of Education, Hainan Normal University, Haikou, China
| |
Collapse
|
13
|
Wei H, Gao L, Wu S, Jiang Y, Liu B. DiSMVC: a multi-view graph collaborative learning framework for measuring disease similarity. Bioinformatics 2024; 40:btae306. [PMID: 38715444 PMCID: PMC11256965 DOI: 10.1093/bioinformatics/btae306] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2024] [Revised: 04/19/2024] [Accepted: 05/05/2024] [Indexed: 05/30/2024] Open
Abstract
MOTIVATION Exploring potential associations between diseases can help in understanding pathological mechanisms of diseases and facilitating the discovery of candidate biomarkers and drug targets, thereby promoting disease diagnosis and treatment. Some computational methods have been proposed for measuring disease similarity. However, these methods describe diseases without considering their latent multi-molecule regulation and valuable supervision signal, resulting in limited biological interpretability and efficiency to capture association patterns. RESULTS In this study, we propose a new computational method named DiSMVC. Different from existing predictors, DiSMVC designs a supervised graph collaborative framework to measure disease similarity. Multiple bio-entity associations related to genes and miRNAs are integrated via cross-view graph contrastive learning to extract informative disease representation, and then association pattern joint learning is implemented to compute disease similarity by incorporating phenotype-annotated disease associations. The experimental results show that DiSMVC can draw discriminative characteristics for disease pairs, and outperform other state-of-the-art methods. As a result, DiSMVC is a promising method for predicting disease associations with molecular interpretability. AVAILABILITY AND IMPLEMENTATION Datasets and source codes are available at https://github.com/Biohang/DiSMVC.
Collapse
Affiliation(s)
- Hang Wei
- School of Computer Science and Technology, Xidian University, Xi’an, Shaanxi 710126, China
| | - Lin Gao
- School of Computer Science and Technology, Xidian University, Xi’an, Shaanxi 710126, China
| | - Shuai Wu
- School of Computer Science and Technology, Xidian University, Xi’an, Shaanxi 710126, China
| | - Yina Jiang
- Department of Basic Medicine, Shaanxi University of Chinese Medicine, Xianyang, Shaanxi 712046, China
| | - Bin Liu
- Faculty of Engineering, Shenzhen MSU-BIT University, Shenzhen, Guangdong 518172, China
- School of Computer Science and Technology, Beijing Institute of Technology, Beijing, 100081, China
| |
Collapse
|
14
|
Sheng N, Xie X, Wang Y, Huang L, Zhang S, Gao L, Wang H. A Survey of Deep Learning for Detecting miRNA- Disease Associations: Databases, Computational Methods, Challenges, and Future Directions. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2024; 21:328-347. [PMID: 38194377 DOI: 10.1109/tcbb.2024.3351752] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/11/2024]
Abstract
MicroRNAs (miRNAs) are an important class of non-coding RNAs that play an essential role in the occurrence and development of various diseases. Identifying the potential miRNA-disease associations (MDAs) can be beneficial in understanding disease pathogenesis. Traditional laboratory experiments are expensive and time-consuming. Computational models have enabled systematic large-scale prediction of potential MDAs, greatly improving the research efficiency. With recent advances in deep learning, it has become an attractive and powerful technique for uncovering novel MDAs. Consequently, numerous MDA prediction methods based on deep learning have emerged. In this review, we first summarize publicly available databases related to miRNAs and diseases for MDA prediction. Next, we outline commonly used miRNA and disease similarity calculation and integration methods. Then, we comprehensively review the 48 existing deep learning-based MDA computation methods, categorizing them into classical deep learning and graph neural network-based techniques. Subsequently, we investigate the evaluation methods and metrics that are frequently used to assess MDA prediction performance. Finally, we discuss the performance trends of different computational methods, point out some problems in current research, and propose 9 potential future research directions. Data resources and recent advances in MDA prediction methods are summarized in the GitHub repository https://github.com/sheng-n/DL-miRNA-disease-association-methods.
Collapse
|
15
|
Krochtová K, Janovec L, Bogárová V, Halečková A, Kožurková M. Interaction of 3,9-disubstituted acridine with single stranded poly(rA), double stranded poly(rAU) and triple stranded poly(rUAU): molecular docking - A spectroscopic tandem study. Chem Biol Interact 2024; 394:110965. [PMID: 38552767 DOI: 10.1016/j.cbi.2024.110965] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2024] [Revised: 03/12/2024] [Accepted: 03/16/2024] [Indexed: 04/10/2024]
Abstract
RNA plays an important role in many biological processes which are crucial for cell survival, and it has been suggested that it may be possible to inhibit individual processes involved in many diseases by targeting specific sequences of RNA. The aim of this work is to determine the affinity of novel 3,9-disubstited acridine derivative 1 with three different RNA molecules, namely single stranded poly(rA), double stranded homopolymer poly(rAU) and triple stranded poly(rUAU). The results of the absorption titration assays show that the binding constant of the novel derivative to the RNA molecules was in the range of 1.7-6.2 × 104 mol dm-3. The fluorescence and circular dichroism titration assays revealed considerable changes. The most significant results in terms of interpreting the nature of the interactions were the melting temperatures of the RNA samples in complexes with the 1. In the case of poly(rA), denaturation resulted in a self-structure formation; increased stabilization was observed for poly(rAU), while the melting points of the ligand-poly(rUAU) complex showed significant destabilization as a result of the interaction. The principles of molecular mechanics were applied to propose the non-bonded interactions within the binding complex, pentariboadenylic acid and acridine ligand as the study model. Initial molecular docking provided the input structure for advanced simulation techniques. Molecular dynamics simulation and cluster analysis reveal π - π stacking and the hydrogen bonds formation as the main forces that can stabilize the binding complex. Subsequent MM-GBSA calculations showed negative binding enthalpy accompanied the complex formation and proposed the most preferred conformation of the interaction complex.
Collapse
Affiliation(s)
- Kristína Krochtová
- Department of Biochemistry, Institute of Chemistry, Faculty of Science, Pavol Jozef Šafárik University in Košice, Šrobárova 2, 041 54, Košice, Slovak Republic
| | - Ladislav Janovec
- Department of Organic Chemistry, Institute of Chemistry, Faculty of Science, Pavol Jozef Šafárik University in Košice, Šrobárova 2, 041 54, Košice, Slovak Republic
| | - Viktória Bogárová
- Department of Biochemistry, Institute of Chemistry, Faculty of Science, Pavol Jozef Šafárik University in Košice, Šrobárova 2, 041 54, Košice, Slovak Republic
| | - Annamária Halečková
- Department of Organic Chemistry, Institute of Chemistry, Faculty of Science, Pavol Jozef Šafárik University in Košice, Šrobárova 2, 041 54, Košice, Slovak Republic
| | - Mária Kožurková
- Department of Biochemistry, Institute of Chemistry, Faculty of Science, Pavol Jozef Šafárik University in Košice, Šrobárova 2, 041 54, Košice, Slovak Republic.
| |
Collapse
|
16
|
Morishita EC, Nakamura S. Recent applications of artificial intelligence in RNA-targeted small molecule drug discovery. Expert Opin Drug Discov 2024; 19:415-431. [PMID: 38321848 DOI: 10.1080/17460441.2024.2313455] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2023] [Accepted: 01/30/2024] [Indexed: 02/08/2024]
Abstract
INTRODUCTION Targeting RNAs with small molecules offers an alternative to the conventional protein-targeted drug discovery and can potentially address unmet and emerging medical needs. The recent rise of interest in the strategy has already resulted in large amounts of data on disease associated RNAs, as well as on small molecules that bind to such RNAs. Artificial intelligence (AI) approaches, including machine learning and deep learning, present an opportunity to speed up the discovery of RNA-targeted small molecules by improving decision-making efficiency and quality. AREAS COVERED The topics described in this review include the recent applications of AI in the identification of RNA targets, RNA structure determination, screening of chemical compound libraries, and hit-to-lead optimization. The impact and limitations of the recent AI applications are discussed, along with an outlook on the possible applications of next-generation AI tools for the discovery of novel RNA-targeted small molecule drugs. EXPERT OPINION Key areas for improvement include developing AI tools for understanding RNA dynamics and RNA - small molecule interactions. High-quality and comprehensive data still need to be generated especially on the biological activity of small molecules that target RNAs.
Collapse
|
17
|
Liu Y, Zhang R, Dong X, Yang H, Li J, Cao H, Tian J, Zhang Y. DAE-CFR: detecting microRNA-disease associations using deep autoencoder and combined feature representation. BMC Bioinformatics 2024; 25:139. [PMID: 38553698 PMCID: PMC10981315 DOI: 10.1186/s12859-024-05757-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2024] [Accepted: 03/20/2024] [Indexed: 04/01/2024] Open
Abstract
BACKGROUND MicroRNA (miRNA) has been shown to play a key role in the occurrence and progression of diseases, making uncovering miRNA-disease associations vital for disease prevention and therapy. However, traditional laboratory methods for detecting these associations are slow, strenuous, expensive, and uncertain. Although numerous advanced algorithms have emerged, it is still a challenge to develop more effective methods to explore underlying miRNA-disease associations. RESULTS In the study, we designed a novel approach on the basis of deep autoencoder and combined feature representation (DAE-CFR) to predict possible miRNA-disease associations. We began by creating integrated similarity matrices of miRNAs and diseases, performing a logistic function transformation, balancing positive and negative samples with k-means clustering, and constructing training samples. Then, deep autoencoder was used to extract low-dimensional feature from two kinds of feature representations for miRNAs and diseases, namely, original association information-based and similarity information-based. Next, we combined the resulting features for each miRNA-disease pair and used a logistic regression (LR) classifier to infer all unknown miRNA-disease interactions. Under five and tenfold cross-validation (CV) frameworks, DAE-CFR not only outperformed six popular algorithms and nine classifiers, but also demonstrated superior performance on an additional dataset. Furthermore, case studies on three diseases (myocardial infarction, hypertension and stroke) confirmed the validity of DAE-CFR in practice. CONCLUSIONS DAE-CFR achieved outstanding performance in predicting miRNA-disease associations and can provide evidence to inform biological experiments and clinical therapy.
Collapse
Affiliation(s)
- Yanling Liu
- Department of Health Statistics, School of Public Health, Shanxi Medical University, Taiyuan, China
- Department of Mathematics, Changzhi Medical College, Changzhi, China
| | - Ruiyan Zhang
- Department of Health Statistics, School of Public Health, Shanxi Medical University, Taiyuan, China
| | - Xiaojing Dong
- Department of Health Statistics, School of Public Health, Shanxi Medical University, Taiyuan, China
| | - Hong Yang
- Department of Health Statistics, School of Public Health, Shanxi Medical University, Taiyuan, China
| | - Jing Li
- Department of Health Statistics, School of Public Health, Shanxi Medical University, Taiyuan, China
| | - Hongyan Cao
- Department of Health Statistics, School of Public Health, Shanxi Medical University, Taiyuan, China
| | - Jing Tian
- Department of Cardiology, First Hospital of Shanxi Medical University, Taiyuan, China.
| | - Yanbo Zhang
- Department of Health Statistics, School of Public Health, Shanxi Medical University, Taiyuan, China.
- Shanxi Provincial Key Laboratory of Major Diseases Risk Assessment, Taiyuan, China.
- School of Health and Service Management, Shanxi University of Chinese Medicine, Jinzhong, China.
| |
Collapse
|
18
|
Chen Q, Zhang L, Liu Y, Qin Z, Zhao T. PUTransGCN: identification of piRNA-disease associations based on attention encoding graph convolutional network and positive unlabelled learning. Brief Bioinform 2024; 25:bbae144. [PMID: 38581419 PMCID: PMC10998538 DOI: 10.1093/bib/bbae144] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2024] [Revised: 02/25/2024] [Accepted: 03/15/2024] [Indexed: 04/08/2024] Open
Abstract
Piwi-interacting RNAs (piRNAs) play a crucial role in various biological processes and are implicated in disease. Consequently, there is an escalating demand for computational tools to predict piRNA-disease interactions. Although there have been computational methods proposed for the detection of piRNA-disease associations, the problem of imbalanced and sparse dataset has brought great challenges to capture the complex relationships between piRNAs and diseases. In response to this necessity, we have developed a novel computational architecture, denoted as PUTransGCN, which uses heterogeneous graph convolutional networks to uncover potential piRNA-disease associations. Additionally, the attention mechanism was used to adjust the weight parameters of aggregation heterogeneous node features automatically. For tackling the imbalanced dataset problem, the combined positive unlabelled learning (PUL) method comprising PU bagging, two-step and spy technique was applied to select reliable negative associations. The features of piRNAs and diseases were derived from three distinct biological sources by PUTransGCN, including information on piRNA sequences, semantic terms related to diseases and the existing network of piRNA-disease associations. In the experiment, PUTransGCN performs in 5-fold cross-validation with an AUC of 0.93 and 0.95 on two datasets, respectively, which outperforms the other six state-of-the-art models. We compared three different PUL methods, and the results of the ablation experiment indicate that the combined PUL method yields the best results. The PUTransGCN could serve as a valuable piRNA-disease prediction tool for upcoming studies in the biomedical field. The code for PUTransGCN is available at https://github.com/chenqiuhao/PUTransGCN.
Collapse
Affiliation(s)
- Qiuhao Chen
- Institute of Bioinformatics, Harbin Institute of Technology, 150000, Harbin, Heilongjiang, China
| | - Liyuan Zhang
- School of Computer Science and Technology, Harbin Institute of Technology, 150000, Harbin, Heilongjiang, China
| | - Yaojia Liu
- School of Computer Science and Technology, Harbin Institute of Technology, 150000, Harbin, Heilongjiang, China
| | - Zhonghao Qin
- State Key Laboratory of Robotics and System, Harbin Institute of Technology, 150000, Harbin, Heilongjiang, China
| | - Tianyi Zhao
- School of Computer Science and Technology, Harbin Institute of Technology, 150000, Harbin, Heilongjiang, China
| |
Collapse
|
19
|
Hu X, Zhang P, Liu D, Zhang J, Zhang Y, Dong Y, Fan Y, Deng L. IGCNSDA: unraveling disease-associated snoRNAs with an interpretable graph convolutional network. Brief Bioinform 2024; 25:bbae179. [PMID: 38647155 PMCID: PMC11033953 DOI: 10.1093/bib/bbae179] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2023] [Revised: 12/15/2023] [Accepted: 03/27/2024] [Indexed: 04/25/2024] Open
Abstract
Accurately delineating the connection between short nucleolar RNA (snoRNA) and disease is crucial for advancing disease detection and treatment. While traditional biological experimental methods are effective, they are labor-intensive, costly and lack scalability. With the ongoing progress in computer technology, an increasing number of deep learning techniques are being employed to predict snoRNA-disease associations. Nevertheless, the majority of these methods are black-box models, lacking interpretability and the capability to elucidate the snoRNA-disease association mechanism. In this study, we introduce IGCNSDA, an innovative and interpretable graph convolutional network (GCN) approach tailored for the efficient inference of snoRNA-disease associations. IGCNSDA leverages the GCN framework to extract node feature representations of snoRNAs and diseases from the bipartite snoRNA-disease graph. SnoRNAs with high similarity are more likely to be linked to analogous diseases, and vice versa. To facilitate this process, we introduce a subgraph generation algorithm that effectively groups similar snoRNAs and their associated diseases into cohesive subgraphs. Subsequently, we aggregate information from neighboring nodes within these subgraphs, iteratively updating the embeddings of snoRNAs and diseases. The experimental results demonstrate that IGCNSDA outperforms the most recent, highly relevant methods. Additionally, our interpretability analysis provides compelling evidence that IGCNSDA adeptly captures the underlying similarity between snoRNAs and diseases, thus affording researchers enhanced insights into the snoRNA-disease association mechanism. Furthermore, we present illustrative case studies that demonstrate the utility of IGCNSDA as a valuable tool for efficiently predicting potential snoRNA-disease associations. The dataset and source code for IGCNSDA are openly accessible at: https://github.com/altriavin/IGCNSDA.
Collapse
Affiliation(s)
- Xiaowen Hu
- School of Computer Science and Engineering, Central South University, 410075, Changsha, China
| | - Pan Zhang
- Hunan Provincial Key Laboratory of Clinical Epidemiology, Xiangya School of Public Health, Central South University, 410078, ChangshaChina
| | - Dayun Liu
- School of Computer Science and Engineering, Central South University, 410075, Changsha, China
| | - Jiaxuan Zhang
- Department of Electrical and Computer Engineering, University of California, San Diego, 92093, CA, United States
| | - Yuanpeng Zhang
- School of Software, Xinjiang University, 830046, Urumqi, China
| | - Yihan Dong
- School of Computer Science and Engineering, Central South University, 410075, Changsha, China
| | - Yanhao Fan
- School of Computer Science and Engineering, Central South University, 410075, Changsha, China
| | - Lei Deng
- School of Computer Science and Engineering, Central South University, 410075, Changsha, China
| |
Collapse
|
20
|
Giulietti M, Piva F, Cecati M, Maggio S, Guescini M, Saladino T, Scortichini L, Crocetti S, Caramanti M, Battelli N, Romagnoli E. Effects of Eribulin on the RNA Content of Extracellular Vesicles Released by Metastatic Breast Cancer Cells. Cells 2024; 13:479. [PMID: 38534323 DOI: 10.3390/cells13060479] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2023] [Revised: 02/23/2024] [Accepted: 03/04/2024] [Indexed: 03/28/2024] Open
Abstract
Extracellular vesicles (EVs) are small lipid particles secreted by almost all human cells into the extracellular space. They perform the essential function of cell-to-cell communication, and their role in promoting breast cancer progression has been well demonstrated. It is known that EVs released by triple-negative and highly aggressive MDA-MB-231 breast cancer cells treated with paclitaxel, a microtubule-targeting agent (MTA), promoted chemoresistance in EV-recipient cells. Here, we studied the RNA content of EVs produced by the same MDA-MB-231 breast cancer cells treated with another MTA, eribulin mesylate. In particular, we analyzed the expression of different RNA species, including mRNAs, lncRNAs, miRNAs, snoRNAs, piRNAs and tRNA fragments by RNA-seq. Then, we performed differential expression analysis, weighted gene co-expression network analysis (WGCNA), functional enrichment analysis, and miRNA-target identification. Our findings demonstrate the possible involvement of EVs from eribulin-treated cells in the spread of chemoresistance, prompting the design of strategies that selectively target tumor EVs.
Collapse
Affiliation(s)
- Matteo Giulietti
- Department of Specialistic Clinical and Odontostomatological Sciences, Polytechnic University of Marche, 60131 Ancona, Italy
| | - Francesco Piva
- Department of Specialistic Clinical and Odontostomatological Sciences, Polytechnic University of Marche, 60131 Ancona, Italy
| | - Monia Cecati
- Department of Specialistic Clinical and Odontostomatological Sciences, Polytechnic University of Marche, 60131 Ancona, Italy
| | - Serena Maggio
- Department of Biomolecular Sciences, University of Urbino Carlo Bo, 61029 Urbino, Italy
| | - Michele Guescini
- Department of Biomolecular Sciences, University of Urbino Carlo Bo, 61029 Urbino, Italy
| | - Tiziana Saladino
- Oncology Unit AST3, Macerata Hospital, Via Santa Lucia 2, 62100 Macerata, Italy
| | - Laura Scortichini
- Oncology Unit AST3, Macerata Hospital, Via Santa Lucia 2, 62100 Macerata, Italy
| | - Sonia Crocetti
- Oncology Unit AST3, Macerata Hospital, Via Santa Lucia 2, 62100 Macerata, Italy
| | - Miriam Caramanti
- Oncology Unit AST3, Macerata Hospital, Via Santa Lucia 2, 62100 Macerata, Italy
| | - Nicola Battelli
- Oncology Unit AST3, Macerata Hospital, Via Santa Lucia 2, 62100 Macerata, Italy
| | - Emanuela Romagnoli
- Oncology Unit AST3, Macerata Hospital, Via Santa Lucia 2, 62100 Macerata, Italy
| |
Collapse
|
21
|
Zhou L, Peng X, Zeng L, Peng L. Finding potential lncRNA-disease associations using a boosting-based ensemble learning model. Front Genet 2024; 15:1356205. [PMID: 38495672 PMCID: PMC10940470 DOI: 10.3389/fgene.2024.1356205] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2023] [Accepted: 02/01/2024] [Indexed: 03/19/2024] Open
Abstract
Introduction: Long non-coding RNAs (lncRNAs) have been in the clinical use as potential prognostic biomarkers of various types of cancer. Identifying associations between lncRNAs and diseases helps capture the potential biomarkers and design efficient therapeutic options for diseases. Wet experiments for identifying these associations are costly and laborious. Methods: We developed LDA-SABC, a novel boosting-based framework for lncRNA-disease association (LDA) prediction. LDA-SABC extracts LDA features based on singular value decomposition (SVD) and classifies lncRNA-disease pairs (LDPs) by incorporating LightGBM and AdaBoost into the convolutional neural network. Results: The LDA-SABC performance was evaluated under five-fold cross validations (CVs) on lncRNAs, diseases, and LDPs. It obviously outperformed four other classical LDA inference methods (SDLDA, LDNFSGB, LDASR, and IPCAF) through precision, recall, accuracy, F1 score, AUC, and AUPR. Based on the accurate LDA prediction performance of LDA-SABC, we used it to find potential lncRNA biomarkers for lung cancer. The results elucidated that 7SK and HULC could have a relationship with non-small-cell lung cancer (NSCLC) and lung adenocarcinoma (LUAD), respectively. Conclusion: We hope that our proposed LDA-SABC method can help improve the LDA identification.
Collapse
Affiliation(s)
- Liqian Zhou
- School of Computer Science, Hunan University of Technology, Zhuzhou, Hunan, China
| | - Xinhuai Peng
- School of Computer Science, Hunan University of Technology, Zhuzhou, Hunan, China
| | - Lijun Zeng
- School of Computer Science, Hunan Institute of Technology, Hengyang, China
| | - Lihong Peng
- School of Computer Science, Hunan University of Technology, Zhuzhou, Hunan, China
| |
Collapse
|
22
|
Fu Y, Zhang YL, Liu RQ, Xu MM, Xie JL, Zhang XL, Xie GM, Han YT, Zhang XM, Zhang WT, Zhang J, Zhang J. Exosome lncRNA IFNG-AS1 derived from mesenchymal stem cells of human adipose ameliorates neurogenesis and ASD-like behavior in BTBR mice. J Nanobiotechnology 2024; 22:66. [PMID: 38368393 PMCID: PMC10874555 DOI: 10.1186/s12951-024-02338-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2023] [Accepted: 02/09/2024] [Indexed: 02/19/2024] Open
Abstract
BACKGROUND The transplantation of exosomes derived from human adipose-derived mesenchymal stem cells (hADSCs) has emerged as a prospective cellular-free therapeutic intervention for the treatment of neurodevelopmental disorders (NDDs), as well as autism spectrum disorder (ASD). Nevertheless, the efficacy of hADSC exosome transplantation for ASD treatment remains to be verified, and the underlying mechanism of action remains unclear. RESULTS The exosomal long non-coding RNAs (lncRNAs) from hADSC and human umbilical cord mesenchymal stem cells (hUCMSC) were sequenced and 13,915 and 729 lncRNAs were obtained, respectively. The lncRNAs present in hADSC-Exos encompass those found in hUCMSC-Exos and are associated with neurogenesis. The biodistribution of hADSC-Exos in mouse brain ventricles and organoids was tracked, and the cellular uptake of hADSC-Exos was evaluated both in vivo and in vitro. hADSC-Exos promote neurogenesis in brain organoid and ameliorate social deficits in ASD mouse model BTBR T + tf/J (BTBR). Fluorescence in situ hybridization (FISH) confirmed lncRNA Ifngas1 significantly increased in the prefrontal cortex (PFC) of adult mice after hADSC-Exos intraventricular injection. The lncRNA Ifngas1 can act as a molecular sponge for miR-21a-3p to play a regulatory role and promote neurogenesis through the miR-21a-3p/PI3K/AKT axis. CONCLUSION We demonstrated hADSC-Exos have the ability to confer neuroprotection through functional restoration, attenuation of neuroinflammation, inhibition of neuronal apoptosis, and promotion of neurogenesis both in vitro and in vivo. The hADSC-Exos-derived lncRNA IFNG-AS1 acts as a molecular sponge and facilitates neurogenesis via the miR-21a-3p/PI3K/AKT signaling pathway, thereby exerting a regulatory effect. Our findings suggest a potential therapeutic avenue for individuals with ASD.
Collapse
Affiliation(s)
- Yu Fu
- Research Center for Translational Medicine at East Hospital, School of Medicine, Tongji University, Shanghai, 200010, China
| | - Yuan-Lin Zhang
- Research Center for Translational Medicine at East Hospital, School of Medicine, Tongji University, Shanghai, 200010, China
- Department of Pathology, Air Force Medical Center, Beijing, 100142, China
| | - Rong-Qi Liu
- Research Center for Translational Medicine at East Hospital, School of Life Science and Technology, Tongji University, Shanghai, 200010, China
| | - Meng-Meng Xu
- Research Center for Translational Medicine at East Hospital, School of Medicine, Tongji University, Shanghai, 200010, China
| | - Jun-Ling Xie
- Research Center for Translational Medicine at East Hospital, School of Medicine, Tongji University, Shanghai, 200010, China
| | - Xing-Liao Zhang
- Research Center for Translational Medicine at East Hospital, School of Medicine, Tongji University, Shanghai, 200010, China
| | - Guang-Ming Xie
- Research Center for Translational Medicine at East Hospital, School of Medicine, Tongji University, Shanghai, 200010, China
| | - Yao-Ting Han
- Research Center for Translational Medicine at East Hospital, School of Life Science and Technology, Tongji University, Shanghai, 200010, China
| | - Xin-Min Zhang
- Research Center for Translational Medicine at East Hospital, School of Life Science and Technology, Tongji University, Shanghai, 200010, China
| | - Wan-Ting Zhang
- Research Center for Translational Medicine at East Hospital, School of Life Science and Technology, Tongji University, Shanghai, 200010, China
| | - Jing Zhang
- Research Center for Translational Medicine at East Hospital, School of Life Science and Technology, Tongji University, Shanghai, 200010, China.
- Shanghai Institute of Stem Cell Research and Clinical Translation, Shanghai, 200092, China.
| | - Jun Zhang
- Research Center for Translational Medicine at East Hospital, School of Medicine, Tongji University, Shanghai, 200010, China.
- Key Laboratory of Spine and Spinal Cord Injury Repair and Regeneration of Ministry of Education, Orthopaedic Department of Tongji Hospital, School of Medicine, Tongji University, Shanghai, 200065, China.
- Shanghai Institute of Stem Cell Research and Clinical Translation, Shanghai, 200092, China.
| |
Collapse
|
23
|
Armenta-Castro A, Núñez-Soto MT, Rodriguez-Aguillón KO, Aguayo-Acosta A, Oyervides-Muñoz MA, Snyder SA, Barceló D, Saththasivam J, Lawler J, Sosa-Hernández JE, Parra-Saldívar R. Urine biomarkers for Alzheimer's disease: A new opportunity for wastewater-based epidemiology? ENVIRONMENT INTERNATIONAL 2024; 184:108462. [PMID: 38335627 DOI: 10.1016/j.envint.2024.108462] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/08/2023] [Revised: 01/16/2024] [Accepted: 01/26/2024] [Indexed: 02/12/2024]
Abstract
While Alzheimer's disease (AD) diagnosis, management, and care have become priorities for healthcare providers and researcher's worldwide due to rapid population aging, epidemiologic surveillance efforts are currently limited by costly, invasive diagnostic procedures, particularly in low to middle income countries (LMIC). In recent years, wastewater-based epidemiology (WBE) has emerged as a promising tool for public health assessment through detection and quantification of specific biomarkers in wastewater, but applications for non-infectious diseases such as AD remain limited. This early review seeks to summarize AD-related biomarkers and urine and other peripheral biofluids and discuss their potential integration to WBE platforms to guide the first prospective efforts in the field. Promising results have been reported in clinical settings, indicating the potential of amyloid β, tau, neural thread protein, long non-coding RNAs, oxidative stress markers and other dysregulated metabolites for AD diagnosis, but questions regarding their concentration and stability in wastewater and the correlation between clinical levels and sewage circulation must be addressed in future studies before comprehensive WBE systems can be developed.
Collapse
Affiliation(s)
| | - Mónica T Núñez-Soto
- Tecnologico de Monterrey, School of Engineering and Sciences, Monterrey 64849, Mexico
| | - Kassandra O Rodriguez-Aguillón
- Tecnologico de Monterrey, School of Engineering and Sciences, Monterrey 64849, Mexico; Tecnologico de Monterrey, Institute of Advanced Materials for Sustainable Manufacturing, Monterrey 64849, Mexico
| | - Alberto Aguayo-Acosta
- Tecnologico de Monterrey, School of Engineering and Sciences, Monterrey 64849, Mexico; Tecnologico de Monterrey, Institute of Advanced Materials for Sustainable Manufacturing, Monterrey 64849, Mexico
| | - Mariel Araceli Oyervides-Muñoz
- Tecnologico de Monterrey, School of Engineering and Sciences, Monterrey 64849, Mexico; Tecnologico de Monterrey, Institute of Advanced Materials for Sustainable Manufacturing, Monterrey 64849, Mexico
| | - Shane A Snyder
- Nanyang Environment & Water Research Institute (NEWRI), Nanyang Technological University, Singapore
| | - Damià Barceló
- Department of Environmental Chemistry, Institute of Environmental Assessment and Water Research, IDAEA-CSIC, Jordi Girona, 18-26, 08034 Barcelona, Spain; Sustainability Cluster, School of Engineering at the UPES, Dehradun, Uttarakhand, India
| | - Jayaprakash Saththasivam
- Water Center, Qatar Environment & Energy Research Institute, Hamad Bin Khalifa University, Qatar Foundation, Qatar
| | - Jenny Lawler
- Water Center, Qatar Environment & Energy Research Institute, Hamad Bin Khalifa University, Qatar Foundation, Qatar
| | - Juan Eduardo Sosa-Hernández
- Tecnologico de Monterrey, School of Engineering and Sciences, Monterrey 64849, Mexico; Tecnologico de Monterrey, Institute of Advanced Materials for Sustainable Manufacturing, Monterrey 64849, Mexico.
| | - Roberto Parra-Saldívar
- Tecnologico de Monterrey, School of Engineering and Sciences, Monterrey 64849, Mexico; Tecnologico de Monterrey, Institute of Advanced Materials for Sustainable Manufacturing, Monterrey 64849, Mexico
| |
Collapse
|
24
|
Feng X, Liu S, Li K, Bu F, Yuan H. NCAD v1.0: a database for non-coding variant annotation and interpretation. J Genet Genomics 2024; 51:230-242. [PMID: 38142743 DOI: 10.1016/j.jgg.2023.12.005] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2023] [Revised: 12/15/2023] [Accepted: 12/18/2023] [Indexed: 12/26/2023]
Abstract
The application of whole genome sequencing is expanding in clinical diagnostics across various genetic disorders, and the significance of non-coding variants in penetrant diseases is increasingly being demonstrated. Therefore, it is urgent to improve the diagnostic yield by exploring the pathogenic mechanisms of variants in non-coding regions. However, the interpretation of non-coding variants remains a significant challenge, due to the complex functional regulatory mechanisms of non-coding regions and the current limitations of available databases and tools. Hence, we develop the non-coding variant annotation database (NCAD, http://www.ncawdb.net/), encompassing comprehensive insights into 665,679,194 variants, regulatory elements, and element interaction details. Integrating data from 96 sources, spanning both GRCh37 and GRCh38 versions, NCAD v1.0 provides vital information to support the genetic diagnosis of non-coding variants, including allele frequencies of 12 diverse populations, with a particular focus on the population frequency information for 230,235,698 variants in 20,964 Chinese individuals. Moreover, it offers prediction scores for variant functionality, five categories of regulatory elements, and four types of non-coding RNAs. With its rich data and comprehensive coverage, NCAD serves as a valuable platform, empowering researchers and clinicians with profound insights into non-coding regulatory mechanisms while facilitating the interpretation of non-coding variants.
Collapse
Affiliation(s)
- Xiaoshu Feng
- Institute of Rare Diseases, West China Hospital, Sichuan University, Chengdu, Sichuan 610044, China
| | - Sihan Liu
- Institute of Rare Diseases, West China Hospital, Sichuan University, Chengdu, Sichuan 610044, China
| | - Ke Li
- Institute of Rare Diseases, West China Hospital, Sichuan University, Chengdu, Sichuan 610044, China
| | - Fengxiao Bu
- Institute of Rare Diseases, West China Hospital, Sichuan University, Chengdu, Sichuan 610044, China.
| | - Huijun Yuan
- Institute of Rare Diseases, West China Hospital, Sichuan University, Chengdu, Sichuan 610044, China.
| |
Collapse
|
25
|
Li G, Bai P, Liang C, Luo J. Node-adaptive graph Transformer with structural encoding for accurate and robust lncRNA-disease association prediction. BMC Genomics 2024; 25:73. [PMID: 38233788 PMCID: PMC10795365 DOI: 10.1186/s12864-024-09998-2] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2023] [Accepted: 01/09/2024] [Indexed: 01/19/2024] Open
Abstract
BACKGROUND Long noncoding RNAs (lncRNAs) are integral to a plethora of critical cellular biological processes, including the regulation of gene expression, cell differentiation, and the development of tumors and cancers. Predicting the relationships between lncRNAs and diseases can contribute to a better understanding of the pathogenic mechanisms of disease and provide strong support for the development of advanced treatment methods. RESULTS Therefore, we present an innovative Node-Adaptive Graph Transformer model for predicting unknown LncRNA-Disease Associations, named NAGTLDA. First, we utilize the node-adaptive feature smoothing (NAFS) method to learn the local feature information of nodes and encode the structural information of the fusion similarity network of diseases and lncRNAs using Structural Deep Network Embedding (SDNE). Next, the Transformer module is used to capture potential association information between the network nodes. Finally, we employ a Transformer module with two multi-headed attention layers for learning global-level embedding fusion. Network structure coding is added as the structural inductive bias of the network to compensate for the missing message-passing mechanism in Transformer. NAGTLDA achieved an average AUC of 0.9531 and AUPR of 0.9537 significantly higher than state-of-the-art methods in 5-fold cross validation. We perform case studies on 4 diseases; 55 out of 60 associations between lncRNAs and diseases have been validated in the literatures. The results demonstrate the enormous potential of the graph Transformer structure to incorporate graph structural information for uncovering lncRNA-disease unknown correlations. CONCLUSIONS Our proposed NAGTLDA model can serve as a highly efficient computational method for predicting biological information associations.
Collapse
Affiliation(s)
- Guanghui Li
- School of Information Engineering, East China Jiaotong University, Nanchang, China.
| | - Peihao Bai
- School of Information Engineering, East China Jiaotong University, Nanchang, China
| | - Cheng Liang
- School of Information Science and Engineering, Shandong Normal University, Jinan, China
| | - Jiawei Luo
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, China.
| |
Collapse
|
26
|
Yao D, Zhang B, Li X, Zhan X, Zhan X, Zhang B. Applying negative sample denoising and multi-view feature for lncRNA-disease association prediction. Front Genet 2024; 14:1332273. [PMID: 38264213 PMCID: PMC10803626 DOI: 10.3389/fgene.2023.1332273] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2023] [Accepted: 12/22/2023] [Indexed: 01/25/2024] Open
Abstract
Increasing evidence indicates that mutations and dysregulation of long non-coding RNA (lncRNA) play a crucial role in the pathogenesis and prognosis of complex human diseases. Computational methods for predicting the association between lncRNAs and diseases have gained increasing attention. However, these methods face two key challenges: obtaining reliable negative samples and incorporating lncRNA-disease association (LDA) information from multiple perspectives. This paper proposes a method called NDMLDA, which combines multi-view feature extraction, unsupervised negative sample denoising, and stacking ensemble classifier. Firstly, an unsupervised method (K-means) is used to design a negative sample denoising module to alleviate the imbalance of samples and the impact of potential noise in the negative samples on model performance. Secondly, graph attention networks are employed to extract multi-view features of both lncRNAs and diseases, thereby enhancing the learning of association information between them. Finally, lncRNA-disease association prediction is implemented through a stacking ensemble classifier. Existing research datasets are integrated to evaluate performance, and 5-fold cross-validation is conducted on this dataset. Experimental results demonstrate that NDMLDA achieves an AUC of 0.9907and an AUPR of 0.9927, with a 5-fold cross-validation variance of less than 0.1%. These results outperform the baseline methods. Additionally, case studies further illustrate the model's potential in cancer diagnosis and precision medicine implementation.
Collapse
Affiliation(s)
- Dengju Yao
- School of Computer Science and Technology, Harbin University of Science and Technology, Harbin, China
| | - Bo Zhang
- School of Computer Science and Technology, Harbin University of Science and Technology, Harbin, China
| | - Xiangkui Li
- School of Computer Science and Technology, Harbin University of Science and Technology, Harbin, China
| | - Xiaojuan Zhan
- College of Computer Science and Technology, Heilongjiang Institute of Technology, Harbin, China
| | - Xiaorong Zhan
- Department of Endocrinology and Metabolism, Hospital of South University of Science and Technology, Shenzhen, China
| | - Binbin Zhang
- School of Computer Science and Technology, Harbin University of Science and Technology, Harbin, China
| |
Collapse
|
27
|
Zhou B, Ji B, Shen C, Zhang X, Yu X, Huang P, Yu R, Zhang H, Dou X, Chen Q, Zeng Q, Wang X, Cao Z, Hu G, Xu S, Zhao H, Yang Y, Zhou Y, Wang J. EVLncRNAs 3.0: an updated comprehensive database for manually curated functional long non-coding RNAs validated by low-throughput experiments. Nucleic Acids Res 2024; 52:D98-D106. [PMID: 37953349 PMCID: PMC10767905 DOI: 10.1093/nar/gkad1057] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2023] [Revised: 10/23/2023] [Accepted: 11/01/2023] [Indexed: 11/14/2023] Open
Abstract
Long noncoding RNAs (lncRNAs) have emerged as crucial regulators across diverse biological processes and diseases. While high-throughput sequencing has enabled lncRNA discovery, functional characterization remains limited. The EVLncRNAs database is the first and exclusive repository for all experimentally validated functional lncRNAs from various species. After previous releases in 2018 and 2021, this update marks a major expansion through exhaustive manual curation of nearly 25 000 publications from 15 May 2020, to 15 May 2023. It incorporates substantial growth across all categories: a 154% increase in functional lncRNAs, 160% in associated diseases, 186% in lncRNA-disease associations, 235% in interactions, 138% in structures, 234% in circular RNAs, 235% in resistant lncRNAs and 4724% in exosomal lncRNAs. More importantly, it incorporated additional information include functional classifications, detailed interaction pathways, homologous lncRNAs, lncRNA locations, COVID-19, phase-separation and organoid-related lncRNAs. The web interface was substantially improved for browsing, visualization, and searching. ChatGPT was tested for information extraction and functional overview with its limitation noted. EVLncRNAs 3.0 represents the most extensive curated resource of experimentally validated functional lncRNAs and will serve as an indispensable platform for unravelling emerging lncRNA functions. The updated database is freely available at https://www.sdklab-biophysics-dzu.net/EVLncRNAs3/.
Collapse
Affiliation(s)
- Bailing Zhou
- Shandong Provincial Key Laboratory of Biophysics, Institute of Biophysics, Dezhou University, Dezhou 253023, China
| | - Baohua Ji
- Shandong Provincial Key Laboratory of Biophysics, Institute of Biophysics, Dezhou University, Dezhou 253023, China
- College of Physics and Electronic Information, Dezhou University, Dezhou 253023, China
| | - Congcong Shen
- Shandong Provincial Key Laboratory of Biophysics, Institute of Biophysics, Dezhou University, Dezhou 253023, China
| | - Xia Zhang
- Shandong Provincial Key Laboratory of Biophysics, Institute of Biophysics, Dezhou University, Dezhou 253023, China
| | - Xue Yu
- Shandong Provincial Key Laboratory of Biophysics, Institute of Biophysics, Dezhou University, Dezhou 253023, China
| | - Pingping Huang
- Shandong Provincial Key Laboratory of Biophysics, Institute of Biophysics, Dezhou University, Dezhou 253023, China
| | - Ru Yu
- Shandong Provincial Key Laboratory of Biophysics, Institute of Biophysics, Dezhou University, Dezhou 253023, China
| | - Hongmei Zhang
- Shandong Provincial Key Laboratory of Biophysics, Institute of Biophysics, Dezhou University, Dezhou 253023, China
- College of Life Science, Dezhou University, Dezhou 253023, China
| | - Xianghua Dou
- Shandong Provincial Key Laboratory of Biophysics, Institute of Biophysics, Dezhou University, Dezhou 253023, China
| | - Qingshuai Chen
- Shandong Provincial Key Laboratory of Biophysics, Institute of Biophysics, Dezhou University, Dezhou 253023, China
| | - Qiangcheng Zeng
- Shandong Provincial Key Laboratory of Biophysics, Institute of Biophysics, Dezhou University, Dezhou 253023, China
- College of Life Science, Dezhou University, Dezhou 253023, China
| | - Xiaoxin Wang
- Shandong Provincial Key Laboratory of Biophysics, Institute of Biophysics, Dezhou University, Dezhou 253023, China
- College of Physics and Electronic Information, Dezhou University, Dezhou 253023, China
| | - Zanxia Cao
- Shandong Provincial Key Laboratory of Biophysics, Institute of Biophysics, Dezhou University, Dezhou 253023, China
| | - Guodong Hu
- Shandong Provincial Key Laboratory of Biophysics, Institute of Biophysics, Dezhou University, Dezhou 253023, China
| | - Shicai Xu
- Shandong Provincial Key Laboratory of Biophysics, Institute of Biophysics, Dezhou University, Dezhou 253023, China
| | - Huiying Zhao
- Department of Medical Research Center, Sun Yat-Sen Memorial Hospital, Sun Yat-Sen University, Guangzhou 510120, China
| | - Yuedong Yang
- Shandong Provincial Key Laboratory of Biophysics, Institute of Biophysics, Dezhou University, Dezhou 253023, China
- School of Computer Science and Engineering, Sun Yat-Sen University, Guangzhou 510006, China
| | - Yaoqi Zhou
- Shandong Provincial Key Laboratory of Biophysics, Institute of Biophysics, Dezhou University, Dezhou 253023, China
- Institute of Systems and Physical Biology, Shenzhen Bay Laboratory, Shenzhen 518038, China
| | - Jihua Wang
- Shandong Provincial Key Laboratory of Biophysics, Institute of Biophysics, Dezhou University, Dezhou 253023, China
| |
Collapse
|
28
|
Momanyi BM, Zhou YW, Grace-Mercure BK, Temesgen SA, Basharat A, Ning L, Tang L, Gao H, Lin H, Tang H. SAGESDA: Multi-GraphSAGE networks for predicting SnoRNA-disease associations. Curr Res Struct Biol 2023; 7:100122. [PMID: 38188542 PMCID: PMC10771890 DOI: 10.1016/j.crstbi.2023.100122] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2023] [Revised: 11/30/2023] [Accepted: 12/24/2023] [Indexed: 01/09/2024] Open
Abstract
Over the years, extensive research has highlighted the functional roles of small nucleolar RNAs in various biological processes associated with the development of complex human diseases. Therefore, understanding the existing relationships between different snoRNAs and diseases is crucial for advancing disease diagnosis and treatment. However, classical biological experiments for identifying snoRNA-disease associations are expensive and time-consuming. Therefore, there is an urgent need for cost-effective computational techniques that can enhance the efficiency and accuracy of prediction. While several computational models have already been proposed, many suffer from limitations and suboptimal performance. In this study, we introduced a novel Graph Neural Network-based (GNN) classification model, called SAGESDA, which is implemented through the GraphSAGE architecture with attention for the prediction of snoRNA-disease associations. The classifier leverages local neighbouring nodes in a heterogeneous network to generate new node embeddings through message passing. The mini-batch gradient descent technique was applied to divide the graph into smaller sub-graphs, which enhances the model's accuracy, speed and scalability. With these advancements, SAGESDA attained an area under the receiver operating characteristic (ROC) curve (AUC) of 0.92 using the standard dot product classifier, surpassing previous related studies. This notable performance demonstrates that SAGESDA is a promising model for predicting unknown snoRNA-disease associations with high accuracy. The SAGESDA implementation details can be obtained from https://github.com/momanyibiffon/SAGESDA.git.
Collapse
Affiliation(s)
- Biffon Manyura Momanyi
- School of Computer Science and Engineering, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu, China
| | - Yu-Wei Zhou
- School of Health Care Technology, Chengdu Neusoft University, Chengdu, China
| | - Bakanina Kissanga Grace-Mercure
- School of Life Science and Technology, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu, 610054, China
| | - Sebu Aboma Temesgen
- School of Life Science and Technology, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu, 610054, China
| | - Ahmad Basharat
- School of Life Science and Technology, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu, 610054, China
| | - Lin Ning
- School of Health Care Technology, Chengdu Neusoft University, Chengdu, China
- School of Life Science and Technology, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu, 610054, China
| | - Lixia Tang
- School of Life Science and Technology, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu, 610054, China
| | - Hui Gao
- School of Computer Science and Engineering, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu, China
| | - Hao Lin
- School of Life Science and Technology, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu, 610054, China
| | - Hua Tang
- School of Basic Medical Sciences, Southwest Medical University, Luzhou, 646000, China
- Basic Medicine Research Innovation Center for Cardiometabolic Diseases, Ministry of Education, Luzhou, 646000, China
- Central Nervous System Drug Key Laboratory of Sichuan Province, Luzhou, 646000, China
| |
Collapse
|
29
|
Zulian V, Fiscon G, Paci P, Garbuglia AR. Hepatitis B Virus and microRNAs: A Bioinformatics Approach. Int J Mol Sci 2023; 24:17224. [PMID: 38139051 PMCID: PMC10743825 DOI: 10.3390/ijms242417224] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2023] [Revised: 11/20/2023] [Accepted: 12/05/2023] [Indexed: 12/24/2023] Open
Abstract
In recent decades, microRNAs (miRNAs) have emerged as key regulators of gene expression, and the identification of viral miRNAs (v-miRNAs) within some viruses, including hepatitis B virus (HBV), has attracted significant attention. HBV infections often progress to chronic states (CHB) and may induce fibrosis/cirrhosis and hepatocellular carcinoma (HCC). The presence of HBV can dysregulate host miRNA expression, influencing several biological pathways, such as apoptosis, innate and immune response, viral replication, and pathogenesis. Consequently, miRNAs are considered a promising biomarker for diagnostic, prognostic, and treatment response. The dynamics of miRNAs during HBV infection are multifaceted, influenced by host variability and miRNA interactions. Given the ability of miRNAs to target multiple messenger RNA (mRNA), understanding the viral-host (human) interplay is complex but essential to develop novel clinical applications. Therefore, bioinformatics can help to analyze, identify, and interpret a vast amount of miRNA data. This review explores the bioinformatics tools available for viral and host miRNA research. Moreover, we introduce a brief overview focusing on the role of miRNAs during HBV infection. In this way, this review aims to help the selection of the most appropriate bioinformatics tools based on requirements and research goals.
Collapse
Affiliation(s)
- Verdiana Zulian
- Virology Laboratory, National Institute for Infectious Diseases “Lazzaro Spallanzani” IRCCS, 00149 Rome, Italy;
| | - Giulia Fiscon
- Department of Computer, Control and Management Engineering, Sapienza University of Rome, 00185 Rome, Italy; (G.F.); (P.P.)
- Institute for Systems Analysis and Computer Science “Antonio Ruberti”, National Research Council, 00185 Rome, Italy
| | - Paola Paci
- Department of Computer, Control and Management Engineering, Sapienza University of Rome, 00185 Rome, Italy; (G.F.); (P.P.)
- Institute for Systems Analysis and Computer Science “Antonio Ruberti”, National Research Council, 00185 Rome, Italy
| | - Anna Rosa Garbuglia
- Virology Laboratory, National Institute for Infectious Diseases “Lazzaro Spallanzani” IRCCS, 00149 Rome, Italy;
| |
Collapse
|
30
|
Böğürcü-Seidel N, Ritschel N, Acker T, Németh A. Beyond ribosome biogenesis: noncoding nucleolar RNAs in physiology and tumor biology. Nucleus 2023; 14:2274655. [PMID: 37906621 PMCID: PMC10730139 DOI: 10.1080/19491034.2023.2274655] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2023] [Accepted: 10/19/2023] [Indexed: 11/02/2023] Open
Abstract
The nucleolus, the largest subcompartment of the nucleus, stands out from the nucleoplasm due to its exceptionally high local RNA and low DNA concentrations. Within this central hub of nuclear RNA metabolism, ribosome biogenesis is the most prominent ribonucleoprotein (RNP) biogenesis process, critically determining the structure and function of the nucleolus. However, recent studies have shed light on other roles of the nucleolus, exploring the interplay with various noncoding RNAs that are not directly involved in ribosome synthesis. This review focuses on this intriguing topic and summarizes the techniques to study and the latest findings on nucleolar long noncoding RNAs (lncRNAs) as well as microRNAs (miRNAs) in the context of nucleolus biology beyond ribosome biogenesis. We particularly focus on the multifaceted roles of the nucleolus and noncoding RNAs in physiology and tumor biology.
Collapse
Affiliation(s)
| | - Nadja Ritschel
- Institute of Neuropathology, Justus Liebig University Giessen, Giessen, Germany
| | - Till Acker
- Institute of Neuropathology, Justus Liebig University Giessen, Giessen, Germany
| | - Attila Németh
- Institute of Neuropathology, Justus Liebig University Giessen, Giessen, Germany
| |
Collapse
|
31
|
Peng L, Huang L, Su Q, Tian G, Chen M, Han G. LDA-VGHB: identifying potential lncRNA-disease associations with singular value decomposition, variational graph auto-encoder and heterogeneous Newton boosting machine. Brief Bioinform 2023; 25:bbad466. [PMID: 38127089 PMCID: PMC10734633 DOI: 10.1093/bib/bbad466] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2023] [Revised: 10/05/2023] [Accepted: 11/25/2023] [Indexed: 12/23/2023] Open
Abstract
Long noncoding RNAs (lncRNAs) participate in various biological processes and have close linkages with diseases. In vivo and in vitro experiments have validated many associations between lncRNAs and diseases. However, biological experiments are time-consuming and expensive. Here, we introduce LDA-VGHB, an lncRNA-disease association (LDA) identification framework, by incorporating feature extraction based on singular value decomposition and variational graph autoencoder and LDA classification based on heterogeneous Newton boosting machine. LDA-VGHB was compared with four classical LDA prediction methods (i.e. SDLDA, LDNFSGB, IPCARF and LDASR) and four popular boosting models (XGBoost, AdaBoost, CatBoost and LightGBM) under 5-fold cross-validations on lncRNAs, diseases, lncRNA-disease pairs and independent lncRNAs and independent diseases, respectively. It greatly outperformed the other methods with its prominent performance under four different cross-validations on the lncRNADisease and MNDR databases. We further investigated potential lncRNAs for lung cancer, breast cancer, colorectal cancer and kidney neoplasms and inferred the top 20 lncRNAs associated with them among all their unobserved lncRNAs. The results showed that most of the predicted top 20 lncRNAs have been verified by biomedical experiments provided by the Lnc2Cancer 3.0, lncRNADisease v2.0 and RNADisease databases as well as publications. We found that HAR1A, KCNQ1DN, ZFAT-AS1 and HAR1B could associate with lung cancer, breast cancer, colorectal cancer and kidney neoplasms, respectively. The results need further biological experimental validation. We foresee that LDA-VGHB was capable of identifying possible lncRNAs for complex diseases. LDA-VGHB is publicly available at https://github.com/plhhnu/LDA-VGHB.
Collapse
Affiliation(s)
- Lihong Peng
- School of Computer Science, Hunan University of Technology, 412007, Hunan, China
- College of Life Sciences and Chemistry, Hunan University of Technology, 412007, Hunan, China
| | - Liangliang Huang
- School of Computer Science, Hunan University of Technology, 412007, Hunan, China
| | - Qiongli Su
- Department of Pharmacy, the Affiliated Zhuzhou Hospital Xiangya Medical College CSU, 412007, Hunan, China
| | - Geng Tian
- Geneis (Beijing) Co. Ltd, China, 100102, Beijing, China
| | - Min Chen
- School of Computer Science, Hunan Institute of Technology, 421002, No. 18 Henghua Road, Zhuhui District, Hengyang, Hunan, China
| | - Guosheng Han
- School of Mathematics and Computational Science, Xiangtan University, 411105, Yuhu District, Xiangtan, Hunan, China
- Hunan Key Laboratory for Computation and Simulation in Science and Engineering, Xiangtan University, 411105, Yuhu District, Xiangtan, Hunan, China
| |
Collapse
|
32
|
Bai T, Liu B. ncRNALocate-EL: a multi-label ncRNA subcellular locality prediction model based on ensemble learning. Brief Funct Genomics 2023; 22:442-452. [PMID: 37122147 DOI: 10.1093/bfgp/elad007] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2022] [Revised: 12/31/2022] [Accepted: 01/31/2023] [Indexed: 05/02/2023] Open
Abstract
Subcellular localizations of ncRNAs are associated with specific functions. Currently, an increasing number of biological researchers are focusing on computational approaches to identify subcellular localizations of ncRNAs. However, the performance of the existing computational methods is low and needs to be further studied. First, most prediction models are trained with outdated databases. Second, only a few predictors can identify multiple subcellular localizations simultaneously. In this work, we establish three human ncRNA subcellular datasets based on the latest RNALocate, including lncRNA, miRNA and snoRNA, and then we propose a novel multi-label classification model based on ensemble learning called ncRNALocate-EL to identify multi-label subcellular localizations of three ncRNAs. The results show that the ncRNALocate-EL outperforms previous methods. Our method achieved an average precision of 0.709,0.977 and 0.730 on three human ncRNA datasets. The web server of ncRNALocate-EL has been established, which can be accessed at https://bliulab.net/ncRNALocate-EL.
Collapse
|
33
|
Zhang L, Chen M, Hu X, Deng L. Graph Convolutional Network and Contrastive Learning Small Nucleolar RNA (snoRNA) Disease Associations (GCLSDA): Predicting snoRNA-Disease Associations via Graph Convolutional Network and Contrastive Learning. Int J Mol Sci 2023; 24:14429. [PMID: 37833876 PMCID: PMC10572952 DOI: 10.3390/ijms241914429] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2023] [Revised: 09/17/2023] [Accepted: 09/18/2023] [Indexed: 10/15/2023] Open
Abstract
Small nucleolar RNAs (snoRNAs) constitute a prevalent class of noncoding RNAs localized within the nucleoli of eukaryotic cells. Their involvement in diverse diseases underscores the significance of forecasting associations between snoRNAs and diseases. However, conventional experimental techniques for such predictions suffer limitations in scalability, protracted timelines, and suboptimal success rates. Consequently, efficient computational methodologies are imperative to realize the accurate predictions of snoRNA-disease associations. Herein, we introduce GCLSDA-graph Convolutional Network and contrastive learning predict snoRNA disease associations. GCLSDA is an innovative framework that combines graph convolution networks and self-supervised learning for snoRNA-disease association prediction. Leveraging the repository of MNDR v4.0 and ncRPheno databases, we construct a robust snoRNA-disease association dataset, which serves as the foundation to create bipartite graphs. The computational prowess of the light graph convolutional network (LightGCN) is harnessed to acquire nuanced embedded representations of both snoRNAs and diseases. With careful consideration, GCLSDA intelligently incorporates contrast learning to address the challenging issues of sparsity and over-smoothing inside correlation matrices. This combination not only ensures the precision of predictions but also amplifies the model's robustness. Moreover, we introduce the augmentation technique of random noise to refine the embedded snoRNA representations, consequently enhancing the precision of predictions. Within the domain of contrast learning, we unite the tasks of contrast and recommendation. This harmonization streamlines the cross-layer contrast process, simplifying the information propagation and concurrently curtailing computational complexity. In the area of snoRNA-disease associations, GCLSDA constantly shows its promising capacity for prediction through extensive research. This success not only contributes valuable insights into the functional roles of snoRNAs in disease etiology, but also plays an instrumental role in identifying potential drug targets and catalyzing innovative treatment modalities.
Collapse
Affiliation(s)
| | | | | | - Lei Deng
- School of Computer Science and Engineering, Central South University, Changsha 410083, China; (L.Z.); (M.C.); (X.H.)
| |
Collapse
|
34
|
Wu J, Ning Z, Ding Y, Wang Y, Peng Q, Fu L. KGETCDA: an efficient representation learning framework based on knowledge graph encoder from transformer for predicting circRNA-disease associations. Brief Bioinform 2023; 24:bbad292. [PMID: 37587836 DOI: 10.1093/bib/bbad292] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2023] [Revised: 07/27/2023] [Accepted: 07/27/2023] [Indexed: 08/18/2023] Open
Abstract
Recent studies have demonstrated the significant role that circRNA plays in the progression of human diseases. Identifying circRNA-disease associations (CDA) in an efficient manner can offer crucial insights into disease diagnosis. While traditional biological experiments can be time-consuming and labor-intensive, computational methods have emerged as a viable alternative in recent years. However, these methods are often limited by data sparsity and their inability to explore high-order information. In this paper, we introduce a novel method named Knowledge Graph Encoder from Transformer for predicting CDA (KGETCDA). Specifically, KGETCDA first integrates more than 10 databases to construct a large heterogeneous non-coding RNA dataset, which contains multiple relationships between circRNA, miRNA, lncRNA and disease. Then, a biological knowledge graph is created based on this dataset and Transformer-based knowledge representation learning and attentive propagation layers are applied to obtain high-quality embeddings with accurately captured high-order interaction information. Finally, multilayer perceptron is utilized to predict the matching scores of CDA based on their embeddings. Our empirical results demonstrate that KGETCDA significantly outperforms other state-of-the-art models. To enhance user experience, we have developed an interactive web-based platform named HNRBase that allows users to visualize, download data and make predictions using KGETCDA with ease. The code and datasets are publicly available at https://github.com/jinyangwu/KGETCDA.
Collapse
Affiliation(s)
- Jinyang Wu
- School of Automation Science and Engineering, Xi'an Jiaotong University, 710049, Shaanxi, China
| | - Zhiwei Ning
- School of Automation Science and Engineering, Xi'an Jiaotong University, 710049, Shaanxi, China
| | - Yidong Ding
- School of Automation Science and Engineering, Xi'an Jiaotong University, 710049, Shaanxi, China
| | - Ying Wang
- School of Automation Science and Engineering, Xi'an Jiaotong University, 710049, Shaanxi, China
| | - Qinke Peng
- School of Automation Science and Engineering, Xi'an Jiaotong University, 710049, Shaanxi, China
| | - Laiyi Fu
- School of Automation Science and Engineering, Xi'an Jiaotong University, 710049, Shaanxi, China
- Research Institute of Xi'an Jiaotong University, 311200, Zhejiang, China
- Sichuan Digital Economy Industry Development Research Institute, 610036, Sichuan, China
| |
Collapse
|
35
|
Rouka E, Zarogiannis SG, Hatzoglou C, Gourgoulianis KI, Malli F. Identification of Genes and miRNAs Associated with TAFI-Related Thrombosis: An in Silico Study. Biomolecules 2023; 13:1318. [PMID: 37759718 PMCID: PMC10526758 DOI: 10.3390/biom13091318] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2023] [Revised: 08/23/2023] [Accepted: 08/24/2023] [Indexed: 09/29/2023] Open
Abstract
Thrombin-Activatable Fibrinolysis Inhibitor (TAFI) is a carboxypeptidase B-like proenzyme encoded by the CPB2 gene. After thrombin activation, TAFI downregulates fibrinolysis, thus linking the latter with coagulation. TAFI has been shown to play a role in venous and arterial thrombotic diseases, yet, data regarding the molecular mechanisms underlying its function have been conflicting. In this study, we focused on the prediction and functional enrichment analysis (FEA) of the TAFI interaction network and the microRNAs (miRNAs) targeting the members of this network in an attempt to identify novel components and pathways of TAFI-related thrombosis. To this end, we used nine bioinformatics software tools. We found that the TAFI interactome consists of 28 unique genes mainly involved in hemostasis. Twenty-four miRNAs were predicted to target these genes. Co-annotation analysis of the predicted interactors with respect to Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways and transcription factors (TFs) pointed to the complement and coagulation cascades as well as neutrophil extracellular trap formation. Cancer, stroke, and intracranial aneurysm were among the top 20 significant diseases related to the identified miRNAs. We reason that the predicted biomolecules should be further studied in the context of TAFI-related thrombosis.
Collapse
Affiliation(s)
- Erasmia Rouka
- Department of Nursing, School of Health Sciences, University of Thessaly, GAIOPOLIS, 41500 Larissa, Greece
- Department of Physiology, Faculty of Medicine, School of Health Sciences, University of Thessaly, BIOPOLIS, 41500 Larissa, Greece; (S.G.Z.); (C.H.)
| | - Sotirios G. Zarogiannis
- Department of Physiology, Faculty of Medicine, School of Health Sciences, University of Thessaly, BIOPOLIS, 41500 Larissa, Greece; (S.G.Z.); (C.H.)
| | - Chrissi Hatzoglou
- Department of Physiology, Faculty of Medicine, School of Health Sciences, University of Thessaly, BIOPOLIS, 41500 Larissa, Greece; (S.G.Z.); (C.H.)
| | - Konstantinos I. Gourgoulianis
- Department of Respiratory Medicine, Faculty of Medicine, School of Health Sciences, University of Thessaly, BIOPOLIS, 41500 Larissa, Greece;
| | - Foteini Malli
- Department of Nursing, School of Health Sciences, University of Thessaly, GAIOPOLIS, 41500 Larissa, Greece
- Department of Respiratory Medicine, Faculty of Medicine, School of Health Sciences, University of Thessaly, BIOPOLIS, 41500 Larissa, Greece;
| |
Collapse
|
36
|
Biyu H, GuangWen T, Ming Z, Lixin G, Mengshan L. A lncRNA-disease association prediction model based on the two-step PU learning and fully connected neural networks. Heliyon 2023; 9:e17726. [PMID: 37539215 PMCID: PMC10395133 DOI: 10.1016/j.heliyon.2023.e17726] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2023] [Revised: 06/13/2023] [Accepted: 06/26/2023] [Indexed: 08/05/2023] Open
Abstract
Long non-coding RNAs (lncRNAs) have been shown to play a regulatory role in various processes of human diseases. However, lncRNA experiments are inefficient, time-consuming and highly subjective, so that the number of experimentally verified associations between lncRNA and diseases is limited. In the era of big data, numerous machine learning methods have been proposed to predict the potential association between lncRNA and diseases, but the characteristics of the associated data were seldom explored. In these methods, negative samples are randomly selected for model training and the model is prone to learn the potential positive association error, thus affecting the prediction accuracy. In this paper, we proposed a cyclic optimization model of predicting lncRNA-disease associations (COPTLDA in short). In COPTLDA, the two-step training strategy is adopted to search for the samples with the greater probability of being negative examples from unlabeled samples and the determined samples are treated as negative samples, which are combined together with known positive samples to train the model. The searching and training steps are repeated until the best model is obtained as the final prediction model. In order to evaluate the performance of the model, 30% of the known positive samples are used to calculate the model accuracy and 10% of positive samples are used to calculate the recall rate of the model. The sampling strategy used in this paper can improve the accuracy and the AUC value reaches 0.9348. The results of case studies showed that the model could predict the potential associations between lncRNA and malignant tumors such as colorectal cancer, gastric cancer, and breast cancer. The predicted top 20 associated lncRNAs included 10 colorectal cancer lncRNAs, 2 gastric cancer lncRNAs, and 8 breast cancer lncRNAs.
Collapse
Affiliation(s)
| | | | | | | | - Li Mengshan
- Corresponding author. Gannan Normal University, China.
| |
Collapse
|
37
|
Recent advances in predicting lncRNA-disease associations based on computational methods. Drug Discov Today 2023; 28:103432. [PMID: 36370992 DOI: 10.1016/j.drudis.2022.103432] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2022] [Revised: 10/19/2022] [Accepted: 11/03/2022] [Indexed: 11/11/2022]
Abstract
Mutations in and dysregulation of long non-coding RNAs (lncRNAs) are closely associated with the development of various human complex diseases, but only a few lncRNAs have been experimentally confirmed to be associated with human diseases. Predicting new potential lncRNA-disease associations (LDAs) will help us to understand the pathogenesis of human diseases and to detect disease markers, as well as in disease diagnosis, prevention and treatment. Computational methods can effectively narrow down the screening scope of biological experiments, thereby reducing the duration and cost of such experiments. In this review, we outline recent advances in computational methods for predicting LDAs, focusing on LDA databases, lncRNA/disease similarity calculations, and advanced computational models. In addition, we analyze the limitations of various computational models and discuss future challenges and directions for development.
Collapse
|