1
|
Dong H, Huang D, Zhang J, Xu D, Jiao X, Wang W. Exploring the innate immune system of Urechis unicinctus: Insights from full-length transcriptome analysis. Gene 2024; 928:148784. [PMID: 39047957 DOI: 10.1016/j.gene.2024.148784] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2024] [Revised: 07/16/2024] [Accepted: 07/18/2024] [Indexed: 07/27/2024]
Abstract
The Echiura worm Urechis unicinctus refers to a common benthic invertebrate found in the intertidal zone of Huanghai as well as Bohai Bay. U. unicinctus is known to contain various physiologically active substances, making it highly valuable in terms of its edibility, medicinal properties, and economic potential. Nonetheless, the limited study on the immune system of U. unicinctus poses difficulties for its aquaculture and artificial reproduction. Marine invertebrates, including shellfish and U. unicinctus, are thought to primarily depend on their innate immune system for disease protection, owing to the severalinnate immune molecules they possess. Herein, we employed PacBio single-molecule real-time (SMRT) sequencing technology to perform the full-length transcriptome analysis of U. unicinctus individuals under five different conditions (room temperature (RT), low temperature (LT), high temperature (HT), without water (DRY), ultraviolet irradiation (UV)). Concequently, we identified 59,371 unigenes that had a 2,779 bp average length, 2,613 long non-coding RNAs (lncRNAs), 59,190 coding sequences (CDSs), 35,166 simple sequence repeats (SSRs), and 1,733 transcription factors (TFs), successfully annotating 90.58 % (53,778) of the unigenes. Subsequently, key factors associated with immune-related processes, such as non-self-recognition, cellular immune defenses, and humoral immune defenses, were searched. Our study also identified pattern recognition receptors (PRRs) that included 17 peptidoglycan recognition proteins (PGRPs), 13 Gram-negative binding proteins (GNBPs), 18 scavenger receptors (SRs), 74 toll-like receptors (TLRs), and 89 C-type lectins (CLTs). Altogether, the high-quality transcriptome obtained data will offer valuable insights for further investigations into U. unicinctus innate immune response, laying the foundation for subsequent molecular biology studies and aquaculture.
Collapse
Affiliation(s)
- Haomiao Dong
- Key Laboratory of Coastal Biology and Biological Resource Utilization, Yantai Institute of Coastal Zone Research, Chinese Academy of Sciences, Yantai 264003, China; University of Chinese Academy of Sciences, Beijing 100049, China
| | - Dong Huang
- Key Laboratory of Coastal Biology and Biological Resource Utilization, Yantai Institute of Coastal Zone Research, Chinese Academy of Sciences, Yantai 264003, China
| | - Jian Zhang
- School of Ocean, Yantai University, Yantai 264005, China
| | - Dong Xu
- Shandong Blue Ocean Technology Co., Ltd, Yantai 261400, China
| | - Xudong Jiao
- Key Laboratory of Coastal Biology and Biological Resource Utilization, Yantai Institute of Coastal Zone Research, Chinese Academy of Sciences, Yantai 264003, China; University of Chinese Academy of Sciences, Beijing 100049, China.
| | - Weizhong Wang
- Shandong Blue Ocean Technology Co., Ltd, Yantai 261400, China.
| |
Collapse
|
2
|
Wen S, Liu Y, Yang G, Chen W, Wu H, Zhu X, Wang Y. A method for miRNA diffusion association prediction using machine learning decoding of multi-level heterogeneous graph Transformer encoded representations. Sci Rep 2024; 14:20490. [PMID: 39227405 PMCID: PMC11371806 DOI: 10.1038/s41598-024-68897-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2024] [Accepted: 07/29/2024] [Indexed: 09/05/2024] Open
Abstract
MicroRNAs (miRNAs) are a key class of endogenous non-coding RNAs that play a pivotal role in regulating diseases. Accurately predicting the intricate relationships between miRNAs and diseases carries profound implications for disease diagnosis, treatment, and prevention. However, these prediction tasks are highly challenging due to the complexity of the underlying relationships. While numerous effective prediction models exist for validating these associations, they often encounter information distortion due to limitations in efficiently retaining information during the encoding-decoding process. Inspired by Multi-layer Heterogeneous Graph Transformer and Machine Learning XGboost classifier algorithm, this study introduces a novel computational approach based on multi-layer heterogeneous encoder-machine learning decoder structure for miRNA-disease association prediction (MHXGMDA). First, we employ the multi-view similarity matrices as the input coding for MHXGMDA. Subsequently, we utilize the multi-layer heterogeneous encoder to capture the embeddings of miRNAs and diseases, aiming to capture the maximum amount of relevant features. Finally, the information from all layers is concatenated to serve as input to the machine learning classifier, ensuring maximal preservation of encoding details. We conducted a comprehensive comparison of seven different classifier models and ultimately selected the XGBoost algorithm as the decoder. This algorithm leverages miRNA embedding features and disease embedding features to decode and predict the association scores between miRNAs and diseases. We applied MHXGMDA to predict human miRNA-disease associations on two benchmark datasets. Experimental findings demonstrate that our approach surpasses several leading methods in terms of both the area under the receiver operating characteristic curve and the area under the precision-recall curve.
Collapse
Affiliation(s)
- SiJian Wen
- School of Information and Artificial Intelligence, Anhui Agricultural University, Hefei, 230036, China
| | - YinBo Liu
- School of Information and Artificial Intelligence, Anhui Agricultural University, Hefei, 230036, China
| | - Guang Yang
- School of Information and Artificial Intelligence, Anhui Agricultural University, Hefei, 230036, China
| | - WenXi Chen
- School of Information and Artificial Intelligence, Anhui Agricultural University, Hefei, 230036, China
| | - HaiTao Wu
- School of Information and Artificial Intelligence, Anhui Agricultural University, Hefei, 230036, China
| | - XiaoLei Zhu
- School of Information and Artificial Intelligence, Anhui Agricultural University, Hefei, 230036, China.
| | - YongMei Wang
- School of Information and Artificial Intelligence, Anhui Agricultural University, Hefei, 230036, China.
- Anhui Provincial Engineering Laboratory for Beidou Precision Agriculture Information, Hefei, 230036, China.
| |
Collapse
|
3
|
Xuan P, Wang W, Cui H, Wang S, Nakaguchi T, Zhang T. Mask-Guided Target Node Feature Learning and Dynamic Detailed Feature Enhancement for lncRNA-Disease Association Prediction. J Chem Inf Model 2024; 64:6662-6675. [PMID: 39112431 DOI: 10.1021/acs.jcim.4c00652] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/27/2024]
Abstract
Identifying new relevant long noncoding RNAs (lncRNAs) for various human diseases can facilitate the exploration of the causes and progression of these diseases. Recently, several graph inference methods have been proposed to predict disease-related lncRNAs by exploiting the topological structure and node attributes within graphs. However, these methods did not prioritize the target lncRNA and disease nodes over auxiliary nodes like miRNA nodes, potentially limiting their ability to fully utilize the features of the target nodes. We propose a new method, mask-guided target node feature learning and dynamic detailed feature enhancement for lncRNA-disease association prediction (MDLD), to enhance node feature learning for improved lncRNA-disease association prediction. First, we designed a heterogeneous graph masked transformer autoencoder to guide feature learning, focusing more on the features of target lncRNA (disease) nodes. The target nodes were increasingly masked as training progressed, which helps develop a more robust prediction model. Second, we developed a graph convolutional network with dynamic residuals (GCNDR) to learn and integrate the heterogeneous topology and features of all lncRNA, disease, and miRNA nodes. GCNDR employs an interlayer residual strategy and a residual evolution strategy to mitigate oversmoothing caused by multilayer graph convolution. The interlayer residual strategy estimates the importance of node features learned in the previous GCN encoding layer for nodes in the current encoding layer. Additionally, since there are dependencies in the importance of features of individual lncRNA (disease, miRNA) nodes across multiple encoding layers, a gated recurrent unit-based strategy is proposed to encode these dependencies. Finally, we designed a perspective-level attention mechanism to obtain more informative features of lncRNA and disease node pairs from the perspectives of mask-enhanced and dynamic-enhanced node features. Cross-validation experimental results demonstrated that MDLD outperformed 10 other state-of-the-art prediction methods. Ablation experiments and case studies on candidate lncRNAs for three diseases further proved the technical contributions of MDLD and its capability to discover disease-related lncRNAs.
Collapse
Affiliation(s)
- Ping Xuan
- Department of Computer Science and Technology, Shantou University, Shantou 515063, China
- School of Mathematical Science, Heilongjiang University, Harbin 150080, China
| | - Wei Wang
- Department of Computer Science and Technology, Shantou University, Shantou 515063, China
| | - Hui Cui
- Department of Computer Science and Information Technology, La Trobe University, Melbourne 3083, Australia
| | - Shuai Wang
- School of Information Science and Engineering, Yanshan University, Qinhuangdao 066004, China
| | - Toshiya Nakaguchi
- Center for Frontier Medical Engineering, Chiba University, Chiba 2638522, Japan
| | - Tiangang Zhang
- School of Mathematical Science, Heilongjiang University, Harbin 150080, China
| |
Collapse
|
4
|
Yao D, Zhang B, Zhan X, Zhang B, Li XK. Predicting lncRNA-Disease Associations Based on a Dual-Path Feature Extraction Network with Multiple Sources of Information Integration. ACS OMEGA 2024; 9:35100-35112. [PMID: 39157140 PMCID: PMC11325412 DOI: 10.1021/acsomega.4c05365] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/08/2024] [Revised: 07/04/2024] [Accepted: 07/22/2024] [Indexed: 08/20/2024]
Abstract
Identifying the associations between long noncoding RNAs (lncRNAs) and disease is critical for disease prevention, diagnosis and treatment. However, conducting wet experiments to discover these associations is time-consuming and costly. Therefore, computational modeling for predicting lncRNA-disease associations (LDAs) has become an important alternative. To enhance the accuracy of LDAs prediction and alleviate the issue of node feature oversmoothing when exploring the potential features of nodes using graph neural networks, we introduce DPFELDA, a dual-path feature extraction network that leverages the integration of information from multiple sources to predict LDA. Initially, we establish a dual-view structure of lncRNAs and disease and a heterogeneous network of lncRNA-disease-microRNA (miRNA) interactions. Subsequently, features are extracted using a dual-path feature extraction network. In particular, we employ a combination of a graph convolutional network, a convolutional block attention module, and a node aggregation layer to perform multilayer topology feature extraction for the dual-view structure of lncRNAs and diseases. Additionally, we utilize a Transformer model to construct the node topology feature residual network for obtaining node-specific features in heterogeneous networks. Finally, XGBoost is employed for LDA prediction. The experimental results demonstrate that DPFELDA outperforms the benchmark model on various benchmark data sets. In the course of model exploration, it becomes evident that DPFELDA successfully alleviates the issue of node feature oversmoothing induced by graph-based learning. Ablation experiments confirm the effectiveness of the innovative module, and a case study substantiates the accuracy of DPFELDA model in predicting novel LDAs for characteristic diseases.
Collapse
Affiliation(s)
- Dengju Yao
- School
of Computer Science and Technology, Harbin
University of Science and Technology, Harbin 150080, China
| | - Binbin Zhang
- School
of Computer Science and Technology, Harbin
University of Science and Technology, Harbin 150080, China
| | - Xiaojuan Zhan
- School
of Computer Science and Technology, Harbin
University of Science and Technology, Harbin 150080, China
- College
of Computer Science and Technology, Heilongjiang
Institute of Technology, Harbin 150050, China
| | - Bo Zhang
- School
of Computer Science and Technology, Harbin
University of Science and Technology, Harbin 150080, China
| | - Xiang Kui Li
- School
of Computer Science and Technology, Harbin
University of Science and Technology, Harbin 150080, China
| |
Collapse
|
5
|
Chen X, Yang L, Aslam MF, Tao J, Zhang X, Ren P, Wang Y, Chao P. Functional analysis, virtual screening, and molecular dynamics revealed potential novel drug targets and their inhibitors against cardiovascular disease in human. J Biomol Struct Dyn 2024; 42:6982-6996. [PMID: 37608602 DOI: 10.1080/07391102.2023.2239926] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2023] [Accepted: 07/11/2023] [Indexed: 08/24/2023]
Abstract
Cardiovascular disease (CVD) is a group of diseases, affecting the human heart and accounting for 30% of deaths worldwide. Major CVDs include heart failure, hypertension, stroke, etc. Various therapeutics are available against CVD, still there is a dire need to find out potential protein drug targets to reduce economic burden and mortality rate. Goal of the current study was to utilize sequential computational techniques to find the best cardiovascular drug targets and their inhibitors. Common human cardiovascular targets of both databases (GeneCards and Uniprot) were subjected to bioinformatics analyses. Purpose was to validate putative therapeutic targets employing the structure-based bioinformatics methods to determine their physiochemical properties and biological processes. Three stable proteins, that have 0 transmembrane helices, and possess biological processes were screened as potential protein-based therapeutic targets: Hemoglobin subunit beta (HBB), Gamma-enolase (ENO2), and Cholesteryl ester transfer protein (CETP). Tertiary structures of target proteins were retrieved from PDB, and molecular docking technique was utilized to evaluate a library of 5000 phytochemicals against the interacting residues of the target protein as well as their respective standard drugs through MOE and Pyrx software. Top five phytochemicals (d-Sesamin, 1,3-benzodioxole, Sativanone, Thiamine, and Cajanol) were identified based on their RMSD and docking scores as compared to their standard drugs. The docking studies were also validated by MM-GBSA binding free energy and molecular dynamics simulations. According to the study's findings, these phytochemicals may eventually be used as drugs to treat CVD. Further in vitro testing is required to confirm their efficacy and drug potency.Communicated by Ramaswamy H. Sarma.
Collapse
Affiliation(s)
- Xiaoyang Chen
- Department of Cardiology, People's Hospital of Xinjiang Uygur Autonomous Region, Xinjiang, China
| | - Lijuan Yang
- Department of Neurology, People's Hospital of Xinjiang Uygur Autonomous Region, Xinjiang, China
| | | | - Jing Tao
- Department of Rehabilitation, People's Hospital of Xinjiang Uygur Autonomous Region, Xinjiang, China
| | - Xueqin Zhang
- Department of Nephrology, People's Hospital of Xinjiang Uygur Autonomous Region, Xinjiang, China
| | - Peng Ren
- Department of Cardiology, People's Hospital of Xinjiang Uygur Autonomous Region, Xinjiang, China
| | - Yong Wang
- Department of Cardiology, People's Hospital of Xinjiang Uygur Autonomous Region, Xinjiang, China
| | - Peng Chao
- Department of Cardiology, People's Hospital of Xinjiang Uygur Autonomous Region, Xinjiang, China
| |
Collapse
|
6
|
Xie G, Li D, Lin Z, Gu G, Li W, Chen R, Liu Z. HPTRMF: Collaborative Matrix Factorization-Based Prediction Method for LncRNA-Disease Associations Using High-Order Perturbation and Flexible Trifactor Regularization. J Chem Inf Model 2024. [PMID: 39058598 DOI: 10.1021/acs.jcim.4c01070] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/28/2024]
Abstract
Existing matrix factorization methods face challenges, including the cold start problem and global nonlinear data loss during similarity learning, particularly in predicting associations between long noncoding RNAs (LncRNAs) and diseases. To overcome these issues, we introduce HPTRMF, a matrix factorization approach incorporating high-order perturbation and flexible trifactor regularization. HPTRMF constructs a high-order correlation matrix utilizing the known association matrix, leveraging high-order perturbation to effectively address the cold start problem caused by data sparsity. Additionally, HPTRMF incorporates a flexible trifactor regularization term to capture similarity information on LncRNAs and diseases, enabling the effective handling of global nonlinear data loss by capturing such data in the similarity matrix. Experimental results demonstrate the superiority of HPTRMF over nine state-of-the-art algorithms in Leave-One-Out Cross-Validation (LOOCV) and Five-Fold Cross-Validation (5-Fold CV) on three data sets.HPTRMF and data sets are available in https://github.com/Llvvvv/HPTRMF.
Collapse
Affiliation(s)
- Guobo Xie
- School of Computer Science, Guangdong University of Technology, Guangzhou 510006, China
| | - Dayin Li
- School of Computer Science, Guangdong University of Technology, Guangzhou 510006, China
| | - Zhiyi Lin
- School of Computer Science, Guangdong University of Technology, Guangzhou 510006, China
| | - Guosheng Gu
- School of Computer Science, Guangdong University of Technology, Guangzhou 510006, China
| | - Weijun Li
- School of Computer Science, Guangdong University of Technology, Guangzhou 510006, China
| | - Ruibin Chen
- School of Computer Science, Guangdong University of Technology, Guangzhou 510006, China
| | - Zhenguo Liu
- 2MD Department of Thoracic Surgery, The First Affiliated Hospital of Sun Yat-sen University, Guangzhou 510080, China
| |
Collapse
|
7
|
Calazans MAA, Ferreira FABS, Santos FAN, Madeiro F, Lima JB. Machine Learning and Graph Signal Processing Applied to Healthcare: A Review. Bioengineering (Basel) 2024; 11:671. [PMID: 39061753 PMCID: PMC11273494 DOI: 10.3390/bioengineering11070671] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2024] [Revised: 06/20/2024] [Accepted: 06/26/2024] [Indexed: 07/28/2024] Open
Abstract
Signal processing is a very useful field of study in the interpretation of signals in many everyday applications. In the case of applications with time-varying signals, one possibility is to consider them as graphs, so graph theory arises, which extends classical methods to the non-Euclidean domain. In addition, machine learning techniques have been widely used in pattern recognition activities in a wide variety of tasks, including health sciences. The objective of this work is to identify and analyze the papers in the literature that address the use of machine learning applied to graph signal processing in health sciences. A search was performed in four databases (Science Direct, IEEE Xplore, ACM, and MDPI), using search strings to identify papers that are in the scope of this review. Finally, 45 papers were included in the analysis, the first being published in 2015, which indicates an emerging area. Among the gaps found, we can mention the need for better clinical interpretability of the results obtained in the papers, that is not to restrict the results or conclusions simply to performance metrics. In addition, a possible research direction is the use of new transforms. It is also important to make new public datasets available that can be used to train the models.
Collapse
Affiliation(s)
| | - Felipe A. B. S. Ferreira
- Unidade Acadêmica do Cabo de Santo Agostinho, Universidade Federal Rural de Pernambuco, Cabo de Santo Agostinho 54518-430, Brazil;
| | - Fernando A. N. Santos
- Institute for Advanced Studies, Universiteit van Amsterdam, 1012 WP Amsterdam, The Netherlands;
| | - Francisco Madeiro
- Escola Politécnica de Pernambuco, Universidade de Pernambuco, Recife 50720-001, Brazil;
| | - Juliano B. Lima
- Centro de Tecnologia e Geociências, Universidade Federal de Pernambuco, Recife 50670-901, Brazil;
| |
Collapse
|
8
|
Chini A, Guha P, Rishi A, Obaid M, Udden SN, Mandal SS. Discovery and functional characterization of LncRNAs associated with inflammation and macrophage activation. Methods 2024; 227:1-16. [PMID: 38703879 DOI: 10.1016/j.ymeth.2024.05.001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2024] [Revised: 04/24/2024] [Accepted: 05/01/2024] [Indexed: 05/06/2024] Open
Abstract
Long noncoding RNAs (lncRNA) are emerging players in regulation of gene expression and cell signaling and their dysregulation has been implicated in a multitude of human diseases. Recent studies from our laboratory revealed that lncRNAs play critical roles in cytokine regulation, inflammation, and metabolism. We demonstrated that lncRNA HOTAIR, which is a well-known regulator of gene silencing, plays critical roles in modulation of cytokines and proinflammatory genes, and glucose metabolism in macrophages during inflammation. In addition, we recently discovered a series of novel lncRNAs that are closely associated with inflammation and macrophage activation. We termed these as long-noncoding inflammation associated RNAs (LinfRNAs). We are currently engaged in the functional characterization of these hLinfRNAs (human LinfRNAs) with a focus on their roles in inflammation, and we are investigating their potential implications in chronic inflammatory human diseases. Here, we have summarized experimental methods that have been utilized for the discovery and functional characterization of lncRNAs in inflammation and macrophage activation.
Collapse
Affiliation(s)
- Avisankar Chini
- Gene Regulation and Epigenetics Research Laboratory, Department of Chemistry and Biochemistry, The University of Texas at Arlington, Arlington, TX 76019, USA
| | - Prarthana Guha
- Gene Regulation and Epigenetics Research Laboratory, Department of Chemistry and Biochemistry, The University of Texas at Arlington, Arlington, TX 76019, USA
| | - Ashcharya Rishi
- Gene Regulation and Epigenetics Research Laboratory, Department of Chemistry and Biochemistry, The University of Texas at Arlington, Arlington, TX 76019, USA
| | - Monira Obaid
- Gene Regulation and Epigenetics Research Laboratory, Department of Chemistry and Biochemistry, The University of Texas at Arlington, Arlington, TX 76019, USA
| | - Sm Nashir Udden
- Department of Radiation Oncology, The University of Texas Southwestern Medical Center, Dallas, TX, 75390, USA
| | - Subhrangsu S Mandal
- Gene Regulation and Epigenetics Research Laboratory, Department of Chemistry and Biochemistry, The University of Texas at Arlington, Arlington, TX 76019, USA.
| |
Collapse
|
9
|
Peng L, Ren M, Huang L, Chen M. GEnDDn: An lncRNA-Disease Association Identification Framework Based on Dual-Net Neural Architecture and Deep Neural Network. Interdiscip Sci 2024; 16:418-438. [PMID: 38733474 DOI: 10.1007/s12539-024-00619-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2023] [Revised: 02/02/2024] [Accepted: 02/03/2024] [Indexed: 05/13/2024]
Abstract
Accumulating studies have demonstrated close relationships between long non-coding RNAs (lncRNAs) and diseases. Identification of new lncRNA-disease associations (LDAs) enables us to better understand disease mechanisms and further provides promising insights into cancer targeted therapy and anti-cancer drug design. Here, we present an LDA prediction framework called GEnDDn based on deep learning. GEnDDn mainly comprises two steps: First, features of both lncRNAs and diseases are extracted by combining similarity computation, non-negative matrix factorization, and graph attention auto-encoder, respectively. And each lncRNA-disease pair (LDP) is depicted as a vector based on concatenation operation on the extracted features. Subsequently, unknown LDPs are classified by aggregating dual-net neural architecture and deep neural network. Using six different evaluation metrics, we found that GEnDDn surpassed four competing LDA identification methods (SDLDA, LDNFSGB, IPCARF, LDASR) on the lncRNADisease and MNDR databases under fivefold cross-validation experiments on lncRNAs, diseases, LDPs, and independent lncRNAs and independent diseases, respectively. Ablation experiments further validated the powerful LDA prediction performance of GEnDDn. Furthermore, we utilized GEnDDn to find underlying lncRNAs for lung cancer and breast cancer. The results elucidated that there may be dense linkages between IFNG-AS1 and lung cancer as well as between HIF1A-AS1 and breast cancer. The results require further biomedical experimental verification. GEnDDn is publicly available at https://github.com/plhhnu/GEnDDn.
Collapse
Affiliation(s)
- Lihong Peng
- College of Life Science and Chemistry, Hunan University of Technology, Zhuzhou, 412007, China
| | - Mengnan Ren
- College of Life Science and Chemistry, Hunan University of Technology, Zhuzhou, 412007, China
| | - Liangliang Huang
- College of Life Science and Chemistry, Hunan University of Technology, Zhuzhou, 412007, China
| | - Min Chen
- School of Computer Science, Hunan Institute of Technology, Hengyang, 421002, China.
| |
Collapse
|
10
|
Nie Z, Gao M, Jin X, Rao Y, Zhang X. MFPINC: prediction of plant ncRNAs based on multi-source feature fusion. BMC Genomics 2024; 25:531. [PMID: 38816689 PMCID: PMC11137975 DOI: 10.1186/s12864-024-10439-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2023] [Accepted: 05/21/2024] [Indexed: 06/01/2024] Open
Abstract
Non-coding RNAs (ncRNAs) are recognized as pivotal players in the regulation of essential physiological processes such as nutrient homeostasis, development, and stress responses in plants. Common methods for predicting ncRNAs are susceptible to significant effects of experimental conditions and computational methods, resulting in the need for significant investment of time and resources. Therefore, we constructed an ncRNA predictor(MFPINC), to predict potential ncRNA in plants which is based on the PINC tool proposed by our previous studies. Specifically, sequence features were carefully refined using variance thresholding and F-test methods, while deep features were extracted and feature fusion were performed by applying the GRU model. The comprehensive evaluation of multiple standard datasets shows that MFPINC not only achieves more comprehensive and accurate identification of gene sequences, but also significantly improves the expressive and generalization performance of the model, and MFPINC significantly outperforms the existing competing methods in ncRNA identification. In addition, it is worth mentioning that our tool can also be found on Github ( https://github.com/Zhenj-Nie/MFPINC ) the data and source code can also be downloaded for free.
Collapse
Affiliation(s)
- Zhenjun Nie
- School of Information and Artificial Intelligence, Anhui Agricultural University, Hefei, 230036, China
| | - Mengqing Gao
- School of Information and Artificial Intelligence, Anhui Agricultural University, Hefei, 230036, China
| | - Xiu Jin
- School of Information and Artificial Intelligence, Anhui Agricultural University, Hefei, 230036, China
- Key Laboratory of Agricultural Sensors, Ministry of Agriculture and Rural Affairs, Hefei, 230036, China
| | - Yuan Rao
- School of Information and Artificial Intelligence, Anhui Agricultural University, Hefei, 230036, China
- Key Laboratory of Agricultural Sensors, Ministry of Agriculture and Rural Affairs, Hefei, 230036, China
| | - Xiaodan Zhang
- School of Information and Artificial Intelligence, Anhui Agricultural University, Hefei, 230036, China.
- Key Laboratory of Agricultural Sensors, Ministry of Agriculture and Rural Affairs, Hefei, 230036, China.
| |
Collapse
|
11
|
Bonomo M, Rombo SE. Neighborhood based computational approaches for the prediction of lncRNA-disease associations. BMC Bioinformatics 2024; 25:187. [PMID: 38741200 DOI: 10.1186/s12859-024-05777-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2023] [Accepted: 04/11/2024] [Indexed: 05/16/2024] Open
Abstract
MOTIVATION Long non-coding RNAs (lncRNAs) are a class of molecules involved in important biological processes. Extensive efforts have been provided to get deeper understanding of disease mechanisms at the lncRNA level, guiding towards the detection of biomarkers for disease diagnosis, treatment, prognosis and prevention. Unfortunately, due to costs and time complexity, the number of possible disease-related lncRNAs verified by traditional biological experiments is very limited. Computational approaches for the prediction of disease-lncRNA associations allow to identify the most promising candidates to be verified in laboratory, reducing costs and time consuming. RESULTS We propose novel approaches for the prediction of lncRNA-disease associations, all sharing the idea of exploring associations among lncRNAs, other intermediate molecules (e.g., miRNAs) and diseases, suitably represented by tripartite graphs. Indeed, while only a few lncRNA-disease associations are still known, plenty of interactions between lncRNAs and other molecules, as well as associations of the latters with diseases, are available. A first approach presented here, NGH, relies on neighborhood analysis performed on a tripartite graph, built upon lncRNAs, miRNAs and diseases. A second approach (CF) relies on collaborative filtering; a third approach (NGH-CF) is obtained boosting NGH by collaborative filtering. The proposed approaches have been validated on both synthetic and real data, and compared against other methods from the literature. It results that neighborhood analysis allows to outperform competitors, and when it is combined with collaborative filtering the prediction accuracy further improves, scoring a value of AUC equal to 0966. AVAILABILITY Source code and sample datasets are available at: https://github.com/marybonomo/LDAsPredictionApproaches.git.
Collapse
Affiliation(s)
| | - Simona E Rombo
- Kazaam Lab s.r.l., Palermo, Italy
- Department of Mathematics and Computer Science, University of Palermo, Palermo, Italy
| |
Collapse
|
12
|
Xuan P, Lu S, Cui H, Wang S, Nakaguchi T, Zhang T. Learning Association Characteristics by Dynamic Hypergraph and Gated Convolution Enhanced Pairwise Attributes for Prediction of Disease-Related lncRNAs. J Chem Inf Model 2024; 64:3569-3578. [PMID: 38523267 DOI: 10.1021/acs.jcim.4c00245] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/26/2024]
Abstract
As the long non-coding RNAs (lncRNAs) play important roles during the incurrence and development of various human diseases, identifying disease-related lncRNAs can contribute to clarifying the pathogenesis of diseases. Most of the recent lncRNA-disease association prediction methods utilized the multi-source data about the lncRNAs and diseases. A single lncRNA may participate in multiple disease processes, and multiple lncRNAs usually are involved in the same disease process synergistically. However, the previous methods did not completely exploit the biological characteristics to construct the informative prediction models. We construct a prediction model based on adaptive hypergraph and gated convolution for lncRNA-disease association prediction (AGLDA), to embed and encode the biological characteristics about lncRNA-disease associations, the topological features from the entire heterogeneous graph perspective, and the gated enhanced pairwise features. First, the strategy for constructing hyperedges is designed to reflect the biological characteristic that multiple lncRNAs are involved in multiple disease processes. Furthermore, each hyperedge has its own biological perspective, and multiple hyperedges are beneficial for revealing the diverse relationships among multiple lncRNAs and diseases. Second, we encode the biological features of each lncRNA (disease) node using a strategy based on dynamic hypergraph convolutional networks. The strategy may adaptively learn the features of the hyperedges and formulate the dynamically evolved hypergraph topological structure. Third, a group convolutional network is established to integrate the entire heterogeneous topological structure and multiple types of node attributes within an lncRNA-disease-miRNA graph. Finally, a gated convolutional strategy is proposed to enhance the informative features of the lncRNA-disease node pairs. The comparison experiments indicate that AGLDA outperforms seven advanced prediction methods. The ablation studies confirm the effectiveness of major innovations, and the case studies validate AGLDA's ability in application for discovering potential disease-related lncRNA candidates.
Collapse
Affiliation(s)
- Ping Xuan
- School of Computer Science and Technology, Heilongjiang University, Harbin 150080, China
- Department of Computer Science, Shantou University, Shantou 515063, China
| | - Siyuan Lu
- School of Computer Science and Technology, Heilongjiang University, Harbin 150080, China
| | - Hui Cui
- Department of Computer Science and Information Technology, La Trobe University, Melbourne 3083, Australia
| | - Shuai Wang
- School of Information Science and Engineering, Yanshan University, Qinhuangdao 066004, China
| | - Toshiya Nakaguchi
- Center for Frontier Medical Engineering, Chiba University, Chiba 2638522, Japan
| | - Tiangang Zhang
- School of Computer Science and Technology, Heilongjiang University, Harbin 150080, China
- School of Mathematical Science, Heilongjiang University, Harbin 150080, China
| |
Collapse
|
13
|
Wasson MCD, Venkatesh J, Cahill HF, McLean ME, Dean CA, Marcato P. LncRNAs exhibit subtype-specific expression, survival associations, and cancer-promoting effects in breast cancer. Gene 2024; 901:148165. [PMID: 38219875 DOI: 10.1016/j.gene.2024.148165] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2023] [Revised: 12/25/2023] [Accepted: 01/11/2024] [Indexed: 01/16/2024]
Abstract
Long non-coding RNAs (lncRNAs) play important roles in cancer progression, influencing processes such as invasion, metastasis, and drug resistance. Their reported cell type-dependent expression patterns suggest the potential for specialized functions in specific contexts. In breast cancer, lncRNA expression has been associated with different subtypes, highlighting their relevance in disease heterogeneity. However, our understanding of lncRNA function within breast cancer subtypes remains limited, warranting further investigation. We conducted a comprehensive analysis using the TANRIC dataset derived from the TCGA-BRCA cohort, profiling the expression, patient survival associations and immune cell type correlations of 12,727 lncRNAs across subtypes. Our findings revealed subtype-specific associations of lncRNAs with patient survival, tumor infiltrating lymphocytes and other immune cells. Targeting of lncRNAs exhibiting subtype-specific survival associations and expression in a panel of breast cancer cells demonstrated a selective reduction in cell proliferation within their associated subtype, supporting subtype-specific functions of certain lncRNAs. Characterization of HER2 + -specific lncRNA LINC01269 and TNBC-specific lncRNA AL078604.2 showed nuclear localization and altered expression of hundreds of genes enriched in cancer-promoting processes, including apoptosis, cell proliferation and immune cell regulation. This work emphasizes the importance of considering the heterogeneity of breast cancer subtypes and the need for subtype-specific analyses to fully uncover the relevance and potential impact of lncRNAs. Collectively, these findings demonstrate the contribution of lncRNAs to the distinct molecular, prognostic, and cellular composition of breast cancer subtypes.
Collapse
Affiliation(s)
| | | | - Hannah F Cahill
- Department of Pathology, Dalhousie University, Halifax, NS B3H4R2, Canada
| | - Meghan E McLean
- Department of Pathology, Dalhousie University, Halifax, NS B3H4R2, Canada
| | - Cheryl A Dean
- Department of Pathology, Dalhousie University, Halifax, NS B3H4R2, Canada
| | - Paola Marcato
- Department of Pathology, Dalhousie University, Halifax, NS B3H4R2, Canada; Department of Microbiology & Immunology, Dalhousie University, Halifax, NS B3H4R2, Canada; Nova Scotia Health Authority, Halifax, NS B3H1V8, Canada.
| |
Collapse
|
14
|
Liu Y, Zhang R, Dong X, Yang H, Li J, Cao H, Tian J, Zhang Y. DAE-CFR: detecting microRNA-disease associations using deep autoencoder and combined feature representation. BMC Bioinformatics 2024; 25:139. [PMID: 38553698 PMCID: PMC10981315 DOI: 10.1186/s12859-024-05757-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2024] [Accepted: 03/20/2024] [Indexed: 04/01/2024] Open
Abstract
BACKGROUND MicroRNA (miRNA) has been shown to play a key role in the occurrence and progression of diseases, making uncovering miRNA-disease associations vital for disease prevention and therapy. However, traditional laboratory methods for detecting these associations are slow, strenuous, expensive, and uncertain. Although numerous advanced algorithms have emerged, it is still a challenge to develop more effective methods to explore underlying miRNA-disease associations. RESULTS In the study, we designed a novel approach on the basis of deep autoencoder and combined feature representation (DAE-CFR) to predict possible miRNA-disease associations. We began by creating integrated similarity matrices of miRNAs and diseases, performing a logistic function transformation, balancing positive and negative samples with k-means clustering, and constructing training samples. Then, deep autoencoder was used to extract low-dimensional feature from two kinds of feature representations for miRNAs and diseases, namely, original association information-based and similarity information-based. Next, we combined the resulting features for each miRNA-disease pair and used a logistic regression (LR) classifier to infer all unknown miRNA-disease interactions. Under five and tenfold cross-validation (CV) frameworks, DAE-CFR not only outperformed six popular algorithms and nine classifiers, but also demonstrated superior performance on an additional dataset. Furthermore, case studies on three diseases (myocardial infarction, hypertension and stroke) confirmed the validity of DAE-CFR in practice. CONCLUSIONS DAE-CFR achieved outstanding performance in predicting miRNA-disease associations and can provide evidence to inform biological experiments and clinical therapy.
Collapse
Affiliation(s)
- Yanling Liu
- Department of Health Statistics, School of Public Health, Shanxi Medical University, Taiyuan, China
- Department of Mathematics, Changzhi Medical College, Changzhi, China
| | - Ruiyan Zhang
- Department of Health Statistics, School of Public Health, Shanxi Medical University, Taiyuan, China
| | - Xiaojing Dong
- Department of Health Statistics, School of Public Health, Shanxi Medical University, Taiyuan, China
| | - Hong Yang
- Department of Health Statistics, School of Public Health, Shanxi Medical University, Taiyuan, China
| | - Jing Li
- Department of Health Statistics, School of Public Health, Shanxi Medical University, Taiyuan, China
| | - Hongyan Cao
- Department of Health Statistics, School of Public Health, Shanxi Medical University, Taiyuan, China
| | - Jing Tian
- Department of Cardiology, First Hospital of Shanxi Medical University, Taiyuan, China.
| | - Yanbo Zhang
- Department of Health Statistics, School of Public Health, Shanxi Medical University, Taiyuan, China.
- Shanxi Provincial Key Laboratory of Major Diseases Risk Assessment, Taiyuan, China.
- School of Health and Service Management, Shanxi University of Chinese Medicine, Jinzhong, China.
| |
Collapse
|
15
|
Zhou L, Peng X, Zeng L, Peng L. Finding potential lncRNA-disease associations using a boosting-based ensemble learning model. Front Genet 2024; 15:1356205. [PMID: 38495672 PMCID: PMC10940470 DOI: 10.3389/fgene.2024.1356205] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2023] [Accepted: 02/01/2024] [Indexed: 03/19/2024] Open
Abstract
Introduction: Long non-coding RNAs (lncRNAs) have been in the clinical use as potential prognostic biomarkers of various types of cancer. Identifying associations between lncRNAs and diseases helps capture the potential biomarkers and design efficient therapeutic options for diseases. Wet experiments for identifying these associations are costly and laborious. Methods: We developed LDA-SABC, a novel boosting-based framework for lncRNA-disease association (LDA) prediction. LDA-SABC extracts LDA features based on singular value decomposition (SVD) and classifies lncRNA-disease pairs (LDPs) by incorporating LightGBM and AdaBoost into the convolutional neural network. Results: The LDA-SABC performance was evaluated under five-fold cross validations (CVs) on lncRNAs, diseases, and LDPs. It obviously outperformed four other classical LDA inference methods (SDLDA, LDNFSGB, LDASR, and IPCAF) through precision, recall, accuracy, F1 score, AUC, and AUPR. Based on the accurate LDA prediction performance of LDA-SABC, we used it to find potential lncRNA biomarkers for lung cancer. The results elucidated that 7SK and HULC could have a relationship with non-small-cell lung cancer (NSCLC) and lung adenocarcinoma (LUAD), respectively. Conclusion: We hope that our proposed LDA-SABC method can help improve the LDA identification.
Collapse
Affiliation(s)
- Liqian Zhou
- School of Computer Science, Hunan University of Technology, Zhuzhou, Hunan, China
| | - Xinhuai Peng
- School of Computer Science, Hunan University of Technology, Zhuzhou, Hunan, China
| | - Lijun Zeng
- School of Computer Science, Hunan Institute of Technology, Hengyang, China
| | - Lihong Peng
- School of Computer Science, Hunan University of Technology, Zhuzhou, Hunan, China
| |
Collapse
|
16
|
Yao HB, Hou ZJ, Zhang WG, Li H, Chen Y. Prediction of MicroRNA-Disease Potential Association Based on Sparse Learning and Multilayer Random Walks. J Comput Biol 2024; 31:241-256. [PMID: 38377572 DOI: 10.1089/cmb.2023.0266] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/22/2024] Open
Abstract
More and more studies have shown that microRNAs (miRNAs) play an indispensable role in the study of complex diseases in humans. Traditional biological experiments to detect miRNA-disease associations are expensive and time-consuming. Therefore, it is necessary to propose efficient and meaningful computational models to predict miRNA-disease associations. In this study, we aim to propose a miRNA-disease association prediction model based on sparse learning and multilayer random walks (SLMRWMDA). The miRNA-disease association matrix is decomposed and reconstructed by the sparse learning method to obtain richer association information, and at the same time, the initial probability matrix for the random walk with restart algorithm is obtained. The disease similarity network, miRNA similarity network, and miRNA-disease association network are used to construct heterogeneous networks, and the stable probability is obtained based on the topological structure features of diseases and miRNAs through a multilayer random walk algorithm to predict miRNA-disease potential association. The experimental results show that the prediction accuracy of this model is significantly improved compared with the previous related models. We evaluated the model using global leave-one-out cross-validation (global LOOCV) and fivefold cross-validation (5-fold CV). The area under the curve (AUC) value for the LOOCV is 0.9368. The mean AUC value for 5-fold CV is 0.9335 and the variance is 0.0004. In the case study, the results show that SLMRWMDA is effective in inferring the potential association of miRNA-disease.
Collapse
Affiliation(s)
- Hai-Bin Yao
- Computer Science and Artificial Intelligence and Aliyun School of Big Data, Changzhou University, Changzhou, China
| | - Zhen-Jie Hou
- Computer Science and Artificial Intelligence and Aliyun School of Big Data, Changzhou University, Changzhou, China
| | - Wen-Guang Zhang
- Life Sciences, Inner Mongolia Agricultural University, Hohhot, China
| | - Han Li
- Computer Science and Artificial Intelligence and Aliyun School of Big Data, Changzhou University, Changzhou, China
| | - Yan Chen
- Computer Science and Artificial Intelligence and Aliyun School of Big Data, Changzhou University, Changzhou, China
| |
Collapse
|
17
|
Peng L, Yang Y, Yang C, Li Z, Cheong N. HRGCNLDA: Forecasting of lncRNA-disease association based on hierarchical refinement graph convolutional neural network. MATHEMATICAL BIOSCIENCES AND ENGINEERING : MBE 2024; 21:4814-4834. [PMID: 38872515 DOI: 10.3934/mbe.2024212] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2024]
Abstract
Long non-coding RNA (lncRNA) is considered to be a crucial regulator involved in various human biological processes, including the regulation of tumor immune checkpoint proteins. It has great potential as both a cancer biomolecular biomarker and therapeutic target. Nevertheless, conventional biological experimental techniques are both resource-intensive and laborious, making it essential to develop an accurate and efficient computational method to facilitate the discovery of potential links between lncRNAs and diseases. In this study, we proposed HRGCNLDA, a computational approach utilizing hierarchical refinement of graph convolutional neural networks for forecasting lncRNA-disease potential associations. This approach effectively addresses the over-smoothing problem that arises from stacking multiple layers of graph convolutional neural networks. Specifically, HRGCNLDA enhances the layer representation during message propagation and node updates, thereby amplifying the contribution of hidden layers that resemble the ego layer while reducing discrepancies. The results of the experiments showed that HRGCNLDA achieved the highest AUC-ROC (area under the receiver operating characteristic curve, AUC for short) and AUC-PR (area under the precision versus recall curve, AUPR for short) values compared to other methods. Finally, to further demonstrate the reliability and efficacy of our approach, we performed case studies on the case of three prevalent human diseases, namely, breast cancer, lung cancer and gastric cancer.
Collapse
Affiliation(s)
- Li Peng
- College of Computer Science and Engineering, Hunan University of Science and Technology, Xiangtan 411201, China
- Hunan Key Laboratory for Service Computing and Novel Software Technology, Hunan University of Science and Technology, Xiangtan 411201, China
| | - Yujie Yang
- College of Computer Science and Engineering, Hunan University of Science and Technology, Xiangtan 411201, China
| | - Cheng Yang
- College of Computer Science and Engineering, Hunan University of Science and Technology, Xiangtan 411201, China
| | - Zejun Li
- School of Computer Science and Engineering, Hunan Institute of Technology, Hengyang 421002, China
| | - Ngai Cheong
- Faculty of Applied Sciences, Macao Polytechnic University, Macau 999078, China
| |
Collapse
|
18
|
Rinaldi S, Moroni E, Rozza R, Magistrato A. Frontiers and Challenges of Computing ncRNAs Biogenesis, Function and Modulation. J Chem Theory Comput 2024; 20:993-1018. [PMID: 38287883 DOI: 10.1021/acs.jctc.3c01239] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/31/2024]
Abstract
Non-coding RNAs (ncRNAs), generated from nonprotein coding DNA sequences, constitute 98-99% of the human genome. Non-coding RNAs encompass diverse functional classes, including microRNAs, small interfering RNAs, PIWI-interacting RNAs, small nuclear RNAs, small nucleolar RNAs, and long non-coding RNAs. With critical involvement in gene expression and regulation across various biological and physiopathological contexts, such as neuronal disorders, immune responses, cardiovascular diseases, and cancer, non-coding RNAs are emerging as disease biomarkers and therapeutic targets. In this review, after providing an overview of non-coding RNAs' role in cell homeostasis, we illustrate the potential and the challenges of state-of-the-art computational methods exploited to study non-coding RNAs biogenesis, function, and modulation. This can be done by directly targeting them with small molecules or by altering their expression by targeting the cellular engines underlying their biosynthesis. Drawing from applications, also taken from our work, we showcase the significance and role of computer simulations in uncovering fundamental facets of ncRNA mechanisms and modulation. This information may set the basis to advance gene modulation tools and therapeutic strategies to address unmet medical needs.
Collapse
Affiliation(s)
- Silvia Rinaldi
- National Research Council of Italy (CNR) - Institute of Chemistry of OrganoMetallic Compounds (ICCOM), c/o Area di Ricerca CNR di Firenze Via Madonna del Piano 10, 50019 Sesto Fiorentino, Florence, Italy
| | - Elisabetta Moroni
- National Research Council of Italy (CNR) - Institute of Chemical Sciences and Technologies (SCITEC), via Mario Bianco 9, 20131 Milano, Italy
| | - Riccardo Rozza
- National Research Council of Italy (CNR) - Institute of Material Foundry (IOM) c/o International School for Advanced Studies (SISSA), Via Bonomea, 265, 34136 Trieste, Italy
| | - Alessandra Magistrato
- National Research Council of Italy (CNR) - Institute of Material Foundry (IOM) c/o International School for Advanced Studies (SISSA), Via Bonomea, 265, 34136 Trieste, Italy
| |
Collapse
|
19
|
Ahvaz S, Amini M, Yari A, Baradaran B, Jebelli A, Mokhtarzadeh A. Downregulation of long noncoding RNA B4GALT1-AS1 is associated with breast cancer development. Sci Rep 2024; 14:3114. [PMID: 38326326 PMCID: PMC10850139 DOI: 10.1038/s41598-023-51124-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2023] [Accepted: 12/31/2023] [Indexed: 02/09/2024] Open
Abstract
The misregulation of long non-coding RNAs (lncRNAs) is related to the progressive evolution of various human cancers, such as Breast cancer (BC). The role of lncRNA B4GALT1-AS1 has been investigated in some human cancers. Therefore, studying B4GALT1-AS1 expression was aimed for the first time in the tumor and marginal tissues of BC in this study. The cancer genome atlas (TCGA) database was utilized to evaluate the relative expression of B4GALT1-AS1 in BC and other cancers. RNA was extracted from twenty-eight paired BC and marginal tissues, and cDNA was synthesized. The quantitative expression level of B4GALT1-AS1 was evaluated using real-time PCR. The bioinformatics analyses were performed to identify co-expression genes and related pathways. B4GALT1-AS1 was significantly downregulated in BC specimens compared to tumor marginal samples. The TCGA data analysis confirmed the downregulation of B4GALT1-AS1 in BC. The bioinformatics analysis discovered the correlation between 700 genes and B4GALT1-AS1 and identified GNAI1 as the high degree gene which was positively correlated with B4GALT1-AS1 expression. It seems B4GALT1-AS1 provides its function, at least partly, in association with one of the hippo pathway components, YAP, in other cancers. This protein has the opposite role in BC and its loss of function can result in poor survival in BC. Further research is needed to investigate the interaction between B4GALT1-AS1 and YAP in various subtypes of BC.
Collapse
Affiliation(s)
- Samaneh Ahvaz
- Immunology Research Center, Tabriz University of Medical Sciences, Tabriz, Iran
| | - Mohammad Amini
- Immunology Research Center, Tabriz University of Medical Sciences, Tabriz, Iran
| | - Amirhossein Yari
- Immunology Research Center, Tabriz University of Medical Sciences, Tabriz, Iran
- Department of Biology, Tabriz Branch, Islamic Azad University, Tabriz, Iran
| | - Behzad Baradaran
- Immunology Research Center, Tabriz University of Medical Sciences, Tabriz, Iran
| | - Asiyeh Jebelli
- Department of Biological Sciences, Faculty of Basic Sciences, Higher Education Institute of Rab-Rashid, Tabriz, Iran.
- Clinical Research Development Unit of Tabriz Valiasr Hospital, Tabriz University of Medical Sciences, Tabriz, Iran.
| | - Ahad Mokhtarzadeh
- Immunology Research Center, Tabriz University of Medical Sciences, Tabriz, Iran.
| |
Collapse
|
20
|
Chen Z, Zhang L, Li J, Fu M. MLFLHMDA: predicting human microbe-disease association based on multi-view latent feature learning. Front Microbiol 2024; 15:1353278. [PMID: 38371933 PMCID: PMC10869561 DOI: 10.3389/fmicb.2024.1353278] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2023] [Accepted: 01/17/2024] [Indexed: 02/20/2024] Open
Abstract
Introduction A growing body of research indicates that microorganisms play a crucial role in human health. Imbalances in microbial communities are closely linked to human diseases, and identifying potential relationships between microbes and diseases can help elucidate the pathogenesis of diseases. However, traditional methods based on biological or clinical experiments are costly, so the use of computational models to predict potential microbe-disease associations is of great importance. Methods In this paper, we present a novel computational model called MLFLHMDA, which is based on a Multi-View Latent Feature Learning approach to predict Human potential Microbe-Disease Associations. Specifically, we compute Gaussian interaction profile kernel similarity between diseases and microbes based on the known microbe-disease associations from the Human Microbe-Disease Association Database and perform a preprocessing step on the resulting microbe-disease association matrix, namely, weighting K nearest known neighbors (WKNKN) to reduce the sparsity of the microbe-disease association matrix. To obtain unobserved associations in the microbe and disease views, we extract different latent features based on the geometrical structure of microbes and diseases, and project multi-modal latent features into a common subspace. Next, we introduce graph regularization to preserve the local manifold structure of Gaussian interaction profile kernel similarity and add L p , q -norms to the projection matrix to ensure the interpretability and sparsity of the model. Results The AUC values for global leave-one-out cross-validation and 5-fold cross validation implemented by MLFLHMDA are 0.9165 and 0.8942+/-0.0041, respectively, which perform better than other existing methods. In addition, case studies of different diseases have demonstrated the superiority of the predictive power of MLFLHMDA. The source code of our model and the data are available on https://github.com/LiangzheZhang/MLFLHMDA_master.
Collapse
|
21
|
Jiao CN, Zhou F, Liu BM, Zheng CH, Liu JX, Gao YL. Multi-Kernel Graph Attention Deep Autoencoder for MiRNA-Disease Association Prediction. IEEE J Biomed Health Inform 2024; 28:1110-1121. [PMID: 38055359 DOI: 10.1109/jbhi.2023.3336247] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/08/2023]
Abstract
Accumulating evidence indicates that microRNAs (miRNAs) can control and coordinate various biological processes. Consequently, abnormal expressions of miRNAs have been linked to various complex diseases. Recognizable proof of miRNA-disease associations (MDAs) will contribute to the diagnosis and treatment of human diseases. Nevertheless, traditional experimental verification of MDAs is laborious and limited to small-scale. Therefore, it is necessary to develop reliable and effective computational methods to predict novel MDAs. In this work, a multi-kernel graph attention deep autoencoder (MGADAE) method is proposed to predict potential MDAs. In detail, MGADAE first employs the multiple kernel learning (MKL) algorithm to construct an integrated miRNA similarity and disease similarity, providing more biological information for further feature learning. Second, MGADAE combines the known MDAs, disease similarity, and miRNA similarity into a heterogeneous network, then learns the representations of miRNAs and diseases through graph convolution operation. After that, an attention mechanism is introduced into MGADAE to integrate the representations from multiple graph convolutional network (GCN) layers. Lastly, the integrated representations of miRNAs and diseases are input into the bilinear decoder to obtain the final predicted association scores. Corresponding experiments prove that the proposed method outperforms existing advanced approaches in MDA prediction. Furthermore, case studies related to two human cancers provide further confirmation of the reliability of MGADAE in practice.
Collapse
|
22
|
Yao D, Deng Y, Zhan X, Zhan X. Predicting lncRNA-disease associations using multiple metapaths in hierarchical graph attention networks. BMC Bioinformatics 2024; 25:46. [PMID: 38287236 PMCID: PMC11271052 DOI: 10.1186/s12859-024-05672-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2023] [Accepted: 01/23/2024] [Indexed: 01/31/2024] Open
Abstract
BACKGROUND Many biological studies have shown that lncRNAs regulate the expression of epigenetically related genes. The study of lncRNAs has helped to deepen our understanding of the pathogenesis of complex diseases at the molecular level. Due to the large number of lncRNAs and the complex and time-consuming nature of biological experiments, applying computer techniques to predict potential lncRNA-disease associations is very effective. To explore information between complex network structures, existing methods rely mainly on lncRNA and disease information. Metapaths have been applied to network models as an effective method for exploring information in heterogeneous graphs. However, existing methods are dominated by lncRNAs or disease nodes and tend to ignore the paths provided by intermediate nodes. METHODS We propose a deep learning model based on hierarchical graphical attention networks to predict unknown lncRNA-disease associations using multiple types of metapaths to extract features. We have named this model the MMHGAN. First, the model constructs a lncRNA-disease-miRNA heterogeneous graph based on known associations and two homogeneous graphs of lncRNAs and diseases. Second, for homogeneous graphs, the features of neighboring nodes are aggregated using a multihead attention mechanism. Third, for the heterogeneous graph, metapaths of different intermediate nodes are selected to construct subgraphs, and the importance of different types of metapaths is calculated and aggregated to obtain the final embedded features. Finally, the features are reconstructed using a fully connected layer to obtain the prediction results. RESULTS We used a fivefold cross-validation method and obtained an average AUC value of 96.07% and an average AUPR value of 93.23%. Additionally, ablation experiments demonstrated the role of homogeneous graphs and different intermediate node path weights. In addition, we studied lung cancer, esophageal carcinoma, and breast cancer. Among the 15 lncRNAs associated with these diseases, 15, 12, and 14 lncRNAs were validated by the lncRNA Disease Database and the Lnc2Cancer Database, respectively. CONCLUSION We compared the MMHGAN model with six existing models with better performance, and the case study demonstrated that the model was effective in predicting the correlation between potential lncRNAs and diseases.
Collapse
Affiliation(s)
- Dengju Yao
- School of Computer Science and Technology, Harbin University of Science and Technology, Harbin, 150080, China.
| | - Yuexiao Deng
- School of Computer Science and Technology, Harbin University of Science and Technology, Harbin, 150080, China
| | - Xiaojuan Zhan
- School of Computer Science and Technology, Harbin University of Science and Technology, Harbin, 150080, China
- College of Computer Science and Technology, Heilongjiang Institute of Technology, Harbin, 150050, China
| | - Xiaorong Zhan
- Department of Endocrinology and Metabolism, Hospital of South, University of Science and Technology, Shenzhen, 518055, China
| |
Collapse
|
23
|
Zhang Y, Chu Y, Lin S, Xiong Y, Wei DQ. ReHoGCNES-MDA: prediction of miRNA-disease associations using homogenous graph convolutional networks based on regular graph with random edge sampler. Brief Bioinform 2024; 25:bbae103. [PMID: 38517693 PMCID: PMC10959163 DOI: 10.1093/bib/bbae103] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2023] [Revised: 02/04/2024] [Accepted: 02/23/2024] [Indexed: 03/24/2024] Open
Abstract
Numerous investigations increasingly indicate the significance of microRNA (miRNA) in human diseases. Hence, unearthing associations between miRNA and diseases can contribute to precise diagnosis and efficacious remediation of medical conditions. The detection of miRNA-disease linkages via computational techniques utilizing biological information has emerged as a cost-effective and highly efficient approach. Here, we introduced a computational framework named ReHoGCNES, designed for prospective miRNA-disease association prediction (ReHoGCNES-MDA). This method constructs homogenous graph convolutional network with regular graph structure (ReHoGCN) encompassing disease similarity network, miRNA similarity network and known MDA network and then was tested on four experimental tasks. A random edge sampler strategy was utilized to expedite processes and diminish training complexity. Experimental results demonstrate that the proposed ReHoGCNES-MDA method outperforms both homogenous graph convolutional network and heterogeneous graph convolutional network with non-regular graph structure in all four tasks, which implicitly reveals steadily degree distribution of a graph does play an important role in enhancement of model performance. Besides, ReHoGCNES-MDA is superior to several machine learning algorithms and state-of-the-art methods on the MDA prediction. Furthermore, three case studies were conducted to further demonstrate the predictive ability of ReHoGCNES. Consequently, 93.3% (breast neoplasms), 90% (prostate neoplasms) and 93.3% (prostate neoplasms) of the top 30 forecasted miRNAs were validated by public databases. Hence, ReHoGCNES-MDA might serve as a dependable and beneficial model for predicting possible MDAs.
Collapse
Affiliation(s)
- Yufang Zhang
- School of Mathematical Sciences and SJTU-Yale Joint Center for Biostatistics and Data Science, Shanghai Jiao Tong University, Shanghai 200240, China
- Peng Cheng Laboratory, Shenzhen, Guangdong 518055, China
- Zhongjing Research and Industrialization Institute of Chinese Medicine, Zhongguancun Scientific Park, Meixi, Nanyang, Henan, 473006, China
| | - Yanyi Chu
- Department of Pathology, Stanford University School of Medicine, Stanford, CA, 94305, USA
| | - Shenggeng Lin
- State Key Laboratory of Microbial Metabolism, School of Life Sciences and Biotechnology, and Joint Laboratory of International Cooperation in Metabolic and Developmental Sciences, Ministry of Education, Shanghai Jiao Tong University, Shanghai 200240, China
| | - Yi Xiong
- State Key Laboratory of Microbial Metabolism, School of Life Sciences and Biotechnology, and Joint Laboratory of International Cooperation in Metabolic and Developmental Sciences, Ministry of Education, Shanghai Jiao Tong University, Shanghai 200240, China
- Shanghai Artificial Intelligence Laboratory, Shanghai, 200232, China
| | - Dong-Qing Wei
- Peng Cheng Laboratory, Shenzhen, Guangdong 518055, China
- Zhongjing Research and Industrialization Institute of Chinese Medicine, Zhongguancun Scientific Park, Meixi, Nanyang, Henan, 473006, China
- State Key Laboratory of Microbial Metabolism, School of Life Sciences and Biotechnology, and Joint Laboratory of International Cooperation in Metabolic and Developmental Sciences, Ministry of Education, Shanghai Jiao Tong University, Shanghai 200240, China
| |
Collapse
|
24
|
Zhang Y, Cai G, Li X, Chen M. GCN-Based Heterogeneous Complex Feature Learning to Enhance Predictability for LncRNA-Disease Associations. ACS OMEGA 2024; 9:1472-1484. [PMID: 38222651 PMCID: PMC10785310 DOI: 10.1021/acsomega.3c07923] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/10/2023] [Revised: 11/20/2023] [Accepted: 11/28/2023] [Indexed: 01/16/2024]
Abstract
Using computational models to predict potential lncRNA-disease associations (LDAs) has emerged as an effective supplement to bioexperiments for exploring the pathogenesis of diseases. However, current computational models still face limitations in their ability to learn the complex features of bionetworks. In this study, HGCNLDA, a model which combines graph convolutional network (GCN)-based aggregation, heterogeneous information fusion, and a bilinear-decoder to infer LDAs was proposed. Recognizing the need to extract essential features during data processing, our HGCNLDA explored four key steps for uncovering interaction patterns within the bionetwork: (1) a novel type of tripartite heterogeneous network, known as the lncRNA-disease-miRNA network (LDMN), was constructed using computed similarities and known associations. (2) Homogeneous and heterogeneous features of nodes were extracted from domains within the LDMN by a GCN-based encoder. (3) Feature fusions, including bipolymerization operations and attention mechanism, were employed to capture a more accurate and comprehensive representation of nodes. (4) Bilinear-decoder was used to rebuild the edge type (or rating type) for a specific node pair, resulting in the predicted association score. Through a 5-fold cross-validation on two data sets, namely, data set1 and data set2, our HGCNLDA consistently demonstrated superior performance compared to five related models. It almost achieved the highest AUROC and AUPR values on both data sets, especially on data set2 where the results obtained were more challenging and objective. Case studies involving three real cancer scenarios further validated the practicality of HGCNLDA in identifying potential LDAs in real-world contexts. The source code and data for this study are available at https://github.com/zywait/HGCNLDA.
Collapse
Affiliation(s)
- Yi Zhang
- Guilin
University of Technology, Guilin 541004, China
- Guangxi Key Laboratory of Embedded Technology
and Intelligent System, Guilin University
of Technology, Guilin 541004, China
| | - Gangsheng Cai
- Guilin
University of Technology, Guilin 541004, China
- Guangxi Key Laboratory of Embedded Technology
and Intelligent System, Guilin University
of Technology, Guilin 541004, China
| | - Xin Li
- Guilin
University of Technology, Guilin 541004, China
- Guangxi Key Laboratory of Embedded Technology
and Intelligent System, Guilin University
of Technology, Guilin 541004, China
| | - Min Chen
- School
of Computer Science and Technology, Hunan
Institute of Technology, Hengyang 421010, China
| |
Collapse
|
25
|
Yao D, Zhang B, Li X, Zhan X, Zhan X, Zhang B. Applying negative sample denoising and multi-view feature for lncRNA-disease association prediction. Front Genet 2024; 14:1332273. [PMID: 38264213 PMCID: PMC10803626 DOI: 10.3389/fgene.2023.1332273] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2023] [Accepted: 12/22/2023] [Indexed: 01/25/2024] Open
Abstract
Increasing evidence indicates that mutations and dysregulation of long non-coding RNA (lncRNA) play a crucial role in the pathogenesis and prognosis of complex human diseases. Computational methods for predicting the association between lncRNAs and diseases have gained increasing attention. However, these methods face two key challenges: obtaining reliable negative samples and incorporating lncRNA-disease association (LDA) information from multiple perspectives. This paper proposes a method called NDMLDA, which combines multi-view feature extraction, unsupervised negative sample denoising, and stacking ensemble classifier. Firstly, an unsupervised method (K-means) is used to design a negative sample denoising module to alleviate the imbalance of samples and the impact of potential noise in the negative samples on model performance. Secondly, graph attention networks are employed to extract multi-view features of both lncRNAs and diseases, thereby enhancing the learning of association information between them. Finally, lncRNA-disease association prediction is implemented through a stacking ensemble classifier. Existing research datasets are integrated to evaluate performance, and 5-fold cross-validation is conducted on this dataset. Experimental results demonstrate that NDMLDA achieves an AUC of 0.9907and an AUPR of 0.9927, with a 5-fold cross-validation variance of less than 0.1%. These results outperform the baseline methods. Additionally, case studies further illustrate the model's potential in cancer diagnosis and precision medicine implementation.
Collapse
Affiliation(s)
- Dengju Yao
- School of Computer Science and Technology, Harbin University of Science and Technology, Harbin, China
| | - Bo Zhang
- School of Computer Science and Technology, Harbin University of Science and Technology, Harbin, China
| | - Xiangkui Li
- School of Computer Science and Technology, Harbin University of Science and Technology, Harbin, China
| | - Xiaojuan Zhan
- College of Computer Science and Technology, Heilongjiang Institute of Technology, Harbin, China
| | - Xiaorong Zhan
- Department of Endocrinology and Metabolism, Hospital of South University of Science and Technology, Shenzhen, China
| | - Binbin Zhang
- School of Computer Science and Technology, Harbin University of Science and Technology, Harbin, China
| |
Collapse
|
26
|
Yao D, Li B, Zhan X, Zhan X, Yu L. GCNFORMER: graph convolutional network and transformer for predicting lncRNA-disease associations. BMC Bioinformatics 2024; 25:5. [PMID: 38166659 PMCID: PMC10763317 DOI: 10.1186/s12859-023-05625-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2023] [Accepted: 12/18/2023] [Indexed: 01/05/2024] Open
Abstract
BACKGROUND A growing body of researches indicate that the disrupted expression of long non-coding RNA (lncRNA) is linked to a range of human disorders. Therefore, the effective prediction of lncRNA-disease association (LDA) can not only suggest solutions to diagnose a condition but also save significant time and labor costs. METHOD In this work, we proposed a novel LDA predicting algorithm based on graph convolutional network and transformer, named GCNFORMER. Firstly, we integrated the intraclass similarity and interclass connections between miRNAs, lncRNAs and diseases, and built a graph adjacency matrix. Secondly, to completely obtain the features between various nodes, we employed a graph convolutional network for feature extraction. Finally, to obtain the global dependencies between inputs and outputs, we used a transformer encoder with a multiheaded attention mechanism to forecast lncRNA-disease associations. RESULTS The results of fivefold cross-validation experiment on the public dataset revealed that the AUC and AUPR of GCNFORMER achieved 0.9739 and 0.9812, respectively. We compared GCNFORMER with six advanced LDA prediction models, and the results indicated its superiority over the other six models. Furthermore, GCNFORMER's effectiveness in predicting potential LDAs is underscored by case studies on breast cancer, colon cancer and lung cancer. CONCLUSIONS The combination of graph convolutional network and transformer can effectively improve the performance of LDA prediction model and promote the in-depth development of this research filed.
Collapse
Affiliation(s)
- Dengju Yao
- School of Computer Science and Technology, Harbin University of Science and Technology, Harbin, 150080, China.
| | - Bailin Li
- School of Computer Science and Technology, Harbin University of Science and Technology, Harbin, 150080, China
| | - Xiaojuan Zhan
- School of Computer Science and Technology, Harbin University of Science and Technology, Harbin, 150080, China
- College of Computer Science and Technology, Heilongjiang Institute of Technology, Harbin, 150050, China
| | - Xiaorong Zhan
- Department of Endocrinology and Metabolism, Hospital of South, University of Science and Technology, Shenzhen, 518055, China
| | - Liyang Yu
- School of Computer Science and Technology, Harbin University of Science and Technology, Harbin, 150080, China
| |
Collapse
|
27
|
Cai J, Wang R, Chen Y, Zhang C, Fu L, Fan C. LncRNA FIRRE regulated endometrial cancer radiotherapy sensitivity via the miR-199b-5p/SIRT1/BECN1 axis-mediated autophagy. Genomics 2024; 116:110750. [PMID: 38052260 DOI: 10.1016/j.ygeno.2023.110750] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2023] [Revised: 11/13/2023] [Accepted: 11/27/2023] [Indexed: 12/07/2023]
Abstract
BACKGROUND Endometrial cancer (EC) poses a serious threat to women's health. Radiotherapy has been widely used for EC treatment. However, the mechanism of FIRRE in EC development and radioresistance remains unknown. METHODS MTT and colony formation assays determined cell proliferation. The degree of autophagy was tested by the measurement of autophagy-related genes and immunofluorescence staining of LC3. Molecular interactions were demonstrated via luciferase reporter assay, RIP, and Co-IP. The FIRRE role's was analyzed by in vivo xenograft tumor model. RESULTS FIRRE and SIRT1 were upregulated in EC tumor tissues, whereas miR-199b-5p was reduced. FIRRE knockdown increased EC cell radiotherapy sensitivity by sponging miR-199b-5p and inhibiting autophagy. SIRT1 was targeted and negatively regulated by miR-199b-5p. SIRT1 could otherwise deacetylate BECN1 protein and participate in FIRRE-mediated autophagy. Silencing FIRRE increased sensitivity of EC radiotherapy in vivo. CONCLUSION FIRRE reduced EC cell radiotherapy sensitivity by stimulating autophagy via miR-199b-5p/SIRT1/BECN1 axis.
Collapse
Affiliation(s)
- Junhong Cai
- Medical Laboratory Center, Hainan Affiliated Hospital of Hainan Medical University/Hainan General Hospital, Haikou 570311, Hainan Province, PR China.
| | - Ru Wang
- Medical Laboratory Center, Hainan Affiliated Hospital of Hainan Medical University/Hainan General Hospital, Haikou 570311, Hainan Province, PR China
| | - Yaxiong Chen
- Department of Radiotherapy Center, Hainan Affiliated Hospital of Hainan Medical University/Hainan General Hospital, Haikou 570311, Hainan Province, PR China
| | - Chen Zhang
- Medical Laboratory Center, Hainan Affiliated Hospital of Hainan Medical University/Hainan General Hospital, Haikou 570311, Hainan Province, PR China
| | - Lanyan Fu
- Department of Gynecology, Hainan Affiliated Hospital of Hainan Medical University/Hainan General Hospital, Haikou 570311, Hainan Province, PR China
| | - Cunfu Fan
- Department of Pathology, Hainan Affiliated Hospital of Hainan Medical University/Hainan General Hospital, Haikou 570311, Hainan Province, PR China
| |
Collapse
|
28
|
Qu J, Ni J, Ni TG, Bian ZK, Liang JZ. Prediction of Human Microbe-Drug Association based on Layer Attention Graph Convolutional Network. Curr Med Chem 2024; 31:5097-5109. [PMID: 39225188 DOI: 10.2174/0109298673249941231108091326] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2023] [Revised: 08/20/2023] [Accepted: 10/19/2023] [Indexed: 09/04/2024]
Abstract
Human microbes are closely associated with a variety of complex diseases and have emerged as drug targets. Identification of microbe-related drugs is becoming a key issue in drug development and precision medicine. It can also provide guidance for solving the increasingly serious problem of drug resistance enhancement in viruses. METHODS In this paper, we have proposed a novel model of layer attention graph convolutional network for microbe-drug association prediction. First, multiple biological data have been integrated into a heterogeneous network. Then, the heterogeneous network has been incorporated into a graph convolutional network to determine the embedded microbe and drug. Finally, the microbe-drug association scores have been obtained by decoding the embedding of microbe and drug based on the layer attention mechanism. RESULTS To evaluate the performance of our proposed model, leave-one-out crossvalidation (LOOCV) and 5-fold cross-validation have been implemented on the two datasets of aBiofilm and MDAD. As a result, based on the aBiofilm dataset, our proposed model has attained areas under the curve (AUC) of 0.9178 and 0.9022 on global LOOCV and local LOOCV, respectively. Based on aBiofilm dataset, the proposed model has attained an AUC value of 0.9018 and 0.8902 on global LOOCV and local LOOCV, respectively. In addition, the average AUC and standard deviation of the proposed model for 5- fold cross-validation on the aBiofilm and MDAD datasets were 0.9141±6.8556e-04 and 0.8982±7.5868e-04, respectively. Also, two kinds of case studies have been further conducted to evaluate the proposed models. CONCLUSION Traditional methods for microbe-drug association prediction are timeconsuming and laborious. Therefore, the computational model proposed was used to predict new microbe-drug associations. Several evaluation results have shown the proposed model to achieve satisfactory results and that it can play a role in drug development and precision medicine.
Collapse
Affiliation(s)
- Jia Qu
- School of Computer Science and Artificial Intelligence & Aliyun School of Big Data, Changzhou University, Changzhou, 213164, China
| | - Jie Ni
- School of Computer Science and Artificial Intelligence & Aliyun School of Big Data, Changzhou University, Changzhou, 213164, China
| | - Tong-Guang Ni
- School of Computer Science and Artificial Intelligence & Aliyun School of Big Data, Changzhou University, Changzhou, 213164, China
| | - Ze-Kang Bian
- School of AI & Computer Science, Jiangnan University, Wuxi, 214122, China
| | - Jiu-Zhen Liang
- School of Computer Science and Artificial Intelligence & Aliyun School of Big Data, Changzhou University, Changzhou, 213164, China
| |
Collapse
|
29
|
Zhu H, Hao H, Yu L. Identifying disease-related microbes based on multi-scale variational graph autoencoder embedding Wasserstein distance. BMC Biol 2023; 21:294. [PMID: 38115088 PMCID: PMC10731776 DOI: 10.1186/s12915-023-01796-8] [Citation(s) in RCA: 11] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2023] [Accepted: 12/05/2023] [Indexed: 12/21/2023] Open
Abstract
BACKGROUND Enormous clinical and biomedical researches have demonstrated that microbes are crucial to human health. Identifying associations between microbes and diseases can not only reveal potential disease mechanisms, but also facilitate early diagnosis and promote precision medicine. Due to the data perturbation and unsatisfactory latent representation, there is a significant room for improvement. RESULTS In this work, we proposed a novel framework, Multi-scale Variational Graph AutoEncoder embedding Wasserstein distance (MVGAEW) to predict disease-related microbes, which had the ability to resist data perturbation and effectively generate latent representations for both microbes and diseases from the perspective of distribution. First, we calculated multiple similarities and integrated them through similarity network confusion. Subsequently, we obtained node latent representations by improved variational graph autoencoder. Ultimately, XGBoost classifier was employed to predict potential disease-related microbes. We also introduced multi-order node embedding reconstruction to enhance the representation capacity. We also performed ablation studies to evaluate the contribution of each section of our model. Moreover, we conducted experiments on common drugs and case studies, including Alzheimer's disease, Crohn's disease, and colorectal neoplasms, to validate the effectiveness of our framework. CONCLUSIONS Significantly, our model exceeded other currently state-of-the-art methods, exhibiting a great improvement on the HMDAD database.
Collapse
Affiliation(s)
- Huan Zhu
- School of Computer Science and Technology, Xidian University, Xi'an, China
| | - Hongxia Hao
- School of Computer Science and Technology, Xidian University, Xi'an, China.
| | - Liang Yu
- School of Computer Science and Technology, Xidian University, Xi'an, China.
| |
Collapse
|
30
|
Yu J, Yang G, Li S, Li M, Ji C, Liu G, Wang Y, Chen N, Lei C, Dang R. Identification of Dezhou donkey muscle development-related genes and long non-coding RNA based on differential expression analysis. Anim Biotechnol 2023; 34:2313-2323. [PMID: 35736796 DOI: 10.1080/10495398.2022.2088549] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/01/2022]
Abstract
Long non-coding RNAs (lncRNAs) play a critical role in the development of muscles. However, the role of lncRNAs in regulating skeletal muscle development has not been studied systematically in the donkey. In this study, we performed the RNA sequencing for different stages of muscles in donkeys, and investigate their expression profile, which showed that 3215 mRNAs (p-adjust <0.05) and 471 lncRNAs (p-value <0.05) were significantly differently expressed (DE) verified by RT-qPCR. GO and KEGG enrichment analysis indicated that DE genes and target genes of DE lncRNAs were associated with muscle development in the donkey. We also found these four target genes (DCN, ITM2A, MUSTN1, ARRDC2) involved in skeletal muscle growth and development. Combined with transcriptome data, network, and RT-qPCR results showed that four co-expression networks of DCN and lnc-008278, ITM2A and lnc_017247, MUSTN1 and lnc_030153, and ARRDC2 and lnc_033914, which may play an important role in the formation and development of muscle in the donkey.
Collapse
Affiliation(s)
- Jie Yu
- College of Animal Science and Technology, Northwest A&F University, Xianyang, China
- National Engineering Research Center for Gelatin-based Traditional Chinese Medicine, Shandong, China
| | - Ge Yang
- College of Animal Science and Technology, Northwest A&F University, Xianyang, China
| | - Shipeng Li
- College of Animal Science and Technology, Northwest A&F University, Xianyang, China
| | - Mei Li
- College of Animal Science and Technology, Northwest A&F University, Xianyang, China
| | - Chuanliang Ji
- National Engineering Research Center for Gelatin-based Traditional Chinese Medicine, Shandong, China
| | - Guiqin Liu
- Technology Collaborative Innovation Center, Liaocheng University, Liaocheng, China
| | - Yantao Wang
- National Engineering Research Center for Gelatin-based Traditional Chinese Medicine, Shandong, China
| | - Ningbo Chen
- College of Animal Science and Technology, Northwest A&F University, Xianyang, China
| | - Chuzhao Lei
- College of Animal Science and Technology, Northwest A&F University, Xianyang, China
| | - Ruihua Dang
- College of Animal Science and Technology, Northwest A&F University, Xianyang, China
| |
Collapse
|
31
|
Wu X, Cao S, Zou Y, Wu F. Traditional Chinese Medicine studies for Alzheimer's disease via network pharmacology based on entropy and random walk. PLoS One 2023; 18:e0294772. [PMID: 38019798 PMCID: PMC10686466 DOI: 10.1371/journal.pone.0294772] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2023] [Accepted: 11/08/2023] [Indexed: 12/01/2023] Open
Abstract
Alzheimer's disease (AD) is a common neurodegenerative disease having complex pathogenesis, approved drugs can only alleviate symptoms of AD for a period of time. Traditional Chinese medicine (TCM) contains multiple active ingredients that can act on multiple targets simultaneously. In this paper, a novel algorithm based on entropy and random walk with the restart of heterogeneous network (RWRHE) is proposed for predicting active ingredients for AD and screening out the effective TCMs for AD. First, Six TCM compounds containing 20 herbs from the AD drug reviews in the CNKI (China National Knowledge Internet) are collected, their active ingredients and targets are retrieved from different databases. Then, comprehensive similarity networks of active ingredients and targets are constructed based on different aspects and entropy weight, respectively. A comprehensive heterogeneous network is constructed by integrating the known active ingredient-target association information and two comprehensive similarity networks. Subsequently, bi-random walks are applied on the heterogeneous network to predict active ingredient-target associations. AD related targets are selected as the seed nodes, a random walk is carried out on the target similarity network to predict the AD-target associations, and the associations of AD-active ingredients are inferred and scored. The effective herbs and compounds for AD are screened out based on their active ingredients' scores. The results measured by machine learning and bioinformatics show that the RWRHE algorithm achieves better prediction accuracy, the top 15 active ingredients may act as multi-target agents in the prevention and treatment of AD, Danshen, Gouteng and Chaihu are recommended as effective TCMs for AD, Yiqitongyutang is recommended as effective compound for AD.
Collapse
Affiliation(s)
- Xiaolu Wu
- School of Mathematical Sciences, Tiangong University, Tianjin, China
| | - Shujuan Cao
- School of Mathematical Sciences, Tiangong University, Tianjin, China
| | - Yongming Zou
- Department of Neurology, Tianjin Huanhu Hospital, Tianjin, China
| | - Fangxiang Wu
- Division of Biomedical Engineering, Department of Mechanical Engineering and Department of Computer Science, University of Saskatchewan, Saskatoon, Saskatchewan, Canada
| |
Collapse
|
32
|
Peng L, Huang L, Su Q, Tian G, Chen M, Han G. LDA-VGHB: identifying potential lncRNA-disease associations with singular value decomposition, variational graph auto-encoder and heterogeneous Newton boosting machine. Brief Bioinform 2023; 25:bbad466. [PMID: 38127089 PMCID: PMC10734633 DOI: 10.1093/bib/bbad466] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2023] [Revised: 10/05/2023] [Accepted: 11/25/2023] [Indexed: 12/23/2023] Open
Abstract
Long noncoding RNAs (lncRNAs) participate in various biological processes and have close linkages with diseases. In vivo and in vitro experiments have validated many associations between lncRNAs and diseases. However, biological experiments are time-consuming and expensive. Here, we introduce LDA-VGHB, an lncRNA-disease association (LDA) identification framework, by incorporating feature extraction based on singular value decomposition and variational graph autoencoder and LDA classification based on heterogeneous Newton boosting machine. LDA-VGHB was compared with four classical LDA prediction methods (i.e. SDLDA, LDNFSGB, IPCARF and LDASR) and four popular boosting models (XGBoost, AdaBoost, CatBoost and LightGBM) under 5-fold cross-validations on lncRNAs, diseases, lncRNA-disease pairs and independent lncRNAs and independent diseases, respectively. It greatly outperformed the other methods with its prominent performance under four different cross-validations on the lncRNADisease and MNDR databases. We further investigated potential lncRNAs for lung cancer, breast cancer, colorectal cancer and kidney neoplasms and inferred the top 20 lncRNAs associated with them among all their unobserved lncRNAs. The results showed that most of the predicted top 20 lncRNAs have been verified by biomedical experiments provided by the Lnc2Cancer 3.0, lncRNADisease v2.0 and RNADisease databases as well as publications. We found that HAR1A, KCNQ1DN, ZFAT-AS1 and HAR1B could associate with lung cancer, breast cancer, colorectal cancer and kidney neoplasms, respectively. The results need further biological experimental validation. We foresee that LDA-VGHB was capable of identifying possible lncRNAs for complex diseases. LDA-VGHB is publicly available at https://github.com/plhhnu/LDA-VGHB.
Collapse
Affiliation(s)
- Lihong Peng
- School of Computer Science, Hunan University of Technology, 412007, Hunan, China
- College of Life Sciences and Chemistry, Hunan University of Technology, 412007, Hunan, China
| | - Liangliang Huang
- School of Computer Science, Hunan University of Technology, 412007, Hunan, China
| | - Qiongli Su
- Department of Pharmacy, the Affiliated Zhuzhou Hospital Xiangya Medical College CSU, 412007, Hunan, China
| | - Geng Tian
- Geneis (Beijing) Co. Ltd, China, 100102, Beijing, China
| | - Min Chen
- School of Computer Science, Hunan Institute of Technology, 421002, No. 18 Henghua Road, Zhuhui District, Hengyang, Hunan, China
| | - Guosheng Han
- School of Mathematics and Computational Science, Xiangtan University, 411105, Yuhu District, Xiangtan, Hunan, China
- Hunan Key Laboratory for Computation and Simulation in Science and Engineering, Xiangtan University, 411105, Yuhu District, Xiangtan, Hunan, China
| |
Collapse
|
33
|
Wang S, Hui C, Zhang T, Wu P, Nakaguchi T, Xuan P. Graph Reasoning Method Based on Affinity Identification and Representation Decoupling for Predicting lncRNA-Disease Associations. J Chem Inf Model 2023; 63:6947-6958. [PMID: 37906529 DOI: 10.1021/acs.jcim.3c01214] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/02/2023]
Abstract
An increasing number of studies have shown that dysregulation of lncRNAs is related to the occurrence of various diseases. Most of the previous methods, however, are designed based on homogeneity assumption that the representation of a target lncRNA (or disease) node should be updated by aggregating the attributes of its neighbor nodes. However, the assumption ignores the affinity nodes that are far from the target node. We present a novel prediction method, GAIRD, to fully leverage the heterogeneous information in the network and the decoupled node features. The first major innovation is a random walk strategy based on width-first searching and depth-first searching. Different from previous methods that only focus on homogeneous information, our new strategy learns both the homogeneous information within local neighborhoods and the heterogeneous information within higher-order neighborhoods. The second innovation is a representation decoupling module to extract the purer attributes and the purer topologies. Third, a module based on group convolution and deep separable convolution is developed to promote the pairwise intrachannel and interchannel feature learning. The experimental results show that GAIRD outperforms comparing state-of-the-art methods, and the ablation studies prove the contributions of major innovations. We also performed case studies on 3 diseases to further demonstrate the effectiveness of the GAIRD model in applications.
Collapse
Affiliation(s)
- Shuai Wang
- School of Information Science and Engineering, Yanshan University, Qinhuangdao 066004, China
| | - Cui Hui
- Department of Computer Science and Information Technology, La Trobe University, Melbourne 3083, Australia
| | - Tiangang Zhang
- School of Mathematical Science, Heilongjiang University, Harbin 150080, China
| | - Peiliang Wu
- School of Information Science and Engineering, Yanshan University, Qinhuangdao 066004, China
- Key Laboratory for Computer Virtual Technology and System Integration of Hebei Province, Qinhuangdao 066004, China
| | - Toshiya Nakaguchi
- Center for Frontier Medical Engineering, Chiba University, Chiba 2638522, Japan
| | - Ping Xuan
- Department of Computer Science, School of Engineering, Shantou University, Shantou 515063, China
| |
Collapse
|
34
|
Ning Z, Wu J, Ding Y, Wang Y, Peng Q, Fu L. BertNDA: A Model Based on Graph-Bert and Multi-Scale Information Fusion for ncRNA-Disease Association Prediction. IEEE J Biomed Health Inform 2023; 27:5655-5664. [PMID: 37669210 DOI: 10.1109/jbhi.2023.3311808] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/07/2023]
Abstract
Non-coding RNAs (ncRNAs) are a class of RNA molecules that lack the ability to encode proteins in human cells, but play crucial roles in various biological process. Understanding the interactions between different ncRNAs and their impact on diseases can significantly contribute to diagnosis, prevention, and treatment of diseases. However, predicting tertiary interactions between ncRNAs and diseases based on structural information in multiple scales remains a challenging task. To address this challenge, we propose a method called BertNDA, aiming to predict potential relationships between miRNAs, lncRNAs, and diseases. The framework identifies the local information through connectionless subgraph, which aggregate neighbor nodes' feature. And global information is extracted by leveraging Laplace transform of graph structures and WL (Weisfeiler-Lehman) absolute role coding. Additionally, an EMLP (Element-wise MLP) structure is designed to fuse pairwise global information. The transformer-encoder is employed as the backbone of our approach, followed by a prediction-layer to output the final correlation score. Extensive experiments demonstrate that BertNDA outperforms state-of-the-art methods in prediction assignment and exhibits significant potential for various biological applications. Moreover, we develop an online prediction platform that incorporates the prediction model, providing users with an intuitive and interactive experience. Overall, our model offers an efficient, accurate, and comprehensive tool for predicting tertiary associations between ncRNAs and diseases.
Collapse
|
35
|
Rahni Z, Hosseini SM, Shahrokh S, Saeedi Niasar M, Shoraka S, Mirjalali H, Nazemalhosseini-Mojarad E, Rostami-Nejad M, Malekpour H, Zali MR, Mohebbi SR. Long non-coding RNAs ANRIL, THRIL, and NEAT1 as potential circulating biomarkers of SARS-CoV-2 infection and disease severity. Virus Res 2023; 336:199214. [PMID: 37657511 PMCID: PMC10502354 DOI: 10.1016/j.virusres.2023.199214] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/04/2023] [Revised: 08/23/2023] [Accepted: 08/29/2023] [Indexed: 09/03/2023]
Abstract
The current outbreak of coronavirus disease 2019 (COVID-19) is a global emergency, as its rapid spread and high mortality rate, which poses a significant threat to public health. Innate immunity plays a crucial role in the primary defense against infections, and recent studies have highlighted the pivotal regulatory function of long non-coding RNAs (lncRNAs) in innate immune responses. This study aims to assess the circulating levels of lncRNAs namely ANRIL, THRIL, NEAT1, and MALAT1 in the blood of moderate and severe SARS-CoV-2 infected patients, in comparison to healthy individuals. Additionally, it aims to explore the potential of these lncRNAs as biomarkers for determining the severity of the disease. The blood samples were collected from a total of 38 moderate and 25 severe COVID-19 patients, along with 30 healthy controls. The total RNA was extracted and qPCR was performed to evaluate the blood levels of the lncRNAs. The results indicate significantly higher expression levels of lncRNAs ANRIL and THRIL in severe patients when compared to moderate patients (P value = 0.0307, P value = 0.0059, respectively). Moreover, the expression levels of lncRNAs ANRIL and THRIL were significantly up-regulated in both moderate and severe patients in comparison to the control group (P value < 0.001, P value < 0.001, P value = 0.001, P value < 0.001, respectively). The expression levels of lncRNA NEAT1 were found to be significantly higher in both moderate and severe COVID-19 patients compared to the healthy group (P value < 0.001, P value < 0.001, respectively), and there was no significant difference in the expression levels of NEAT1 between moderate and severe patients (P value = 0.6979). The expression levels of MALAT1 in moderate and severe patients did not exhibit a significant difference compared to the control group (P value = 0.677, P value = 0.764, respectively). Furthermore, the discriminative power of ANRIL and THRIL was significantly higher in the severe patient group than the moderate group (Area under curve (AUC) = 0.6879; P-value = 0.0122, AUC = 0.6947; P-value = 0.0093, respectively). In conclusion, the expression levels of the lncRNAs ANRIL and THRIL are correlated with the severity of COVID-19 and can be regarded as circulating biomarkers for disease progression.
Collapse
Affiliation(s)
- Zeynab Rahni
- Basic and Molecular Epidemiology of Gastrointestinal Disorders Research Center, Research Institute for Gastroenterology and Liver Diseases, Shahid Beheshti University of Medical Sciences, Tehran, Iran; Department of Microbiology and Microbial Biotechnology, Faculty of Life Sciences and Biotechnology, Shahid Beheshti University, Tehran, Iran
| | - Seyed Masoud Hosseini
- Department of Microbiology and Microbial Biotechnology, Faculty of Life Sciences and Biotechnology, Shahid Beheshti University, Tehran, Iran
| | - Shabnam Shahrokh
- Gastroenterology and Liver Diseases Research Center, Research Institute for Gastroenterology and Liver Diseases, Shahid Beheshti University of Medical Sciences, Tehran, Iran
| | - Mahsa Saeedi Niasar
- Gastroenterology and Liver Diseases Research Center, Research Institute for Gastroenterology and Liver Diseases, Shahid Beheshti University of Medical Sciences, Tehran, Iran
| | - Shahrzad Shoraka
- Gastroenterology and Liver Diseases Research Center, Research Institute for Gastroenterology and Liver Diseases, Shahid Beheshti University of Medical Sciences, Tehran, Iran
| | - Hamed Mirjalali
- Foodborne and Waterborne Diseases Research Center, Research Institute for Gastroenterology and Liver Diseases, Shahid Beheshti University of Medical Sciences, Tehran, Iran
| | - Ehsan Nazemalhosseini-Mojarad
- Gastroenterology and Liver Diseases Research Center, Research Institute for Gastroenterology and Liver Diseases, Shahid Beheshti University of Medical Sciences, Tehran, Iran
| | - Mohammad Rostami-Nejad
- Gastroenterology and Liver Diseases Research Center, Research Institute for Gastroenterology and Liver Diseases, Shahid Beheshti University of Medical Sciences, Tehran, Iran
| | - Habib Malekpour
- Research and Development Center, Imam Hossein Hospital, Shahid Beheshti University of Medical Sciences, Tehran, Iran
| | - Mohammad Reza Zali
- Gastroenterology and Liver Diseases Research Center, Research Institute for Gastroenterology and Liver Diseases, Shahid Beheshti University of Medical Sciences, Tehran, Iran
| | - Seyed Reza Mohebbi
- Gastroenterology and Liver Diseases Research Center, Research Institute for Gastroenterology and Liver Diseases, Shahid Beheshti University of Medical Sciences, Tehran, Iran.
| |
Collapse
|
36
|
Xie GB, Liu SG, Gu GS, Lin ZY, Yu JR, Chen RB, Xie WJ, Xu HJ. LUNCRW: Prediction of potential lncRNA-disease associations based on unbalanced neighborhood constraint random walk. Anal Biochem 2023; 679:115297. [PMID: 37619903 DOI: 10.1016/j.ab.2023.115297] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2023] [Revised: 08/14/2023] [Accepted: 08/18/2023] [Indexed: 08/26/2023]
Abstract
Accumulating evidence suggests that long non-coding RNAs (lncRNAs) are associated with various complex human diseases. They can serve as disease biomarkers and hold considerable promise for the prevention and treatment of various diseases. The traditional random walk algorithms generally exclude the effect of non-neighboring nodes on random walking. In order to overcome the issue, the neighborhood constraint (NC) approach is proposed in this study for regulating the direction of the random walk by computing the effects of both neighboring nodes and non-neighboring nodes. Then the association matrix is updated by matrix multiplication for minimizing the effect of the false negative data. The heterogeneous lncRNA-disease network is finally analyzed using an unbalanced random walk method for predicting the potential lncRNA-disease associations. The LUNCRW model is therefore developed for predicting potential lncRNA-disease associations. The area under the curve (AUC) values of the LUNCRW model in leave-one-out cross-validation and five-fold cross-validation were 0.951 and 0.9486 ± 0.0011, respectively. Data from published case studies on three diseases, including squamous cell carcinoma, hepatocellular carcinoma, and renal cell carcinoma, confirmed the predictive potential of the LUNCRW model. Altogether, the findings indicated that the performance of the LUNCRW method is superior to that of existing methods in predicting potential lncRNA-disease associations.
Collapse
Affiliation(s)
- Guo-Bo Xie
- School of Computer Science, Guangdong University of Technology, Guangzhou, 510000, China.
| | - Shi-Gang Liu
- School of Computer Science, Guangdong University of Technology, Guangzhou, 510000, China.
| | - Guo-Sheng Gu
- School of Computer Science, Guangdong University of Technology, Guangzhou, 510000, China.
| | - Zhi-Yi Lin
- School of Computer Science, Guangdong University of Technology, Guangzhou, 510000, China.
| | - Jun-Rui Yu
- School of Computer Science, Guangdong University of Technology, Guangzhou, 510000, China.
| | - Rui-Bin Chen
- School of Computer Science, Guangdong University of Technology, Guangzhou, 510000, China.
| | - Wei-Jie Xie
- School of Computer Science, Guangdong University of Technology, Guangzhou, 510000, China.
| | - Hao-Jie Xu
- School of Computer Science, Guangdong University of Technology, Guangzhou, 510000, China.
| |
Collapse
|
37
|
Khan R, Riaz A, Abbasi SA, Sadaf T, Baig RM, Mansoor Q. Identification of transcriptional level variations in microRNA-221 and microRNA-222 as alternate players in the thyroid cancer tumor microenvironment. Sci Rep 2023; 13:15800. [PMID: 37737255 PMCID: PMC10516937 DOI: 10.1038/s41598-023-42941-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2022] [Accepted: 09/16/2023] [Indexed: 09/23/2023] Open
Abstract
Thyroid cancer (TC) is caused by genetic factors and or their cross talk with lifestyle and environment. An important role of miRNA involvement has been identified in different human diseases alongside the cancer. The growing cloud of miRNA discoveries narrates miRNA-221 and miRNA-222 as key elements of ready arsenal in the cancer micro-niches. The aim of present study was to identify the variations of miRNA-221 and miRNA-222 expression in TC tissues and their likely association with TC. miRNA-221 and miRNA-222 were investigated for their expressional alterations in TC tissue samples and healthy thyroid tissue. Expression of miRNA-221 and -222 was analyzed through real time PCR. The relative gene expression of both the miRNA was quantified and statistically evaluated. miRNA-221 and miRNA-222 were found to be highly over expressed when compared with samples of multinodular goiter (MNG) and normal controls. Interestingly, it was also noted that miRNA-221 and miRNA-222 expression is working in a cluster in thyroid cancer patients. So, it can be concluded that the expressional alterations of miRNA-221 and -222 are playing their potential role in the development of thyroid cancer.
Collapse
Affiliation(s)
- Rashida Khan
- Department of Zoology, PMAS-Arid Agriculture University, Rawalpindi, Pakistan
- Institute of Biomedical and Genetic Engineering (IBGE), Islamabad, Pakistan
| | - Aayesha Riaz
- Department of Zoology, PMAS-Arid Agriculture University, Rawalpindi, Pakistan
| | | | - Tanzeela Sadaf
- Department of Zoology, PMAS-Arid Agriculture University, Rawalpindi, Pakistan
| | - Ruqia Mehmood Baig
- Department of Zoology, PMAS-Arid Agriculture University, Rawalpindi, Pakistan.
| | - Qaisar Mansoor
- Institute of Biomedical and Genetic Engineering (IBGE), Islamabad, Pakistan.
| |
Collapse
|
38
|
Sheng N, Wang Y, Huang L, Gao L, Cao Y, Xie X, Fu Y. Multi-task prediction-based graph contrastive learning for inferring the relationship among lncRNAs, miRNAs and diseases. Brief Bioinform 2023; 24:bbad276. [PMID: 37529914 DOI: 10.1093/bib/bbad276] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2023] [Revised: 07/09/2023] [Accepted: 07/11/2023] [Indexed: 08/03/2023] Open
Abstract
MOTIVATION Identifying the relationships among long non-coding RNAs (lncRNAs), microRNAs (miRNAs) and diseases is highly valuable for diagnosing, preventing, treating and prognosing diseases. The development of effective computational prediction methods can reduce experimental costs. While numerous methods have been proposed, they often to treat the prediction of lncRNA-disease associations (LDAs), miRNA-disease associations (MDAs) and lncRNA-miRNA interactions (LMIs) as separate task. Models capable of predicting all three relationships simultaneously remain relatively scarce. Our aim is to perform multi-task predictions, which not only construct a unified framework, but also facilitate mutual complementarity of information among lncRNAs, miRNAs and diseases. RESULTS In this work, we propose a novel unsupervised embedding method called graph contrastive learning for multi-task prediction (GCLMTP). Our approach aims to predict LDAs, MDAs and LMIs by simultaneously extracting embedding representations of lncRNAs, miRNAs and diseases. To achieve this, we first construct a triple-layer lncRNA-miRNA-disease heterogeneous graph (LMDHG) that integrates the complex relationships between these entities based on their similarities and correlations. Next, we employ an unsupervised embedding model based on graph contrastive learning to extract potential topological feature of lncRNAs, miRNAs and diseases from the LMDHG. The graph contrastive learning leverages graph convolutional network architectures to maximize the mutual information between patch representations and corresponding high-level summaries of the LMDHG. Subsequently, for the three prediction tasks, multiple classifiers are explored to predict LDA, MDA and LMI scores. Comprehensive experiments are conducted on two datasets (from older and newer versions of the database, respectively). The results show that GCLMTP outperforms other state-of-the-art methods for the disease-related lncRNA and miRNA prediction tasks. Additionally, case studies on two datasets further demonstrate the ability of GCLMTP to accurately discover new associations. To ensure reproducibility of this work, we have made the datasets and source code publicly available at https://github.com/sheng-n/GCLMTP.
Collapse
Affiliation(s)
- Nan Sheng
- Key laboratory of Symbol Computation and Knowledge Engineering of Ministry of Education, College of Computer Science and Technology, Jilin University, 130012 Changchun, China
| | - Yan Wang
- Key laboratory of Symbol Computation and Knowledge Engineering of Ministry of Education, College of Computer Science and Technology, Jilin University, 130012 Changchun, China
- School of Artificial Intelligence, Jilin University, 130012 Changchun, China
| | - Lan Huang
- Key laboratory of Symbol Computation and Knowledge Engineering of Ministry of Education, College of Computer Science and Technology, Jilin University, 130012 Changchun, China
| | - Ling Gao
- Key laboratory of Symbol Computation and Knowledge Engineering of Ministry of Education, College of Computer Science and Technology, Jilin University, 130012 Changchun, China
| | - Yangkun Cao
- School of Artificial Intelligence, Jilin University, 130012 Changchun, China
| | - Xuping Xie
- Key laboratory of Symbol Computation and Knowledge Engineering of Ministry of Education, College of Computer Science and Technology, Jilin University, 130012 Changchun, China
| | - Yuan Fu
- Institute of Biological, Environmental and Rural Sciences, Aberystwyth University, Aberystwyth, Ceredigion, UK
| |
Collapse
|
39
|
Xuan P, Bai H, Cui H, Zhang X, Nakaguchi T, Zhang T. Specific topology and topological connection sensitivity enhanced graph learning for lncRNA-disease association prediction. Comput Biol Med 2023; 164:107265. [PMID: 37531860 DOI: 10.1016/j.compbiomed.2023.107265] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2023] [Revised: 06/26/2023] [Accepted: 07/16/2023] [Indexed: 08/04/2023]
Abstract
Predicting disease-related candidate long noncoding RNAs (lncRNAs) is beneficial for exploring disease pathogenesis due to the close relations between lncRNAs and the occurrence and development of human diseases. It is a long-term and challenging task to adequately extract specific and local topologies in individual lncRNA network and individual disease network, and integrate the information of the connection relationships. We propose a new graph learning-based prediction method to encode specific and local topologies from each individual network, neighbor topologies with different connection relationships, and pairwise attributes. We first construct a lncRNA network composed of all the lncRNA nodes and their similarities, and a single disease network that contains all the disease nodes and disease similarities. Then, a network-aware graph convolutional autoencoder is constructed to encode the specific and local topologies of each network. Secondly, a heterogeneous network is established to embed all lncRNA, disease, and miRNA nodes and their various connections. Afterwards, a connection-sensitive graph neural network is designed to deeply integrate the neighbor node attributes and connection characteristics in the heterogeneous network and learn neighbor topological representations. We also construct both connection-level and topology representation-level attention mechanisms to extract informative connections and topological representations. Finally, we build a multi-layer convolutional neural networks with weighted residuals to adaptively complement the detailed features to pairwise attribute encoding. Comprehensive experiments and comparison results demonstrated that NCPred outperforms seven state-of-the-art prediction methods. The ablation studies demonstrated the importance of local topology learning, neighbor topology learning, and pairwise attribute encoding. Case studies on prostate, lung, and breast cancers further revealed NCPred's capacity to screen potential candidate disease-related lncRNAs.
Collapse
Affiliation(s)
- Ping Xuan
- Department of Computer Science, School of Engineering, Shantou University, Shantou, China
| | - Honglei Bai
- School of Computer Science and Technology, Heilongjiang University, Harbin, China
| | - Hui Cui
- Department of Computer Science and Information Technology, La Trobe University, Melbourne, Australia
| | - Xiaowen Zhang
- School of Computer Science and Technology, Heilongjiang University, Harbin, China
| | - Toshiya Nakaguchi
- Center for Frontier Medical Engineering, Chiba University, Chiba, Japan
| | - Tiangang Zhang
- School of Computer Science and Technology, Heilongjiang University, Harbin, China; School of Mathematical Science, Heilongjiang University, Harbin, China.
| |
Collapse
|
40
|
Li Y, Zhang M, Shang J, Li F, Ren Q, Liu JX. iLncDA-RSN: identification of lncRNA-disease associations based on reliable similarity networks. Front Genet 2023; 14:1249171. [PMID: 37614816 PMCID: PMC10442839 DOI: 10.3389/fgene.2023.1249171] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2023] [Accepted: 07/27/2023] [Indexed: 08/25/2023] Open
Abstract
Identification of disease-associated long non-coding RNAs (lncRNAs) is crucial for unveiling the underlying genetic mechanisms of complex diseases. Multiple types of similarity networks of lncRNAs (or diseases) can complementary and comprehensively characterize their similarities. Hence, in this study, we presented a computational model iLncDA-RSN based on reliable similarity networks for identifying potential lncRNA-disease associations (LDAs). Specifically, for constructing reliable similarity networks of lncRNAs and diseases, miRNA heuristic information with lncRNAs and diseases is firstly introduced to construct their respective Jaccard similarity networks; then Gaussian interaction profile (GIP) kernel similarity networks and Jaccard similarity networks of lncRNAs and diseases are provided based on the lncRNA-disease association network; a random walk with restart strategy is finally applied on Jaccard similarity networks, GIP kernel similarity networks, as well as lncRNA functional similarity network and disease semantic similarity network to construct reliable similarity networks. Depending on the lncRNA-disease association network and the reliable similarity networks, feature vectors of lncRNA-disease pairs are integrated from lncRNA and disease perspectives respectively, and then dimensionality reduced by the elastic net. Two random forests are at last used together on different lncRNA-disease association feature sets to identify potential LDAs. The iLncDA-RSN is evaluated by five-fold cross-validation to analyse its prediction performance, results of which show that the iLncDA-RSN outperforms the compared models. Furthermore, case studies of different complex diseases demonstrate the effectiveness of the iLncDA-RSN in identifying potential LDAs.
Collapse
Affiliation(s)
| | | | - Junliang Shang
- School of Computer Science, Qufu Normal University, Rizhao, China
| | | | | | | |
Collapse
|
41
|
Biyu H, GuangWen T, Ming Z, Lixin G, Mengshan L. A lncRNA-disease association prediction model based on the two-step PU learning and fully connected neural networks. Heliyon 2023; 9:e17726. [PMID: 37539215 PMCID: PMC10395133 DOI: 10.1016/j.heliyon.2023.e17726] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2023] [Revised: 06/13/2023] [Accepted: 06/26/2023] [Indexed: 08/05/2023] Open
Abstract
Long non-coding RNAs (lncRNAs) have been shown to play a regulatory role in various processes of human diseases. However, lncRNA experiments are inefficient, time-consuming and highly subjective, so that the number of experimentally verified associations between lncRNA and diseases is limited. In the era of big data, numerous machine learning methods have been proposed to predict the potential association between lncRNA and diseases, but the characteristics of the associated data were seldom explored. In these methods, negative samples are randomly selected for model training and the model is prone to learn the potential positive association error, thus affecting the prediction accuracy. In this paper, we proposed a cyclic optimization model of predicting lncRNA-disease associations (COPTLDA in short). In COPTLDA, the two-step training strategy is adopted to search for the samples with the greater probability of being negative examples from unlabeled samples and the determined samples are treated as negative samples, which are combined together with known positive samples to train the model. The searching and training steps are repeated until the best model is obtained as the final prediction model. In order to evaluate the performance of the model, 30% of the known positive samples are used to calculate the model accuracy and 10% of positive samples are used to calculate the recall rate of the model. The sampling strategy used in this paper can improve the accuracy and the AUC value reaches 0.9348. The results of case studies showed that the model could predict the potential associations between lncRNA and malignant tumors such as colorectal cancer, gastric cancer, and breast cancer. The predicted top 20 associated lncRNAs included 10 colorectal cancer lncRNAs, 2 gastric cancer lncRNAs, and 8 breast cancer lncRNAs.
Collapse
Affiliation(s)
| | | | | | | | - Li Mengshan
- Corresponding author. Gannan Normal University, China.
| |
Collapse
|
42
|
Hu X, Yin Z, Zeng Z, Peng Y. Prediction of miRNA-Disease Associations by Cascade Forest Model Based on Stacked Autoencoder. Molecules 2023; 28:5013. [PMID: 37446675 DOI: 10.3390/molecules28135013] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2023] [Revised: 06/23/2023] [Accepted: 06/24/2023] [Indexed: 07/15/2023] Open
Abstract
Numerous pieces of evidence have indicated that microRNA (miRNA) plays a crucial role in a series of significant biological processes and is closely related to complex disease. However, the traditional biological experimental methods used to verify disease-related miRNAs are inefficient and expensive. Thus, it is necessary to design some excellent approaches to improve efficiency. In this work, a novel method (CFSAEMDA) is proposed for the prediction of unknown miRNA-disease associations (MDAs). Specifically, we first capture the interactive features of miRNA and disease by integrating multi-source information. Then, the stacked autoencoder is applied for obtaining the underlying feature representation. Finally, the modified cascade forest model is employed to complete the final prediction. The experimental results present that the AUC value obtained by our method is 97.67%. The performance of CFSAEMDA is superior to several of the latest methods. In addition, case studies conducted on lung neoplasms, breast neoplasms and hepatocellular carcinoma further show that the CFSAEMDA method may be regarded as a utility approach to infer unknown disease-miRNA relationships.
Collapse
Affiliation(s)
- Xiang Hu
- Center of Intelligent Computing and Applied Statistics, School of Mathematics, Physics and Statistics, Shanghai University of Engineering Science, Shanghai 201620, China
| | - Zhixiang Yin
- Center of Intelligent Computing and Applied Statistics, School of Mathematics, Physics and Statistics, Shanghai University of Engineering Science, Shanghai 201620, China
| | - Zhiliang Zeng
- Center of Intelligent Computing and Applied Statistics, School of Mathematics, Physics and Statistics, Shanghai University of Engineering Science, Shanghai 201620, China
| | - Yu Peng
- Center of Intelligent Computing and Applied Statistics, School of Mathematics, Physics and Statistics, Shanghai University of Engineering Science, Shanghai 201620, China
| |
Collapse
|
43
|
Lu C, Xie M. LDAEXC: LncRNA-Disease Associations Prediction with Deep Autoencoder and XGBoost Classifier. Interdiscip Sci 2023:10.1007/s12539-023-00573-z. [PMID: 37308797 DOI: 10.1007/s12539-023-00573-z] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2022] [Revised: 05/14/2023] [Accepted: 05/15/2023] [Indexed: 06/14/2023]
Abstract
Numerous scientific evidences have revealed that long non-coding RNAs (lncRNAs) are involved in the progression of human complex diseases and biological life activities. Therefore, identifying novel and potential disease-related lncRNAs is helpful to diagnosis, prognosis and therapy of many human complex diseases. Since traditional laboratory experiments are cost and time-consuming, a great quantity of computer algorithms have been proposed for predicting the relationships between lncRNAs and diseases. However, there are still much room for the improvement. In this paper, we introduce an accurate framework named LDAEXC to infer LncRNA-Disease Associations with deep autoencoder and XGBoost Classifier. LDAEXC utilizes different similarity views of lncRNAs and human diseases to construct features for each data sources. Then, the reduced features are obtained by feeding the constructed feature vectors into a deep autoencoder, and at last an XGBoost classifier is leveraged to calculate the latent lncRNA-disease-associated scores using reduced features. The fivefold cross-validation experiments on four datasets showed that LDAEXC reached AUC scores of 0.9676 ± 0.0043, 0.9449 ± 0.022, 0.9375 ± 0.0331 and 0.9556 ± 0.0134, respectively, significantly higher than other advanced similar computer methods. Extensive experiment results and case studies of two complex diseases (colon and breast cancers) further indicated the practicability and excellent prediction performance of LDAEXC in inferring unknown lncRNA-disease associations. TLDAEXC utilizes disease semantic similarity, lncRNA expression similarity, and Gaussian interaction profile kernel similarity of lncRNAs and diseases for feature construction. The constructed features are fed to a deep autoencoder to extract reduced features, and an XGBoost classifier is used to predict the lncRNA-disease associations based on the reduced features. The fivefold and tenfold cross-validation experiments on a benchmark dataset showed that LDAEXC could achieve AUC scores of 0.9676 and 0.9682, respectively, significantly higher than other state-of-the-art similar methods.
Collapse
Affiliation(s)
- Cuihong Lu
- College of Information Science and Engineering, Hunan Normal University, Changsha, China
| | - Minzhu Xie
- College of Information Science and Engineering, Hunan Normal University, Changsha, China.
| |
Collapse
|
44
|
Zhong H, Luo J, Tang L, Liao S, Lu Z, Lin G, Murphy RW, Liu L. Association filtering and generative adversarial networks for predicting lncRNA-associated disease. BMC Bioinformatics 2023; 24:234. [PMID: 37277721 DOI: 10.1186/s12859-023-05368-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/06/2023] [Accepted: 05/29/2023] [Indexed: 06/07/2023] Open
Abstract
BACKGROUND Long non-coding RNA (lncRNA) closely associates with numerous biological processes, and with many diseases. Therefore, lncRNA-disease association prediction helps obtain relevant biological information and understand pathogenesis, and thus better diagnose preventable diseases. RESULTS Herein, we offer the LDAF_GAN method for predicting lncRNA-associated disease based on association filtering and generative adversarial networks. Experimentation used two types of data: lncRNA-disease associated data without lncRNA sequence features, and fused lncRNA sequence features. LDAF_GAN uses a generator and discriminator, and differs from the original GAN by the addition of a filtering operation and negative sampling. Filtering allows the generator output to filter out unassociated diseases before being fed into the discriminator. Thus, the results generated by the model focuses only on lncRNAs associated with disease. Negative sampling takes a portion of disease terms with 0 from the association matrix as negative samples, which are assumed to be unassociated with lncRNA. A regular term is added to the loss function to avoid producing a vector with all values of 1, which can fool the discriminator. Thus, the model requires that generated positive samples are close to 1, and negative samples are close to 0. The model achieved a superior fitting effect; LDAF_GAN had superior performance in predicting fivefold cross-validations on the two datasets with AUC values of 0.9265 and 0.9278, respectively. In the case study, LDAF_GAN predicted disease association for six lncRNAs-H19, MALAT1, XIST, ZFAS1, UCA1, and ZEB1-AS1-and with the top ten predictions of 100%, 80%, 90%, 90%, 100%, and 90%, respectively, which were reported by previous studies. CONCLUSION LDAF_GAN efficiently predicts the potential association of existing lncRNAs and the potential association of new lncRNAs with diseases. The results of fivefold cross-validation, tenfold cross-validation, and case studies suggest that the model has great predictive potential for lncRNA-disease association prediction.
Collapse
Affiliation(s)
- Hua Zhong
- School of Information Science, Yunnan Normal University, Kunming, China
| | - Jing Luo
- State Key Laboratory for Conservation and Utilization of Bio-resource, School of Ecology and Environment and School of Life Sciences, Yunnan University, Kunming, China
| | - Lin Tang
- Key Laboratory of Educational lnformation for Nationalities Ministry of Education, Yunnan University, Kunming, China
| | - Shicheng Liao
- School of Information Science, Yunnan Normal University, Kunming, China
| | - Zhonghao Lu
- School of Information Science, Yunnan Normal University, Kunming, China
| | - Guoliang Lin
- School of Medicine, Yunnan University, Kunming, China
| | - Robert W Murphy
- Reptilia Zoo and Education Centre, 2501 Rutherford Rd., Vaughan, ON, L4K 2N6, Canada
| | - Lin Liu
- School of Information Science, Yunnan Normal University, Kunming, China.
| |
Collapse
|
45
|
Kumar R, Yadav G, Kuddus M, Ashraf GM, Singh R. Unlocking the microbial studies through computational approaches: how far have we reached? ENVIRONMENTAL SCIENCE AND POLLUTION RESEARCH INTERNATIONAL 2023; 30:48929-48947. [PMID: 36920617 PMCID: PMC10016191 DOI: 10.1007/s11356-023-26220-0] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 03/23/2022] [Accepted: 02/24/2023] [Indexed: 04/16/2023]
Abstract
The metagenomics approach accelerated the study of genetic information from uncultured microbes and complex microbial communities. In silico research also facilitated an understanding of protein-DNA interactions, protein-protein interactions, docking between proteins and phyto/biochemicals for drug design, and modeling of the 3D structure of proteins. These in silico approaches provided insight into analyzing pathogenic and nonpathogenic strains that helped in the identification of probable genes for vaccines and antimicrobial agents and comparing whole-genome sequences to microbial evolution. Artificial intelligence, more precisely machine learning (ML) and deep learning (DL), has proven to be a promising approach in the field of microbiology to handle, analyze, and utilize large data that are generated through nucleic acid sequencing and proteomics. This enabled the understanding of the functional and taxonomic diversity of microorganisms. ML and DL have been used in the prediction and forecasting of diseases and applied to trace environmental contaminants and environmental quality. This review presents an in-depth analysis of the recent application of silico approaches in microbial genomics, proteomics, functional diversity, vaccine development, and drug design.
Collapse
Affiliation(s)
- Rajnish Kumar
- Amity Institute of Biotechnology, Amity University Uttar Pradesh Lucknow Campus, Lucknow, Uttar Pradesh, India
- Department of Veterinary Medicine and Surgery, College of Veterinary Medicine, University of Missouri, Columbia, MO, USA
| | - Garima Yadav
- Amity Institute of Biotechnology, Amity University Uttar Pradesh Lucknow Campus, Lucknow, Uttar Pradesh, India
| | - Mohammed Kuddus
- Department of Biochemistry, College of Medicine, University of Hail, Hail, Saudi Arabia
| | - Ghulam Md Ashraf
- Department of Medical Laboratory Sciences, College of Health Sciences, and Sharjah Institute for Medical Research, University of Sharjah, Sharjah , 27272, United Arab Emirates
| | - Rachana Singh
- Amity Institute of Biotechnology, Amity University Uttar Pradesh Lucknow Campus, Lucknow, Uttar Pradesh, India.
| |
Collapse
|
46
|
Zhang GZ, Gao YL. BRWMC: Predicting lncRNA-disease associations based on bi-random walk and matrix completion on disease and lncRNA networks. Comput Biol Chem 2023; 103:107833. [PMID: 36812824 DOI: 10.1016/j.compbiolchem.2023.107833] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2022] [Revised: 12/29/2022] [Accepted: 02/15/2023] [Indexed: 02/19/2023]
Abstract
Many experiments have proved that long non-coding RNAs (lncRNAs) in humans have been implicated in disease development. The prediction of lncRNA-disease association is essential in promoting disease treatment and drug development. It is time-consuming and laborious to explore the relationship between lncRNA and diseases in the laboratory. The computation-based approach has clear advantages and has become a promising research direction. This paper proposes a new lncRNA disease association prediction algorithm BRWMC. Firstly, BRWMC constructed several lncRNA (disease) similarity networks based on different measurement angles and fused them into an integrated similarity network by similarity network fusion (SNF). In addition, the random walk method is used to preprocess the known lncRNA-disease association matrix and calculate the estimated scores of potential lncRNA-disease associations. Finally, the matrix completion method accurately predicts the potential lncRNA-disease associations. Under the framework of leave-one-out cross-validation and 5-fold cross-validation, the AUC values obtained by BRWMC are 0.9610 and 0.9739, respectively. In addition, case studies of three common diseases show that BRWMC is a reliable method for prediction.
Collapse
Affiliation(s)
- Guo-Zheng Zhang
- School of Computer Science, Qufu Normal University, Rizhao, China
| | - Ying-Lian Gao
- Qufu Normal University Library, Qufu Normal University, Rizhao, China.
| |
Collapse
|
47
|
Feng JL, Zheng WJ, Xu L, Zhou QY, Chen J. Identification of potential LncRNAs as papillary thyroid carcinoma biomarkers based on integrated bioinformatics analysis using TCGA and RNA sequencing data. Sci Rep 2023; 13:4350. [PMID: 36928327 PMCID: PMC10020161 DOI: 10.1038/s41598-023-30086-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2023] [Accepted: 02/15/2023] [Indexed: 03/18/2023] Open
Abstract
The roles and mechanisms of long non-coding RNAs (lncRNAs) in papillary thyroid cancer (PTC) remain elusive. We obtained RNA sequencing (RNA-seq) data of surgical PTC specimens from patients with thyroid cancer (THCA; n = 20) and identified differentially expressed genes (DEGs) between cancer and cancer-adjacent tissue samples. We identified 2309 DEGs (1372 significantly upregulated and 937 significantly downregulated). We performed Gene Ontology, Kyoto Encyclopedia of Genes and Genomes, gene set enrichment, and protein-protein interaction network analyses and screened for hub lncRNAs. Using the same methods, we analyzed the RNA-seq data from THCA dataset in The Cancer Genome Atlas (TCGA) database to identify differentially expressed lncRNAs. We identified 15 key differentially expressed lncRNAs and pathways that were closely related to PTC. Subsequently, by intersecting the differentially expressed lncRNAs with hub lncRNAs, we identified LINC02407 as the key lncRNA. Assessment of the associated clinical characteristics and prognostic correlations revealed a close correlation between LINC02407 expression and N stage of patients. Furthermore, receiver operating characteristic curve analysis showed that LINC02407 could better distinguish between cancerous and cancer-adjacent tissues in THCA patients. In conclusion, our findings suggest that LINC02407 is a potential biomarker for PTC diagnosis and the prediction of lymph node metastasis.
Collapse
Affiliation(s)
- Jia-Lin Feng
- Department of Head and Neck Surgery, Ren Ji Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, China
| | - Wen-Jie Zheng
- Department of Head and Neck Surgery, Ren Ji Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, China
| | - Le Xu
- Department of Head and Neck Surgery, Ren Ji Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, China
| | - Qin-Yi Zhou
- Department of Head and Neck Surgery, Ren Ji Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, China.
| | - Jun Chen
- Department of Head and Neck Surgery, Ren Ji Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, China.
| |
Collapse
|
48
|
Yalimaimaiti S, Liang X, Zhao H, Dou H, Liu W, Yang Y, Ning L. Establishment of a prognostic signature for lung adenocarcinoma using cuproptosis-related lncRNAs. BMC Bioinformatics 2023; 24:81. [PMID: 36879187 PMCID: PMC9990240 DOI: 10.1186/s12859-023-05192-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2022] [Accepted: 02/20/2023] [Indexed: 03/08/2023] Open
Abstract
OBJECTIVE To establish a prognostic signature for lung adenocarcinoma (LUAD) based on cuproptosis-related long non-coding RNAs (lncRNAs), and to study the immune-related functions of LUAD. METHODS First, transcriptome data and clinical data related to LUAD were downloaded from the Cancer Genome Atlas (TCGA), and cuproptosis-related genes were analyzed to identify cuproptosis-related lncRNAs. Univariate COX analysis, least absolute shrinkage and selection operator (LASSO) analysis, and multivariate COX analysis were performed to analyze the cuproptosis-related lncRNAs, and a prognostic signature was established. Second, univariate COX analysis and multivariate COX analysis were performed for independent prognostic analyses. Receiver operating characteristic (ROC) curves, C index, survival curve, nomogram, and principal component analysis (PCA) were performed to evaluate the results of the independent prognostic analyses. Finally, gene enrichment analyses and immune-related function analyses were also carried out. RESULTS (1) A total of 1,297 cuproptosis-related lncRNAs were screened. (2) A LUAD prognostic signature containing 13 cuproptosis-related lncRNAs was constructed (NIFK-AS1, AC026355.2, SEPSECS-AS1, AL360270.1, AC010999.2, ABCA9-AS1, AC032011.1, AL162632.3, LINC02518, LINC0059, AL031600.2, AP000346.1, AC012409.4). (3) The area under the multi-indicator ROC curves at 1, 3, and 5 years were AUC1 = 0.742, AUC2 = 0.708, and AUC3 = 0.762, respectively. The risk score of the prognostic signature could be used as an independent prognostic factor that was independent of other clinical indicators. (4) The results of gene enrichment analyses showed that 13 biomarkers were primarily related to amoebiasis, the wnt signaling pathway, hematopoietic cell lineage. The ssGSEA volcano map showed significant differences between high- and low-risk groups in immune-related functions, such as human leukocyte antigen (HLA), Type_II_IFN_Reponse, MHC_class_I, and Parainflammation (P < 0.001). CONCLUSIONS Thirteen cuproptosis-related lncRNAs may be clinical molecular biomarkers for the prognosis of LUAD.
Collapse
Affiliation(s)
- Saiyidan Yalimaimaiti
- School of Public Health, Xinjiang Medical University, Urumqi, 830011, Xinjiang, China
| | - Xiaoqiao Liang
- School of Public Health, Xinjiang Medical University, Urumqi, 830011, Xinjiang, China
| | - Haili Zhao
- School of Public Health, Xinjiang Medical University, Urumqi, 830011, Xinjiang, China
| | - Hong Dou
- Xinjiang Uygur Autonomous Region Occupational Disease Hospital, Urumqi, 830011, Xinjiang, China
| | - Wei Liu
- Xinjiang Uygur Autonomous Region Occupational Disease Hospital, Urumqi, 830011, Xinjiang, China
| | - Ying Yang
- School of Public Health, Xinjiang Medical University, Urumqi, 830011, Xinjiang, China
| | - Li Ning
- School of Public Health, Xinjiang Medical University, Urumqi, 830011, Xinjiang, China.
| |
Collapse
|
49
|
Akbarzadeh S, Tayefeh-Gholami S, Najari P, Rajabi A, Ghasemzadeh T, Hosseinpour Feizi M, Safaralizadeh R. The expression profile of HAR1A and HAR1B in the peripheral blood cells of multiple sclerosis patients. Mol Biol Rep 2023; 50:2391-2398. [PMID: 36583781 DOI: 10.1007/s11033-022-08182-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2022] [Accepted: 12/06/2022] [Indexed: 12/31/2022]
Abstract
BACKGROUND Multiple sclerosis (MS) is a progressive neurodegenerative disease of the central nervous system (CNS) with varying degrees of axonal and neuronal damage. The onset and progression of the disease are influenced by several environmental and genetic variables. Long non-coding RNAs (lncRNAs) have a crucial role in the pathophysiology of MS. Our study aimed to assess the levels of HAR1A and HAR1B lncRNA expression in the blood samples of MS patients and investigate the relationship between these lncRNAs and disease activity. METHODS AND RESULTS The blood samples of 100 MS patients, including 82 relapsing-remitting (RR), 8 primary progressive (PP), and 10 secondary progressive (SP) MS cases, and 100 healthy controls were collected. Quantitative real-time PCR was used for the evaluation of gene expression. ROC curve analysis was performed to evaluate the diagnostic potential of lncRNA levels. A significant decrease was detected in HAR1A expressions (P < 0.0001), and a moderate increase was also shown in HAR1B of SPMS patients (P value = 0.0189). HAR1A showed different expression levels in patients over forty (P value = 0.034). The expression levels of HAR1A and HAR1B were positively correlated in MS patients (r = 0.2003, P value = 0.0457). In addition, ROC curve results suggested that HAR1A can be introduced as a novel biomarker for MS diagnosis (AUC = 0.776). CONCLUSION The low serum level of HAR1A may be a potential molecular biomarker for MS diagnosis; however, no discernible difference was detected in the expression level of HAR1B in the blood samples of MS patients.
Collapse
Affiliation(s)
- Sama Akbarzadeh
- Department of Animal Biology, Faculty of Natural Science, University of Tabriz, Tabriz, Iran
| | - Samaneh Tayefeh-Gholami
- Department of Animal Biology, Faculty of Natural Science, University of Tabriz, Tabriz, Iran
| | - Parisa Najari
- Department of Animal Biology, Faculty of Natural Science, University of Tabriz, Tabriz, Iran
| | - Ali Rajabi
- Department of Animal Biology, Faculty of Natural Science, University of Tabriz, Tabriz, Iran
| | - Tooraj Ghasemzadeh
- Department of Animal Biology, Faculty of Natural Science, University of Tabriz, Tabriz, Iran
| | | | - Reza Safaralizadeh
- Department of Animal Biology, Faculty of Natural Science, University of Tabriz, Tabriz, Iran.
| |
Collapse
|
50
|
Ha J, Park S. NCMD: Node2vec-Based Neural Collaborative Filtering for Predicting MiRNA-Disease Association. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2023; 20:1257-1268. [PMID: 35849666 DOI: 10.1109/tcbb.2022.3191972] [Citation(s) in RCA: 12] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/04/2023]
Abstract
Numerous studies have reported that micro RNAs (miRNAs) play pivotal roles in disease pathogenesis based on the deregulation of the expressions of target messenger RNAs. Therefore, the identification of disease-related miRNAs is of great significance in understanding human complex diseases, which can also provide insight into the design of novel prognostic markers and disease therapies. Considering the time and cost involved in wet experiments, most recent works have focused on the effective and feasible modeling of computational frameworks to uncover miRNA-disease associations. In this study, we propose a novel framework called node2vec-based neural collaborative filtering for predicting miRNA-disease association (NCMD) based on deep neural networks. Initially, NCMD exploits Node2vec to learn low-dimensional vector representations of miRNAs and diseases. Next, it utilizes a deep learning framework that combines the linear ability of generalized matrix factorization and nonlinear ability of a multilayer perceptron. Experimental results clearly demonstrate the comparable performance of NCMD relative to the state-of-the-art methods according to statistical measures. In addition, case studies on breast cancer, lung cancer and pancreatic cancer validate the effectiveness of NCMD. Extensive experiments demonstrate the benefits of modeling a neural collaborative-filtering-based approach for discovering novel miRNA-disease associations.
Collapse
|