1
|
Diao B, Luo J, Guo Y. A comprehensive survey on deep learning-based identification and predicting the interaction mechanism of long non-coding RNAs. Brief Funct Genomics 2024; 23:314-324. [PMID: 38576205 DOI: 10.1093/bfgp/elae010] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2023] [Revised: 02/25/2024] [Accepted: 03/14/2024] [Indexed: 04/06/2024] Open
Abstract
Long noncoding RNAs (lncRNAs) have been discovered to be extensively involved in eukaryotic epigenetic, transcriptional, and post-transcriptional regulatory processes with the advancements in sequencing technology and genomics research. Therefore, they play crucial roles in the body's normal physiology and various disease outcomes. Presently, numerous unknown lncRNA sequencing data require exploration. Establishing deep learning-based prediction models for lncRNAs provides valuable insights for researchers, substantially reducing time and costs associated with trial and error and facilitating the disease-relevant lncRNA identification for prognosis analysis and targeted drug development as the era of artificial intelligence progresses. However, most lncRNA-related researchers lack awareness of the latest advancements in deep learning models and model selection and application in functional research on lncRNAs. Thus, we elucidate the concept of deep learning models, explore several prevalent deep learning algorithms and their data preferences, conduct a comprehensive review of recent literature studies with exemplary predictive performance over the past 5 years in conjunction with diverse prediction functions, critically analyze and discuss the merits and limitations of current deep learning models and solutions, while also proposing prospects based on cutting-edge advancements in lncRNA research.
Collapse
Affiliation(s)
- Biyu Diao
- Department of Breast Surgery, The First Affiliated Hospital of Ningbo University, No. 59, Liuting Street, Haishu District, Ningbo 315000, China
| | - Jin Luo
- Department of Breast Surgery, The First Affiliated Hospital of Ningbo University, No. 59, Liuting Street, Haishu District, Ningbo 315000, China
| | - Yu Guo
- Department of Breast Surgery, The First Affiliated Hospital of Ningbo University, No. 59, Liuting Street, Haishu District, Ningbo 315000, China
| |
Collapse
|
2
|
Guo M, Zeng J, Li J, Jiang L, Wu X, Ren Z, Hu Z. Pharmacological Components and Mechanism Research on the Treatment of Myelosuppression after Chemotherapy with Danggui Jixueteng Decoction Based on Spectrum-Effect Relationships and Transcriptome Sequencing. ACS OMEGA 2024; 9:28926-28936. [PMID: 38973888 PMCID: PMC11223127 DOI: 10.1021/acsomega.4c03641] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/15/2024] [Revised: 06/07/2024] [Accepted: 06/10/2024] [Indexed: 07/09/2024]
Abstract
Danggui Jixueteng decoction (DJD) has been used to treat anemia for many years and has been shown to be effective. However, the mechanism of action and effective components are yet unknown. We want to search for pharmacodynamic components in DJD with therapeutic effects on myelosuppression after chemotherapy (MAC), utilizing a spectrum-effect connection study based on gray relational analysis and partial least-squares regression analysis. Transcriptome sequencing (RNA-Seq) was used to investigate the mechanism by which DJD treats MAC. In this study, fingerprints of different batches of DJD (S1-S10) were established by ultraperformance liquid chromatography-mass spectrometry (UPLC-MS), after which the resulting shared peaks were screened and identified. A total of 21 common peaks were screened through the fingerprints of different batches of DJD, and the similarity of each profile was greater than 0.92. The 21 shared peaks were identified by comparison with the standard sample and searching on a MassLynx 4.1 workstation. The rat model of MAC was established by intraperitoneal injection of cyclophosphamide, and DJD treatment was carried out in parallel with the establishment of the model. White blood cell count, red blood cell count, platelet count, interleukin-3, hemoglobin concentration, granulocyte-macrophage colony-stimulating factor, and nucleated cell count were used as efficacy indicators. Pharmacodynamic results indicated that DJD could effectively improve the pharmacodynamic indices of MAC rats. The results of gray relational analysis demonstrated eight peaks with high correlation with efficacy, which were 2, 7, 10, 14, 15, 16, 18, and 21, and the partial least-squares regression analysis showed four peaks with variable importance in projection values greater than 1, which were 10, 12, 13, and 19. RNA-Seq was used to identify DEGs in rat bone marrow cells, Gene Ontology functional enrichment and Kyoto Encyclopedia of Genes and Genomes pathway enrichment analyses of DEGs were performed. The genes related to the effects of DJD on MAC were mainly involved in the phosphatidylinositol 3-kinase/serine-threonine kinase (PI3K-Akt) signaling pathway, the mitogen-activated protein kinase signaling pathway, actin cytoskeleton regulation, focal adhesion, and Rap1 signaling pathways. The results of the RNA-Seq study were confirmed by a qPCR experiment. The effective compounds of DJD against MAC include albiflorin, paeoniflorin, gallopaeoniflorin, salvianolic acid H/I, albiflorin R1, salvianolic acid B, salvianolic acid E, benzoylpaeoniflorin, and C12H18N5O4. The mechanism by which DJD prevents and treats MAC might involve the control of the PI3K-Akt signaling pathway.
Collapse
Affiliation(s)
- Mingxin Guo
- The
Affiliated Yixing Hospital of Jiangsu University, Yixing 214200, China
| | - Jiaqi Zeng
- The
Affiliated Yixing Hospital of Jiangsu University, Yixing 214200, China
| | - Jing Li
- Zibo
Central Hospital, Zibo 255000, China
| | - Luyao Jiang
- The
Affiliated Yixing Hospital of Jiangsu University, Yixing 214200, China
| | - Xia Wu
- Guangdong
Pharmaceutical University, Guangzhou 516006, China
| | - Zhanyun Ren
- The
Affiliated Yixing Hospital of Jiangsu University, Yixing 214200, China
| | - Zhiqiang Hu
- The
Affiliated Yixing Hospital of Jiangsu University, Yixing 214200, China
| |
Collapse
|
3
|
Saranya KR, Vimina ER, Pinto FR. TransNeT-CGP: A cluster-based comorbid gene prioritization by integrating transcriptomics and network-topological features. Comput Biol Chem 2024; 110:108038. [PMID: 38461796 DOI: 10.1016/j.compbiolchem.2024.108038] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2023] [Revised: 01/11/2024] [Accepted: 02/25/2024] [Indexed: 03/12/2024]
Abstract
The local disruptions caused by the genes of one disease can influence the pathways associated with the other diseases resulting in comorbidity. For gene therapies, it is necessary to prioritize the key genes that regulate common biological mechanisms to tackle the issues caused by overlapping diseases. This work proposes a clustering-based computational approach for prioritising the comorbid genes within the overlapping disease modules by analyzing Protein-Protein Interaction networks. For this, a sub-network with gene interactions of the disease pair was extracted from the interactome. The edge weights are assigned by combining the pairwise gene expression correlation and betweenness centrality scores. Further, a weighted graph clustering algorithm is applied and dominant nodes of high-density clusters are ranked based on clustering coefficients and neighborhood connectivity. Case studies based on neurodegenerative diseases such as Amyotrophic Lateral Sclerosis- Spinal Muscular Atrophy (ALS-SMA) pair and cancers such as Ovarian Carcinoma-Invasive Ductal Breast Carcinoma (OC-IDBC) pair were conducted to examine the efficacy of the proposed method. To identify the mechanistic role of top-ranked genes, we used Functional and Pathway enrichment analysis, connectivity analysis with leave-one-out (LOO) method, analysis of associated disease-related protein complexes, and prioritization tools such as TOPPGENE and Heml2.0. From pathway analysis, it was observed that the top 10 genes obtained using the proposed method were associated with 10 pathways in ALS-SMA comorbidity and 15 in the case of OC-IDBC, while that in similar methods like SAPDSB and S2B were 4, 6 respectively for ALS-SMA and 9, 10 respectively for OC-IDBC. In both case studies, 70 % of the disease-specific benchmark protein complexes were linked to top-ranked genes of the proposed method while that of SAPDSB and S2B were 55 % and 60 % respectively. Additionally, it was found that the removal of the top 10 genes disconnect the network into 14 distinct components in the case of ALS-SMA and 9 in the case of OC-IDBC. The experimental results shows that the proposed method can be effectively used for identifying key genes in comorbidity and can offer insights about the intricate molecular relationship driving comorbid diseases.
Collapse
Affiliation(s)
- K R Saranya
- Department of Computer Science & IT, School of Computing, Amrita Vishwa Vidyapeetham, Kochi Campus, India.
| | - E R Vimina
- Department of Computer Science & IT, School of Computing, Amrita Vishwa Vidyapeetham, Kochi Campus, India.
| | - F R Pinto
- Chemistry and Biochemistry Department, Faculty of Sciences, University of Lisbon, Portugal.
| |
Collapse
|
4
|
Jung S, Wang S, Lee D. CancerGATE: Prediction of cancer-driver genes using graph attention autoencoders. Comput Biol Med 2024; 176:108568. [PMID: 38744009 DOI: 10.1016/j.compbiomed.2024.108568] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2023] [Revised: 04/13/2024] [Accepted: 05/05/2024] [Indexed: 05/16/2024]
Abstract
Discovery of the cancer type specific-driver genes is important for understanding the molecular mechanisms of each cancer type and for providing proper treatment. Recently, graph deep learning methods became widely used in finding cancer-driver genes. However, previous methods had limited performance in individual cancer types due to a small number of cancer-driver genes used in training and biases toward the cancer-driver genes used in training the models. Here, we introduce a novel pipeline, CancerGATE that predicts the cancer-driver genes using graph attention autoencoder (GATE) to learn in a self-supervised manner and can be applied to each of the cancer types. CancerGATE utilizes biological network topology and multi-omics data from 15 types of cancer of 20,079 samples from the cancer genome atlas (TCGA). Attention coefficients calculated in the model are used to prioritize cancer-driver genes by comparing coefficients of cancer and normal contexts. CancerGATE shows a higher AUPRC with a difference ranging from 1.5 % to 36.5 % compared to the previous graph deep learning models in each cancer type. We also show that CancerGATE is free from the bias toward cancer-driver genes used in training, revealing mechanisms of the cancer-driver genes in specific cancer types. Finally, we propose novel cancer-driver gene candidates that could be therapeutic targets for specific cancer types.
Collapse
Affiliation(s)
- Seunghwan Jung
- Department of Bio and Brain Engineering, KAIST, Daejeon 34141, Republic of Korea.
| | - Seunghyun Wang
- Department of Bio and Brain Engineering, KAIST, Daejeon 34141, Republic of Korea.
| | - Doheon Lee
- Department of Bio and Brain Engineering, KAIST, Daejeon 34141, Republic of Korea.
| |
Collapse
|
5
|
Zhang Y, Deng Z, Xu X, Feng Y, Junliang S. Application of Artificial Intelligence in Drug-Drug Interactions Prediction: A Review. J Chem Inf Model 2024; 64:2158-2173. [PMID: 37458400 DOI: 10.1021/acs.jcim.3c00582] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/09/2024]
Abstract
Drug-drug interactions (DDI) are a critical aspect of drug research that can have adverse effects on patients and can lead to serious consequences. Predicting these events accurately can significantly improve clinicians' ability to make better decisions and establish optimal treatment regimens. However, manually detecting these interactions is time-consuming and labor-intensive. Utilizing the advancements in Artificial Intelligence (AI) is essential for achieving accurate forecasts of DDIs. In this review, DDI prediction tasks are classified into three types according to the type of DDI prediction: undirected DDI prediction, DDI events prediction, and Asymmetric DDI prediction. The paper then reviews the progress of AI for each of these three prediction tasks in DDI and provides a summary of the data sets used as well as the representative methods used in these three prediction directions. In this review, we aim to provide a comprehensive overview of drug interaction prediction. The first section introduces commonly used databases and presents an overview of current research advancements and techniques across three domains of DDI. Additionally, we introduce classical machine learning techniques for predicting undirected drug interactions and provide a timeline for the progression of the predicted drug interaction events. At last, we debate the difficulties and prospects of AI approaches at predicting DDI, emphasizing their potential for improving clinical decision-making and patient outcomes.
Collapse
Affiliation(s)
- Yuanyuan Zhang
- School of Information and Control Engineering, Qingdao University of Technology, Qingdao,266000,China
| | - Zengqian Deng
- School of Information and Control Engineering, Qingdao University of Technology, Qingdao,266000,China
| | - Xiaoyu Xu
- School of Information and Control Engineering, Qingdao University of Technology, Qingdao,266000,China
| | - Yinfei Feng
- School of Information and Control Engineering, Qingdao University of Technology, Qingdao,266000,China
| | - Shang Junliang
- School of Information Science and Engineering, Qufu Normal University, Rizhao, 276800, China
| |
Collapse
|
6
|
Li C, Ye G, Jiang Y, Wang Z, Yu H, Yang M. Artificial Intelligence in battling infectious diseases: A transformative role. J Med Virol 2024; 96:e29355. [PMID: 38179882 DOI: 10.1002/jmv.29355] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2023] [Revised: 12/01/2023] [Accepted: 12/17/2023] [Indexed: 01/06/2024]
Abstract
It is widely acknowledged that infectious diseases have wrought immense havoc on human society, being regarded as adversaries from which humanity cannot elude. In recent years, the advancement of Artificial Intelligence (AI) technology has ushered in a revolutionary era in the realm of infectious disease prevention and control. This evolution encompasses early warning of outbreaks, contact tracing, infection diagnosis, drug discovery, and the facilitation of drug design, alongside other facets of epidemic management. This article presents an overview of the utilization of AI systems in the field of infectious diseases, with a specific focus on their role during the COVID-19 pandemic. The article also highlights the contemporary challenges that AI confronts within this domain and posits strategies for their mitigation. There exists an imperative to further harness the potential applications of AI across multiple domains to augment its capacity in effectively addressing future disease outbreaks.
Collapse
Affiliation(s)
- Chunhui Li
- School of Life Science, Advanced Research Institute of Multidisciplinary Science, Key Laboratory of Molecular Medicine and Biotherapy, Beijing Institute of Technology, Beijing, People's Republic of China
| | - Guoguo Ye
- Shenzhen Key Laboratory of Pathogen and Immunity, National Clinical Research Center for Infectious Disease, The Third People's Hospital of Shenzhen, Second Hospital Affiliated to Southern University of Science and Technology, Shenzhen, China
| | - Yinghan Jiang
- School of Life Science, Advanced Research Institute of Multidisciplinary Science, Key Laboratory of Molecular Medicine and Biotherapy, Beijing Institute of Technology, Beijing, People's Republic of China
| | - Zhiming Wang
- School of Life Science, Advanced Research Institute of Multidisciplinary Science, Key Laboratory of Molecular Medicine and Biotherapy, Beijing Institute of Technology, Beijing, People's Republic of China
| | - Haiyang Yu
- Hangzhou Yalla Information Technology Service Co., Ltd., Hangzhou, People's Republic of China
| | - Minghui Yang
- School of Life Science, Advanced Research Institute of Multidisciplinary Science, Key Laboratory of Molecular Medicine and Biotherapy, Beijing Institute of Technology, Beijing, People's Republic of China
| |
Collapse
|
7
|
Xie W, Chen X, Zheng Z, Wang F, Zhu X, Lin Q, Sun Y, Wong KC. LncRNA-Top: Controlled deep learning approaches for lncRNA gene regulatory relationship annotations across different platforms. iScience 2023; 26:108197. [PMID: 37965148 PMCID: PMC10641498 DOI: 10.1016/j.isci.2023.108197] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2023] [Revised: 08/10/2023] [Accepted: 10/10/2023] [Indexed: 11/16/2023] Open
Abstract
By soaking microRNAs (miRNAs), long non-coding RNAs (lncRNAs) have the potential to regulate gene expression. Few methods have been created based on this mechanism to anticipate the lncRNA-gene relationship prediction. Hence, we present lncRNA-Top to forecast potential lncRNA-gene regulation relationships. Specifically, we constructed controlled deep-learning methods using 12417 lncRNAs and 16127 genes. We have provided retrospective and innovative views among negative sampling, random seeds, cross-validation, metrics, and independent datasets. The AUC, AUPR, and our defined precision@k were leveraged to evaluate performance. In-depth case studies demonstrate that 47 out of 100 projected top unknown pairings were recorded in publications, supporting the predictive power. Our additional software can annotate the scores with target candidates. The lncRNA-Top will be a helpful tool to uncover prospective lncRNA targets and better comprehend the regulatory processes of lncRNAs.
Collapse
Affiliation(s)
- Weidun Xie
- Department of Computer Science, City University of Hong Kong, Kowloon Tong, Hong Kong SAR
| | - Xingjian Chen
- Department of Computer Science, City University of Hong Kong, Kowloon Tong, Hong Kong SAR
| | - Zetian Zheng
- Department of Computer Science, City University of Hong Kong, Kowloon Tong, Hong Kong SAR
| | - Fuzhou Wang
- Department of Computer Science, City University of Hong Kong, Kowloon Tong, Hong Kong SAR
| | - Xiaowei Zhu
- Department of Neuroscience, Jockey Club College of Veterinary Medicine and Life Sciences, City University of Hong Kong, Kowloon Tong, Hong Kong SAR
| | - Qiuzhen Lin
- College of Computer Science and Software Engineering, Shenzhen University, Shenzhen, China
| | - Yanni Sun
- Department of Electrical Engineering, City University of Hong Kong, Kowloon Tong, Hong Kong SAR
| | - Ka-Chun Wong
- Department of Computer Science, City University of Hong Kong, Kowloon Tong, Hong Kong SAR
- Shenzhen Research Institute, City University of Hong Kong, Shenzhen, China
- Hong Kong Institute for Data Science, City University of Hong Kong, Kowloon Tong, Hong Kong SAR
| |
Collapse
|
8
|
Han L, Wang Z, Li C, Fan M, Wang Y, Sun G, Dai G. Functional identification and prediction of lncRNAs in esophageal cancer. Comput Biol Med 2023; 165:107205. [PMID: 37611425 DOI: 10.1016/j.compbiomed.2023.107205] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2023] [Revised: 05/29/2023] [Accepted: 06/25/2023] [Indexed: 08/25/2023]
Abstract
Esophageal cancer is a highly lethal malignancy with poor prognosis, and the identification of molecular biomarkers is crucial for improving diagnosis and treatment. Long non-coding RNAs (lncRNAs) have been shown to play important roles in the development and progression of esophageal cancer. However, due to the time cost of biological experiments, only a small number of lncRNAs related to esophageal cancer have been discovered. Currently, computational methods have emerged as powerful tools for identifying and characterizing lncRNAs, as well as predicting their potential functions. Therefore, this article proposes a transformer-based method for identifying esophageal cancer-related lncRNAs. Experimental results show that the AUC and AUPR of this method are superior to other comparison methods, with an AUC of 0.87 and an AUPR of 0.83, and the identified lncRNA targets are closely associated with esophageal cancer. We focus on the role of esophageal cancer-related lncRNAs in the immune microenvironment, and fully explore the functions of the target genes regulated by lncRNAs. Enrichment analysis shows that the predicted target genes are related to multiple pathways involved in the occurrence, development, and prognosis of esophageal cancer. This not only demonstrates the effectiveness of the method but also indicates the accuracy of the prediction results.
Collapse
Affiliation(s)
- Lu Han
- Department of Oncology, The Fifth Medical Center of Chinese PLA General Hospital, Beijing, China; Medical School of Chinese PLA, Beijing, China
| | - Zhikuan Wang
- Department of Oncology, The Fifth Medical Center of Chinese PLA General Hospital, Beijing, China
| | - Congyong Li
- Medical School of Chinese PLA, Beijing, China; Sixth Health Care Department, The Second Medical Center of PLA General Hospital, Beijing, China
| | - Mengjiao Fan
- Department of Oncology, The Fifth Medical Center of Chinese PLA General Hospital, Beijing, China; Medical School of Chinese PLA, Beijing, China
| | - Yanrong Wang
- Department of Oncology, The Fifth Medical Center of Chinese PLA General Hospital, Beijing, China; Medical School of Chinese PLA, Beijing, China
| | - Gang Sun
- Department of Gastroenterology and Hepatology, The First Medical Center of Chinese PLA General Hospital, Beijing, China.
| | - Guanghai Dai
- Department of Oncology, The Fifth Medical Center of Chinese PLA General Hospital, Beijing, China.
| |
Collapse
|
9
|
Sheng QJ, Tan Y, Zhang L, Wu ZP, Wang B, He XY. Heterogeneous graph framework for predicting the association between lncRNA and disease and case on uterine fibroid. Comput Biol Med 2023; 165:107331. [PMID: 37619322 DOI: 10.1016/j.compbiomed.2023.107331] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2023] [Revised: 07/24/2023] [Accepted: 08/07/2023] [Indexed: 08/26/2023]
Abstract
Long non-coding RNAs (lncRNAs) play crucial regulatory roles in various cellular processes, including gene expression, chromatin remodeling, and protein localization. Dysregulation of lncRNAs has been linked to several diseases, making it essential to understand their functions in disease mechanisms and therapeutic strategies. However, traditional experimental methods for studying lncRNA function are time-consuming, expensive, and offer limited insights. In recent years, computational methods have emerged as valuable tools for predicting lncRNA functions and their associations with diseases. However, many existing methods focus on constructing separate networks for lncRNA and disease similarity, resulting in information loss and insufficient processing capacity for isolated nodes. To address this, we developed 'RGLD' by combining Random Walk with restarting (RWR), Graph Neural Network (GNN), and Graph Attention Networks (GAT) to predict lncRNA-disease associations in a heterogeneous network. RGLD achieved an impressive AUC of 0.88, outperforming other methods. It can also predict novel associations between lncRNAs and diseases. RGLD identified HOTAIR, MEG3, and PVT1 as lncRNAs associated with uterine fibroids. Biological experiments directly or indirectly verified the involvement of these three lncRNAs in uterine fibroids, validating the accuracy of RGLD's predictions. Furthermore, we extensively discussed the functions of the target genes regulated by these lncRNAs in uterine fibroids, providing evidence for their role in the development and progression of the disease.
Collapse
Affiliation(s)
- Qing-Jing Sheng
- Department of Gynecology, Shanghai First Maternity and Infant Hospital, School of Medicine, Tong Ji University, Shanghai, China; Shanghai Key Laboratory of Maternal and Fetal Medicine, Shanghai First Maternity and Infant Hospital, Shanghai, China
| | - Yuan Tan
- Department of Integrated Traditional Chinese Medicine (TCM) & Western Medicine, Shanghai First Maternity and Infant Hospital, School of Medicine, Tongji University, Shanghai, China; Shanghai Key Laboratory of Maternal and Fetal Medicine, Shanghai First Maternity and Infant Hospital, Shanghai, China
| | - Liyuan Zhang
- School of Computer Science and Technology, Harbin Institute of Technology, Harbin, China
| | - Zhi-Ping Wu
- Department of Gynecology, Shanghai First Maternity and Infant Hospital, School of Medicine, Tong Ji University, Shanghai, China; Shanghai Key Laboratory of Maternal and Fetal Medicine, Shanghai First Maternity and Infant Hospital, Shanghai, China
| | - Beiying Wang
- Department of Gynecology, Shanghai First Maternity and Infant Hospital, School of Medicine, Tong Ji University, Shanghai, China; Shanghai Key Laboratory of Maternal and Fetal Medicine, Shanghai First Maternity and Infant Hospital, Shanghai, China
| | - Xiao-Ying He
- Department of Gynecology, Shanghai First Maternity and Infant Hospital, School of Medicine, Tong Ji University, Shanghai, China; Shanghai Key Laboratory of Maternal and Fetal Medicine, Shanghai First Maternity and Infant Hospital, Shanghai, China.
| |
Collapse
|
10
|
Zhang X, Guo H, Zhang F, Wang X, Wu K, Qiu S, Liu B, Wang Y, Hu Y, Li J. HNetGO: protein function prediction via heterogeneous network transformer. Brief Bioinform 2023; 24:bbab556. [PMID: 37861172 PMCID: PMC10588005 DOI: 10.1093/bib/bbab556] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2021] [Revised: 11/18/2021] [Accepted: 12/04/2021] [Indexed: 10/21/2023] Open
Abstract
Protein function annotation is one of the most important research topics for revealing the essence of life at molecular level in the post-genome era. Current research shows that integrating multisource data can effectively improve the performance of protein function prediction models. However, the heavy reliance on complex feature engineering and model integration methods limits the development of existing methods. Besides, models based on deep learning only use labeled data in a certain dataset to extract sequence features, thus ignoring a large amount of existing unlabeled sequence data. Here, we propose an end-to-end protein function annotation model named HNetGO, which innovatively uses heterogeneous network to integrate protein sequence similarity and protein-protein interaction network information and combines the pretraining model to extract the semantic features of the protein sequence. In addition, we design an attention-based graph neural network model, which can effectively extract node-level features from heterogeneous networks and predict protein function by measuring the similarity between protein nodes and gene ontology term nodes. Comparative experiments on the human dataset show that HNetGO achieves state-of-the-art performance on cellular component and molecular function branches.
Collapse
Affiliation(s)
- Xiaoshuai Zhang
- School of Computer Science and Technology, Harbin Institute of Technology (Shenzhen), Shenzhen, Guangdong 518055, China
| | - Huannan Guo
- General Hospital of Heilongjiang Province Land Reclamation Bureau, Harbin 150086, China
| | - Fan Zhang
- Center NHC Key Laboratory of Cell Transplantation, The First Affiliated Hospital of Harbin Medical University, Harbin 150086, China
| | - Xuan Wang
- School of Computer Science and Technology, Harbin Institute of Technology (Shenzhen), Shenzhen, Guangdong 518055, China
| | - Kaitao Wu
- School of Computer Science and Technology, Harbin Institute of Technology (Shenzhen), Shenzhen, Guangdong 518055, China
| | - Shizheng Qiu
- School of Computer Science and Technology, Harbin Institute of Technology, Harbin 150001, China
| | - Bo Liu
- School of Computer Science and Technology, Harbin Institute of Technology, Harbin 150001, China
| | - Yadong Wang
- School of Computer Science and Technology, Harbin Institute of Technology (Shenzhen), Shenzhen, Guangdong 518055, China
- School of Computer Science and Technology, Harbin Institute of Technology, Harbin 150001, China
| | - Yang Hu
- School of Computer Science and Technology, Harbin Institute of Technology, Harbin 150001, China
| | - Junyi Li
- School of Computer Science and Technology, Harbin Institute of Technology (Shenzhen), Shenzhen, Guangdong 518055, China
| |
Collapse
|
11
|
Rehman S, Ahmad Z, Ramakrishnan M, Kalendar R, Zhuge Q. Regulation of plant epigenetic memory in response to cold and heat stress: towards climate resilient agriculture. Funct Integr Genomics 2023; 23:298. [PMID: 37700098 DOI: 10.1007/s10142-023-01219-5] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2023] [Revised: 08/18/2023] [Accepted: 08/23/2023] [Indexed: 09/14/2023]
Abstract
Plants have evolved to adapt and grow in hot and cold climatic conditions. Some also adapt to daily and seasonal temperature changes. Epigenetic modifications play an important role in regulating plant tolerance under such conditions. DNA methylation and post-translational modifications of histone proteins influence gene expression during plant developmental stages and under stress conditions, including cold and heat stress. While short-term modifications are common, some modifications may persist and result in stress memory that can be inherited by subsequent generations. Understanding the mechanisms of epigenomes responding to stress and the factors that trigger stress memory is crucial for developing climate-resilient agriculture, but such an integrated view is currently limited. This review focuses on the plant epigenetic stress memory during cold and heat stress. It also discusses the potential of machine learning to modify stress memory through epigenetics to develop climate-resilient crops.
Collapse
Affiliation(s)
- Shamsur Rehman
- Co-Innovation Center for Sustainable Forestry in Southern China, Key Laboratory of Forest Genetics and Biotechnology, College of Biology and the Environment, Nanjing Forestry University, Ministry of Education, Nanjing, China
| | - Zishan Ahmad
- Co-Innovation Center for Sustainable Forestry in Southern China, Nanjing Forestry University, Nanjing, 210037, China
- Bamboo Research Institute, Nanjing Forestry University, Nanjing, 210037, China
| | - Muthusamy Ramakrishnan
- Co-Innovation Center for Sustainable Forestry in Southern China, Nanjing Forestry University, Nanjing, 210037, China
- Bamboo Research Institute, Nanjing Forestry University, Nanjing, 210037, China
| | - Ruslan Kalendar
- Helsinki Institute of Life Science HiLIFE, Biocenter 3, Viikinkaari 1, FI-00014 University of Helsinki, Helsinki, Finland.
- Center for Life Sciences, National Laboratory Astana, Nazarbayev University, Astana, Kazakhstan.
| | - Qiang Zhuge
- Co-Innovation Center for Sustainable Forestry in Southern China, Key Laboratory of Forest Genetics and Biotechnology, College of Biology and the Environment, Nanjing Forestry University, Ministry of Education, Nanjing, China.
| |
Collapse
|
12
|
Gao M, Shang X. Identification of associations between lncRNA and drug resistance based on deep learning and attention mechanism. Front Microbiol 2023; 14:1147778. [PMID: 37180267 PMCID: PMC10169643 DOI: 10.3389/fmicb.2023.1147778] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2023] [Accepted: 04/04/2023] [Indexed: 05/16/2023] Open
Abstract
Introduction Abnormal lncRNA expression can lead to the resistance of tumor cells to anticancer drugs, which is a crucial factor leading to high cancer mortality. Studying the relationship between lncRNA and drug resistance becomes necessary. Recently, deep learning has achieved promising results in predicting biomolecular associations. However, to our knowledge, deep learning-based lncRNA-drug resistance associations prediction has yet to be studied. Methods Here, we proposed a new computational model, DeepLDA, which used deep neural networks and graph attention mechanisms to learn lncRNA and drug embeddings for predicting potential relationships between lncRNAs and drug resistance. DeepLDA first constructed similarity networks for lncRNAs and drugs using known association information. Subsequently, deep graph neural networks were utilized to automatically extract features from multiple attributes of lncRNAs and drugs. These features were fed into graph attention networks to learn lncRNA and drug embeddings. Finally, the embeddings were used to predict potential associations between lncRNAs and drug resistance. Results Experimental results on the given datasets show that DeepLDA outperforms other machine learning-related prediction methods, and the deep neural network and attention mechanism can improve model performance. Dicsussion In summary, this study proposes a powerful deep-learning model that can effectively predict lncRNA-drug resistance associations and facilitate the development of lncRNA-targeted drugs. DeepLDA is available at https://github.com/meihonggao/DeepLDA.
Collapse
Affiliation(s)
| | - Xuequn Shang
- School of Computer Science, Northwestern Polytechnical University, Xi'an, China
| |
Collapse
|
13
|
Wang C, Zou Q, Ju Y, Shi H. Enhancer-FRL: Improved and Robust Identification of Enhancers and Their Activities Using Feature Representation Learning. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2023; 20:967-975. [PMID: 36063523 DOI: 10.1109/tcbb.2022.3204365] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/04/2023]
Abstract
Enhancers are crucial for precise regulation of gene expression, while enhancer identification and strength prediction are challenging because of their free distribution and tremendous number of similar fractions in the genome. Although several bioinformatics tools have been developed, shortfalls in these models remain, and their performances need further improvement. In the present study, a two-layer predictor called Enhancer-FRL was proposed for identifying enhancers (enhancers or nonenhancers) and their activities (strong and weak). More specifically, to build an efficient model, the feature representation learning scheme was applied to generate a 50D probabilistic vector based on 10 feature encodings and five machine learning algorithms. Subsequently, the multiview probabilistic features were integrated to construct the final prediction model. Compared with the single feature-based model, Enhancer-FRL showed significant performance improvement and model robustness. Performance assessment on the independent test dataset indicated that the proposed model outperformed state-of-the-art available toolkits. The webserver Enhancer-FRL is freely accessible at http://lab.malab.cn/∼wangchao/softwares/Enhancer-FRL/, The code and datasets can be downloaded at the webserver page or at the Github https://github.com/wangchao-malab/Enhancer-FRL/.
Collapse
|
14
|
Han K, Wang J, Wang Y, Zhang L, Yu M, Xie F, Zheng D, Xu Y, Ding Y, Wan J. A review of methods for predicting DNA N6-methyladenine sites. Brief Bioinform 2023; 24:6887111. [PMID: 36502371 DOI: 10.1093/bib/bbac514] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2022] [Revised: 10/07/2022] [Accepted: 10/27/2022] [Indexed: 12/14/2022] Open
Abstract
Deoxyribonucleic acid(DNA) N6-methyladenine plays a vital role in various biological processes, and the accurate identification of its site can provide a more comprehensive understanding of its biological effects. There are several methods for 6mA site prediction. With the continuous development of technology, traditional techniques with the high costs and low efficiencies are gradually being replaced by computer methods. Computer methods that are widely used can be divided into two categories: traditional machine learning and deep learning methods. We first list some existing experimental methods for predicting the 6mA site, then analyze the general process from sequence input to results in computer methods and review existing model architectures. Finally, the results were summarized and compared to facilitate subsequent researchers in choosing the most suitable method for their work.
Collapse
Affiliation(s)
- Ke Han
- School of Computer and Information Engineering, Heilongjiang Provincial Key Laboratory of Electronic Commerce and Information Processing, Harbin University of Commerce, Harbin, 150028, China.,College of Pharmacy, Harbin University of Commerce, Harbin, 150076, China
| | - Jianchun Wang
- School of Computer and Information Engineering, Heilongjiang Provincial Key Laboratory of Electronic Commerce and Information Processing, Harbin University of Commerce, Harbin, 150028, China
| | - Yu Wang
- School of Computer and Information Engineering, Heilongjiang Provincial Key Laboratory of Electronic Commerce and Information Processing, Harbin University of Commerce, Harbin, 150028, China
| | - Lei Zhang
- School of Computer and Information Engineering, Heilongjiang Provincial Key Laboratory of Electronic Commerce and Information Processing, Harbin University of Commerce, Harbin, 150028, China
| | - Mengyao Yu
- School of Computer and Information Engineering, Heilongjiang Provincial Key Laboratory of Electronic Commerce and Information Processing, Harbin University of Commerce, Harbin, 150028, China
| | - Fang Xie
- School of Computer and Information Engineering, Heilongjiang Provincial Key Laboratory of Electronic Commerce and Information Processing, Harbin University of Commerce, Harbin, 150028, China
| | - Dequan Zheng
- School of Computer and Information Engineering, Heilongjiang Provincial Key Laboratory of Electronic Commerce and Information Processing, Harbin University of Commerce, Harbin, 150028, China
| | - Yaoqun Xu
- School of Computer and Information Engineering, Heilongjiang Provincial Key Laboratory of Electronic Commerce and Information Processing, Harbin University of Commerce, Harbin, 150028, China
| | - Yijie Ding
- Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, Quzhou, 324000, China
| | - Jie Wan
- Laboratory for Space Environment and Physical Sciences, Harbin Institute of Technology, Harbin, 150001, China
| |
Collapse
|
15
|
Identification of adaptor proteins using the ANOVA feature selection technique. Methods 2022; 208:42-47. [DOI: 10.1016/j.ymeth.2022.10.008] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2022] [Revised: 10/01/2022] [Accepted: 10/24/2022] [Indexed: 11/06/2022] Open
|
16
|
Tang H, Sun L, Huang J, Yang Z, Li C, Zhou X. The mechanism and biomarker function of Cavin-2 in lung ischemia-reperfusion injury. Comput Biol Med 2022; 151:106234. [PMID: 36335812 DOI: 10.1016/j.compbiomed.2022.106234] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2022] [Revised: 10/01/2022] [Accepted: 10/22/2022] [Indexed: 12/27/2022]
Abstract
BACKGROUND Lung Ischemia Reperfusion injury(LIRI) is one of the most predominant complications of ischemic lung disease. Cavin-2 emerged as a regulator of a variety of cellular processes, including endocytosis, lipid homeostasis, signal transduction and tumorigenesis, but the function of Cavin-2 in LIRI is unknown. The purpose of this study was to determine the predictive potential of Cavin-2 in protecting lung ischemia-reperfusion injury and its corresponding mechanisms. METHODS We found the strong relationship between Cavin-2 and multiple immune-related genes by deep learning method. To reveal the mechanism of Cavin-2 in LIRI, the LIRI SD rat model was constructed to detect the expression of Cavin-2 in the lung tissue of SD rats after LIRI, and the expression of Cavin-2 in lung cell lines was also detected. The expression of IL-6, IL-10 and MDA in cells after Cavin-2 over-expression or knockdown was examined under hypoxic conditions. The expression levels of p-AKT, p-STAT3 and p-ERK1/2 were measured in over-expressing Cavin-2 cells under hypoxic-ischemia conditions, and then the corresponding blockers of AKT, STAT3 and ERK1/2 were given to verify, whether they play a protective role in LIRI. RESULTS After hypoxia, the expression of Cavin-2 in rat lung tissues was significantly increased, and the cellular activity and IL-10 in Cavin-2 over-expressing cells were significantly higher than that of the control group, while IL-6 and MDA were significantly lower than that of the control group, while the above results were reversed in Cavin-2 knockdown cells; Meanwhile, the phosphorylation levels of AKT, STAT3, and ERK1/2 were significantly increased in Cavin-2 over-expression cells after hypoxia. When AKT, STAT3, and ERK1/2 specific blockers were given, they lost their protective effect against LIRI. CONCLUSIONS Cavin-2 shows biomarker potential in protecting lung from ischemia-reperfusion injury through the survivor activating factor enhancement (SAFE) and reperfusion injury salvage kinase (RISK) pathway.
Collapse
Affiliation(s)
- Hexiao Tang
- Department of Thoracic Surgery, Zhongnan Hospital of Wuhan University, Wuhan, Hubei, China
| | - Linao Sun
- Tianjin Medical University, Tianjin, China
| | - Jingyu Huang
- Department of Thoracic Surgery, Zhongnan Hospital of Wuhan University, Wuhan, Hubei, China
| | - Zetian Yang
- Department of Thoracic Surgery, Zhongnan Hospital of Wuhan University, Wuhan, Hubei, China
| | - Changsheng Li
- Department of Thoracic Surgery, Zhongnan Hospital of Wuhan University, Wuhan, Hubei, China.
| | - Xuefeng Zhou
- Department of Thoracic Surgery, Zhongnan Hospital of Wuhan University, Wuhan, Hubei, China.
| |
Collapse
|
17
|
Yao Y, Lv Y, Tong L, Liang Y, Xi S, Ji B, Zhang G, Li L, Tian G, Tang M, Hu X, Li S, Yang J. ICSDA: a multi-modal deep learning model to predict breast cancer recurrence and metastasis risk by integrating pathological, clinical and gene expression data. Brief Bioinform 2022; 23:6761046. [PMID: 36242564 DOI: 10.1093/bib/bbac448] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2022] [Revised: 07/18/2022] [Accepted: 07/18/2022] [Indexed: 12/14/2022] Open
Abstract
Breast cancer patients often have recurrence and metastasis after surgery. Predicting the risk of recurrence and metastasis for a breast cancer patient is essential for the development of precision treatment. In this study, we proposed a novel multi-modal deep learning prediction model by integrating hematoxylin & eosin (H&E)-stained histopathological images, clinical information and gene expression data. Specifically, we segmented tumor regions in H&E into image blocks (256 × 256 pixels) and encoded each image block into a 1D feature vector using a deep neural network. Then, the attention module scored each area of the H&E-stained images and combined image features with clinical and gene expression data to predict the risk of recurrence and metastasis for each patient. To test the model, we downloaded all 196 breast cancer samples from the Cancer Genome Atlas with clinical, gene expression and H&E information simultaneously available. The samples were then divided into the training and testing sets with a ratio of 7: 3, in which the distributions of the samples were kept between the two datasets by hierarchical sampling. The multi-modal model achieved an area-under-the-curve value of 0.75 on the testing set better than those based solely on H&E image, sequencing data and clinical data, respectively. This study might have clinical significance in identifying high-risk breast cancer patients, who may benefit from postoperative adjuvant treatment.
Collapse
Affiliation(s)
- Yuhua Yao
- School of Mathematics and Statistics, Hainan Normal University, Haikou 570100, China.,Key Laboratory of Data Science and Intelligence Education, Ministry of Education, Hainan Normal University, Haikou, China.,Key Laboratory of Computational Science and Application of Hainan Province, Hainan Normal University, Haikou, China
| | - Yaping Lv
- School of Mathematics and Statistics, Hainan Normal University, Haikou 570100, China.,Genies Beijing Co., Ltd., Beijing 100102, China
| | - Ling Tong
- Chifeng Municipal Hospital, Chifeng, Inner Mongolia 024000, China
| | - Yuebin Liang
- Genies Beijing Co., Ltd., Beijing 100102, China.,Qingdao Geneis Institute of Big Data Mining and Precision Medicine, Qingdao 266000, China
| | - Shuxue Xi
- Genies Beijing Co., Ltd., Beijing 100102, China.,Qingdao Geneis Institute of Big Data Mining and Precision Medicine, Qingdao 266000, China
| | - Binbin Ji
- Genies Beijing Co., Ltd., Beijing 100102, China.,Qingdao Geneis Institute of Big Data Mining and Precision Medicine, Qingdao 266000, China
| | - Guanglu Zhang
- School of Mathematics and Statistics, Hainan Normal University, Haikou 570100, China
| | - Ling Li
- Basic Courses Department, Zhejiang Shuren University, Hangzhou 310000, China
| | - Geng Tian
- Genies Beijing Co., Ltd., Beijing 100102, China.,Qingdao Geneis Institute of Big Data Mining and Precision Medicine, Qingdao 266000, China
| | - Min Tang
- School of Life Sciences, Jiangsu University, Zhenjiang, 212013, China
| | - Xiyue Hu
- Dept. of Colorectal Surgery, National Cancer Center/ Cancer Hospital, Chinese Academy of Medical Science, 17 Panjiayuan Nanli, Chaoyang District, Beijing, China, 100021
| | - Shijun Li
- Chifeng Municipal Hospital, Chifeng, Inner Mongolia 024000, China
| | - Jialiang Yang
- Genies Beijing Co., Ltd., Beijing 100102, China.,Chifeng Municipal Hospital, Chifeng, Inner Mongolia 024000, China.,Qingdao Geneis Institute of Big Data Mining and Precision Medicine, Qingdao 266000, China
| |
Collapse
|
18
|
Zhang H, Wang Y, Pan Z, Sun X, Mou M, Zhang B, Li Z, Li H, Zhu F. ncRNAInter: a novel strategy based on graph neural network to discover interactions between lncRNA and miRNA. Brief Bioinform 2022; 23:6747810. [PMID: 36198065 DOI: 10.1093/bib/bbac411] [Citation(s) in RCA: 13] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2022] [Revised: 08/04/2022] [Accepted: 08/23/2022] [Indexed: 12/14/2022] Open
Abstract
In recent years, many studies have illustrated the significant role that non-coding RNA (ncRNA) plays in biological activities, in which lncRNA, miRNA and especially their interactions have been proved to affect many biological processes. Some in silico methods have been proposed and applied to identify novel lncRNA-miRNA interactions (LMIs), but there are still imperfections in their RNA representation and information extraction approaches, which imply there is still room for further improving their performances. Meanwhile, only a few of them are accessible at present, which limits their practical applications. The construction of a new tool for LMI prediction is thus imperative for the better understanding of their relevant biological mechanisms. This study proposed a novel method, ncRNAInter, for LMI prediction. A comprehensive strategy for RNA representation and an optimized deep learning algorithm of graph neural network were utilized in this study. ncRNAInter was robust and showed better performance of 26.7% higher Matthews correlation coefficient than existing reputable methods for human LMI prediction. In addition, ncRNAInter proved its universal applicability in dealing with LMIs from various species and successfully identified novel LMIs associated with various diseases, which further verified its effectiveness and usability. All source code and datasets are freely available at https://github.com/idrblab/ncRNAInter.
Collapse
Affiliation(s)
- Hanyu Zhang
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China.,Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, Alibaba-Zhejiang University Joint Research Center of Future Digital Healthcare, Hangzhou 330110, China
| | - Yunxia Wang
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China
| | - Ziqi Pan
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China
| | - Xiuna Sun
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China
| | - Minjie Mou
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China
| | - Bing Zhang
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, Alibaba-Zhejiang University Joint Research Center of Future Digital Healthcare, Hangzhou 330110, China
| | - Zhaorong Li
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, Alibaba-Zhejiang University Joint Research Center of Future Digital Healthcare, Hangzhou 330110, China
| | - Honglin Li
- School of Computer Science and Technology, East China Normal University, Shanghai 200062, China.,Shanghai Key Laboratory of New Drug Design, East China University of Science and Technology, Shanghai 200237, China
| | - Feng Zhu
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China.,Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, Alibaba-Zhejiang University Joint Research Center of Future Digital Healthcare, Hangzhou 330110, China
| |
Collapse
|
19
|
Gao M, Liu S, Qi Y, Guo X, Shang X. GAE-LGA: integration of multi-omics data with graph autoencoders to identify lncRNA-PCG associations. Brief Bioinform 2022; 23:6775590. [PMID: 36305456 DOI: 10.1093/bib/bbac452] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2022] [Revised: 09/20/2022] [Accepted: 09/22/2022] [Indexed: 12/14/2022] Open
Abstract
Long non-coding RNAs (lncRNAs) can disrupt the biological functions of protein-coding genes (PCGs) to cause cancer. However, the relationship between lncRNAs and PCGs remains unclear and difficult to predict. Machine learning has achieved a satisfactory performance in association prediction, but to our knowledge, it is currently less used in lncRNA-PCG association prediction. Therefore, we introduce GAE-LGA, a powerful deep learning model with graph autoencoders as components, to recognize potential lncRNA-PCG associations. GAE-LGA jointly explored lncRNA-PCG learning and cross-omics correlation learning for effective lncRNA-PCG association identification. The functional similarity and multi-omics similarity of lncRNAs and PCGs were accumulated and encoded by graph autoencoders to extract feature representations of lncRNAs and PCGs, which were subsequently used for decoding to obtain candidate lncRNA-PCG pairs. Comprehensive evaluation demonstrated that GAE-LGA can successfully capture lncRNA-PCG associations with strong robustness and outperformed other machine learning-based identification methods. Furthermore, multi-omics features were shown to improve the performance of lncRNA-PCG association identification. In conclusion, GAE-LGA can act as an efficient application for lncRNA-PCG association prediction with the following advantages: It fuses multi-omics information into the similarity network, making the feature representation more accurate; it can predict lncRNA-PCG associations for new lncRNAs and identify potential lncRNA-PCG associations with high accuracy.
Collapse
Affiliation(s)
- Meihong Gao
- School of Computer Science and Engineering, Northwestern Polytechnical University, Xi'an 710072, China
| | - Shuhui Liu
- School of Computer Science and Engineering, Northwestern Polytechnical University, Xi'an 710072, China
| | - Yang Qi
- School of Computer Science and Engineering, Northwestern Polytechnical University, Xi'an 710072, China
| | - Xinpeng Guo
- School of Computer Science and Engineering, Northwestern Polytechnical University, Xi'an 710072, China
| | - Xuequn Shang
- School of Computer Science and Engineering, Northwestern Polytechnical University, Xi'an 710072, China
| |
Collapse
|
20
|
Lin W, Hu S, Wu Z, Xu Z, Zhong Y, Lv Z, Qiu W, Xiao X. iCancer-Pred: A tool for identifying cancer and its type using DNA methylation. Genomics 2022; 114:110486. [PMID: 36126833 DOI: 10.1016/j.ygeno.2022.110486] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2022] [Revised: 09/11/2022] [Accepted: 09/16/2022] [Indexed: 01/14/2023]
Abstract
DNA methylation is an important epigenetics, which occurs in the early stages of tumor formation. And it also is of great significance to find the relationship between DNA methylation and cancer. This paper proposes a novel model, iCancer-Pred, to identify cancer and classify its types further. The datasets of DNA methylation information of 7 cancer types have been collected from The Cancer Genome Atlas (TCGA). The coefficient of variation firstly is used to reduce the number of features, and then the elastic network is applied to select important features. Finally, a fully connected neural network is constructed with these selected features. In predicting seven types of cancers, iCancer-Pred has achieved an overall accuracy of over 97% accuracy with 5-fold cross-validation. For the convenience of the application, a user-friendly web server: http://bioinfo.jcu.edu.cn/cancer or http://121.36.221.79/cancer/ is available. And the source codes are freely available for download at https://github.com/Huerhu/iCancer-Pred.
Collapse
Affiliation(s)
- Weizhong Lin
- School of Information Engineering, Jingdezhen Ceramic University, Jingdezhen 333000, China.
| | - Siqin Hu
- School of Information Engineering, Jingdezhen Ceramic University, Jingdezhen 333000, China
| | - Zhicheng Wu
- Wuhan Ammunition Life Science & Technology Co., Ltd., Wuhan 430000, China
| | - Zhaochun Xu
- School of Information Engineering, Jingdezhen Ceramic University, Jingdezhen 333000, China
| | - Yu Zhong
- School of Information Engineering, Jingdezhen Ceramic University, Jingdezhen 333000, China
| | - Zhe Lv
- School of Information Engineering, Jingdezhen Ceramic University, Jingdezhen 333000, China
| | - Wangren Qiu
- School of Information Engineering, Jingdezhen Ceramic University, Jingdezhen 333000, China
| | - Xuan Xiao
- School of Information Engineering, Jingdezhen Ceramic University, Jingdezhen 333000, China
| |
Collapse
|
21
|
Zhang T, Lin Y, He W, Yuan F, Zeng Y, Zhang S. GCN-GENE: A novel method for prediction of coronary heart disease-related genes. Comput Biol Med 2022; 150:105918. [PMID: 36215847 DOI: 10.1016/j.compbiomed.2022.105918] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2022] [Revised: 07/19/2022] [Accepted: 07/30/2022] [Indexed: 11/22/2022]
Abstract
Coronary heart disease is the most common heart disease, it can induce myocardial infarction, and the cause of the disease has a lot to do with life and eating habits. The results of a large number of epidemiological studies at home and abroad show that the incidence of coronary heart disease has an obvious familial tendency. However, little is known about the genetic factors of coronary heart disease. Although genome-wide association analysis and gene knockout experiments have found some genes related to coronary heart disease, there are still a large number of genes potentially related to coronary heart disease that have not been discovered. If it is confirmed by biological experimental means, the time and money cost is too high. Therefore, it is urgent to identify genes related to coronary heart disease on a large scale by computational means, so as to conduct targeted biological experimental verification. This paper proposes a deep learning method based on biological networks for the identification of coronary heart disease-related genes. We constructed gene interaction networks and extracted gene expression levels in different tissues as features. Through the association information and expression characteristics between genes, we constructed a model of coronary heart disease-related genes. Through cross-validation, we found that our proposed GCN-GENE that has AUC as 0.75 and AUPR as 0.78, which is more accurate than other methods and is a reliable method for predicting coronary heart disease-related genes.
Collapse
Affiliation(s)
- Tong Zhang
- Department of Cardiology, The Sixth Affiliated Hospital, School of Medicine, South China University of Technology, Guangdong, China.
| | - Yixuan Lin
- Department of Cardiology, The Sixth Affiliated Hospital, School of Medicine, South China University of Technology, Guangdong, China.
| | - Weimin He
- Department of Cardiology, The Sixth Affiliated Hospital, School of Medicine, South China University of Technology, Guangdong, China.
| | - FengXin Yuan
- Department of Cardiology, The Sixth Affiliated Hospital, School of Medicine, South China University of Technology, Guangdong, China.
| | - Yu Zeng
- Department of Cardiology, The Sixth Affiliated Hospital, School of Medicine, South China University of Technology, Guangdong, China.
| | - Shihua Zhang
- College of Life Science and Health, Wuhan University of Science and Technology, Wuhan, China.
| |
Collapse
|
22
|
Li L, Qiu W, Lin L, Liu J, Shi X, Shi Y. Predicting recurrence and metastasis risk of endometrial carcinoma via prognostic signatures identified from multi-omics data. Front Oncol 2022; 12:982452. [PMID: 36059678 PMCID: PMC9438970 DOI: 10.3389/fonc.2022.982452] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2022] [Accepted: 08/03/2022] [Indexed: 11/13/2022] Open
Abstract
ObjectivesEndometrial carcinoma (EC) is one of the three major gynecological malignancies, in which 15% - 20% patients will have recurrence and metastasis. Though there are many studies on the prognosis on this cancer, the performances of existing models evaluating the risk of its recurrence and metastasis are yet to be improved. In addition, a comprehensive multi-omics analyses on the prognostic signatures of EC are on demand. In this study, we aimed to construct a relatively stable and reliable model for predicting recurrence and metastasis of EC. This will help determine the risk level of patients and choose appropriate adjuvant therapy, thereby avoiding improper treatment, and improving the prognosis of patients.MethodsThe mRNA, microRNA (miRNA), long non-coding RNA (lncRNA), copy number variation (CNV) data and clinical information of patients with EC were downloaded from The Cancer Genome Atlas (TCGA). Differential expression analyses were performed between the recurrence or metastasis group and the non-recurrence/metastasis group. Then, we screened potential prognostic markers from the four kinds of omics data respectively and established prediction models using three classifiers.ResultsWe achieved differential expressed mRNAs, lncRNAs, miRNAs and CNVs between the two groups. According to feature selection scores by the random forest algorithm, 275 CNV features, 50 lncRNA features, 150 miRNA features and 150 mRNA features were selected, respectively. And the prediction model constructed by the features of lncRNA data using random forest method showed the best performance, with an area under the curve of 0.763, and an accuracy of 0.819 under 10-fold cross-validation.ConclusionWe developed a computational model using omics information, which is able to predicting recurrence and metastasis risk of EC accurately.
Collapse
Affiliation(s)
- Ling Li
- Department of Gynecological Oncology Surgery, Fujian Cancer Hospital, Fujian Medical University Cancer Hospital, Fuzhou, China
| | - Wenjing Qiu
- Science System Department, Geneis Beijing Co., Ltd., Beijing, China
| | - Liang Lin
- Department of Gynecological Oncology Surgery, Fujian Cancer Hospital, Fujian Medical University Cancer Hospital, Fuzhou, China
| | - Jinyang Liu
- Science System Department, Geneis Beijing Co., Ltd., Beijing, China
- Qingdao Geneis Institute of Big Data Mining and Precision Medicine, Qingdao, China
| | - Xiaoli Shi
- Science System Department, Geneis Beijing Co., Ltd., Beijing, China
- Qingdao Geneis Institute of Big Data Mining and Precision Medicine, Qingdao, China
- *Correspondence: Yi Shi, ; Xiaoli Shi,
| | - Yi Shi
- Department of Molecular Pathology, Fujian Cancer Hospital, Fujian Medical University Cancer Hospital, Fuzhou, China
- *Correspondence: Yi Shi, ; Xiaoli Shi,
| |
Collapse
|
23
|
Yang J, Shi X, Wang B, Qiu W, Tian G, Wang X, Wang P, Yang J. Ultrasound Image Classification of Thyroid Nodules Based on Deep Learning. Front Oncol 2022; 12:905955. [PMID: 35912199 PMCID: PMC9335944 DOI: 10.3389/fonc.2022.905955] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2022] [Accepted: 06/22/2022] [Indexed: 11/25/2022] Open
Abstract
A thyroid nodule, which is defined as abnormal growth of thyroid cells, indicates excessive iodine intake, thyroid degeneration, inflammation, and other diseases. Although thyroid nodules are always non-malignant, the malignancy likelihood of a thyroid nodule grows steadily every year. In order to reduce the burden on doctors and avoid unnecessary fine needle aspiration (FNA) and surgical resection, various studies have been done to diagnose thyroid nodules through deep-learning-based image recognition analysis. In this study, to predict the benign and malignant thyroid nodules accurately, a novel deep learning framework is proposed. Five hundred eight ultrasound images were collected from the Third Hospital of Hebei Medical University in China for model training and validation. First, a ResNet18 model, pretrained on ImageNet, was trained by an ultrasound image dataset, and a random sampling of training dataset was applied 10 times to avoid accidental errors. The results show that our model has a good performance, the average area under curve (AUC) of 10 times is 0.997, the average accuracy is 0.984, the average recall is 0.978, the average precision is 0.939, and the average F1 score is 0.957. Second, Gradient-weighted Class Activation Mapping (Grad-CAM) was proposed to highlight sensitive regions in an ultrasound image during the learning process. Grad-CAM is able to extract the sensitive regions and analyze their shape features. Based on the results, there are obvious differences between benign and malignant thyroid nodules; therefore, shape features of the sensitive regions are helpful in diagnosis to a great extent. Overall, the proposed model demonstrated the feasibility of employing deep learning and ultrasound images to estimate benign and malignant thyroid nodules.
Collapse
Affiliation(s)
- Jingya Yang
- School of Electrical & Information Engineering, Anhui University of Technology, Ma’anshan, China
- Scientific System, Geneis Beijing Co., Ltd., Beijing, China
| | - Xiaoli Shi
- Scientific System, Geneis Beijing Co., Ltd., Beijing, China
- Qingdao Genesis Institute of Big Data Mining and Precision Medicine, Qingdao, China
| | - Bing Wang
- School of Electrical & Information Engineering, Anhui University of Technology, Ma’anshan, China
| | - Wenjing Qiu
- School of Electrical & Information Engineering, Anhui University of Technology, Ma’anshan, China
- Scientific System, Geneis Beijing Co., Ltd., Beijing, China
| | - Geng Tian
- Scientific System, Geneis Beijing Co., Ltd., Beijing, China
- Qingdao Genesis Institute of Big Data Mining and Precision Medicine, Qingdao, China
| | - Xudong Wang
- School of Electrical & Information Engineering, Anhui University of Technology, Ma’anshan, China
| | - Peizhen Wang
- School of Electrical & Information Engineering, Anhui University of Technology, Ma’anshan, China
- *Correspondence: Peizhen Wang, ; Jiasheng Yang,
| | - Jiasheng Yang
- School of Electrical & Information Engineering, Anhui University of Technology, Ma’anshan, China
- *Correspondence: Peizhen Wang, ; Jiasheng Yang,
| |
Collapse
|
24
|
Chen Y, Sun X, Yang J. Prediction of Gastric Cancer-Related Genes Based on the Graph Transformer Network. Front Oncol 2022; 12:902616. [PMID: 35847949 PMCID: PMC9281472 DOI: 10.3389/fonc.2022.902616] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2022] [Accepted: 04/26/2022] [Indexed: 02/01/2023] Open
Abstract
Gastric cancer is a complex multifactorial and multistage process that involves a large number of tumor-related gene structural changes and abnormal expression. Therefore, knowing the related genes of gastric cancer can further understand the pathogenesis of gastric cancer and provide guidance for the development of targeted drugs. Traditional methods to discover gastric cancer-related genes based on biological experiments are time-consuming and expensive. In recent years, a large number of computational methods have been developed to identify gastric cancer-related genes. In addition, a large number of experiments show that establishing a biological network to identify disease-related genes has higher accuracy than ordinary methods. However, most of the current computing methods focus on the processing of homogeneous networks, and do not have the ability to encode heterogeneous networks. In this paper, we built a heterogeneous network using a disease similarity network and a gene interaction network. We implemented the graph transformer network (GTN) to encode this heterogeneous network. Meanwhile, the deep belief network (DBN) was applied to reduce the dimension of features. We call this method “DBN-GTN”, and it performed best among four traditional methods and five similar methods.
Collapse
|
25
|
Recent Deep Learning Methodology Development for RNA–RNA Interaction Prediction. Symmetry (Basel) 2022. [DOI: 10.3390/sym14071302] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/04/2023] Open
Abstract
Genetic regulation of organisms involves complicated RNA–RNA interactions (RRIs) among messenger RNA (mRNA), microRNA (miRNA), and long non-coding RNA (lncRNA). Detecting RRIs is beneficial for discovering biological mechanisms as well as designing new drugs. In recent years, with more and more experimentally verified RNA–RNA interactions being deposited into databases, statistical machine learning, especially recent deep-learning-based automatic algorithms, have been widely applied to RRI prediction with remarkable success. This paper first gives a brief introduction to the traditional machine learning methods applied on RRI prediction and benchmark databases for training the models, and then provides a recent methodology overview of deep learning models in the prediction of microRNA (miRNA)–mRNA interactions and long non-coding RNA (lncRNA)–miRNA interactions.
Collapse
|
26
|
Xu H, Hu X, Yan X, Zhong W, Yin D, Gai Y. Exploring noncoding RNAs in thyroid cancer using a graph convolutional network approach. Comput Biol Med 2022; 145:105447. [DOI: 10.1016/j.compbiomed.2022.105447] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2022] [Revised: 03/20/2022] [Accepted: 03/21/2022] [Indexed: 12/01/2022]
|
27
|
Liu Y, Huang K, Yang Y, Wu Y, Gao W. Prediction of Tumor Mutation Load in Colorectal Cancer Histopathological Images Based on Deep Learning. Front Oncol 2022; 12:906888. [PMID: 35686098 PMCID: PMC9171017 DOI: 10.3389/fonc.2022.906888] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2022] [Accepted: 04/18/2022] [Indexed: 02/05/2023] Open
Abstract
Colorectal cancer (CRC) is one of the most prevalent malignancies, and immunotherapy can be applied to CRC patients of all ages, while its efficacy is uncertain. Tumor mutational burden (TMB) is important for predicting the effect of immunotherapy. Currently, whole-exome sequencing (WES) is a standard method to measure TMB, but it is costly and inefficient. Therefore, it is urgent to explore a method to assess TMB without WES to improve immunotherapy outcomes. In this study, we propose a deep learning method, DeepHE, based on the Residual Network (ResNet) model. On images of tissue, DeepHE can efficiently identify and analyze characteristics of tumor cells in CRC to predict the TMB. In our study, we used ×40 magnification images and grouped them by patients followed by thresholding at the 10th and 20th quantiles, which significantly improves the performance. Also, our model is superior compared with multiple models. In summary, deep learning methods can explore the association between histopathological images and genetic mutations, which will contribute to the precise treatment of CRC patients.
Collapse
Affiliation(s)
- Yongguang Liu
- Department of Anorectal Surgery, Weifang People’s Hospital, Weifang, China
| | - Kaimei Huang
- Genies (Beijing) Co., Ltd., Beijing, China
- Qingdao Geneis Institute of Big Data Mining and Precision Medicine, Qingdao, China
| | - Yachao Yang
- School of Mathematics and Statistics, Hainan Normal University, Haikou, China
| | - Yan Wu
- Genies (Beijing) Co., Ltd., Beijing, China
- Qingdao Geneis Institute of Big Data Mining and Precision Medicine, Qingdao, China
| | - Wei Gao
- Department of Internal Medicine-Oncology, Fujian Cancer Hospital and Fujian Medical University Cancer Hospital, Fuzhou, China
- *Correspondence: Wei Gao,
| |
Collapse
|
28
|
Liang Y, Zhang ZQ, Liu NN, Wu YN, Gu CL, Wang YL. MAGCNSE: predicting lncRNA-disease associations using multi-view attention graph convolutional network and stacking ensemble model. BMC Bioinformatics 2022; 23:189. [PMID: 35590258 PMCID: PMC9118755 DOI: 10.1186/s12859-022-04715-w] [Citation(s) in RCA: 17] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2022] [Accepted: 05/05/2022] [Indexed: 01/02/2023] Open
Abstract
Background Many long non-coding RNAs (lncRNAs) have key roles in different human biologic processes and are closely linked to numerous human diseases, according to cumulative evidence. Predicting potential lncRNA-disease associations can help to detect disease biomarkers and perform disease analysis and prevention. Establishing effective computational methods for lncRNA-disease association prediction is critical.
Results In this paper, we propose a novel model named MAGCNSE to predict underlying lncRNA-disease associations. We first obtain multiple feature matrices from the multi-view similarity graphs of lncRNAs and diseases utilizing graph convolutional network. Then, the weights are adaptively assigned to different feature matrices of lncRNAs and diseases using the attention mechanism. Next, the final representations of lncRNAs and diseases is acquired by further extracting features from the multi-channel feature matrices of lncRNAs and diseases using convolutional neural network. Finally, we employ a stacking ensemble classifier, consisting of multiple traditional machine learning classifiers, to make the final prediction. The results of ablation studies in both representation learning methods and classification methods demonstrate the validity of each module. Furthermore, we compare the overall performance of MAGCNSE with that of six other state-of-the-art models, the results show that it outperforms the other methods. Moreover, we verify the effectiveness of using multi-view data of lncRNAs and diseases. Case studies further reveal the outstanding ability of MAGCNSE in the identification of potential lncRNA-disease associations.
Conclusions The experimental results indicate that MAGCNSE is a useful approach for predicting potential lncRNA-disease associations. Supplementary Information The online version contains supplementary material available at 10.1186/s12859-022-04715-w.
Collapse
Affiliation(s)
- Ying Liang
- College of Computer and Information Engineering, Jiangxi Agricultural University, Nanchang, China
| | - Ze-Qun Zhang
- College of Computer and Information Engineering, Jiangxi Agricultural University, Nanchang, China
| | - Nian-Nian Liu
- College of Computer and Information Engineering, Jiangxi Agricultural University, Nanchang, China
| | - Ya-Nan Wu
- College of Computer and Information Engineering, Jiangxi Agricultural University, Nanchang, China
| | - Chang-Long Gu
- College of Information Science and Engineering, Hunan University, Changsha, China
| | - Ying-Long Wang
- College of Computer and Information Engineering, Jiangxi Agricultural University, Nanchang, China.
| |
Collapse
|
29
|
Zhang H, Zou Q, Ju Y, Song C, Chen D. Distance-based support vector machine to predict DNA N6-methyladenine modification. Curr Bioinform 2022. [DOI: 10.2174/1574893617666220404145517] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
Background:
DNA N6-methyladenine plays an important role in the restriction-modification system to isolate invasion from adventive DNA. The shortcomings of the high time-consumption and high costs of experimental methods have been exposed, and some computational methods have emerged. The support vector machine theory has received extensive attention in the bioinformatics field due to its solid theoretical foundation and many good characteristics.
Objective:
General machine learning methods include an important step of extracting features. The research has omitted this step and replaced with easy-to-obtain sequence distances matrix to obtain better results
Method:
First sequence alignment technology was used to achieve the similarity matrix. Then a novel transformation turned the similarity matrix into a distance matrix. Next, the similarity-distance matrix is made positive semi-definite so that it can be used in the kernel matrix. Finally, the LIBSVM software was applied to solve the support vector machine.
Results:
The five-fold cross-validation of this model on rice and mouse data has achieved excellent accuracy rates of 92.04% and 96.51%, respectively. This shows that the DB-SVM method has obvious advantages compared with traditional machine learning methods. Meanwhile this model achieved 0.943,0.982 and 0.818 accuracy,0.944, 0.982, and 0.838 Matthews correlation coefficient and 0.942, 0.982 and 0.840 F1 scores for the rice, M. musculus and cross-species genome datasets, respectively.
Conclusion:
These outcomes show that this model outperforms the iIM-CNN and csDMA in the prediction of DNA 6mA modification, which are the lastest research on DNA 6mA.
Collapse
Affiliation(s)
- Haoyu Zhang
- Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu 610051, China
| | - Quan Zou
- Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu 610051, China
| | - Ying Ju
- School of Informatics, Xiamen University, Xiamen 361005, China
| | - Chenggang Song
- Beidahuang Industry Group General Hospital, Harbin 150001, China
| | - Dong Chen
- College of Electrical and Information Engineering, Quzhou University, Quzhou 324000, China
| |
Collapse
|
30
|
Li C, Su F, Liang Z, Zhang L, Liu F, Fan W, Li Z. Macrophage M1 regulatory diabetic nephropathy is mediated by m6A methylation modification of lncRNA expression. Mol Immunol 2022; 144:16-25. [DOI: 10.1016/j.molimm.2022.02.008] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2021] [Revised: 12/07/2021] [Accepted: 02/07/2022] [Indexed: 12/24/2022]
|
31
|
Online Diagnosis and Classification of CT Images Collected by Internet of Things Using Deep Learning. COMPUTATIONAL AND MATHEMATICAL METHODS IN MEDICINE 2022; 2022:5373624. [PMID: 35345522 PMCID: PMC8957435 DOI: 10.1155/2022/5373624] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/08/2022] [Revised: 02/08/2022] [Accepted: 02/09/2022] [Indexed: 11/17/2022]
Abstract
Deep learning technology has recently played an important role in image, language processing, and feature extraction. In the past disease diagnosis, most medical staff fixed the images together for observation and then combined with their own work experience to judge. The diagnosis results are subjective, time-consuming, and inefficient. In order to improve the efficiency of diagnosis, this paper applies the deep learning algorithm to the online diagnosis and classification of CT images. Based on this, in this paper, the deep learning algorithm is applied to CT image online diagnosis and classification. Based on a brief analysis of the current situation of CT image classification, this paper proposes to use the Internet of things technology to collect CT image information and establishes the Internet of things to collect the CT image model. In view of image classification and diagnosis, the convolution neural network algorithm in the deep learning algorithm is proposed to diagnose and classify CT images, and several factors affecting the accuracy of classification are proposed, including the convolution number and network layer number. Using the CT image of the hospital brain for simulation analysis, the simulation results confirm the effectiveness of the deep learning algorithm. With the increase of convolution and network layer and the decrease of compensation, the accuracy of image classification will decline. Using the maximum pool method, reducing the step size can improve the classification effect. Using relu function as the activation function can improve the classification accuracy. In the process of large data set processing, appropriately adding a network layer can improve classification accuracy. In the diagnosis and analysis of brain CT images, the overall classification accuracy is close to 70%, and in the diagnosis of tumor diseases, the accuracy is higher, up to 80%.
Collapse
|
32
|
Gogleva A, Polychronopoulos D, Pfeifer M, Poroshin V, Ughetto M, Martin MJ, Thorpe H, Bornot A, Smith PD, Sidders B, Dry JR, Ahdesmäki M, McDermott U, Papa E, Bulusu KC. Knowledge graph-based recommendation framework identifies drivers of resistance in EGFR mutant non-small cell lung cancer. Nat Commun 2022; 13:1667. [PMID: 35351890 PMCID: PMC8964738 DOI: 10.1038/s41467-022-29292-7] [Citation(s) in RCA: 22] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2021] [Accepted: 03/09/2022] [Indexed: 12/25/2022] Open
Abstract
Resistance to EGFR inhibitors (EGFRi) presents a major obstacle in treating non-small cell lung cancer (NSCLC). One of the most exciting new ways to find potential resistance markers involves running functional genetic screens, such as CRISPR, followed by manual triage of significantly enriched genes. This triage process to identify ‘high value’ hits resulting from the CRISPR screen involves manual curation that requires specialized knowledge and can take even experts several months to comprehensively complete. To find key drivers of resistance faster we build a recommendation system on top of a heterogeneous biomedical knowledge graph integrating pre-clinical, clinical, and literature evidence. The recommender system ranks genes based on trade-offs between diverse types of evidence linking them to potential mechanisms of EGFRi resistance. This unbiased approach identifies 57 resistance markers from >3,000 genes, reducing hit identification time from months to minutes. In addition to reproducing known resistance markers, our method identifies previously unexplored resistance mechanisms that we prospectively validate. Resistance to EGFR inhibitors presents a major obstacle in treating non-small cell lung cancer. Here, the authors develop a recommender system ranking genes based on trade-offs between diverse types of evidence linking them to potential mechanisms of EGFRi resistance.
Collapse
|
33
|
HKAM-MKM: A hybrid kernel alignment maximization-based multiple kernel model for identifying DNA-binding proteins. Comput Biol Med 2022; 145:105395. [PMID: 35334314 DOI: 10.1016/j.compbiomed.2022.105395] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/07/2022] [Revised: 03/08/2022] [Accepted: 03/08/2022] [Indexed: 12/24/2022]
Abstract
The identification of DNA-binding proteins (DBPs) has always been a hot issue in the field of sequence classification. However, considering that the experimental identification method is very resource-intensive, the construction of a computational prediction model is worthwhile. This study developed and evaluated a hybrid kernel alignment maximization-based multiple kernel model (HKAM-MKM) for predicting DBPs. First, we collected two datasets and performed feature extraction on the sequences to obtain six feature groups, and then constructed the corresponding kernels. To ensure the effective utilisation of the base kernel and avoid ignoring the difference between the sample and its neighbours, we proposed local kernel alignment to calculate the kernel between the sample and its neighbours, with each sample as the centre. We combined the global and local kernel alignments to develop a hybrid kernel alignment model, and balance the relationship between the two through parameters. By maximising the hybrid kernel alignment value, we obtained the weight of each kernel and then linearly combined the kernels in the form of weights. Finally, the fused kernel was input into a support vector machine for training and prediction. Finally, in the independent test sets PDB186 and PDB2272, we obtained the highest Matthew's correlation coefficient (MCC) (0.768 and 0.5962, respectively) and the highest accuracy (87.1% and 78.43%, respectively), which were superior to the other predictors. Therefore, HKAM-MKM is an efficient prediction tool for DBPs.
Collapse
|
34
|
Zhang C, Lu Y, Zang T. CNN-DDI: a learning-based method for predicting drug-drug interactions using convolution neural networks. BMC Bioinformatics 2022; 23:88. [PMID: 35255808 PMCID: PMC8902704 DOI: 10.1186/s12859-022-04612-2] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2022] [Accepted: 02/14/2022] [Indexed: 01/07/2023] Open
Abstract
Background Drug–drug interactions (DDIs) are the reactions between drugs. They are compartmentalized into three types: synergistic, antagonistic and no reaction. As a rapidly developing technology, predicting DDIs-associated events is getting more and more attention and application in drug development and disease diagnosis fields. In this work, we study not only whether the two drugs interact, but also specific interaction types. And we propose a learning-based method using convolution neural networks to learn feature representations and predict DDIs. Results In this paper, we proposed a novel algorithm using a CNN architecture, named CNN-DDI, to predict drug–drug interactions. First, we extract feature interactions from drug categories, targets, pathways and enzymes as feature vectors and employ the Jaccard similarity as the measurement of drugs similarity. Then, based on the representation of features, we build a new convolution neural network as the DDIs’ predictor. Conclusion The experimental results indicate that drug categories is effective as a new feature type applied to CNN-DDI method. And using multiple features is more informative and more effective than single feature. It can be concluded that CNN-DDI has more superiority than other existing algorithms on task of predicting DDIs.
Collapse
Affiliation(s)
- Chengcheng Zhang
- Department of Computer Science and Technology, Harbin Institute of Technology, Harbin, China
| | - Yao Lu
- General Hospital of Heilongjiang Province Land Reclamation Bureau, Harbin, China
| | - Tianyi Zang
- Department of Computer Science and Technology, Harbin Institute of Technology, Harbin, China.
| |
Collapse
|
35
|
Peng L, Tan J, Tian X, Zhou L. EnANNDeep: An Ensemble-based lncRNA-protein Interaction Prediction Framework with Adaptive k-Nearest Neighbor Classifier and Deep Models. Interdiscip Sci 2022; 14:209-232. [PMID: 35006529 DOI: 10.1007/s12539-021-00483-y] [Citation(s) in RCA: 12] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2021] [Revised: 09/14/2021] [Accepted: 09/15/2021] [Indexed: 01/08/2023]
Abstract
lncRNA-protein interactions (LPIs) prediction can deepen the understanding of many important biological processes. Artificial intelligence methods have reported many possible LPIs. However, most computational techniques were evaluated mainly on one dataset, which may produce prediction bias. More importantly, they were validated only under cross validation on lncRNA-protein pairs, and did not consider the performance under cross validations on lncRNAs and proteins, thus fail to search related proteins/lncRNAs for a new lncRNA/protein. Under an ensemble learning framework (EnANNDeep) composed of adaptive k-nearest neighbor classifier and Deep models, this study focuses on systematically finding underlying linkages between lncRNAs and proteins. First, five LPI-related datasets are arranged. Second, multiple source features are integrated to depict an lncRNA-protein pair. Third, adaptive k-nearest neighbor classifier, deep neural network, and deep forest are designed to score unknown lncRNA-protein pairs, respectively. Finally, interaction probabilities from the three predictors are integrated based on a soft voting technique. In comparing to five classical LPI identification models (SFPEL, PMDKN, CatBoost, PLIPCOM, and LPI-SKF) under fivefold cross validations on lncRNAs, proteins, and LPIs, EnANNDeep computes the best average AUCs of 0.8660, 0.8775, and 0.9166, respectively, and the best average AUPRs of 0.8545, 0.8595, and 0.9054, respectively, indicating its superior LPI prediction ability. Case study analyses indicate that SNHG10 may have dense linkage with Q15717. In the ensemble framework, adaptive k-nearest neighbor classifier can separately pick the most appropriate k for each query lncRNA-protein pair. More importantly, deep models including deep neural network and deep forest can effectively learn the representative features of lncRNAs and proteins.
Collapse
Affiliation(s)
- Lihong Peng
- School of Computer Science, Hunan University of Technology, Zhuzhou, China. .,College of Life Sciences and Chemistry, Hunan University of Technology, Zhuzhou, China.
| | - Jingwei Tan
- School of Computer Science, Hunan University of Technology, Zhuzhou, China
| | - Xiongfei Tian
- School of Computer Science, Hunan University of Technology, Zhuzhou, China
| | - Liqian Zhou
- School of Computer Science, Hunan University of Technology, Zhuzhou, China.
| |
Collapse
|
36
|
Xia Y, Li X, Chen X, Lu C, Yu X. Inferring Retinal Degeneration-Related Genes Based on Xgboost. Front Mol Biosci 2022; 9:843150. [PMID: 35223997 PMCID: PMC8880610 DOI: 10.3389/fmolb.2022.843150] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/24/2021] [Accepted: 01/17/2022] [Indexed: 11/13/2022] Open
Abstract
Retinal Degeneration (RD) is an inherited retinal disease characterized by degeneration of rods and cones photoreceptor cells and degeneration of retinal pigment epithelial cells. The age of onset and disease progression of RD are related to genes and environment. At present, research has discovered five genes closely related to RD. They are RHO, PDE6B, MERTK, RLBP1, RPGR, and researchers have developed corresponding gene therapy methods. Gene therapy uses vectors to transfer therapeutic genes, genetically modify target cells, and correct or replace disease-causing RD genes. Therefore, identifying the pathogenic genes of RD will play an important role in the development of treatment methods for the disease. However, the traditional methods of identifying RD-related genes are mostly based on animal experiments, and currently only a small number of RD-related genes have been identified. With the increase of biological data, Xgboost is purposed in this article to identify RP-related genes. Xgboost adds a regular term to control the complexity of the model, hence using Xgboost to find out true RD-related genes from complex and massive genes is suitable. The problem of overfitting can be avoided to some extent. To verify the power of Xgboost to identify RD-related genes, we did 10-cross validation and compared with three traditional methods: Random Forest, Back Propagation network, Support Vector Machine. The accuracy of Xgboost is 99.13% and AUC is much higher than other three methods. Therefore, this article can provide technical support for efficient identification of RD-related genes and help researchers have a deeper the understanding of the genetic characteristics of RD.
Collapse
Affiliation(s)
- Yujie Xia
- Department of Ophthalmology, The First Affiliated Hospital of Guangzhou University of Chinese Medicine, Guangzhou, China
| | - Xiaojie Li
- Department of Ophthalmology, The First Affiliated Hospital of Guangzhou University of Chinese Medicine, Guangzhou, China
| | - Xinlin Chen
- Guangzhou University of Chinese Medicine, Guangzhou, China
| | - Changjin Lu
- Department of Ophthalmology, The First Affiliated Hospital of Guangzhou University of Chinese Medicine, Guangzhou, China
| | - Xiaoyi Yu
- Department of Ophthalmology, The First Affiliated Hospital of Guangzhou University of Chinese Medicine, Guangzhou, China
| |
Collapse
|
37
|
Li C, Huang J, Tang H, Liu B, Zhou X. Revealing Cavin-2 Gene Function in Lung Based on Multi-Omics Data Analysis Method. Front Cell Dev Biol 2022; 9:827108. [PMID: 35174175 PMCID: PMC8841408 DOI: 10.3389/fcell.2021.827108] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2021] [Accepted: 12/15/2021] [Indexed: 11/23/2022] Open
Abstract
Research points out that it is particularly important to comprehensively evaluate immune microenvironmental indicators and gene mutation characteristics to select the best treatment plan. Therefore, exploring the relevant genes of pulmonary injury is an important basis for the improvement of survival. In recent years, with the massive production of omics data, a large number of computational methods have been applied in the field of biomedicine. Most of these computational methods are devel-oped for a certain type of diseases or whole diseases. Algorithms that specifically identify genes associated with pulmonary injury have not yet been developed. To fill this gap, we developed a novel method, named AdaRVM, to identify pulmonary injury-related genes in large scale. AdaRVM is the fusion of Adaboost and Relevance Vector Machine (RVM) to achieve fast and high-precision pattern recognition of pulmonary injury genetic mechanism. AdaRVM found that Cavin-2 gene has strong potential to be related to pulmonary injury. As we known, the formation and function of Caveolae are mediated by two family proteins: Caveolin and Cavin. Many studies have explored the role of Caveolin proteins, but people still knew little about Cavin family members. To verify our method and reveal the functions of cavin-2, we integrated six genome-wide association studies (GWAS) data related to lung function traits, four expression Quantitative Trait Loci (eQTL) data, and one methylation Quantitative Trait Loci (mQTL) data by Summary data level Mendelian Randomization (SMR). We found strong relationship between cavin-2 and canonical signaling pathways ERK1/2, AKT, and STAT3 which are all known to be related to lung injury.
Collapse
Affiliation(s)
- Changsheng Li
- Department of Thoracic Surgery, Zhongnan Hospital of Wuhan University, Wuhan, China
| | - Jingyu Huang
- Department of Thoracic Surgery, Zhongnan Hospital of Wuhan University, Wuhan, China
| | - Hexiao Tang
- Department of Thoracic Surgery, Zhongnan Hospital of Wuhan University, Wuhan, China
| | - Bing Liu
- Department of Pulmonary and Critical Care Medicine, Zhongnan Hospital of Wuhan University, Wuhan, China
- Wuhan Research Center for Infectious Diseases and Cancer, Chinese Academy of Medical Sciences, Wuhan, China
| | - Xuefeng Zhou
- Department of Thoracic Surgery, Zhongnan Hospital of Wuhan University, Wuhan, China
- *Correspondence: Xuefeng Zhou,
| |
Collapse
|
38
|
Han K, Cao P, Wang Y, Xie F, Ma J, Yu M, Wang J, Xu Y, Zhang Y, Wan J. A Review of Approaches for Predicting Drug–Drug Interactions Based on Machine Learning. Front Pharmacol 2022; 12:814858. [PMID: 35153767 PMCID: PMC8835726 DOI: 10.3389/fphar.2021.814858] [Citation(s) in RCA: 30] [Impact Index Per Article: 15.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2021] [Accepted: 12/20/2021] [Indexed: 01/01/2023] Open
Abstract
Drug–drug interactions play a vital role in drug research. However, they may also cause adverse reactions in patients, with serious consequences. Manual detection of drug–drug interactions is time-consuming and expensive, so it is urgent to use computer methods to solve the problem. There are two ways for computers to identify drug interactions: one is to identify known drug interactions, and the other is to predict unknown drug interactions. In this paper, we review the research progress of machine learning in predicting unknown drug interactions. Among these methods, the literature-based method is special because it combines the extraction method of DDI and the prediction method of DDI. We first introduce the common databases, then briefly describe each method, and summarize the advantages and disadvantages of some prediction models. Finally, we discuss the challenges and prospects of machine learning methods in predicting drug interactions. This review aims to provide useful guidance for interested researchers to further promote bioinformatics algorithms to predict DDI.
Collapse
Affiliation(s)
- Ke Han
- Heilongjiang Provincial Key Laboratory of Electronic Commerce and Information Processing, School of Computer and Information Engineering, Harbin University of Commerce, Harbin, China
- College of Pharmacy, Harbin University of Commerce, Harbin, China
- *Correspondence: Ke Han, ; Jie Wan,
| | - Peigang Cao
- Beidahuang Industry Group General Hospital, Harbin, China
| | - Yu Wang
- Heilongjiang Provincial Key Laboratory of Electronic Commerce and Information Processing, School of Computer and Information Engineering, Harbin University of Commerce, Harbin, China
| | - Fang Xie
- Heilongjiang Provincial Key Laboratory of Electronic Commerce and Information Processing, School of Computer and Information Engineering, Harbin University of Commerce, Harbin, China
| | - Jiaqi Ma
- Heilongjiang Provincial Key Laboratory of Electronic Commerce and Information Processing, School of Computer and Information Engineering, Harbin University of Commerce, Harbin, China
| | - Mengyao Yu
- Heilongjiang Provincial Key Laboratory of Electronic Commerce and Information Processing, School of Computer and Information Engineering, Harbin University of Commerce, Harbin, China
| | - Jianchun Wang
- Heilongjiang Provincial Key Laboratory of Electronic Commerce and Information Processing, School of Computer and Information Engineering, Harbin University of Commerce, Harbin, China
| | - Yaoqun Xu
- Heilongjiang Provincial Key Laboratory of Electronic Commerce and Information Processing, School of Computer and Information Engineering, Harbin University of Commerce, Harbin, China
| | - Yu Zhang
- Heilongjiang Provincial Key Laboratory of Electronic Commerce and Information Processing, School of Computer and Information Engineering, Harbin University of Commerce, Harbin, China
| | - Jie Wan
- Laboratory for Space Environment and Physical Sciences, Harbin Institute of Technology, Harbin, China
- *Correspondence: Ke Han, ; Jie Wan,
| |
Collapse
|
39
|
Ma D, Chen Z, He Z, Huang X. A SNARE Protein Identification Method Based on iLearnPlus to Efficiently Solve the Data Imbalance Problem. Front Genet 2022; 12:818841. [PMID: 35154261 PMCID: PMC8832978 DOI: 10.3389/fgene.2021.818841] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2021] [Accepted: 12/14/2021] [Indexed: 11/13/2022] Open
Abstract
Machine learning has been widely used to solve complex problems in engineering applications and scientific fields, and many machine learning-based methods have achieved good results in different fields. SNAREs are key elements of membrane fusion and required for the fusion process of stable intermediates. They are also associated with the formation of some psychiatric disorders. This study processes the original sequence data with the synthetic minority oversampling technique (SMOTE) to solve the problem of data imbalance and produces the most suitable machine learning model with the iLearnPlus platform for the identification of SNARE proteins. Ultimately, a sensitivity of 66.67%, specificity of 93.63%, accuracy of 91.33%, and MCC of 0.528 were obtained in the cross-validation dataset, and a sensitivity of 66.67%, specificity of 93.63%, accuracy of 91.33%, and MCC of 0.528 were obtained in the independent dataset (the adaptive skip dipeptide composition descriptor was used for feature extraction, and LightGBM with proper parameters was used as the classifier). These results demonstrate that this combination can perform well in the classification of SNARE proteins and is superior to other methods.
Collapse
|
40
|
Zhao Z, Yang W, Zhai Y, Liang Y, Zhao Y. Identify DNA-Binding Proteins Through the Extreme Gradient Boosting Algorithm. Front Genet 2022; 12:821996. [PMID: 35154264 PMCID: PMC8837382 DOI: 10.3389/fgene.2021.821996] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/25/2021] [Accepted: 12/07/2021] [Indexed: 12/13/2022] Open
Abstract
The exploration of DNA-binding proteins (DBPs) is an important aspect of studying biological life activities. Research on life activities requires the support of scientific research results on DBPs. The decline in many life activities is closely related to DBPs. Generally, the detection method for identifying DBPs is achieved through biochemical experiments. This method is inefficient and requires considerable manpower, material resources and time. At present, several computational approaches have been developed to detect DBPs, among which machine learning (ML) algorithm-based computational techniques have shown excellent performance. In our experiments, our method uses fewer features and simpler recognition methods than other methods and simultaneously obtains satisfactory results. First, we use six feature extraction methods to extract sequence features from the same group of DBPs. Then, this feature information is spliced together, and the data are standardized. Finally, the extreme gradient boosting (XGBoost) model is used to construct an effective predictive model. Compared with other excellent methods, our proposed method has achieved better results. The accuracy achieved by our method is 78.26% for PDB2272 and 85.48% for PDB186. The accuracy of the experimental results achieved by our strategy is similar to that of previous detection methods.
Collapse
Affiliation(s)
- Ziye Zhao
- College of Information and Computer Engineering, Northeast Forestry University, Harbin, China
| | - Wen Yang
- International Medical Center, Shenzhen University General Hospital, Shenzhen, China
| | - Yixiao Zhai
- College of Information and Computer Engineering, Northeast Forestry University, Harbin, China
| | - Yingjian Liang
- Department of Obstetrics and Gynecology, The First Affiliated Hospital of Harbin Medical University, Harbin, China
- *Correspondence: Yingjian Liang, ; Yuming Zhao,
| | - Yuming Zhao
- College of Information and Computer Engineering, Northeast Forestry University, Harbin, China
- *Correspondence: Yingjian Liang, ; Yuming Zhao,
| |
Collapse
|
41
|
Li J, Yang Z, Wang D, Li Z. WAFNRLTG: A Novel Model for Predicting LncRNA Target Genes Based on Weighted Average Fusion Network Representation Learning Method. Front Cell Dev Biol 2022; 9:820342. [PMID: 35127729 PMCID: PMC8807548 DOI: 10.3389/fcell.2021.820342] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2021] [Accepted: 12/14/2021] [Indexed: 11/29/2022] Open
Abstract
Long non-coding RNAs (lncRNAs) do not encode proteins, yet they have been well established to be involved in complex regulatory functions, and lncRNA regulatory dysfunction can lead to a variety of human complex diseases. LncRNAs mostly exert their functions by regulating the expressions of target genes, and accurate prediction of potential lncRNA target genes would be helpful to further understanding the functional annotations of lncRNAs. Considering the limitations in traditional computational methods for predicting lncRNA target genes, a novel model which was named Weighted Average Fusion Network Representation learning for predicting LncRNA Target Genes (WAFNRLTG) was proposed. First, a novel heterogeneous network was constructed by integrating lncRNA sequence similarity network, mRNA sequence similarity network, lncRNA-mRNA interaction network, lncRNA-miRNA interaction network and mRNA-miRNA interaction network. Next, four popular network representation learning methods were utilized to gain the representation vectors of lncRNA and mRNA nodes. Then, the representations of lncRNAs and target genes in the heterogeneous network were obtained with the weighted average fusion network representation learning method. Finally, we merged the representations of lncRNAs and related target genes to form lncRNA-gene pairs, trained the XGBoost classifier and predicted potential lncRNA target genes. In five-cross validations on the training and independent datasets, the experimental results demonstrated that WAFNRLTG obtained better AUC scores (0.9410, 0.9350) and AUPR scores (0.9391, 0.9350). Moreover, case studies of three common lncRNAs were performed for predicting their potential lncRNA target genes and the results confirmed the effectiveness of WAFNRLTG. The source codes and all data of WAFNRLTG can be freely downloaded at https://github.com/HGDYZW/WAFNRLTG.
Collapse
Affiliation(s)
- Jianwei Li
- School of Artificial Intelligence, Institute of Computational Medicine, Hebei University of Technology, Tianjin, China
- Hebei Province Key Laboratory of Big Data Calculation, Hebei University of Technology, Tianjin, China
- *Correspondence: Jianwei Li,
| | - Zhenwu Yang
- School of Artificial Intelligence, Institute of Computational Medicine, Hebei University of Technology, Tianjin, China
| | - Duanyang Wang
- School of Artificial Intelligence, Institute of Computational Medicine, Hebei University of Technology, Tianjin, China
| | - Zhiguang Li
- School of Artificial Intelligence, Institute of Computational Medicine, Hebei University of Technology, Tianjin, China
| |
Collapse
|
42
|
Chen Q, Zhang J, Bao B, Zhang F, Zhou J. Large-Scale Gastric Cancer Susceptibility Gene Identification Based on Gradient Boosting Decision Tree. Front Mol Biosci 2022; 8:815243. [PMID: 35096975 PMCID: PMC8793069 DOI: 10.3389/fmolb.2021.815243] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2021] [Accepted: 12/06/2021] [Indexed: 01/21/2023] Open
Abstract
The early clinical symptoms of gastric cancer are not obvious, and metastasis may have occurred at the time of treatment. Poor prognosis is one of the important reasons for the high mortality of gastric cancer. Therefore, the identification of gastric cancer-related genes can be used as relevant markers for diagnosis and treatment to improve diagnosis precision and guide personalized treatment. In order to further reveal the pathogenesis of gastric cancer at the gene level, we proposed a method based on Gradient Boosting Decision Tree (GBDT) to identify the susceptible genes of gastric cancer through gene interaction network. Based on the known genes related to gastric cancer, we collected more genes which can interact with them and constructed a gene interaction network. Random Walk was used to extract network association of each gene and we used GBDT to identify the gastric cancer-related genes. To verify the AUC and AUPR of our algorithm, we implemented 10-fold cross-validation. GBDT achieved AUC as 0.89 and AUPR as 0.81. We selected four other methods to compare with GBDT and found GBDT performed best.
Collapse
Affiliation(s)
- Qing Chen
- Department of Hepatobiliary Surgery, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, China
| | - Ji Zhang
- Department of Hepatobiliary Surgery, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, China
| | - Banghe Bao
- Department of Pathology, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, China
| | - Fan Zhang
- Wuhan Asia General Hospital, Wuhan, China
| | - Jie Zhou
- Department of Biochemistry and Molecular Biology, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, China
- *Correspondence: Jie Zhou,
| |
Collapse
|
43
|
Zhang Z, Gong Y, Gao B, Li H, Gao W, Zhao Y, Dong B. SNAREs-SAP: SNARE Proteins Identification With PSSM Profiles. Front Genet 2022; 12:809001. [PMID: 34987554 PMCID: PMC8721734 DOI: 10.3389/fgene.2021.809001] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2021] [Accepted: 11/15/2021] [Indexed: 12/20/2022] Open
Abstract
Soluble N-ethylmaleimide sensitive factor activating protein receptor (SNARE) proteins are a large family of transmembrane proteins located in organelles and vesicles. The important roles of SNARE proteins include initiating the vesicle fusion process and activating and fusing proteins as they undergo exocytosis activity, and SNARE proteins are also vital for the transport regulation of membrane proteins and non-regulatory vesicles. Therefore, there is great significance in establishing a method to efficiently identify SNARE proteins. However, the identification accuracy of the existing methods such as SNARE CNN is not satisfied. In our study, we developed a method based on a support vector machine (SVM) that can effectively recognize SNARE proteins. We used the position-specific scoring matrix (PSSM) method to extract features of SNARE protein sequences, used the support vector machine recursive elimination correlation bias reduction (SVM-RFE-CBR) algorithm to rank the importance of features, and then screened out the optimal subset of feature data based on the sorted results. We input the feature data into the model when building the model, used 10-fold crossing validation for training, and tested model performance by using an independent dataset. In independent tests, the ability of our method to identify SNARE proteins achieved a sensitivity of 68%, specificity of 94%, accuracy of 92%, area under the curve (AUC) of 84%, and Matthew’s correlation coefficient (MCC) of 0.48. The results of the experiment show that the common evaluation indicators of our method are excellent, indicating that our method performs better than other existing classification methods in identifying SNARE proteins.
Collapse
Affiliation(s)
- Zixiao Zhang
- College of Information and Computer Engineering, Northeast Forestry University, Harbin, China
| | - Yue Gong
- College of Information and Computer Engineering, Northeast Forestry University, Harbin, China
| | - Bo Gao
- Department of Radiology, The Second Affiliated Hospital, Harbin Medical University, Harbin, China
| | - Hongfei Li
- College of Information and Computer Engineering, Northeast Forestry University, Harbin, China
| | - Wentao Gao
- College of Information and Computer Engineering, Northeast Forestry University, Harbin, China
| | - Yuming Zhao
- College of Information and Computer Engineering, Northeast Forestry University, Harbin, China
| | - Benzhi Dong
- College of Information and Computer Engineering, Northeast Forestry University, Harbin, China
| |
Collapse
|
44
|
Wang L, Zhong C. gGATLDA: lncRNA-disease association prediction based on graph-level graph attention network. BMC Bioinformatics 2022; 23:11. [PMID: 34983363 PMCID: PMC8729153 DOI: 10.1186/s12859-021-04548-z] [Citation(s) in RCA: 13] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2021] [Accepted: 12/21/2021] [Indexed: 01/20/2023] Open
Abstract
Background Long non-coding RNAs (lncRNAs) are related to human diseases by regulating gene expression. Identifying lncRNA-disease associations (LDAs) will contribute to diagnose, treatment, and prognosis of diseases. However, the identification of LDAs by the biological experiments is time-consuming, costly and inefficient. Therefore, the development of efficient and high-accuracy computational methods for predicting LDAs is of great significance. Results In this paper, we propose a novel computational method (gGATLDA) to predict LDAs based on graph-level graph attention network. Firstly, we extract the enclosing subgraphs of each lncRNA-disease pair. Secondly, we construct the feature vectors by integrating lncRNA similarity and disease similarity as node attributes in subgraphs. Finally, we train a graph neural network (GNN) model by feeding the subgraphs and feature vectors to it, and use the trained GNN model to predict lncRNA-disease potential association scores. The experimental results show that our method can achieve higher area under the receiver operation characteristic curve (AUC), area under the precision recall curve (AUPR), accuracy and F1-Score than the state-of-the-art methods in five fold cross-validation. Case studies show that our method can effectively identify lncRNAs associated with breast cancer, gastric cancer, prostate cancer, and renal cancer. Conclusion The experimental results indicate that our method is a useful approach for predicting potential LDAs.
Collapse
Affiliation(s)
- Li Wang
- School of Computer Science and Engineering, South China University of Technology, Guangzhou, China.,School of Computer, Electronics and Information, Guangxi University, Nanning, China
| | - Cheng Zhong
- School of Computer, Electronics and Information, Guangxi University, Nanning, China. .,Key Laboratory of Parallel and Distributed Computing in Guangxi Colleges and Universities, Guangxi University, Nanning, China.
| |
Collapse
|
45
|
Han S, Wang N, Guo Y, Tang F, Xu L, Ju Y, Shi L. Application of Sparse Representation in Bioinformatics. Front Genet 2021; 12:810875. [PMID: 34976030 PMCID: PMC8715914 DOI: 10.3389/fgene.2021.810875] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2021] [Accepted: 12/01/2021] [Indexed: 11/15/2022] Open
Abstract
Inspired by L1-norm minimization methods, such as basis pursuit, compressed sensing, and Lasso feature selection, in recent years, sparse representation shows up as a novel and potent data processing method and displays powerful superiority. Researchers have not only extended the sparse representation of a signal to image presentation, but also applied the sparsity of vectors to that of matrices. Moreover, sparse representation has been applied to pattern recognition with good results. Because of its multiple advantages, such as insensitivity to noise, strong robustness, less sensitivity to selected features, and no “overfitting” phenomenon, the application of sparse representation in bioinformatics should be studied further. This article reviews the development of sparse representation, and explains its applications in bioinformatics, namely the use of low-rank representation matrices to identify and study cancer molecules, low-rank sparse representations to analyze and process gene expression profiles, and an introduction to related cancers and gene expression profile database.
Collapse
Affiliation(s)
- Shuguang Han
- Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, Quzhou, China
| | - Ning Wang
- Beidahuang Industry Group General Hospital, Harbin, China
| | - Yuxin Guo
- Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, Quzhou, China
- School of Mathematics and Statistics, Hainan Normal University, Haikou, China
| | - Furong Tang
- Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, Quzhou, China
- School of Electronic and Communication Engineering, Shenzhen Polytechnic, Shenzhen, China
| | - Lei Xu
- School of Electronic and Communication Engineering, Shenzhen Polytechnic, Shenzhen, China
| | - Ying Ju
- School of Informatics, Xiamen University, Xiamen, China
- *Correspondence: Ying Ju, ; Lei Shi,
| | - Lei Shi
- Department of Spine Surgery, Changzheng Hospital, Naval Medical University, Shanghai, China
- *Correspondence: Ying Ju, ; Lei Shi,
| |
Collapse
|
46
|
Cheng N, Cui X, Chen C, Li C, Huang J. Exploration of Lung Cancer-Related Genetic Factors via Mendelian Randomization Method Based on Genomic and Transcriptomic Summarized Data. Front Cell Dev Biol 2021; 9:800756. [PMID: 34938740 PMCID: PMC8686495 DOI: 10.3389/fcell.2021.800756] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2021] [Accepted: 11/22/2021] [Indexed: 12/24/2022] Open
Abstract
Lung carcinoma is one of the most deadly malignant tumors in mankind. With the rising incidence of lung cancer, searching for the high effective cures become more and more imperative. There has been sufficient research evidence that living habits and situations such as smoking and air pollution are associated with an increased risk of lung cancer. Simultaneously, the influence of individual genetic susceptibility on lung carcinoma morbidity has been confirmed, and a growing body of evidence has been accumulated on the relationship between various risk factors and the risk of different pathological types of lung cancer. Additionally, the analyses from many large-scale cancer registries have shown a degree of familial aggregation of lung cancer. To explore lung cancer-related genetic factors, Genome-Wide Association Studies (GWAS) have been used to identify several lung cancer susceptibility sites and have been widely validated. However, the biological mechanism behind the impact of these site mutations on lung cancer remains unclear. Therefore, this study applied the Summary data-based Mendelian Randomization (SMR) model through the integration of two GWAS datasets and four expression Quantitative Trait Loci (eQTL) datasets to identify susceptibility genes. Using this strategy, we found ten of Single Nucleotide Polymorphisms (SNPs) sites that affect the occurrence and development of lung tumors by regulating the expression of seven genes. Further analysis of the signaling pathway about these genes not only provides important clues to explain the pathogenesis of lung cancer but also has critical significance for the diagnosis and treatment of lung cancer.
Collapse
Affiliation(s)
- Nitao Cheng
- Department of Thoracic Surgery, Zhongnan Hospital, Wuhan University, Wuhan, China
| | - Xinran Cui
- School of Computer Science and Technology, Harbin Institute of Technology, Harbin, China
| | - Chen Chen
- Department of Biological Repositories, Zhongnan Hospital, Wuhan University, Wuhan, China
| | - Changsheng Li
- Department of Thoracic Surgery, Zhongnan Hospital, Wuhan University, Wuhan, China
| | - Jingyu Huang
- Department of Thoracic Surgery, Zhongnan Hospital, Wuhan University, Wuhan, China
| |
Collapse
|
47
|
Shang J, Sun Y. Predicting the hosts of prokaryotic viruses using GCN-based semi-supervised learning. BMC Biol 2021; 19:250. [PMID: 34819064 PMCID: PMC8611875 DOI: 10.1186/s12915-021-01180-4] [Citation(s) in RCA: 20] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2021] [Accepted: 10/29/2021] [Indexed: 11/23/2022] Open
Abstract
Background Prokaryotic viruses, which infect bacteria and archaea, are the most abundant and diverse biological entities in the biosphere. To understand their regulatory roles in various ecosystems and to harness the potential of bacteriophages for use in therapy, more knowledge of viral-host relationships is required. High-throughput sequencing and its application to the microbiome have offered new opportunities for computational approaches for predicting which hosts particular viruses can infect. However, there are two main challenges for computational host prediction. First, the empirically known virus-host relationships are very limited. Second, although sequence similarity between viruses and their prokaryote hosts have been used as a major feature for host prediction, the alignment is either missing or ambiguous in many cases. Thus, there is still a need to improve the accuracy of host prediction. Results In this work, we present a semi-supervised learning model, named HostG, to conduct host prediction for novel viruses. We construct a knowledge graph by utilizing both virus-virus protein similarity and virus-host DNA sequence similarity. Then graph convolutional network (GCN) is adopted to exploit viruses with or without known hosts in training to enhance the learning ability. During the GCN training, we minimize the expected calibrated error (ECE) to ensure the confidence of the predictions. We tested HostG on both simulated and real sequencing data and compared its performance with other state-of-the-art methods specifically designed for virus host classification (VHM-net, WIsH, PHP, HoPhage, RaFAH, vHULK, and VPF-Class). Conclusion HostG outperforms other popular methods, demonstrating the efficacy of using a GCN-based semi-supervised learning approach. A particular advantage of HostG is its ability to predict hosts from new taxa. Supplementary Information The online version contains supplementary material available at (10.1186/s12915-021-01180-4).
Collapse
Affiliation(s)
- Jiayu Shang
- Electrical Engineering, City University of Hong Kong, Hong Kong, China
| | - Yanni Sun
- Electrical Engineering, City University of Hong Kong, Hong Kong, China.
| |
Collapse
|
48
|
Zhang H, Xu R, Ding M, Zhang Y. Prediction of Gastric Cancer-Related Proteins Based on Graph Fusion Method. Front Cell Dev Biol 2021; 9:739715. [PMID: 34790662 PMCID: PMC8591485 DOI: 10.3389/fcell.2021.739715] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2021] [Accepted: 08/02/2021] [Indexed: 12/09/2022] Open
Abstract
Gastric cancer is a common malignant tumor of the digestive system with no specific symptoms. Due to the limited knowledge of pathogenesis, patients are usually diagnosed in advanced stage and do not have effective treatment methods. Proteome has unique tissue and time specificity and can reflect the influence of external factors that has become a potential biomarker for early diagnosis. Therefore, discovering gastric cancer-related proteins could greatly help researchers design drugs and develop an early diagnosis kit. However, identifying gastric cancer-related proteins by biological experiments is time- and money-consuming. With the high speed increase of data, it has become a hot issue to mine the knowledge of proteomics data on a large scale through computational methods. Based on the hypothesis that the stronger the association between the two proteins, the more likely they are to be associated with the same disease, in this paper, we constructed both disease similarity network and protein interaction network. Then, Graph Convolutional Networks (GCN) was applied to extract topological features of these networks. Finally, Xgboost was used to identify the relationship between proteins and gastric cancer. Results of 10-cross validation experiments show high area under the curve (AUC) (0.85) and area under the precision recall (AUPR) curve (0.76) of our method, which proves the effectiveness of our method.
Collapse
Affiliation(s)
- Hao Zhang
- Endoscopy Center, China-Japan Union Hospital of Jilin University, Changchun, China
| | - Ruisi Xu
- Endoscopy Center, China-Japan Union Hospital of Jilin University, Changchun, China
| | - Meng Ding
- Endoscopy Center, China-Japan Union Hospital of Jilin University, Changchun, China
| | - Ying Zhang
- Endoscopy Center, China-Japan Union Hospital of Jilin University, Changchun, China
| |
Collapse
|
49
|
ReRF-Pred: predicting amyloidogenic regions of proteins based on their pseudo amino acid composition and tripeptide composition. BMC Bioinformatics 2021; 22:545. [PMID: 34753427 PMCID: PMC8579573 DOI: 10.1186/s12859-021-04446-4] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2021] [Accepted: 10/13/2021] [Indexed: 02/08/2023] Open
Abstract
BACKGROUND Amyloids are insoluble fibrillar aggregates that are highly associated with complex human diseases, such as Alzheimer's disease, Parkinson's disease, and type II diabetes. Recently, many studies reported that some specific regions of amino acid sequences may be responsible for the amyloidosis of proteins. It has become very important for elucidating the mechanism of amyloids that identifying the amyloidogenic regions. Accordingly, several computational methods have been put forward to discover amyloidogenic regions. The majority of these methods predicted amyloidogenic regions based on the physicochemical properties of amino acids. In fact, position, order, and correlation of amino acids may also influence the amyloidosis of proteins, which should be also considered in detecting amyloidogenic regions. RESULTS To address this problem, we proposed a novel machine-learning approach for predicting amyloidogenic regions, called ReRF-Pred. Firstly, the pseudo amino acid composition (PseAAC) was exploited to characterize physicochemical properties and correlation of amino acids. Secondly, tripeptides composition (TPC) was employed to represent the order and position of amino acids. To improve the distinguishability of TPC, all possible tripeptides were analyzed by the binomial distribution method, and only those which have significantly different distribution between positive and negative samples remained. Finally, all samples were characterized by PseAAC and TPC of their amino acid sequence, and a random forest-based amyloidogenic regions predictor was trained on these samples. It was proved by validation experiments that the feature set consisted of PseAAC and TPC is the most distinguishable one for detecting amyloidosis. Meanwhile, random forest is superior to other concerned classifiers on almost all metrics. To validate the effectiveness of our model, ReRF-Pred is compared with a series of gold-standard methods on two datasets: Pep-251 and Reg33. The results suggested our method has the best overall performance and makes significant improvements in discovering amyloidogenic regions. CONCLUSIONS The advantages of our method are mainly attributed to that PseAAC and TPC can describe the differences between amyloids and other proteins successfully. The ReRF-Pred server can be accessed at http://106.12.83.135:8080/ReRF-Pred/.
Collapse
|
50
|
Lv H, Shi L, Berkenpas JW, Dao FY, Zulfiqar H, Ding H, Zhang Y, Yang L, Cao R. Application of artificial intelligence and machine learning for COVID-19 drug discovery and vaccine design. Brief Bioinform 2021; 22:bbab320. [PMID: 34410360 PMCID: PMC8511807 DOI: 10.1093/bib/bbab320] [Citation(s) in RCA: 32] [Impact Index Per Article: 10.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/24/2021] [Revised: 07/15/2021] [Accepted: 07/22/2021] [Indexed: 12/13/2022] Open
Abstract
The global pandemic of coronavirus disease 2019 (COVID-19), caused by severe acute respiratory syndrome coronavirus 2, has led to a dramatic loss of human life worldwide. Despite many efforts, the development of effective drugs and vaccines for this novel virus will take considerable time. Artificial intelligence (AI) and machine learning (ML) offer promising solutions that could accelerate the discovery and optimization of new antivirals. Motivated by this, in this paper, we present an extensive survey on the application of AI and ML for combating COVID-19 based on the rapidly emerging literature. Particularly, we point out the challenges and future directions associated with state-of-the-art solutions to effectively control the COVID-19 pandemic. We hope that this review provides researchers with new insights into the ways AI and ML fight and have fought the COVID-19 outbreak.
Collapse
Affiliation(s)
- Hao Lv
- Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu 610054, China
| | - Lei Shi
- Department of Spine Surgery, Changzheng Hospital, Naval Medical University, Shanghai 200433, China
| | | | - Fu-Ying Dao
- Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu 610054, China
| | - Hasan Zulfiqar
- Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu 610054, China
| | - Hui Ding
- Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu 610054, China
| | - Yang Zhang
- Innovative Institute of Chinese Medicine and Pharmacy, Chengdu University of Traditional Chinese Medicine, Chengdu 611137, China
| | - Liming Yang
- Department of Pathophysiology, Harbin Medical University-Daqing, Daqing, 163319, China
| | - Renzhi Cao
- Department of Computer Science, Pacific Lutheran University, Tacoma 98447, USA
| |
Collapse
|