1
|
Zhu L, Yang Q, Yang S. DeepAIP: Deep learning for anti-inflammatory peptide prediction using pre-trained protein language model features based on contextual self-attention network. Int J Biol Macromol 2024; 280:136172. [PMID: 39357724 DOI: 10.1016/j.ijbiomac.2024.136172] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2024] [Revised: 09/20/2024] [Accepted: 09/29/2024] [Indexed: 10/04/2024]
Abstract
Non-steroidal anti-inflammatory drugs (NSAIDs), glucocorticoids, and other immunosuppressants are commonly used medications for treating inflammation. However, these drugs often come with numerous side effects. Therefore, finding more effective methods for inflammation treatment has become more necessary. The study of anti-inflammatory peptides can effectively address these issues. In this work, we propose a contextual self-attention deep learning model, coupled with features extracted from a pre-trained protein language model, to predict Anti-inflammatory Peptides (AIP). The contextual self-attention module can effectively enhance and learn the features extracted from the pre-trained protein language model, resulting in high accuracy to predict AIP. Additionally, we compared the performance of features extracted from popular pre-trained protein language models available in the market. Finally, Prot-T5 features demonstrated the best comprehensive performance as the input for our deep learning model named DeepAIP. Compared with existing methods on benchmark test dataset, DeepAIP gets higher Matthews Correlation Coefficient and Accuracy score than the second-best method by 16.35 % and 6.91 %, respectively. Performance comparison analysis was conducted using a dataset of 17 novel anti-inflammatory peptide sequences. DeepAIP demonstrates outstanding accuracy, correctly identifying all 17 peptide types as AIP and predicting values closer to the true ones. Data and code are available at https://github.com/YangQingGuoCCZU/DeepAIP.
Collapse
Affiliation(s)
- Lun Zhu
- School of Computer Science and Artificial Intelligence Aliyun School of Big Data School of Software, Changzhou University, Changzhou 213164, China; The Affiliated Changzhou No.2 People's Hospital of Nanjing Medical University, Changzhou 213164, China
| | - Qingguo Yang
- School of Computer Science and Artificial Intelligence Aliyun School of Big Data School of Software, Changzhou University, Changzhou 213164, China
| | - Sen Yang
- School of Computer Science and Artificial Intelligence Aliyun School of Big Data School of Software, Changzhou University, Changzhou 213164, China; The Affiliated Changzhou No.2 People's Hospital of Nanjing Medical University, Changzhou 213164, China.
| |
Collapse
|
2
|
Zhang W, Ding Y, Wei L, Guo X, Ni F. Therapeutic peptides identification via kernel risk sensitive loss-based k-nearest neighbor model and multi-Laplacian regularization. Brief Bioinform 2024; 25:bbae534. [PMID: 39438076 PMCID: PMC11495874 DOI: 10.1093/bib/bbae534] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2024] [Revised: 08/30/2024] [Accepted: 10/08/2024] [Indexed: 10/25/2024] Open
Abstract
Therapeutic peptides are therapeutic agents synthesized from natural amino acids, which can be used as carriers for precisely transporting drugs and can activate the immune system for preventing and treating various diseases. However, screening therapeutic peptides using biochemical assays is expensive, time-consuming, and limited by experimental conditions and biological samples, and there may be ethical considerations in the clinical stage. In contrast, screening therapeutic peptides using machine learning and computational methods is efficient, automated, and can accurately predict potential therapeutic peptides. In this study, a k-nearest neighbor model based on multi-Laplacian and kernel risk sensitive loss was proposed, which introduces a kernel risk loss function derived from the K-local hyperplane distance nearest neighbor model as well as combining the Laplacian regularization method to predict therapeutic peptides. The findings indicated that the suggested approach achieved satisfactory results and could effectively predict therapeutic peptide sequences.
Collapse
Affiliation(s)
- Wenyu Zhang
- Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, No. 2006 Xiyuan Avenue, High tech Zone, Chengdu 610054, China
- Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, No.1 Chengdian Road, Kecheng District, Quzhou 324000, China
| | - Yijie Ding
- Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, No.1 Chengdian Road, Kecheng District, Quzhou 324000, China
| | - Leyi Wei
- Macao Polytechnic University, Gomes Street, Macau Peninsula, Macau 999078, China
| | - Xiaoyi Guo
- Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, No.1 Chengdian Road, Kecheng District, Quzhou 324000, China
| | - Fengming Ni
- Department of Gastroenterology, The First Hospital of Jilin University, No. 71 Xinmin Street, Chaoyang District, Changchun 130021, China
| |
Collapse
|
3
|
Isaac KS, Combe M, Potter G, Sokolenko S. Machine learning tools for peptide bioactivity evaluation - Implications for cell culture media optimization and the broader cultivated meat industry. Curr Res Food Sci 2024; 9:100842. [PMID: 39435450 PMCID: PMC11491887 DOI: 10.1016/j.crfs.2024.100842] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2024] [Accepted: 09/07/2024] [Indexed: 10/23/2024] Open
Abstract
Although bioactive peptides have traditionally been studied for their health-promoting qualities in the context of nutrition and medicine, the past twenty years have seen a steady increase in their application to cell culture media optimization. Complex natural sources of bioactive peptides, such as hydrolysates, offer a sustainable and cost-effective means of promoting cellular growth, making them an essential component of scaling-up cultivated meat production. However, the sheer diversity of hydrolysates makes product selection difficult, highlighting the need for functional characterization. Traditional wet-lab techniques for isolating and estimating peptide bioactivity cannot keep pace with peptide identification using high-throughput tools such as mass spectrometry, requiring the development and use of machine learning-based classifiers. This review provides a comprehensive list of available software tools to evaluate peptide bioactivity, classified and compared based on the algorithm, training set, functionality, and limitations of the underlying models. We curated independent test sets to compare the predictive performance of different models based on specific bioactivity classification relevant to promoting cell culture growth: antioxidant and anti-inflammatory. A comprehensive screening of all bioactivity classifiers revealed that while there are approximately fifty tools to elucidate antimicrobial activity and sixteen that predict anti-inflammatory activity, fewer tools are available for other functionalities related to cell growth - five that predict antioxidant activity and two for growth factor and/or cell signaling prediction. A thorough evaluation of the available tools revealed significant issues with sensitivity, specificity, and overall accuracy. Despite the overall interest in estimating peptide bioactivity, our work highlights key gaps in the broader adoption of existing software for the specific application of cell culture media optimization in the context of cultivated meat and beyond.
Collapse
Affiliation(s)
- Kathy Sharon Isaac
- Process Engineering and Applied Science, Dalhousie University, 5273 DaCosta Row, PO Box 15000, Halifax, B3H 4R2, NS, Canada
| | - Michelle Combe
- Process Engineering and Applied Science, Dalhousie University, 5273 DaCosta Row, PO Box 15000, Halifax, B3H 4R2, NS, Canada
| | | | - Stanislav Sokolenko
- Process Engineering and Applied Science, Dalhousie University, 5273 DaCosta Row, PO Box 15000, Halifax, B3H 4R2, NS, Canada
| |
Collapse
|
4
|
Xu Y, Zhang S, Zhu F, Liang Y. A deep learning model for anti-inflammatory peptides identification based on deep variational autoencoder and contrastive learning. Sci Rep 2024; 14:18451. [PMID: 39117712 PMCID: PMC11310449 DOI: 10.1038/s41598-024-69419-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2024] [Accepted: 08/05/2024] [Indexed: 08/10/2024] Open
Abstract
As a class of biologically active molecules with significant immunomodulatory and anti-inflammatory effects, anti-inflammatory peptides have important application value in the medical and biotechnology fields due to their unique biological functions. Research on the identification of anti-inflammatory peptides provides important theoretical foundations and practical value for a deeper understanding of the biological mechanisms of inflammation and immune regulation, as well as for the development of new drugs and biotechnological applications. Therefore, it is necessary to develop more advanced computational models for identifying anti-inflammatory peptides. In this study, we propose a deep learning model named DAC-AIPs based on variational autoencoder and contrastive learning for accurate identification of anti-inflammatory peptides. In the sequence encoding part, the incorporation of multi-hot encoding helps capture richer sequence information. The autoencoder, composed of convolutional layers and linear layers, can learn latent features and reconstruct features, with variational inference enhancing the representation capability of latent features. Additionally, the introduction of contrastive learning aims to improve the model's classification ability. Through cross-validation and independent dataset testing experiments, DAC-AIPs achieves superior performance compared to existing state-of-the-art models. In cross-validation, the classification accuracy of DAC-AIPs reached around 88%, which is 7% higher than previous models. Furthermore, various ablation experiments and interpretability experiments validate the effectiveness of DAC-AIPs. Finally, a user-friendly online predictor is designed to enhance the practicality of the model, and the server is freely accessible at http://dac-aips.online .
Collapse
Affiliation(s)
- Yujie Xu
- School of Mathematics and Statistics, Xidian University, Xi'an, 710071, People's Republic of China
| | - Shengli Zhang
- School of Mathematics and Statistics, Xidian University, Xi'an, 710071, People's Republic of China.
| | - Feng Zhu
- Center for Translational Medicine, The First Affiliated Hospital of Xi'an Jiaotong University, Xi'an, 710061, People's Republic of China
| | - Yunyun Liang
- School of Science, Xi'an Polytechnic University, Xi'an, 710048, People's Republic of China
| |
Collapse
|
5
|
Cai Y, Gao Y, Lv Y, Chen Z, Zhong L, Chen J, Fan Y. Multicomponent comprehensive confirms that erythroferrone is a molecular biomarker of pan-cancer. Heliyon 2024; 10:e26990. [PMID: 38444475 PMCID: PMC10912481 DOI: 10.1016/j.heliyon.2024.e26990] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2023] [Revised: 02/01/2024] [Accepted: 02/22/2024] [Indexed: 03/07/2024] Open
Abstract
All vertebrates organisms produce erythroferrone, a secretory hormone with structure-related functions during iron homeostasis. However, limited knowledge exists regarding the effect of this hormone on the occurrence and progression of cancer. To systematically and comprehensively identify the diverse implications of Erythroferrone (ERFE) in various malignant tumors, we conducted an in-depth analysis of multiple datasets, including the expression levels of oncogenes and target proteins, biological functions, and molecular characteristics. This analysis aimed to assess the diagnostic and prognostic value of ERFE in pan-cancer. Our findings revealed a significant elevation in ERFE expression across 20 distinct cancer types, with notable increases in gastrointestinal cancers. Utilizing the Cytoscape and STRING databases, we identified 35 ERFE-targeted binding proteins. Survival prognosis studies, particularly gastrointestinal cancers indicated by Colon adenocarcinoma (COAD), demonstrated a poor prognosis in patients with high ERFE expression (p < 0.001), consistently observed across various clinical subgroups. Furthermore, the ROC curve underscored the high predictive ability of EFRE for gastrointestinal cancer (AUC >0.9). Understanding the roles and interactions of ERFE in biological processes can also be aided by examining the genes co-expressed with ERFE in the coat and ranking the top 50 positive and negative genes. In the correlation analysis between the ERFE gene and different immune cells in COAD, we discovered that the expression of ERFE was positively correlated with Th1 cells, cytotoxic cells, and activated DC (aDC) abundance, and negatively correlated with Tcm (T central memory) abundance (P < 0.001). in summary, ERFE emerges as strongly associated with various malignant cancers, positioning it as a prospective biological target for cancer treatment. It stands out as a key molecular biomarker for diagnosing and prognosticating pancreatic cancer, also serves as an independent prognostic risk factor for COAD.
Collapse
Affiliation(s)
- Ying Cai
- Department of Gastroenterology, Zhongshan Hospital of Xiamen University, School of Medicine, Xiamen University, Xiamen, Fujian, PR China
- Xiamen Key Laboratory of intestinal microbiome and human health, Zhongshan Hospital affiliated to Xiamen University, Xiamen, Fujian, PR China
| | - Yaling Gao
- Department of Xia He, Zhongshan Hospital of Xiamen University, School of Medicine, Xiamen University, Xiamen, Fujian, PR China
| | - Yinyin Lv
- Department of Gastroenterology, Zhongshan Hospital of Xiamen University, School of Medicine, Xiamen University, Xiamen, Fujian, PR China
- Xiamen Key Laboratory of intestinal microbiome and human health, Zhongshan Hospital affiliated to Xiamen University, Xiamen, Fujian, PR China
| | - Zhiyuan Chen
- Department of Gastroenterology, Zhongshan Hospital of Xiamen University, School of Medicine, Xiamen University, Xiamen, Fujian, PR China
- Xiamen Key Laboratory of intestinal microbiome and human health, Zhongshan Hospital affiliated to Xiamen University, Xiamen, Fujian, PR China
| | - Lingfeng Zhong
- Department of Gastroenterology, Zhongshan Hospital of Xiamen University, School of Medicine, Xiamen University, Xiamen, Fujian, PR China
- Xiamen Key Laboratory of intestinal microbiome and human health, Zhongshan Hospital affiliated to Xiamen University, Xiamen, Fujian, PR China
| | - Junjie Chen
- Department of Gastroenterology, Zhongshan Hospital of Xiamen University, School of Medicine, Xiamen University, Xiamen, Fujian, PR China
- Xiamen Key Laboratory of intestinal microbiome and human health, Zhongshan Hospital affiliated to Xiamen University, Xiamen, Fujian, PR China
| | - Yanyun Fan
- Department of Gastroenterology, Zhongshan Hospital of Xiamen University, School of Medicine, Xiamen University, Xiamen, Fujian, PR China
- Xiamen Key Laboratory of intestinal microbiome and human health, Zhongshan Hospital affiliated to Xiamen University, Xiamen, Fujian, PR China
| |
Collapse
|
6
|
Ji S, An F, Zhang T, Lou M, Guo J, Liu K, Zhu Y, Wu J, Wu R. Antimicrobial peptides: An alternative to traditional antibiotics. Eur J Med Chem 2024; 265:116072. [PMID: 38147812 DOI: 10.1016/j.ejmech.2023.116072] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2023] [Revised: 12/04/2023] [Accepted: 12/17/2023] [Indexed: 12/28/2023]
Abstract
As antibiotic-resistant bacteria and genes continue to emerge, the identification of effective alternatives to traditional antibiotics has become a pressing issue. Antimicrobial peptides are favored for their safety, low residue, and low resistance properties, and their unique antimicrobial mechanisms show significant potential in combating antibiotic resistance. However, the high production cost and weak activity of antimicrobial peptides limit their application. Moreover, traditional laboratory methods for identifying and designing new antimicrobial peptides are time-consuming and labor-intensive, hindering their development. Currently, novel technologies, such as artificial intelligence (AI) are being employed to develop and design new antimicrobial peptide resources, offering new opportunities for the advancement of antimicrobial peptides. This article summarizes the basic characteristics and antimicrobial mechanisms of antimicrobial peptides, as well as their advantages and limitations, and explores the application of AI in antimicrobial peptides prediction amd design. This highlights the crucial role of AI in enhancing the efficiency of antimicrobial peptide research and provides a reference for antimicrobial drug development.
Collapse
Affiliation(s)
- Shuaiqi Ji
- College of Food Science, Shenyang Agricultural University, Shenyang, 110866, PR China; Shenyang Key Laboratory of Microbial Fermentation Technology Innovation, Shenyang, 110866, PR China
| | - Feiyu An
- College of Food Science, Shenyang Agricultural University, Shenyang, 110866, PR China; Liaoning Engineering Research Center of Food Fermentation Technology, Shenyang, 110866, PR China
| | - Taowei Zhang
- College of Food Science, Shenyang Agricultural University, Shenyang, 110866, PR China; Shenyang Key Laboratory of Microbial Fermentation Technology Innovation, Shenyang, 110866, PR China
| | - Mengxue Lou
- College of Food Science, Shenyang Agricultural University, Shenyang, 110866, PR China; Liaoning Engineering Research Center of Food Fermentation Technology, Shenyang, 110866, PR China
| | - Jiawei Guo
- College of Food Science, Shenyang Agricultural University, Shenyang, 110866, PR China; Shenyang Key Laboratory of Microbial Fermentation Technology Innovation, Shenyang, 110866, PR China
| | - Kexin Liu
- College of Food Science, Shenyang Agricultural University, Shenyang, 110866, PR China; Shenyang Key Laboratory of Microbial Fermentation Technology Innovation, Shenyang, 110866, PR China
| | - Yi Zhu
- College of Food Science, Shenyang Agricultural University, Shenyang, 110866, PR China; Liaoning Engineering Research Center of Food Fermentation Technology, Shenyang, 110866, PR China
| | - Junrui Wu
- College of Food Science, Shenyang Agricultural University, Shenyang, 110866, PR China; Liaoning Engineering Research Center of Food Fermentation Technology, Shenyang, 110866, PR China; Shenyang Key Laboratory of Microbial Fermentation Technology Innovation, Shenyang, 110866, PR China.
| | - Rina Wu
- College of Food Science, Shenyang Agricultural University, Shenyang, 110866, PR China; Liaoning Engineering Research Center of Food Fermentation Technology, Shenyang, 110866, PR China; Shenyang Key Laboratory of Microbial Fermentation Technology Innovation, Shenyang, 110866, PR China.
| |
Collapse
|
7
|
Gaffar S, Hassan MT, Tayara H, Chong KT. IF-AIP: A machine learning method for the identification of anti-inflammatory peptides using multi-feature fusion strategy. Comput Biol Med 2024; 168:107724. [PMID: 37989075 DOI: 10.1016/j.compbiomed.2023.107724] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2023] [Revised: 10/16/2023] [Accepted: 11/15/2023] [Indexed: 11/23/2023]
Abstract
BACKGROUND The most commonly used therapy currently for inflammatory and autoimmune diseases is nonspecific anti-inflammatory drugs, which have various hazardous side effects. Recently, some anti-inflammatory peptides (AIPs) have been found to be a substitute therapy for inflammatory diseases like rheumatoid arthritis and Alzheimer's. Therefore, the identification of these AIPs is an emerging topic that is equally important. METHODS In this work, we have proposed an identification model for AIPs using a voting classifier. We used eight different feature descriptors and five conventional machine-learning classifiers. The eight feature encodings were concatenated to get a hybrid feature set. The five baseline models trained on the hybrid feature set were integrated via a voting classifier. Finally, a feature selection algorithm was used to select the optimal feature set for the construction of our final model, named IF-AIP. RESULTS We tested the proposed model on two independent datasets. On independent data 1, the IF-AIP model shows an improvement of 3%-5.6% in terms of accuracies and 6.7%-10.8% in terms of MCC compared to the existing methods. On the independent dataset 2, our model IF-AIP shows an overall improvement of 2.9%-5.7% in terms of accuracy and 8.3%-8.6% in terms of MCC score compared to the existing methods. A comparative performance analysis was conducted between the proposed model and existing methods using a set of 24 novel peptide sequences. Notably, the IF-AIP method exhibited exceptional accuracy, correctly identifying all 24 peptides as AIPs. The source code, pre-trained models, and all datasets are made available at https://github.com/Mir-Saima/IF-AIP.
Collapse
Affiliation(s)
- Saima Gaffar
- Department of Electronics and Information Engineering, Jeonbuk National University, Jeonju, 54896, South Korea
| | - Mir Tanveerul Hassan
- Department of Electronics and Information Engineering, Jeonbuk National University, Jeonju, 54896, South Korea
| | - Hilal Tayara
- School of International Engineering and Science, Jeonbuk National University, Jeonju, 54896, South Korea.
| | - Kil To Chong
- Department of Electronics and Information Engineering, Jeonbuk National University, Jeonju, 54896, South Korea; Advances Electronics and Information Research Centre, Jeonbuk National University, Jeonju, 54896, South Korea.
| |
Collapse
|
8
|
Guan J, Yao L, Chung CR, Xie P, Zhang Y, Deng J, Chiang YC, Lee TY. Predicting Anti-inflammatory Peptides by Ensemble Machine Learning and Deep Learning. J Chem Inf Model 2023; 63:7886-7898. [PMID: 38054927 DOI: 10.1021/acs.jcim.3c01602] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/07/2023]
Abstract
Inflammation is a biological response to harmful stimuli, aiding in the maintenance of tissue homeostasis. However, excessive or persistent inflammation can precipitate a myriad of pathological conditions. Although current treatments such as NSAIDs, corticosteroids, and immunosuppressants are effective, they can have side effects and resistance issues. In this backdrop, anti-inflammatory peptides (AIPs) have emerged as a promising therapeutic approach against inflammation. Leveraging machine learning methods, we have the opportunity to accelerate the discovery and investigation of these AIPs more effectively. In this study, we proposed an advanced framework by ensemble machine learning and deep learning for AIP prediction. Initially, we constructed three individual models with extremely randomized trees (ET), gated recurrent unit (GRU), and convolutional neural networks (CNNs) with attention mechanism and then used stacking architecture to build the final predictor. By utilizing various sequence encodings and combining the strengths of different algorithms, our predictor demonstrated exemplary performance. On our independent test set, our model achieved an accuracy, MCC, and F1-score of 0.757, 0.500, and 0.707, respectively, clearly outperforming other contemporary AIP prediction methods. Additionally, our model offers profound insights into the feature interpretation of AIPs, establishing a valuable knowledge foundation for the design and development of future anti-inflammatory strategies.
Collapse
Affiliation(s)
- Jiahui Guan
- School of Medicine, The Chinese University of Hong Kong, Shenzhen 518172, China
| | - Lantian Yao
- Kobilka Institute of Innovative Drug Discovery, School of Medicine, The Chinese University of Hong Kong, Shenzhen 518172, China
- School of Science and Engineering, The Chinese University of Hong Kong, Shenzhen 518172, China
| | - Chia-Ru Chung
- Department of Computer Science and Information Engineering, National Central University, Taoyuan 320317, Taiwan
| | - Peilin Xie
- Kobilka Institute of Innovative Drug Discovery, School of Medicine, The Chinese University of Hong Kong, Shenzhen 518172, China
| | - Yilun Zhang
- School of Medicine, The Chinese University of Hong Kong, Shenzhen 518172, China
| | - Junyang Deng
- School of Medicine, The Chinese University of Hong Kong, Shenzhen 518172, China
| | - Ying-Chih Chiang
- School of Medicine, The Chinese University of Hong Kong, Shenzhen 518172, China
- Kobilka Institute of Innovative Drug Discovery, School of Medicine, The Chinese University of Hong Kong, Shenzhen 518172, China
| | - Tzong-Yi Lee
- Institute of Bioinformatics and Systems Biology, National Yang Ming Chiao Tung University, Hsinchu 300093, Taiwan
- Center for Intelligent Drug Systems and Smart Bio-devices (IDS2B), National Yang Ming Chiao Tung University, Hsinchu 300093, Taiwan
| |
Collapse
|
9
|
Raza A, Uddin J, Almuhaimeed A, Akbar S, Zou Q, Ahmad A. AIPs-SnTCN: Predicting Anti-Inflammatory Peptides Using fastText and Transformer Encoder-Based Hybrid Word Embedding with Self-Normalized Temporal Convolutional Networks. J Chem Inf Model 2023; 63:6537-6554. [PMID: 37905969 DOI: 10.1021/acs.jcim.3c01563] [Citation(s) in RCA: 22] [Impact Index Per Article: 22.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/02/2023]
Abstract
Inflammation is a biologically resistant response to harmful stimuli, such as infection, damaged cells, toxic chemicals, or tissue injuries. Its purpose is to eradicate pathogenic micro-organisms or irritants and facilitate tissue repair. Prolonged inflammation can result in chronic inflammatory diseases. However, wet-laboratory-based treatments are costly and time-consuming and may have adverse side effects on normal cells. In the past decade, peptide therapeutics have gained significant attention due to their high specificity in targeting affected cells without affecting healthy cells. Motivated by the significance of peptide-based therapies, we developed a highly discriminative prediction model called AIPs-SnTCN to predict anti-inflammatory peptides accurately. The peptide samples are encoded using word embedding techniques such as skip-gram and attention-based bidirectional encoder representation using a transformer (BERT). The conjoint triad feature (CTF) also collects structure-based cluster profile features. The fused vector of word embedding and sequential features is formed to compensate for the limitations of single encoding methods. Support vector machine-based recursive feature elimination (SVM-RFE) is applied to choose the ranking-based optimal space. The optimized feature space is trained by using an improved self-normalized temporal convolutional network (SnTCN). The AIPs-SnTCN model achieved a predictive accuracy of 95.86% and an AUC of 0.97 by using training samples. In the case of the alternate training data set, our model obtained an accuracy of 92.04% and an AUC of 0.96. The proposed AIPs-SnTCN model outperformed existing models with an ∼19% higher accuracy and an ∼14% higher AUC value. The reliability and efficacy of our AIPs-SnTCN model make it a valuable tool for scientists and may play a beneficial role in pharmaceutical design and research academia.
Collapse
Affiliation(s)
- Ali Raza
- Department of Physical and Numerical Sciences, Qurtuba University of Science and Information Technology, Peshawar, Khyber Pakhtunkhwa 25124, Pakistan
- Department of Computer Science, MY University, Islamabad 45750, Pakistan
| | - Jamal Uddin
- Department of Physical and Numerical Sciences, Qurtuba University of Science and Information Technology, Peshawar, Khyber Pakhtunkhwa 25124, Pakistan
| | - Abdullah Almuhaimeed
- Digital Health Institute, King Abdulaziz City for Science and Technology, Riyadh 11442, Saudi Arabia
| | - Shahid Akbar
- Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu 610054, China
- Department of Computer Science, Abdul Wali Khan University Mardan, Mardan, Khyber Pakhtunkhwa 23200, Pakistan
| | - Quan Zou
- Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu 610054, China
- Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, Quzhou 324000, PR China
| | - Ashfaq Ahmad
- Department of Computer Science, MY University, Islamabad 45750, Pakistan
| |
Collapse
|
10
|
Ali Z, Alturise F, Alkhalifah T, Khan YD. IGPred-HDnet: Prediction of Immunoglobulin Proteins Using Graphical Features and the Hierarchal Deep Learning-Based Approach. COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE 2023; 2023:2465414. [PMID: 36744119 PMCID: PMC9891831 DOI: 10.1155/2023/2465414] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 07/08/2022] [Revised: 09/16/2022] [Accepted: 10/12/2022] [Indexed: 01/26/2023]
Abstract
Motivation. Immunoglobulin proteins (IGP) (also called antibodies) are glycoproteins that act as B-cell receptors against external or internal antigens like viruses and bacteria. IGPs play a significant role in diverse cellular processes ranging from adhesion to cell recognition. IGP identifications via the in-silico approach are faster and more cost-effective than wet-lab technological methods. Methods. In this study, we developed an intelligent theoretical deep learning framework, "IGPred-HDnet" for the discrimination of IGPs and non-IGPs. Three types of promising descriptors are feature extraction based on graphical and statistical features (FEGS), amphiphilic pseudo-amino acid composition (Amp-PseAAC), and dipeptide composition (DPC) to extract the graphical, physicochemical, and sequential features. Next, the extracted attributes are evaluated through machine learning, i.e., decision tree (DT), support vector machine (SVM), k-nearest neighbour (KNN), and hierarchical deep network (HDnet) classifiers. The proposed predictor IGPred-HDnet was trained and tested using a 10-fold cross-validation and independent test. Results and Conclusion. The success rates in terms of accuracy (ACC) and Matthew's correlation coefficient (MCC) of IGPred-HDnet on training and independent dataset (Dtrain Dtest) are ACC = 98.00%, 99.10%, and MCC = 0.958, and 0.980 points, respectively. The empirical outcomes demonstrate that the IGPred-HDnet model efficacy on both datasets using the novel FEGS feature and HDnet algorithm achieved superior predictions to other existing computational models. We hope this research will provide great insights into the large-scale identification of IGPs and pharmaceutical companies in new drug design.
Collapse
Affiliation(s)
- Zakir Ali
- Department of Computer Science, School of Science and Technology, University of Management and Technology, Lahore, Pakistan
| | - Fahad Alturise
- Department of Computer, College of Science and Arts in Ar Rass, Qassim University, Ar Rass, Qassim, Saudi Arabia
| | - Tamim Alkhalifah
- Department of Computer, College of Science and Arts in Ar Rass, Qassim University, Ar Rass, Qassim, Saudi Arabia
| | - Yaser Daanial Khan
- Department of Computer Science, School of Science and Technology, University of Management and Technology, Lahore, Pakistan
| |
Collapse
|
11
|
Tharmakulasingam M, Gardner B, La Ragione R, Fernando A. Rectified Classifier Chains for Prediction of Antibiotic Resistance From Multi-Labelled Data With Missing Labels. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2023; 20:625-636. [PMID: 35130168 DOI: 10.1109/tcbb.2022.3148577] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/04/2023]
Abstract
Predicting Antimicrobial Resistance (AMR) from genomic data has important implications for human and animal healthcare, and especially given its potential for more rapid diagnostics and informed treatment choices. With the recent advances in sequencing technologies, applying machine learning techniques for AMR prediction have indicated promising results. Despite this, there are shortcomings in the literature concerning methodologies suitable for multi-drug AMR prediction and especially where samples with missing labels exist. To address this shortcoming, we introduce a Rectified Classifier Chain (RCC) method for predicting multi-drug resistance. This RCC method was tested using annotated features of genomics sequences and compared with similar multi-label classification methodologies. We found that applying the eXtreme Gradient Boosting (XGBoost) base model to our RCC model outperformed the second-best model, XGBoost based binary relevance model, by 3.3% in Hamming accuracy and 7.8% in F1-score. Additionally, we note that in the literature machine learning models applied to AMR prediction typically are unsuitable for identifying biomarkers informative of their decisions; in this study, we show that biomarkers contributing to AMR prediction can also be identified using the proposed RCC method. We expect this can facilitate genome annotation and pave the path towards identifying new biomarkers indicative of AMR.
Collapse
|
12
|
Kumar M, Bajaj K, Sharma B, Narang S. A Comparative Performance Assessment of Optimized Multilevel Ensemble Learning Model with Existing Classifier Models. BIG DATA 2022; 10:371-387. [PMID: 34881989 DOI: 10.1089/big.2021.0257] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
To predict the class level of any classification problem, predictive models are used and mostly a single predictive model is built to predict the class level of any classification problem; current research considers multiple predictive models to predict the class level. Ensemble modeling means instead of building a single predictive model, it is proposed to build a multilevel predictive model, which generalizes to predict all the class levels with an adequate percent of accuracy, that is, from 70% to 90% by applying and using a different combination of classification algorithms. In this article, a multilevel approach for selecting base classifiers for building an ensemble classification model is proposed. The rudimentary concept behind this approach is to drop lousy performing features and collinearity from the selected data set for ensemble modeling. For the evaluation of the proposed multilevel predictive model, different data sets from the University of California, Irvine, repository have been used and comparisons with the modern classifier's models have been conducted. The implementation analyses demonstrate the potency and excellence of the novel approach when compared with other modern classification models (three-layered artificial neural network, Radial Variant Function Neural Network/Fish Swarm Algorithm). The classification accuracy achieved with selected algorithms lies in the range of 70%-88.3%. Among all the selected classification algorithms, the lowest accuracy is achieved by the naive Bayes algorithm, which is close to 71.9%. However, the proposed algorithm (NB-RF-LR-SEMod), which is a combination of different classifiers, achieved a maximum accuracy of 88.3% on the Photographic and Imaging Manufacturers Association Diabetes data set, which is, by far, the best to any single classifier. Hence, this proposed work is helpful for any health care official to detect the diabetes problem at an early stage and prevent the affected person from future complications of it.
Collapse
Affiliation(s)
- Mukesh Kumar
- Department of Computer Science & Engineering, Chitkara University School of Engineering and Technology, Chitkara University, Baddi, Himachal Pradesh, India
| | - Karan Bajaj
- Department of Computer Science & Engineering, Chitkara University School of Engineering and Technology, Chitkara University, Baddi, Himachal Pradesh, India
| | - Bhisham Sharma
- Department of Computer Science & Engineering, Chitkara University School of Engineering and Technology, Chitkara University, Baddi, Himachal Pradesh, India
| | - Sushil Narang
- Department of Computer Science & Engineering, Chitkara University School of Engineering and Technology, Chitkara University, Baddi, Himachal Pradesh, India
| |
Collapse
|
13
|
Yan W, Tang W, Wang L, Bin Y, Xia J. PrMFTP: Multi-functional therapeutic peptides prediction based on multi-head self-attention mechanism and class weight optimization. PLoS Comput Biol 2022; 18:e1010511. [PMID: 36094961 PMCID: PMC9499272 DOI: 10.1371/journal.pcbi.1010511] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2022] [Revised: 09/22/2022] [Accepted: 08/24/2022] [Indexed: 11/18/2022] Open
Abstract
Prediction of therapeutic peptide is a significant step for the discovery of promising therapeutic drugs. Most of the existing studies have focused on the mono-functional therapeutic peptide prediction. However, the number of multi-functional therapeutic peptides (MFTP) is growing rapidly, which requires new computational schemes to be proposed to facilitate MFTP discovery. In this study, based on multi-head self-attention mechanism and class weight optimization algorithm, we propose a novel model called PrMFTP for MFTP prediction. PrMFTP exploits multi-scale convolutional neural network, bi-directional long short-term memory, and multi-head self-attention mechanisms to fully extract and learn informative features of peptide sequence to predict MFTP. In addition, we design a class weight optimization scheme to address the problem of label imbalanced data. Comprehensive evaluation demonstrate that PrMFTP is superior to other state-of-the-art computational methods for predicting MFTP. We provide a user-friendly web server of PrMFTP, which is available at http://bioinfo.ahu.edu.cn/PrMFTP.
Collapse
Affiliation(s)
- Wenhui Yan
- Information Materials and Intelligent Sensing Laboratory of Anhui Province and Key Laboratory of Intelligent Computing and Signal Processing of Ministry of Education, Institutes of Physical Science and Information Technology, Anhui University, Hefei, Anhui, China
| | - Wending Tang
- Information Materials and Intelligent Sensing Laboratory of Anhui Province and Key Laboratory of Intelligent Computing and Signal Processing of Ministry of Education, Institutes of Physical Science and Information Technology, Anhui University, Hefei, Anhui, China
| | - Lihua Wang
- Information Materials and Intelligent Sensing Laboratory of Anhui Province and Key Laboratory of Intelligent Computing and Signal Processing of Ministry of Education, Institutes of Physical Science and Information Technology, Anhui University, Hefei, Anhui, China
| | - Yannan Bin
- Information Materials and Intelligent Sensing Laboratory of Anhui Province and Key Laboratory of Intelligent Computing and Signal Processing of Ministry of Education, Institutes of Physical Science and Information Technology, Anhui University, Hefei, Anhui, China
- * E-mail: (YB); (JX)
| | - Junfeng Xia
- Information Materials and Intelligent Sensing Laboratory of Anhui Province and Key Laboratory of Intelligent Computing and Signal Processing of Ministry of Education, Institutes of Physical Science and Information Technology, Anhui University, Hefei, Anhui, China
- * E-mail: (YB); (JX)
| |
Collapse
|
14
|
Prediction of anti-inflammatory peptides by a sequence-based stacking ensemble model named AIPStack. iScience 2022; 25:104967. [PMID: 36093066 PMCID: PMC9449674 DOI: 10.1016/j.isci.2022.104967] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2022] [Revised: 08/09/2022] [Accepted: 08/12/2022] [Indexed: 11/23/2022] Open
Abstract
Accurate and efficient identification of anti-inflammatory peptides (AIPs) is crucial for the treatment of inflammation. Here, we proposed a two-layer stacking ensemble model, AIPStack, to effectively predict AIPs. At first, we constructed a new dataset for model building and validation. Then, peptide sequences were represented by hybrid features, which were fused by two amino acid composition descriptors. Next, the stacking ensemble model was constructed by random forest and extremely randomized tree as the base-classifiers and logistic regression as the meta-classifier to receive the outputs from the base-classifiers. AIPStack achieved an AUC of 0.819, accuracy of 0.755, and MCC of 0.510 on the independent set 3, which were higher than other AIP predictors. Furthermore, the essential sequence features were highlighted by the Shapley Additive exPlanation (SHAP) method. It is anticipated that AIPStack could be used for AIP prediction in a high-throughput manner and facilitate the hypothesis-driven experimental design. AIPStack model was developed for the prediction of anti-inflammatory peptides The hybrid features were used to describe the peptide sequences The proposed model AIPStack outperformed existing ones SHAP was used to highlight the essential features required for AIP prediction
Collapse
|
15
|
Fan R, Suo B, Ding Y. Identification of Vesicle Transport Proteins via Hypergraph Regularized K-Local Hyperplane Distance Nearest Neighbour Model. Front Genet 2022; 13:960388. [PMID: 35910197 PMCID: PMC9326258 DOI: 10.3389/fgene.2022.960388] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2022] [Accepted: 06/22/2022] [Indexed: 12/04/2022] Open
Abstract
The prediction of protein function is a common topic in the field of bioinformatics. In recent years, advances in machine learning have inspired a growing number of algorithms for predicting protein function. A large number of parameters and fairly complex neural networks are often used to improve the prediction performance, an approach that is time-consuming and costly. In this study, we leveraged traditional features and machine learning classifiers to boost the performance of vesicle transport protein identification and make the prediction process faster. We adopt the pseudo position-specific scoring matrix (PsePSSM) feature and our proposed new classifier hypergraph regularized k-local hyperplane distance nearest neighbour (HG-HKNN) to classify vesicular transport proteins. We address dataset imbalances with random undersampling. The results show that our strategy has an area under the receiver operating characteristic curve (AUC) of 0.870 and a Matthews correlation coefficient (MCC) of 0.53 on the benchmark dataset, outperforming all state-of-the-art methods on the same dataset, and other metrics of our model are also comparable to existing methods.
Collapse
Affiliation(s)
- Rui Fan
- Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu, China
- Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, Quzhou, China
| | - Bing Suo
- Beidahuang Industry Group General Hospital, Harbin, China
| | - Yijie Ding
- Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, Quzhou, China
| |
Collapse
|
16
|
Li Y, Li X, Liu Y, Yao Y, Huang G. MPMABP: A CNN and Bi-LSTM-Based Method for Predicting Multi-Activities of Bioactive Peptides. Pharmaceuticals (Basel) 2022; 15:707. [PMID: 35745625 PMCID: PMC9231127 DOI: 10.3390/ph15060707] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2022] [Revised: 05/23/2022] [Accepted: 05/30/2022] [Indexed: 12/30/2022] Open
Abstract
Bioactive peptides are typically small functional peptides with 2-20 amino acid residues and play versatile roles in metabolic and biological processes. Bioactive peptides are multi-functional, so it is vastly challenging to accurately detect all their functions simultaneously. We proposed a convolution neural network (CNN) and bi-directional long short-term memory (Bi-LSTM)-based deep learning method (called MPMABP) for recognizing multi-activities of bioactive peptides. The MPMABP stacked five CNNs at different scales, and used the residual network to preserve the information from loss. The empirical results showed that the MPMABP is superior to the state-of-the-art methods. Analysis on the distribution of amino acids indicated that the lysine preferred to appear in the anti-cancer peptide, the leucine in the anti-diabetic peptide, and the proline in the anti-hypertensive peptide. The method and analysis are beneficial to recognize multi-activities of bioactive peptides.
Collapse
Affiliation(s)
- You Li
- School of Electrical Engineering, Shaoyang University, Shaoyang 422000, China; (Y.L.); (X.L.)
| | - Xueyong Li
- School of Electrical Engineering, Shaoyang University, Shaoyang 422000, China; (Y.L.); (X.L.)
| | - Yuewu Liu
- College of Information and Intelligence, Hunan Agricultural University, Changsha 410128, China;
| | - Yuhua Yao
- School of Mathematics and Statistics, Hainan Normal University, Haikou 571158, China;
| | - Guohua Huang
- School of Electrical Engineering, Shaoyang University, Shaoyang 422000, China; (Y.L.); (X.L.)
| |
Collapse
|
17
|
Liu P, Ding Y, Rong Y, Chen D. Prediction of cell penetrating peptides and their uptake efficiency using random forest‐based feature selections. AIChE J 2022. [DOI: 10.1002/aic.17781] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
Affiliation(s)
- Peng Liu
- Institute of Fundamental and Frontier Sciences University of Electronic Science and Technology of China Chengdu China
- Institute of Yangtze Delta Region (Quzhou) University of Electronic Science and Technology of China Quzhou China
| | - Yijie Ding
- Institute of Yangtze Delta Region (Quzhou) University of Electronic Science and Technology of China Quzhou China
| | - Ying Rong
- Beidahuang Industry Group General Hospital Harbin China
| | - Dong Chen
- College of Electrical and Information Engineering, Quzhou University Quzhou China
| |
Collapse
|
18
|
Jiao S, Chen Z, Zhang L, Zhou X, Shi L. ATGPred-FL: sequence-based prediction of autophagy proteins with feature representation learning. Amino Acids 2022; 54:799-809. [PMID: 35286461 DOI: 10.1007/s00726-022-03145-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2021] [Accepted: 01/28/2022] [Indexed: 11/26/2022]
Abstract
Autophagy plays an important role in biological evolution and is regulated by many autophagy proteins. Accurate identification of autophagy proteins is crucially important to reveal their biological functions. Due to the expense and labor cost of experimental methods, it is urgent to develop automated, accurate and reliable sequence-based computational tools to enable the identification of novel autophagy proteins among numerous proteins and peptides. For this purpose, a new predictor named ATGPred-FL was proposed for the efficient identification of autophagy proteins. We investigated various sequence-based feature descriptors and adopted the feature learning method to generate corresponding, more informative probability features. Then, a two-step feature selection strategy based on accuracy was utilized to remove irrelevant and redundant features, leading to the most discriminative 14-dimensional feature set. The final predictor was built using a support vector machine classifier, which performed favorably on both the training and testing sets with accuracy values of 94.40% and 90.50%, respectively. ATGPred-FL is the first ATG machine learning predictor based on protein primary sequences. We envision that ATGPred-FL will be an effective and useful tool for autophagy protein identification, and it is available for free at http://lab.malab.cn/~acy/ATGPred-FL , the source code and datasets are accessible at https://github.com/jiaoshihu/ATGPred .
Collapse
Affiliation(s)
- Shihu Jiao
- Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, Quzhou, China
| | - Zheng Chen
- School of Applied Chemistry and Biological Technology, Shenzhen Polytechnic, 7098 Liuxian Street, Shenzhen, 518055, China
- Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, No.4 Block 2 North Jianshe Road, Chengdu, 61005, China
| | - Lichao Zhang
- School of Intelligent Manufacturing and Equipment, Shenzhen Institute of Information Technology, Shenzhen, 518172, China
| | - Xun Zhou
- Beidahuang Industry Group General Hospital, Harbin, 150001, China.
| | - Lei Shi
- Department of Spine Surgery, Changzheng Hospital, Naval Medical University, No 415, Fengyang Road, Huangpu District, Shanghai, 210000, China.
| |
Collapse
|
19
|
Zhao Z, Yang W, Zhai Y, Liang Y, Zhao Y. Identify DNA-Binding Proteins Through the Extreme Gradient Boosting Algorithm. Front Genet 2022; 12:821996. [PMID: 35154264 PMCID: PMC8837382 DOI: 10.3389/fgene.2021.821996] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/25/2021] [Accepted: 12/07/2021] [Indexed: 12/13/2022] Open
Abstract
The exploration of DNA-binding proteins (DBPs) is an important aspect of studying biological life activities. Research on life activities requires the support of scientific research results on DBPs. The decline in many life activities is closely related to DBPs. Generally, the detection method for identifying DBPs is achieved through biochemical experiments. This method is inefficient and requires considerable manpower, material resources and time. At present, several computational approaches have been developed to detect DBPs, among which machine learning (ML) algorithm-based computational techniques have shown excellent performance. In our experiments, our method uses fewer features and simpler recognition methods than other methods and simultaneously obtains satisfactory results. First, we use six feature extraction methods to extract sequence features from the same group of DBPs. Then, this feature information is spliced together, and the data are standardized. Finally, the extreme gradient boosting (XGBoost) model is used to construct an effective predictive model. Compared with other excellent methods, our proposed method has achieved better results. The accuracy achieved by our method is 78.26% for PDB2272 and 85.48% for PDB186. The accuracy of the experimental results achieved by our strategy is similar to that of previous detection methods.
Collapse
Affiliation(s)
- Ziye Zhao
- College of Information and Computer Engineering, Northeast Forestry University, Harbin, China
| | - Wen Yang
- International Medical Center, Shenzhen University General Hospital, Shenzhen, China
| | - Yixiao Zhai
- College of Information and Computer Engineering, Northeast Forestry University, Harbin, China
| | - Yingjian Liang
- Department of Obstetrics and Gynecology, The First Affiliated Hospital of Harbin Medical University, Harbin, China
- *Correspondence: Yingjian Liang, ; Yuming Zhao,
| | - Yuming Zhao
- College of Information and Computer Engineering, Northeast Forestry University, Harbin, China
- *Correspondence: Yingjian Liang, ; Yuming Zhao,
| |
Collapse
|
20
|
Wan H, Zhang J, Ding Y, Wang H, Tian G. Immunoglobulin Classification Based on FC* and GC* Features. Front Genet 2022; 12:827161. [PMID: 35140745 PMCID: PMC8819591 DOI: 10.3389/fgene.2021.827161] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2021] [Accepted: 12/22/2021] [Indexed: 11/13/2022] Open
Abstract
Immunoglobulins have a pivotal role in disease regulation. Therefore, it is vital to accurately identify immunoglobulins to develop new drugs and research related diseases. Compared with utilizing high-dimension features to identify immunoglobulins, this research aimed to examine a method to classify immunoglobulins and non-immunoglobulins using two features, FC* and GC*. Classification of 228 samples (109 immunoglobulin samples and 119 non-immunoglobulin samples) revealed that the overall accuracy was 80.7% in 10-fold cross-validation using the J48 classifier implemented in Weka software. The FC* feature identified in this study was found in the immunoglobulin subtype domain, which demonstrated that this extracted feature could represent functional and structural properties of immunoglobulins for forecasting.
Collapse
Affiliation(s)
- Hao Wan
- Institute of Advanced Cross-field Science, College of Life Science, Qingdao University, Qingdao, China
| | - Jina Zhang
- Geneis (Beijing) Co., Ltd., Beijing, China
| | - Yijie Ding
- Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, Quzhou, China
| | - Hetian Wang
- Beidahuang Industry Group General Hospital, Harbin, China
- *Correspondence: Hetian Wang, ; Geng Tian,
| | - Geng Tian
- Geneis (Beijing) Co., Ltd., Beijing, China
- *Correspondence: Hetian Wang, ; Geng Tian,
| |
Collapse
|
21
|
Zhai Y, Zhang J, Zhang T, Gong Y, Zhang Z, Zhang D, Zhao Y. AOPM: Application of Antioxidant Protein Classification Model in Predicting the Composition of Antioxidant Drugs. Front Pharmacol 2022; 12:818115. [PMID: 35115948 PMCID: PMC8803896 DOI: 10.3389/fphar.2021.818115] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2021] [Accepted: 12/20/2021] [Indexed: 11/18/2022] Open
Abstract
Antioxidant proteins can not only balance the oxidative stress in the body, but are also an important component of antioxidant drugs. Accurate identification of antioxidant proteins is essential to help humans fight diseases and develop new drugs. In this paper, we developed a friendly method AOPM to identify antioxidant proteins. 188D and the Composition of k-spaced Amino Acid Pairs were adopted as the feature extraction method. In addition, the Max-Relevance-Max-Distance algorithm (MRMD) and random forest were the feature selection and classifier, respectively. We used 5-folds cross-validation and independent test dataset to evaluate our model. On the test dataset, AOPM presented a higher performance compared with the state-of-the-art methods. The sensitivity, specificity, accuracy, Matthew’s Correlation Coefficient and an Area Under the Curve reached 87.3, 94.2, 92.0%, 0.815 and 0.972, respectively. In addition, AOPM still has excellent performance in predicting the catalytic enzymes of antioxidant drugs. This work proved the feasibility of virtual drug screening based on sequence information and provided new ideas and solutions for drug development.
Collapse
Affiliation(s)
- Yixiao Zhai
- College of Information and Computer Engineering, Northeast Forestry University, Harbin, China
| | - Jingyu Zhang
- Department of Neurology, the Fourth Affiliated Hospital of Harbin Medical University, Harbin, China
| | - Tianjiao Zhang
- College of Information and Computer Engineering, Northeast Forestry University, Harbin, China
| | - Yue Gong
- College of Information and Computer Engineering, Northeast Forestry University, Harbin, China
| | - Zixiao Zhang
- College of Information and Computer Engineering, Northeast Forestry University, Harbin, China
| | - Dandan Zhang
- Department of Obstetrics and Gynecology, the First Affiliated Hospital of Harbin Medical University, Harbin, China
- *Correspondence: Dandan Zhang, ; Yuming Zhao,
| | - Yuming Zhao
- College of Information and Computer Engineering, Northeast Forestry University, Harbin, China
- *Correspondence: Dandan Zhang, ; Yuming Zhao,
| |
Collapse
|
22
|
Zhang Z, Gong Y, Gao B, Li H, Gao W, Zhao Y, Dong B. SNAREs-SAP: SNARE Proteins Identification With PSSM Profiles. Front Genet 2022; 12:809001. [PMID: 34987554 PMCID: PMC8721734 DOI: 10.3389/fgene.2021.809001] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2021] [Accepted: 11/15/2021] [Indexed: 12/20/2022] Open
Abstract
Soluble N-ethylmaleimide sensitive factor activating protein receptor (SNARE) proteins are a large family of transmembrane proteins located in organelles and vesicles. The important roles of SNARE proteins include initiating the vesicle fusion process and activating and fusing proteins as they undergo exocytosis activity, and SNARE proteins are also vital for the transport regulation of membrane proteins and non-regulatory vesicles. Therefore, there is great significance in establishing a method to efficiently identify SNARE proteins. However, the identification accuracy of the existing methods such as SNARE CNN is not satisfied. In our study, we developed a method based on a support vector machine (SVM) that can effectively recognize SNARE proteins. We used the position-specific scoring matrix (PSSM) method to extract features of SNARE protein sequences, used the support vector machine recursive elimination correlation bias reduction (SVM-RFE-CBR) algorithm to rank the importance of features, and then screened out the optimal subset of feature data based on the sorted results. We input the feature data into the model when building the model, used 10-fold crossing validation for training, and tested model performance by using an independent dataset. In independent tests, the ability of our method to identify SNARE proteins achieved a sensitivity of 68%, specificity of 94%, accuracy of 92%, area under the curve (AUC) of 84%, and Matthew’s correlation coefficient (MCC) of 0.48. The results of the experiment show that the common evaluation indicators of our method are excellent, indicating that our method performs better than other existing classification methods in identifying SNARE proteins.
Collapse
Affiliation(s)
- Zixiao Zhang
- College of Information and Computer Engineering, Northeast Forestry University, Harbin, China
| | - Yue Gong
- College of Information and Computer Engineering, Northeast Forestry University, Harbin, China
| | - Bo Gao
- Department of Radiology, The Second Affiliated Hospital, Harbin Medical University, Harbin, China
| | - Hongfei Li
- College of Information and Computer Engineering, Northeast Forestry University, Harbin, China
| | - Wentao Gao
- College of Information and Computer Engineering, Northeast Forestry University, Harbin, China
| | - Yuming Zhao
- College of Information and Computer Engineering, Northeast Forestry University, Harbin, China
| | - Benzhi Dong
- College of Information and Computer Engineering, Northeast Forestry University, Harbin, China
| |
Collapse
|
23
|
Zhou H, Wang H, Ding Y, Tang J. Multivariate Information Fusion for Identifying Antifungal Peptides with
Hilbert-Schmidt Independence Criterion. Curr Bioinform 2022. [DOI: 10.2174/1574893616666210727161003] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
Background:
Antifungal Peptides (AFP) have been found to be effective against many fungal
infections.
Objective:
However, it is difficult to identify AFP. Therefore, it is great practical significance to identify
AFP via machine learning methods (with sequence information).
Method:
In this study, a Multi-Kernel Support Vector Machine (MKSVM) with Hilbert-Schmidt Independence
Criterion (HSIC) is proposed. Proteins are encoded with five types of features (188-bit,
AAC, ASDC, CKSAAP, DPC), and then construct kernels using Gaussian kernel function. HSIC are
used to combine kernels and multi-kernel SVM model is built.
Results:
Our model performed well on three AFPs datasets and the performance is better than or comparable
to other state-of-art predictive models.
Conclusion:
Our method will be a useful tool for identifying antifungal peptides.
Collapse
Affiliation(s)
- Haohao Zhou
- School of Computer Science and Technology, College of Intelligence and Computing, Tianjin University, Tianjin,
300354, China
| | - Hao Wang
- School of Computer Science and Technology, College of Intelligence and Computing, Tianjin University, Tianjin,
300354, China
| | - Yijie Ding
- School of Electronic and Information Engineering, Suzhou University of Science and Technology, Suzhou,
215009, China
- Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of
China, Quzhou, 324000, China
| | - Jijun Tang
- Department of Computer Science and Engineering, University of South Carolina, Columbia, SC 29208, USA
- Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, 518055,
China
| |
Collapse
|
24
|
Gu X, Guo L, Liao B, Jiang Q. Pseudo-188D: Phage Protein Prediction Based on a Model of Pseudo-188D. Front Genet 2021; 12:796327. [PMID: 34925468 PMCID: PMC8672092 DOI: 10.3389/fgene.2021.796327] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2021] [Accepted: 11/15/2021] [Indexed: 11/13/2022] Open
Abstract
Phages have seriously affected the biochemical systems of the world, and not only are phages related to our health, but medical treatments for many cancers and skin infections are related to phages; therefore, this paper sought to identify phage proteins. In this paper, a Pseudo-188D model was established. The digital features of the phage were extracted by PseudoKNC, an appropriate vector was selected by the AdaBoost tool, and features were extracted by 188D. Then, the extracted digital features were combined together, and finally, the viral proteins of the phage were predicted by a stochastic gradient descent algorithm. Our model effect reached 93.4853%. To verify the stability of our model, we randomly selected 80% of the downloaded data to train the model and used the remaining 20% of the data to verify the robustness of our model.
Collapse
Affiliation(s)
- Xiaomei Gu
- Key Laboratory of Computational Science and Application of Hainan Province, Haikou, China.,Institute of Yangtze River Delta, University of Electronic Science and Technology of China, Haikou, China.,Key Laboratory of Data Science and Intelligence Education, Hainan Normal University, Ministry of Education, Haikou, China.,School of Mathematics and Statistics, Hainan Normal University, Haikou, China
| | - Lina Guo
- Beidahuang Industry Group General Hospital, Harbin, China
| | - Bo Liao
- Key Laboratory of Computational Science and Application of Hainan Province, Haikou, China.,Key Laboratory of Data Science and Intelligence Education, Hainan Normal University, Ministry of Education, Haikou, China.,School of Mathematics and Statistics, Hainan Normal University, Haikou, China
| | - Qinghua Jiang
- Key Laboratory of Computational Science and Application of Hainan Province, Haikou, China.,Key Laboratory of Data Science and Intelligence Education, Hainan Normal University, Ministry of Education, Haikou, China.,School of Mathematics and Statistics, Hainan Normal University, Haikou, China
| |
Collapse
|
25
|
Zhao D, Teng Z, Li Y, Chen D. iAIPs: Identifying Anti-Inflammatory Peptides Using Random Forest. Front Genet 2021; 12:773202. [PMID: 34917130 PMCID: PMC8669811 DOI: 10.3389/fgene.2021.773202] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2021] [Accepted: 10/08/2021] [Indexed: 12/25/2022] Open
Abstract
Recently, several anti-inflammatory peptides (AIPs) have been found in the process of the inflammatory response, and these peptides have been used to treat some inflammatory and autoimmune diseases. Therefore, identifying AIPs accurately from a given amino acid sequences is critical for the discovery of novel and efficient anti-inflammatory peptide-based therapeutics and the acceleration of their application in therapy. In this paper, a random forest-based model called iAIPs for identifying AIPs is proposed. First, the original samples were encoded with three feature extraction methods, including g-gap dipeptide composition (GDC), dipeptide deviation from the expected mean (DDE), and amino acid composition (AAC). Second, the optimal feature subset is generated by a two-step feature selection method, in which the feature is ranked by the analysis of variance (ANOVA) method, and the optimal feature subset is generated by the incremental feature selection strategy. Finally, the optimal feature subset is inputted into the random forest classifier, and the identification model is constructed. Experiment results showed that iAIPs achieved an AUC value of 0.822 on an independent test dataset, which indicated that our proposed model has better performance than the existing methods. Furthermore, the extraction of features for peptide sequences provides the basis for evolutionary analysis. The study of peptide identification is helpful to understand the diversity of species and analyze the evolutionary history of species.
Collapse
Affiliation(s)
- Dongxu Zhao
- College of Information and Computer Engineering, Northeast Forestry University, Harbin, China
| | - Zhixia Teng
- College of Information and Computer Engineering, Northeast Forestry University, Harbin, China
| | - Yanjuan Li
- College of Electrical and Information Engineering, Quzhou University, Quzhou, China
| | - Dong Chen
- College of Electrical and Information Engineering, Quzhou University, Quzhou, China
| |
Collapse
|
26
|
Jia Y, Huang S, Zhang T. KK-DBP: A Multi-Feature Fusion Method for DNA-Binding Protein Identification Based on Random Forest. Front Genet 2021; 12:811158. [PMID: 34912382 PMCID: PMC8667860 DOI: 10.3389/fgene.2021.811158] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2021] [Accepted: 11/15/2021] [Indexed: 02/04/2023] Open
Abstract
DNA-binding protein (DBP) is a protein with a special DNA binding domain that is associated with many important molecular biological mechanisms. Rapid development of computational methods has made it possible to predict DBP on a large scale; however, existing methods do not fully integrate DBP-related features, resulting in rough prediction results. In this article, we develop a DNA-binding protein identification method called KK-DBP. To improve prediction accuracy, we propose a feature extraction method that fuses multiple PSSM features. The experimental results show a prediction accuracy on the independent test dataset PDB186 of 81.22%, which is the highest of all existing methods.
Collapse
Affiliation(s)
- Yuran Jia
- College of Information and Computer Engineering, Northeast Forestry University, Harbin, China
| | - Shan Huang
- Department of Neurology, The Second Affiliated Hospital of Harbin Medical University, Harbin, China
| | - Tianjiao Zhang
- College of Information and Computer Engineering, Northeast Forestry University, Harbin, China
| |
Collapse
|
27
|
Lin X. Genomic Variation Prediction: A Summary From Different Views. Front Cell Dev Biol 2021; 9:795883. [PMID: 34901036 PMCID: PMC8656232 DOI: 10.3389/fcell.2021.795883] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2021] [Accepted: 11/11/2021] [Indexed: 12/02/2022] Open
Abstract
Structural variations in the genome are closely related to human health and the occurrence and development of various diseases. To understand the mechanisms of diseases, find pathogenic targets, and carry out personalized precision medicine, it is critical to detect such variations. The rapid development of high-throughput sequencing technologies has accelerated the accumulation of large amounts of genomic mutation data, including synonymous mutations. Identifying pathogenic synonymous mutations that play important roles in the occurrence and development of diseases from all the available mutation data is of great importance. In this paper, machine learning theories and methods are reviewed, efficient and accurate pathogenic synonymous mutation prediction methods are developed, and a standardized three-level variant analysis framework is constructed. In addition, multiple variation tolerance prediction models are studied and integrated, and new ideas for structural variation detection based on deep information mining are explored.
Collapse
Affiliation(s)
- Xiuchun Lin
- College of Information and Electrical Engineering, China Agricultural University, Beijing, China
| |
Collapse
|
28
|
Zhang C, Guo C, Li Y, Liu K, Zhao Q, Ouyang L. Identification of Claudin-6 as a Molecular Biomarker in Pan-Cancer Through Multiple Omics Integrative Analysis. Front Cell Dev Biol 2021; 9:726656. [PMID: 34409042 PMCID: PMC8365468 DOI: 10.3389/fcell.2021.726656] [Citation(s) in RCA: 24] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2021] [Accepted: 07/08/2021] [Indexed: 12/13/2022] Open
Abstract
Claudin-6 (CLDN6) is one of the 27 family members of claudins and majorly involved in the tight junction and cell-to-cell adhesion of epithelial cell sheets, playing a significant role in cancer initiation and progression. To provide a more systematic and comprehensive dimension of identifying the diverse significance of CLDN6 in a variety of malignant tumors, we explored CLDN6 through multiple omics data integrative analysis, including gene expression level in pan-cancer and comparison of CLDN6 expression in different molecular subtypes and immune subtypes of pan-cancer, targeted protein, biological functions, molecular signatures, diagnostic value, and prognostic value in pan-cancer. Furthermore, we focused on uterine corpus endometrial carcinoma (UCEC) and further investigated CLDN6 from the perspective of the correlations with clinical characteristics, prognosis in different clinical subgroups, co-expression genes, and differentially expressed genes (DEGs), basing on discussing the validation of its established monoclonal antibody by immunohistochemical staining and semi-quantification reported in the previous study. As a result, CLDN6 expression differs significantly not only in most cancers but also in different molecular and immune subtypes of cancers. Besides, high accuracy in predicting cancers and notable correlations with prognosis of certain cancers suggest that CLDN6 might be a potential diagnostic and prognostic biomarker of cancers. Additionally, CLDN6 is identified to be significantly correlated with age, stage, weight, histological type, histologic grade, and menopause status in UCEC. Moreover, CLDN6 high expression can lead to a worse overall survival (OS), disease-specific survival (DSS), and progression-free interval (PFI) in UCEC, especially in different clinical subgroups of UCEC. Taken together, CLDN6 may be a remarkable molecular biomarker for diagnosis and prognosis in pan-cancer and an independent prognostic risk factor of UCEC, presenting to be a promising molecular target for cancer therapy.
Collapse
Affiliation(s)
- Chiyuan Zhang
- Department of Obstetrics and Gynecology, Shengjing Hospital of China Medical University, Shenyang, China
| | - Cuishan Guo
- Department of Obstetrics and Gynecology, Shengjing Hospital of China Medical University, Shenyang, China
| | - Yan Li
- Department of Obstetrics and Gynecology, Shengjing Hospital of China Medical University, Shenyang, China
| | - Kuiran Liu
- Department of Obstetrics and Gynecology, Shengjing Hospital of China Medical University, Shenyang, China
| | - Qi Zhao
- School of Computer Science and Software Engineering, University of Science and Technology Liaoning, Anshan, China
| | - Ling Ouyang
- Department of Obstetrics and Gynecology, Shengjing Hospital of China Medical University, Shenyang, China
| |
Collapse
|
29
|
Yang H, Ding Y, Tang J, Guo F. Identifying potential association on gene-disease network via dual hypergraph regularized least squares. BMC Genomics 2021; 22:605. [PMID: 34372777 PMCID: PMC8351363 DOI: 10.1186/s12864-021-07864-z] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2021] [Accepted: 06/29/2021] [Indexed: 12/27/2022] Open
Abstract
BACKGROUND Identifying potential associations between genes and diseases via biomedical experiments must be the time-consuming and expensive research works. The computational technologies based on machine learning models have been widely utilized to explore genetic information related to complex diseases. Importantly, the gene-disease association detection can be defined as the link prediction problem in bipartite network. However, many existing methods do not utilize multiple sources of biological information; Additionally, they do not extract higher-order relationships among genes and diseases. RESULTS In this study, we propose a novel method called Dual Hypergraph Regularized Least Squares (DHRLS) with Centered Kernel Alignment-based Multiple Kernel Learning (CKA-MKL), in order to detect all potential gene-disease associations. First, we construct multiple kernels based on various biological data sources in gene and disease spaces respectively. After that, we use CAK-MKL to obtain the optimal kernels in the two spaces respectively. To specific, hypergraph can be employed to establish higher-order relationships. Finally, our DHRLS model is solved by the Alternating Least squares algorithm (ALSA), for predicting gene-disease associations. CONCLUSION Comparing with many outstanding prediction tools, DHRLS achieves best performance on gene-disease associations network under two types of cross validation. To verify robustness, our proposed approach has excellent prediction performance on six real-world networks. Our research work can effectively discover potential disease-associated genes and provide guidance for the follow-up verification methods of complex diseases.
Collapse
Affiliation(s)
- Hongpeng Yang
- School of Computer Science and Technology, College of Intelligence and Computing, Tianjin University, Tianjin, China
| | - Yijie Ding
- Yangtze Delta Region Institute, University of Electronic Science and Technology of China, Quzhou, China.
| | - Jijun Tang
- Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, China
| | - Fei Guo
- School of Computer Science and Engineering, Central South University, Changsha, China.
| |
Collapse
|
30
|
Zhang C, Guo C, Li Y, Ouyang L, Zhao Q, Liu K. The role of YTH domain containing 2 in epigenetic modification and immune infiltration of pan-cancer. J Cell Mol Med 2021; 25:8615-8627. [PMID: 34312987 PMCID: PMC8435423 DOI: 10.1111/jcmm.16818] [Citation(s) in RCA: 18] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2021] [Revised: 07/11/2021] [Accepted: 07/15/2021] [Indexed: 12/17/2022] Open
Abstract
YTH domain containing 2 (YTHDC2) is the largest N6‐Methyladenosine (m6A) binding protein of the YTH protein family and the only member containing ATP‐dependent RNA helicase activity. For further analysing its biological role in epigenetic modification, we comprehensively explored YTHDC2 from gene expression, genetic alteration, protein‐protein interaction (PPI) network, immune infiltration, diagnostic value and prognostic value in pan‐cancer, using a series of databases and bioinformatic tools. We found that YTHDC2 with Missense mutation could cause a different prognosis in uterine corpus endometrial carcinoma (UCEC), and its different methylation level could lead to a totally various prognosis in adrenocortical carcinoma (ACC), cervical squamous cell carcinoma and endocervical adenocarcinoma (CESC), lung squamous cell carcinoma (LUSC) and UCEC. The main molecular mechanisms of YTHDC2 focused on catalytic activity, helicase activity, snRNA binding, spliceosome and mRNA surveillance. Additionally, YTHDC2 was notably correlated with tumour immune infiltration. Moreover, YTHDC2 had a high diagnostic value for seven cancer types and a prognostic value for brain lower grade glioma (LGG), rectum adenocarcinoma (READ) and skin cutaneous melanoma (SKCM). Collectively, YTHDC2 plays a significant role in epigenetic modification and immune infiltration and maybe a potential biomarker for diagnosis and prognosis in certain cancers.
Collapse
Affiliation(s)
- Chiyuan Zhang
- Department of Obstetrics and Gynecology, Shengjing Hospital of China Medical University, Shenyang, China
| | - Cuishan Guo
- Department of Obstetrics and Gynecology, Shengjing Hospital of China Medical University, Shenyang, China
| | - Yan Li
- Department of Obstetrics and Gynecology, Shengjing Hospital of China Medical University, Shenyang, China
| | - Ling Ouyang
- Department of Obstetrics and Gynecology, Shengjing Hospital of China Medical University, Shenyang, China
| | - Qi Zhao
- School of Computer Science and Software Engineering, University of Science and Technology Liaoning, Anshan, China
| | - Kuiran Liu
- Department of Obstetrics and Gynecology, Shengjing Hospital of China Medical University, Shenyang, China
| |
Collapse
|
31
|
Yang C, Ding Y, Meng Q, Tang J, Guo F. Granular multiple kernel learning for identifying RNA-binding protein residues via integrating sequence and structure information. Neural Comput Appl 2021. [DOI: 10.1007/s00521-020-05573-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]
|