1
|
Wang X, Wang S. ACP-PDAFF: Pretrained model and dual-channel attentional feature fusion for anticancer peptides prediction. Comput Biol Chem 2024; 112:108141. [PMID: 38996756 DOI: 10.1016/j.compbiolchem.2024.108141] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/01/2024] [Revised: 05/26/2024] [Accepted: 06/28/2024] [Indexed: 07/14/2024]
Abstract
Anticancer peptides(ACPs) have attracted significant interest as a novel method of treating cancer due to their ability to selectively kill cancer cells without damaging normal cells. Many artificial intelligence-based methods have demonstrated impressive performance in predicting ACPs. Nevertheless, the limitations of existing methods in feature engineering include handcrafted features driven by prior knowledge, insufficient feature extraction, and inefficient feature fusion. In this study, we propose a model based on a pretrained model, and dual-channel attentional feature fusion(DAFF), called ACP-PDAFF. Firstly, to reduce the heavy dependence on expert knowledge-based handcrafted features, binary profile features (BPF) and physicochemical properties features(PCPF) are used as inputs to the transformer model. Secondly, aimed at learning more diverse feature informations of ACPs, a pretrained model ProtBert is utilized. Thirdly, for better fusion of different feature channels, DAFF is employed. Finally, to evaluate the performance of the model, we compare it with other methods on five benchmark datasets, including ACP-Mixed-80 dataset, Main and Alternate datasets of AntiCP 2.0, LEE and Independet dataset, and ACPred-Fuse dataset. And the accuracies obtained by ACP-PDAFF are 0.86, 0.80, 0.94, 0.97 and 0.95 on five datasets, respectively, higher than existing methods by 1% to 12%. Therefore, by learning rich feature informations and effectively fusing different feature channels, ACD-PDAFF achieves outstanding performance. Our code and the datasets are available at https://github.com/wongsing/ACP-PDAFF.
Collapse
Affiliation(s)
- Xinyi Wang
- Department of Computer Science and Engineering, School of Information Science and Engineering, Yunnan University, Kunming, 650504, Yunnan, China
| | - Shunfang Wang
- Department of Computer Science and Engineering, School of Information Science and Engineering, Yunnan University, Kunming, 650504, Yunnan, China.
| |
Collapse
|
2
|
Ghafoor H, Asim MN, Ibrahim MA, Ahmed S, Dengel A. CAPTURE: Comprehensive anti-cancer peptide predictor with a unique amino acid sequence encoder. Comput Biol Med 2024; 176:108538. [PMID: 38759585 DOI: 10.1016/j.compbiomed.2024.108538] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2024] [Revised: 04/26/2024] [Accepted: 04/28/2024] [Indexed: 05/19/2024]
Abstract
Anticancer peptides (ACPs) key properties including bioactivity, high efficacy, low toxicity, and lack of drug resistance make them ideal candidates for cancer therapies. To deeply explore the potential of ACPs and accelerate development of cancer therapies, although 53 Artificial Intelligence supported computational predictors have been developed for ACPs and non ACPs classification but only one predictor has been developed for ACPs functional types annotations. Moreover, these predictors extract amino acids distribution patterns to transform peptides sequences into statistical vectors that are further fed to classifiers for discriminating peptides sequences and annotating peptides functional classes. Overall, these predictors remain fail in extracting diverse types of amino acids distribution patterns from peptide sequences. The paper in hand presents a unique CARE encoder that transforms peptides sequences into statistical vectors by extracting 4 different types of distribution patterns including correlation, distribution, composition, and transition. Across public benchmark dataset, proposed encoder potential is explored under two different evaluation settings namely; intrinsic and extrinsic. Extrinsic evaluation indicates that 12 different machine learning classifiers achieve superior performance with the proposed encoder as compared to 55 existing encoders. Furthermore, an intrinsic evaluation reveals that, unlike existing encoders, the proposed encoder generates more discriminative clusters for ACPs and non-ACPs classes. Across 8 public benchmark ACPs and non-ACPs classification datasets, proposed encoder and Adaboost classifier based CAPTURE predictor outperforms existing predictors with an average accuracy, recall and MCC score of 1%, 4%, and 2% respectively. In generalizeability evaluation case study, across 7 benchmark anti-microbial peptides classification datasets, CAPTURE surpasses existing predictors by an average AU-ROC of 2%. CAPTURE predictive pipeline along with label powerset method outperforms state-of-the-art ACPs functional types predictor by 5%, 5%, 5%, 6%, and 3% in terms of average accuracy, subset accuracy, precision, recall, and F1 respectively. CAPTURE web application is available at https://sds_genetic_analysis.opendfki.de/CAPTURE.
Collapse
Affiliation(s)
- Hina Ghafoor
- Department of Computer Science, Rhineland-Palatinate Technical University of Kaiserslautern-Landau, Kaiserslautern, 67663, Germany; German Research Center for Artificial Intelligence GmbH, Kaiserslautern, 67663, Germany
| | - Muhammad Nabeel Asim
- German Research Center for Artificial Intelligence GmbH, Kaiserslautern, 67663, Germany.
| | - Muhammad Ali Ibrahim
- Department of Computer Science, Rhineland-Palatinate Technical University of Kaiserslautern-Landau, Kaiserslautern, 67663, Germany; German Research Center for Artificial Intelligence GmbH, Kaiserslautern, 67663, Germany
| | - Sheraz Ahmed
- German Research Center for Artificial Intelligence GmbH, Kaiserslautern, 67663, Germany
| | - Andreas Dengel
- Department of Computer Science, Rhineland-Palatinate Technical University of Kaiserslautern-Landau, Kaiserslautern, 67663, Germany; German Research Center for Artificial Intelligence GmbH, Kaiserslautern, 67663, Germany
| |
Collapse
|
3
|
Song H, Lin X, Zhang H, Yin H. ACP-ESM2: The prediction of anticancer peptides based on pre-trained classifier. Comput Biol Chem 2024; 110:108091. [PMID: 38735271 DOI: 10.1016/j.compbiolchem.2024.108091] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2024] [Revised: 04/07/2024] [Accepted: 04/29/2024] [Indexed: 05/14/2024]
Abstract
Anticancer peptides (ACPs) are a type of protein molecule that has anti-cancer activity and can inhibit cancer cell growth and survival. Traditional classification approaches for ACPs are expensive and time-consuming. This paper proposes a pre-trained classifier model, ESM2-GRU, for ACP prediction to make it easier to predict ACPs, gain a better understanding of the structural and functional differences of anti-cancer peptides, and optimize the design for the development of more effective anti-cancer treatment strategies. The model is made up of the ESM2 pre-trained model, a bidirectional GRU recurrent neural network, and a fully connected layer. ACP sequences are first fed into the ESM2 model, which then expands the dimensions before feeding the findings back into the bidirectional GRU recurrent neural network. Finally, the fully connected layer generates the ultimate output. Experimental validation demonstrates that the ESM2-GRU model greatly improves classification performance on the benchmark dataset ACP606, with AUC, ACC, and MCC values of 0.975, 0.852, and 0.738, respectively. This exceptional prediction potential helps to identify specific types of anti-cancer peptides, improving their targeting and selectivity and, therefore, furthering the development of tailored medicine and treatments.
Collapse
Affiliation(s)
- Huijia Song
- School of Information Engineering, Beijing Institute of Petrochemical Technology, Beijing 102617, China
| | - Xiaozhu Lin
- School of Information Engineering, Beijing Institute of Petrochemical Technology, Beijing 102617, China.
| | - Huainian Zhang
- School of Information Engineering, Beijing Institute of Petrochemical Technology, Beijing 102617, China
| | - Huijuan Yin
- School of Information Engineering, Beijing Institute of Petrochemical Technology, Beijing 102617, China
| |
Collapse
|
4
|
Liao YH, Chen SZ, Bin YN, Zhao JP, Feng XL, Zheng CH. UsIL-6: An unbalanced learning strategy for identifying IL-6 inducing peptides by undersampling technique. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2024; 250:108176. [PMID: 38677081 DOI: 10.1016/j.cmpb.2024.108176] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/18/2022] [Revised: 03/26/2024] [Accepted: 04/11/2024] [Indexed: 04/29/2024]
Abstract
BACKGROUND AND OBJECTIVE Interleukin-6 (IL-6) is the critical factor of early warning, monitoring, and prognosis in the inflammatory storm of COVID-19 cases. IL-6 inducing peptides, which can induce cytokine IL-6 production, are very important for the development of diagnosis and immunotherapy. Although the existing methods have some success in predicting IL-6 inducing peptides, there is still room for improvement in the performance of these models in practical application. METHODS In this study, we proposed UsIL-6, a high-performance bioinformatics tool for identifying IL-6 inducing peptides. First, we extracted five groups of physicochemical properties and sequence structural information from IL-6 inducing peptide sequences, and obtained a 636-dimensional feature vector, we also employed NearMiss3 undersampling method and normalization method StandardScaler to process the data. Then, a 40-dimensional optimal feature vector was obtained by Boruta feature selection method. Finally, we combined this feature vector with extreme randomization tree classifier to build the final model UsIL-6. RESULTS The AUC value of UsIL-6 on the independent test dataset was 0.87, and the BACC value was 0.808, which indicated that UsIL-6 had better performance than the existing methods in IL-6 inducing peptide recognition. CONCLUSIONS The performance comparison on independent test dataset confirmed that UsIL-6 could achieve the highest performance, best robustness, and most excellent generalization ability. We hope that UsIL-6 will become a valuable method to identify, annotate and characterize new IL-6 inducing peptides.
Collapse
Affiliation(s)
- Yan-Hong Liao
- School of Mathematics and System Science, Xinjiang University, Urumqi, Xinjiang 830017, China
| | - Shou-Zhi Chen
- School of Mathematics and System Science, Xinjiang University, Urumqi, Xinjiang 830017, China
| | - Yan-Nan Bin
- School of Computer Science and Technology, Anhui University, Hefei, Anhui 230601, China
| | - Jian-Ping Zhao
- School of Mathematics and System Science, Xinjiang University, Urumqi, Xinjiang 830017, China.
| | - Xin-Long Feng
- School of Mathematics and System Science, Xinjiang University, Urumqi, Xinjiang 830017, China.
| | - Chun-Hou Zheng
- School of Mathematics and System Science, Xinjiang University, Urumqi, Xinjiang 830017, China; School of Computer Science and Technology, Anhui University, Hefei, Anhui 230601, China
| |
Collapse
|
5
|
Chen Z, Wang R, Guo J, Wang X. The role and future prospects of artificial intelligence algorithms in peptide drug development. Biomed Pharmacother 2024; 175:116709. [PMID: 38713945 DOI: 10.1016/j.biopha.2024.116709] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2024] [Revised: 05/01/2024] [Accepted: 05/02/2024] [Indexed: 05/09/2024] Open
Abstract
Peptide medications have been more well-known in recent years due to their many benefits, including low side effects, high biological activity, specificity, effectiveness, and so on. Over 100 peptide medications have been introduced to the market to treat a variety of illnesses. Most of these peptide medications are developed on the basis of endogenous peptides or natural peptides, which frequently required expensive, time-consuming, and extensive tests to confirm. As artificial intelligence advances quickly, it is now possible to build machine learning or deep learning models that screen a large number of candidate sequences for therapeutic peptides. Therapeutic peptides, such as those with antibacterial or anticancer properties, have been developed by the application of artificial intelligence algorithms.The process of finding and developing peptide drugs is outlined in this review, along with a few related cases that were helped by AI and conventional methods. These resources will open up new avenues for peptide drug development and discovery, helping to meet the pressing needs of clinical patients for disease treatment. Although peptide drugs are a new class of biopharmaceuticals that distinguish them from chemical and small molecule drugs, their clinical purpose and value cannot be ignored. However, the traditional peptide drug research and development has a long development cycle and high investment, and the creation of peptide medications will be substantially hastened by the AI-assisted (AI+) mode, offering a new boost for combating diseases.
Collapse
Affiliation(s)
- Zhiheng Chen
- School of Biological Science and Medical Engineering, Beihang University, Beijing 100083, China.
| | - Ruoxi Wang
- School of Biological Science and Medical Engineering, Beihang University, Beijing 100083, China.
| | - Junqi Guo
- School of Biological Science and Medical Engineering, Beihang University, Beijing 100083, China.
| | - Xiaogang Wang
- Guangdong Provincial Key Laboratory of Bone and Joint Degenerative Diseases, The Third Affiliated Hospital of Southern Medical University, Guangzhou, Guangdong 510630, China.
| |
Collapse
|
6
|
Liang X, Zhao H, Wang J. MA-PEP: A novel anticancer peptide prediction framework with multimodal feature fusion based on attention mechanism. Protein Sci 2024; 33:e4966. [PMID: 38532681 DOI: 10.1002/pro.4966] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2023] [Revised: 01/30/2024] [Accepted: 03/06/2024] [Indexed: 03/28/2024]
Abstract
AntiCancer Peptides (ACPs) have emerged as promising therapeutic agents for cancer treatment. The time-consuming and costly nature of wet-lab discriminatory methods has spurred the development of various machine learning and deep learning-based ACP classification methods. Nonetheless, current methods encountered challenges in efficiently integrating features from various peptide modalities, thereby limiting a more comprehensive understanding of ACPs and further restricting the improvement of prediction model performance. In this study, we introduce a novel ACP prediction method, MA-PEP, which leverages multiple attention mechanisms for feature enhancement and fusion to improve ACP prediction. By integrating the enhanced molecular-level chemical features and sequence information of peptides, MA-PEP demonstrates superior prediction performance across several benchmark datasets, highlighting its efficacy in ACP prediction. Moreover, the visual analysis and case studies further demonstrate MA-PEP's reliable feature extraction capability and its promise in the realm of ACP exploration. The code and datasets for MA-PEP are available at https://github.com/liangxiaodata/MA-PEP.
Collapse
Affiliation(s)
- Xiao Liang
- School of Computer Science and Engineering, Central South University, Changsha, China
- Hunan Provincial Key Lab on Bioinformatics, Central South University, Changsha, China
| | - Haochen Zhao
- School of Computer Science and Engineering, Central South University, Changsha, China
- Hunan Provincial Key Lab on Bioinformatics, Central South University, Changsha, China
| | - Jianxin Wang
- School of Computer Science and Engineering, Central South University, Changsha, China
- Hunan Provincial Key Lab on Bioinformatics, Central South University, Changsha, China
| |
Collapse
|
7
|
Lee B, Shin D. Contrastive learning for enhancing feature extraction in anticancer peptides. Brief Bioinform 2024; 25:bbae220. [PMID: 38725157 PMCID: PMC11082072 DOI: 10.1093/bib/bbae220] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2024] [Revised: 03/28/2024] [Accepted: 04/21/2024] [Indexed: 05/13/2024] Open
Abstract
Cancer, recognized as a primary cause of death worldwide, has profound health implications and incurs a substantial social burden. Numerous efforts have been made to develop cancer treatments, among which anticancer peptides (ACPs) are garnering recognition for their potential applications. While ACP screening is time-consuming and costly, in silico prediction tools provide a way to overcome these challenges. Herein, we present a deep learning model designed to screen ACPs using peptide sequences only. A contrastive learning technique was applied to enhance model performance, yielding better results than a model trained solely on binary classification loss. Furthermore, two independent encoders were employed as a replacement for data augmentation, a technique commonly used in contrastive learning. Our model achieved superior performance on five of six benchmark datasets against previous state-of-the-art models. As prediction tools advance, the potential in peptide-based cancer therapeutics increases, promising a brighter future for oncology research and patient care.
Collapse
Affiliation(s)
- Byungjo Lee
- Research Institute, National Cancer Center, 323, Ilsan-ro, Ilsandong-gu, Goyang, 10408, Republic of Korea
| | - Dongkwan Shin
- Research Institute, National Cancer Center, 323, Ilsan-ro, Ilsandong-gu, Goyang, 10408, Republic of Korea
- Department of Cancer Biomedical Science, National Cancer Center Graduate School of Cancer Science and Policy, 323, Ilsan-ro, Ilsandong-gu, Goyang, 10408, Republic of Korea
| |
Collapse
|
8
|
Zhong G, Deng L. ACPScanner: Prediction of Anticancer Peptides by Integrated Machine Learning Methodologies. J Chem Inf Model 2024; 64:1092-1104. [PMID: 38277774 DOI: 10.1021/acs.jcim.3c01860] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/28/2024]
Abstract
Novel therapeutic alternatives for cancer treatment are increasingly attracting global research attention. Although chemotherapy remains a primary clinical solution, it often results in significant side effects for patients. In recent years, anticancer peptides (ACPs) have emerged as promising candidates for highly specific anticancer drugs, and a number of computational approaches have been developed to identify ACPs. However, existing methods do not recognize specific types of anticancer function. In this article, we propose ACPScanner, an integrated approach to predict ACPs and non-ACPs at first and then predict several specific activity types for potential ACPs. We incorporate sequential, physicochemical properties, secondary structural information, and deep representation learning embeddings which are generated from artificial intelligence methods to build feature space. Customized deep learning and statistical learning methods are combined to form an integral architecture for the comprehensive two-level prediction task. To the best of our knowledge, ACPScanner is the first approach for specific ACP activity prediction. The comparative evaluation illustrates that ACPScanner achieves competitive prediction performance in both prediction phases in independent testings. We establish a web server at http://acpscanner.denglab.org to provide convenient usage of ACPScanner and make the predictive framework, source code, and data sets publicly available.
Collapse
Affiliation(s)
- Guolun Zhong
- School of Computer Science and Engineering, Central South University, Changsha 410000, China
| | - Lei Deng
- School of Computer Science and Engineering, Central South University, Changsha 410000, China
| |
Collapse
|
9
|
Karim T, Shaon MSH, Sultan MF, Hasan MZ, Kafy AA. ANNprob-ACPs: A novel anticancer peptide identifier based on probabilistic feature fusion approach. Comput Biol Med 2024; 169:107915. [PMID: 38171261 DOI: 10.1016/j.compbiomed.2023.107915] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2023] [Revised: 12/28/2023] [Accepted: 12/29/2023] [Indexed: 01/05/2024]
Abstract
Anticancer Peptides (ACPs) offer significant potential as cancer treatment drugs in this modern era. Quickly identifying active compounds from protein sequences is crucial for healthcare and cancer treatment. In this paper ANNprob-ACPs, a novel and effective model for detecting ACPs has been implemented based on nine feature encoding techniques, including AAC, CC, W2V, DPC, PAAC, QSO, CTDC, CTDT, and CKSAAGP. After analyzing the performance of several machine learning models, the six best models were selected based on their overall performances in every evaluation metric. The probability scores of each model were subsequently aggregated and used as input of our meta- model, called ANNprob-ACPs. Our model outperformed all others and its potential to lead to phenomenal identification of ACPs. The results of this study showed notable improvement in 10-fold cross-validation and independent test, with accuracy of 93.72% and 90.62%, respectively. Our proposed model, ANNprob-ACPs outperformed existing approaches in terms of accuracy and effectiveness in discovering ACPs. By using SHAP, this study obtained the physicochemical properties of QSO, and compositional properties of DPC, AAC, and PAAC are more impactful for our model's performances, which have a major impact on a drug's interactions and future discoveries. Consequently, this model is crucial for the future and has a high probability of detecting ACPs more frequently. We developed a web server of ANNprob-ACPs, which is accessible at ANNprob-ACPs webserver.
Collapse
Affiliation(s)
- Tasmin Karim
- Department of Computer Science & Engineering, Daffodil International University, Daffodil Smart City, Birulia, Dhaka, 1216, Bangladesh; Health Informatics Research Lab, Department of Computer Science and Engineering, Daffodil International University, Daffodil Smart City, Birulia, Dhaka, 1216, Bangladesh.
| | - Md Shazzad Hossain Shaon
- Department of Computer Science & Engineering, Daffodil International University, Daffodil Smart City, Birulia, Dhaka, 1216, Bangladesh; Health Informatics Research Lab, Department of Computer Science and Engineering, Daffodil International University, Daffodil Smart City, Birulia, Dhaka, 1216, Bangladesh.
| | - Md Fahim Sultan
- Department of Computer Science & Engineering, Daffodil International University, Daffodil Smart City, Birulia, Dhaka, 1216, Bangladesh; Health Informatics Research Lab, Department of Computer Science and Engineering, Daffodil International University, Daffodil Smart City, Birulia, Dhaka, 1216, Bangladesh.
| | - Md Zahid Hasan
- Department of Computer Science & Engineering, Daffodil International University, Daffodil Smart City, Birulia, Dhaka, 1216, Bangladesh; Health Informatics Research Lab, Department of Computer Science and Engineering, Daffodil International University, Daffodil Smart City, Birulia, Dhaka, 1216, Bangladesh.
| | - Abdulla-Al Kafy
- Department of Urban & Regional Planning, Rajshahi University of Engineering & Technology (RUET), Rajshahi, 6204, Bangladesh.
| |
Collapse
|
10
|
Wongklaew P, Sriswasdi S, Chuangsuwanich E. MHCSeqNet2-improved peptide-class I MHC binding prediction for alleles with low data. Bioinformatics 2024; 40:btad780. [PMID: 38152987 PMCID: PMC10783953 DOI: 10.1093/bioinformatics/btad780] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2023] [Revised: 12/14/2023] [Accepted: 12/27/2023] [Indexed: 12/29/2023] Open
Abstract
MOTIVATION The binding of a peptide antigen to a Class I major histocompatibility complex (MHC) protein is part of a key process that lets the immune system recognize an infected cell or a cancer cell. This mechanism enabled the development of peptide-based vaccines that can activate the patient's immune response to treat cancers. Hence, the ability of accurately predict peptide-MHC binding is an essential component for prioritizing the best peptides for each patient. However, peptide-MHC binding experimental data for many MHC alleles are still lacking, which limited the accuracy of existing prediction models. RESULTS In this study, we presented an improved version of MHCSeqNet that utilized sub-word-level peptide features, a 3D structure embedding for MHC alleles, and an expanded training dataset to achieve better generalizability on MHC alleles with small amounts of data. Visualization of MHC allele embeddings confirms that the model was able to group alleles with similar binding specificity, including those with no peptide ligand in the training dataset. Furthermore, an external evaluation suggests that MHCSeqNet2 can improve the prioritization of T cell epitopes for MHC alleles with small amount of training data. AVAILABILITY AND IMPLEMENTATION The source code and installation instruction for MHCSeqNet2 are available at https://github.com/cmb-chula/MHCSeqNet2.
Collapse
Affiliation(s)
- Patiphan Wongklaew
- Department of Computer Engineering, Faculty of Engineering, Chulalongkorn University, Bangkok 10330, Thailand
| | - Sira Sriswasdi
- Center of Excellence in Computational Molecular Biology, Division of Research Affairs, Faculty of Medicine, Chulalongkorn University, Bangkok 10330, Thailand
- Center for Artificial Intelligence in Medicine, Division of Research Affairs, Faculty of Medicine, Chulalongkorn University, Bangkok 10330, Thailand
| | - Ekapol Chuangsuwanich
- Department of Computer Engineering, Faculty of Engineering, Chulalongkorn University, Bangkok 10330, Thailand
- Center of Excellence in Computational Molecular Biology, Division of Research Affairs, Faculty of Medicine, Chulalongkorn University, Bangkok 10330, Thailand
| |
Collapse
|
11
|
Lv H, Yan K, Liu B. TPpred-LE: therapeutic peptide function prediction based on label embedding. BMC Biol 2023; 21:238. [PMID: 37904157 PMCID: PMC10617231 DOI: 10.1186/s12915-023-01740-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/17/2023] [Accepted: 10/17/2023] [Indexed: 11/01/2023] Open
Abstract
BACKGROUND Therapeutic peptides play an essential role in human physiology, treatment paradigms and bio-pharmacy. Several computational methods have been developed to identify the functions of therapeutic peptides based on binary classification and multi-label classification. However, these methods fail to explicitly exploit the relationship information among different functions, preventing the further improvement of the prediction performance. Besides, with the development of peptide detection technology, peptide functions will be more comprehensively discovered. Therefore, it is necessary to explore computational methods for detecting therapeutic peptide functions with limited labeled data. RESULTS In this study, a novel method called TPpred-LE based on Transformer framework was proposed for predicting therapeutic peptide multiple functions, which can explicitly extract the function correlation information by using label embedding methodology and exploit the specificity information based on function-specific classifiers. Besides, we incorporated the multi-label classifier retraining approach (MCRT) into TPpred-LE to detect the new therapeutic functions with limited labeled data. Experimental results demonstrate that TPpred-LE outperforms the other state-of-the-art methods, and TPpred-LE with MCRT is robust for the limited labeled data. CONCLUSIONS In summary, TPpred-LE is a function-specific classifier for accurate therapeutic peptide function prediction, demonstrating the importance of the relationship information for therapeutic peptide function prediction. MCRT is a simple but effective strategy to detect functions with limited labeled data.
Collapse
Affiliation(s)
- Hongwu Lv
- School of Computer Science and Technology, Beijing Institute of Technology, Beijing, 100081, China
| | - Ke Yan
- School of Computer Science and Technology, Beijing Institute of Technology, Beijing, 100081, China
| | - Bin Liu
- School of Computer Science and Technology, Beijing Institute of Technology, Beijing, 100081, China.
- Advanced Research Institute of Multidisciplinary Science, Beijing Institute of Technology, No. 5, South Zhongguancun Street, Haidian District, Beijing, 100081, China.
| |
Collapse
|
12
|
Jiang Y, Wang R, Feng J, Jin J, Liang S, Li Z, Yu Y, Ma A, Su R, Zou Q, Ma Q, Wei L. Explainable Deep Hypergraph Learning Modeling the Peptide Secondary Structure Prediction. ADVANCED SCIENCE (WEINHEIM, BADEN-WURTTEMBERG, GERMANY) 2023; 10:e2206151. [PMID: 36794291 PMCID: PMC10104664 DOI: 10.1002/advs.202206151] [Citation(s) in RCA: 13] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/24/2022] [Revised: 01/20/2023] [Indexed: 06/18/2023]
Abstract
Accurately predicting peptide secondary structures remains a challenging task due to the lack of discriminative information in short peptides. In this study, PHAT is proposed, a deep hypergraph learning framework for the prediction of peptide secondary structures and the exploration of downstream tasks. The framework includes a novel interpretable deep hypergraph multi-head attention network that uses residue-based reasoning for structure prediction. The algorithm can incorporate sequential semantic information from large-scale biological corpus and structural semantic information from multi-scale structural segmentation, leading to better accuracy and interpretability even with extremely short peptides. The interpretable models are able to highlight the reasoning of structural feature representations and the classification of secondary substructures. The importance of secondary structures in peptide tertiary structure reconstruction and downstream functional analysis is further demonstrated, highlighting the versatility of our models. To facilitate the use of the model, an online server is established which is accessible via http://inner.wei-group.net/PHAT/. The work is expected to assist in the design of functional peptides and contribute to the advancement of structural biology research.
Collapse
Affiliation(s)
- Yi Jiang
- School of SoftwareShandong UniversityJinanShandong250101China
- Joint SDU‐NTU Centre for Artificial Intelligence Research (C‐FAIR)Shandong UniversityJinanShandong250101China
| | - Ruheng Wang
- School of SoftwareShandong UniversityJinanShandong250101China
- Joint SDU‐NTU Centre for Artificial Intelligence Research (C‐FAIR)Shandong UniversityJinanShandong250101China
| | - Jiuxin Feng
- School of SoftwareShandong UniversityJinanShandong250101China
- Joint SDU‐NTU Centre for Artificial Intelligence Research (C‐FAIR)Shandong UniversityJinanShandong250101China
| | - Junru Jin
- School of SoftwareShandong UniversityJinanShandong250101China
- Joint SDU‐NTU Centre for Artificial Intelligence Research (C‐FAIR)Shandong UniversityJinanShandong250101China
| | - Sirui Liang
- School of SoftwareShandong UniversityJinanShandong250101China
- Joint SDU‐NTU Centre for Artificial Intelligence Research (C‐FAIR)Shandong UniversityJinanShandong250101China
| | - Zhongshen Li
- School of SoftwareShandong UniversityJinanShandong250101China
- Joint SDU‐NTU Centre for Artificial Intelligence Research (C‐FAIR)Shandong UniversityJinanShandong250101China
| | - Yingying Yu
- School of SoftwareShandong UniversityJinanShandong250101China
- Joint SDU‐NTU Centre for Artificial Intelligence Research (C‐FAIR)Shandong UniversityJinanShandong250101China
| | - Anjun Ma
- Department of Biomedical InformaticsCollege of MedicineThe Ohio State UniversityColumbusOH43210USA
| | - Ran Su
- College of Intelligence and ComputingTianjin UniversityTianjin300350China
| | - Quan Zou
- Institute of Fundamental and Frontier SciencesUniversity of Electronic Science and Technology of ChinaChengduSichuan610054China
| | - Qin Ma
- Department of Biomedical InformaticsCollege of MedicineThe Ohio State UniversityColumbusOH43210USA
| | - Leyi Wei
- School of SoftwareShandong UniversityJinanShandong250101China
- Joint SDU‐NTU Centre for Artificial Intelligence Research (C‐FAIR)Shandong UniversityJinanShandong250101China
| |
Collapse
|
13
|
Deng H, Ding M, Wang Y, Li W, Liu G, Tang Y. ACP-MLC: A two-level prediction engine for identification of anticancer peptides and multi-label classification of their functional types. Comput Biol Med 2023; 158:106844. [PMID: 37058760 DOI: 10.1016/j.compbiomed.2023.106844] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/07/2023] [Revised: 03/09/2023] [Accepted: 03/30/2023] [Indexed: 04/07/2023]
Abstract
Anticancer peptides (ACPs), a series of short bioactive peptides, are promising candidates in fighting against cancer due to their high activity, low toxicity, and not likely cause drug resistance. The accurate identification of ACPs and classification of their functional types is of great importance for investigating their mechanisms of action and developing peptide-based anticancer therapies. Here, we provided a computational tool, called ACP-MLC, to address binary classification and multi-label classification of ACPs for a given peptide sequence. Briefly, ACP-MLC is a two-level prediction engine, in which the 1st-level model predicts whether a query sequence is an ACP or not by random forest algorithm, and the 2nd-level model predicts which tissue types the sequence might target by the binary relevance algorithm. Development and evaluation by high-quality datasets, our ACP-MLC yielded an area under the receiver operating characteristic curve (AUC) of 0.888 on the independent test set for the 1st-level prediction, and obtained 0.157 hamming loss, 0.577 subset accuracy, 0.802 F1-scoremacro, and 0.826 F1-scoremicro on the independent test set for the 2nd-level prediction. A systematic comparison demonstrated that ACP-MLC outperformed existing binary classifiers and other multi-label learning classifiers for ACP prediction. Finally, we interpreted the important features of ACP-MLC by the SHAP method. User-friendly software and the datasets are available at https://github.com/Nicole-DH/ACP-MLC. We believe that the ACP-MLC would be a powerful tool in ACP discovery.
Collapse
|
14
|
Qin D, Jiao L, Wang R, Zhao Y, Hao Y, Liang G. Prediction of antioxidant peptides using a quantitative structure-activity relationship predictor (AnOxPP) based on bidirectional long short-term memory neural network and interpretable amino acid descriptors. Comput Biol Med 2023; 154:106591. [PMID: 36701965 DOI: 10.1016/j.compbiomed.2023.106591] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2022] [Revised: 01/15/2023] [Accepted: 01/22/2023] [Indexed: 01/25/2023]
Abstract
Antioxidant peptides can protect against free radical-mediated diseases, especially food-derived antioxidant peptides are considered as potential competitors among synthetic antioxidants due to their safety, high activity and abundant sources. However, wet experimental methods can not meet the need for effectively screening and clearly elucidating the structure-activity relationship of antioxidant peptides. Therefore, it is particularly important to build a reliable prediction platform for antioxidant peptides. In this work, we developed a platform, AnOxPP, for prediction of antioxidant peptides using the bidirectional long short-term memory (BiLSTM) neural network. The sequence characteristics of peptides were converted into feature codes based on amino acid descriptors (AADs). Our results showed that the feature conversion ability of the combined-AADs optimized by the forward feature selection method was more accurate than that of the single-AADs. Especially, the model trained by the optimal descriptor SDPZ27 significantly outperformed the existing predictor on two independent test sets (Accuracy = 0.967 and 0.819, respectively). The SDPZ27-based AnOxPP learned four key structure-activity features of antioxidant peptides, with the following importance as steric properties > hydrophobic properties > electronic properties > hydrogen bond contributions. AnOxPP is a valuable tool for screening and design of peptide drugs, and the web-server is accessible at http://www.cqudfbp.net/AnOxPP/index.jsp.
Collapse
Affiliation(s)
- Dongya Qin
- Key Laboratory of Biorheological Science and Technology, Ministry of Education, Bioengineering College, Chongqing University, Chongqing, 400030, China
| | - Linna Jiao
- Key Laboratory of Biorheological Science and Technology, Ministry of Education, Bioengineering College, Chongqing University, Chongqing, 400030, China
| | - Ruihong Wang
- Key Laboratory of Biorheological Science and Technology, Ministry of Education, Bioengineering College, Chongqing University, Chongqing, 400030, China
| | - Yi Zhao
- Key Laboratory of Biorheological Science and Technology, Ministry of Education, Bioengineering College, Chongqing University, Chongqing, 400030, China
| | - Youjin Hao
- College of Life Sciences, Chongqing Normal University, Chongqing, 401331, China
| | - Guizhao Liang
- Key Laboratory of Biorheological Science and Technology, Ministry of Education, Bioengineering College, Chongqing University, Chongqing, 400030, China.
| |
Collapse
|
15
|
Ghaly G, Tallima H, Dabbish E, Badr ElDin N, Abd El-Rahman MK, Ibrahim MAA, Shoeib T. Anti-Cancer Peptides: Status and Future Prospects. Molecules 2023; 28:molecules28031148. [PMID: 36770815 PMCID: PMC9920184 DOI: 10.3390/molecules28031148] [Citation(s) in RCA: 12] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2022] [Revised: 12/26/2022] [Accepted: 01/19/2023] [Indexed: 01/26/2023] Open
Abstract
The dramatic rise in cancer incidence, alongside treatment deficiencies, has elevated cancer to the second-leading cause of death globally. The increasing morbidity and mortality of this disease can be traced back to a number of causes, including treatment-related side effects, drug resistance, inadequate curative treatment and tumor relapse. Recently, anti-cancer bioactive peptides (ACPs) have emerged as a potential therapeutic choice within the pharmaceutical arsenal due to their high penetration, specificity and fewer side effects. In this contribution, we present a general overview of the literature concerning the conformational structures, modes of action and membrane interaction mechanisms of ACPs, as well as provide recent examples of their successful employment as targeting ligands in cancer treatment. The use of ACPs as a diagnostic tool is summarized, and their advantages in these applications are highlighted. This review expounds on the main approaches for peptide synthesis along with their reconstruction and modification needed to enhance their therapeutic effect. Computational approaches that could predict therapeutic efficacy and suggest ACP candidates for experimental studies are discussed. Future research prospects in this rapidly expanding area are also offered.
Collapse
Affiliation(s)
- Gehane Ghaly
- Department of Chemistry, The American University in Cairo, New Cairo 11835, Egypt
| | - Hatem Tallima
- Department of Chemistry, The American University in Cairo, New Cairo 11835, Egypt
| | - Eslam Dabbish
- Department of Chemistry, The American University in Cairo, New Cairo 11835, Egypt
| | - Norhan Badr ElDin
- Analytical Chemistry Department, Faculty of Pharmacy, Cairo University, Kasr-El Aini Street, Cairo 11562, Egypt
| | - Mohamed K. Abd El-Rahman
- Analytical Chemistry Department, Faculty of Pharmacy, Cairo University, Kasr-El Aini Street, Cairo 11562, Egypt
- Department of Chemistry and Chemical Biology, Harvard University, 12 Oxford Street, Cambridge, MA 02138, USA
| | - Mahmoud A. A. Ibrahim
- Computational Chemistry Laboratory, Chemistry Department, Faculty of Science, Minia University, Minia 61519, Egypt
- School of Health Sciences, University of Kwa-Zulu-Natal, Westville, Durban 4000, South Africa
| | - Tamer Shoeib
- Department of Chemistry, The American University in Cairo, New Cairo 11835, Egypt
- Correspondence:
| |
Collapse
|
16
|
ACPred-BMF: bidirectional LSTM with multiple feature representations for explainable anticancer peptide prediction. Sci Rep 2022; 12:21915. [PMID: 36535969 PMCID: PMC9763336 DOI: 10.1038/s41598-022-24404-1] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2022] [Accepted: 11/15/2022] [Indexed: 12/24/2022] Open
Abstract
Cancer has become a major factor threatening human life and health. Under the circumstance that traditional treatment methods such as chemotherapy and radiotherapy are not highly specific and often cause severe side effects and toxicity, new treatment methods are urgently needed. Anticancer peptide drugs have low toxicity, stronger efficacy and specificity, and have emerged as a new type of cancer treatment drugs. However, experimental identification of anticancer peptides is time-consuming and expensive, and difficult to perform in a high-throughput manner. Computational identification of anticancer peptides can make up for the shortcomings of experimental identification. In this study, a deep learning-based predictor named ACPred-BMF is proposed for the prediction of anticancer peptides. This method uses the quantitative and qualitative properties of amino acids, binary profile feature to numerical representation for the peptide sequences. The Bidirectional LSTM network architecture is used in the model, and the attention mechanism is also considered. To alleviate the black-box problem of deep learning model prediction, we visualized the automatically extracted features and used the Shapley additive explanations algorithm to determine the importance of features to further understand the anticancer peptide mechanism. The results show that our method is one of the state-of-the-art anticancer peptide predictors. A web server as the implementation of ACPred-BMF that can be accessed via: http://mialab.ruc.edu.cn/ACPredBMFServer/ .
Collapse
|
17
|
Liu J, Li M, Chen X. AntiMF: A deep learning framework for predicting anticancer peptides based on multi-view feature extraction. Methods 2022; 207:38-43. [PMID: 36100141 DOI: 10.1016/j.ymeth.2022.07.017] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2022] [Revised: 07/20/2022] [Accepted: 07/26/2022] [Indexed: 01/10/2023] Open
Abstract
In recent years, anticancer peptides have emerged as a new viable option in cancer therapy, with the ability to overcome the considerable side effects and poor outcomes of standard cancer therapies. Accurate anticancer peptide identification can facilitate its finding and speed up its application in treating cancer. However, many recent approaches are based on machine learning, which not only restricts the representation ability of the models but also requires a complex hand-crafted feature extraction process. Here, we propose AntiMF, a deep learning model that utilizes multi-view mechanism based on different feature extraction models. Comparative results show that our model has a better performance than the state-of-the-art methods in the prediction of anticancer peptides. By using an ensemble learning framework to extract representation, AntiMF can capture the different dimensional information, which can make model representation more complete. Moreover, we visualize what AntiMF learns on one of its ensemble models to intuitively show the effectivity of our model, indicating that AntiMF has the great potential ability to be an effective and useful model to identify anticancer peptides accurately.
Collapse
Affiliation(s)
- Jingjing Liu
- Eye Hospital, The First Affiliated Hospital of Harbin Medical University, Harbin 150001, China
| | - Minghao Li
- Beidahuang Industry Group General Hospital, Harbin 150001, China
| | - Xin Chen
- Eye Hospital, The First Affiliated Hospital of Harbin Medical University, Harbin 150001, China; Department of Neurosurgical Laboratory, The First Affiliated Hospital of Harbin Medical University, Harbin 150001, China.
| |
Collapse
|
18
|
Akbar S, Hayat M, Tahir M, Khan S, Alarfaj FK. cACP-DeepGram: Classification of anticancer peptides via deep neural network and skip-gram-based word embedding model. Artif Intell Med 2022; 131:102349. [DOI: 10.1016/j.artmed.2022.102349] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2021] [Revised: 05/24/2022] [Accepted: 07/04/2022] [Indexed: 12/28/2022]
|
19
|
Feng G, Yao H, Li C, Liu R, Huang R, Fan X, Ge R, Miao Q. ME-ACP: Multi-view neural networks with ensemble model for identification of anticancer peptides. Comput Biol Med 2022; 145:105459. [DOI: 10.1016/j.compbiomed.2022.105459] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2022] [Revised: 03/22/2022] [Accepted: 03/24/2022] [Indexed: 12/26/2022]
|
20
|
Multi-channel CNN based anticancer peptides identification. Anal Biochem 2022; 650:114707. [PMID: 35568159 DOI: 10.1016/j.ab.2022.114707] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2021] [Revised: 01/27/2022] [Accepted: 04/27/2022] [Indexed: 11/20/2022]
Abstract
Cancer is one of the most dangerous diseases in the world that often leads to misery and death. Current treatments include different kinds of anticancer therapy which exhibit different types of side effects. Because of certain physicochemical properties, anticancer peptides (ACPs) have opened a new path of treatments for this deadly disease. That is why a well-performed methodology for identifying novel anticancer peptides has great importance in the fight against cancer. In addition to the laboratory techniques, various machine learning and deep learning methodologies have developed in recent years for this task. Although these models have shown reasonable predictive ability, there's still room for improvement in terms of performance and exploring new types of algorithms. In this work, we have proposed a novel multi-channel convolutional neural network (CNN) for identifying anticancer peptides from protein sequences. We have collected data from the existing state-of-the-art methodologies and applied binary encoding for data preprocessing. We have also employed k-fold cross-validation to train our models on benchmark datasets and compared our models' performance on the independent datasets. The comparison has indicated our models' superiority on various evaluation metrics. We think our work can be a valuable asset in finding novel anticancer peptides. We have provided a user-friendly web server for academic purposes and it is publicly available at: \texttt{http://103.99.176.239/iacp-cnn/}.
Collapse
|
21
|
Development of Anticancer Peptides Using Artificial Intelligence and Combinational Therapy for Cancer Therapeutics. Pharmaceutics 2022; 14:pharmaceutics14050997. [PMID: 35631583 PMCID: PMC9147327 DOI: 10.3390/pharmaceutics14050997] [Citation(s) in RCA: 14] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2022] [Revised: 04/28/2022] [Accepted: 05/04/2022] [Indexed: 01/27/2023] Open
Abstract
Cancer is a group of diseases causing abnormal cell growth, altering the genome, and invading or spreading to other parts of the body. Among therapeutic peptide drugs, anticancer peptides (ACPs) have been considered to target and kill cancer cells because cancer cells have unique characteristics such as a high negative charge and abundance of microvilli in the cell membrane when compared to a normal cell. ACPs have several advantages, such as high specificity, cost-effectiveness, low immunogenicity, minimal toxicity, and high tolerance under normal physiological conditions. However, the development and identification of ACPs are time-consuming and expensive in traditional wet-lab-based approaches. Thus, the application of artificial intelligence on the approaches can save time and reduce the cost to identify candidate ACPs. Recently, machine learning (ML), deep learning (DL), and hybrid learning (ML combined DL) have emerged into the development of ACPs without experimental analysis, owing to advances in computer power and big data from the power system. Additionally, we suggest that combination therapy with classical approaches and ACPs might be one of the impactful approaches to increase the efficiency of cancer therapy.
Collapse
|
22
|
Breast and Lung Anticancer Peptides Classification Using N-Grams and Ensemble Learning Techniques. BIG DATA AND COGNITIVE COMPUTING 2022. [DOI: 10.3390/bdcc6020040] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/01/2023]
Abstract
Anticancer peptides (ACPs) are short protein sequences; they perform functions like some hormones and enzymes inside the body. The role of any protein or peptide is related to its structure and the sequence of amino acids that make up it. There are 20 types of amino acids in humans, and each of them has a particular characteristic according to its chemical structure. Current machine and deep learning models have been used to classify ACPs problems. However, these models have neglected Amino Acid Repeats (AARs) that play an essential role in the function and structure of peptides. Therefore, in this paper, ACPs offer a promising route for novel anticancer peptides by extracting AARs based on N-Grams and k-mers using two peptides’ datasets. These datasets pointed to breast and lung cancer cells assembled and curated manually from the Cancer Peptide and Protein Database (CancerPPD). Every dataset consists of a sequence of peptides and their synthesis and anticancer activity on breast and lung cancer cell lines. Five different feature selection methods were used in this paper to improve classification performance and reduce the experimental costs. After that, ACPs were classified using four classifiers, namely AdaBoost, Random Forest Tree (RFT), Multi-class Support Vector Machine (SVM), and Multi-Layer Perceptron (MLP). These classifiers were evaluated by applying five well-known evaluation metrics. Experimental results showed that the breast and lung ACPs classification process provided an accurate performance that reached 89.25% and 92.56%, respectively. In terms of AUC, it reached 95.35% and 96.92% for both breast and lung ACPs, respectively. The proposed classifiers performed competently somewhat equally in AUC, accuracy, precision, F-measures, and recall, except for Multi-class SVM-based feature selection, which showed superior performance. As a result, this paper significantly improved the predictive performance that can effectively distinguish ACPs as virtual inactive, experimental inactive, moderately active, and very active.
Collapse
|
23
|
Lin C, Wang L, Shi L. AAPred-CNN: accurate predictor based on deep convolution neural network for identification of anti-angiogenic peptides. Methods 2022; 204:442-448. [PMID: 35031486 DOI: 10.1016/j.ymeth.2022.01.004] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2021] [Revised: 12/28/2021] [Accepted: 01/09/2022] [Indexed: 12/13/2022] Open
Abstract
Recently, deep learning techniques have been developed for various bioactive peptide prediction tasks. However, there are only conventional machine learning-based methods for the prediction of anti-angiogenic peptides (AAP), which play an important role in cancer treatment. The main reason why no deep learning method has been involved in this field is that there are too few experimentally validated AAPs to support the training of deep models but researchers have believed that deep learning seriously depends on the amounts of labeled data. In this paper, as a tentative work, we try to predict AAP by constructing different classical deep learning models and propose the first deep convolution neural network-based predictor (AAPred-CNN) for AAP. Contrary to intuition, the experimental results show that deep learning models can achieve superior or comparable performance to the state-of-the-art model, although they are given a few labeled sequences to train. We also decipher the influence of hyper-parameters and training samples on the performance of deep learning models to help understand how the model work. Furthermore, we also visualize the learned embeddings by dimension reduction to increase the model interpretability and reveal the residue propensity of AAP through the statistics of convolutional features for different residues. In summary, this work demonstrates the powerful representation ability of AAPred-CNNfor AAP prediction, further improving the prediction accuracy of AAP.
Collapse
Affiliation(s)
- Changhang Lin
- School of Big Data and Artificial Intelligence, Fujian Polytechnic Normal University, Fuzhou, China
| | - Lei Wang
- Beidahuang Industry Group General Hospital, Harbin, China.
| | - Lei Shi
- Department of Spine Surgery, Changzheng Hospital, Naval Medical University, Shanghai, China.
| |
Collapse
|
24
|
Li F, Guo X, Xiang D, Pitt ME, Bainomugisa A, Coin LJ. Computational analysis and prediction of PE_PGRS proteins using machine learning. Comput Struct Biotechnol J 2022; 20:662-674. [PMID: 35140886 PMCID: PMC8804200 DOI: 10.1016/j.csbj.2022.01.019] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2021] [Revised: 01/09/2022] [Accepted: 01/18/2022] [Indexed: 12/18/2022] Open
Abstract
PEPPER is the first machine learning-based predictor for PE_PGRS proteins. PEPPER is based on lightGBM and various sequence and physicochemical features. PEPPER can identify PE_PGRS proteins rapidly and accurately. The webserver of PEPPER and stand-alone tool are publicly available at http://web.unimelb-bioinfortools.cloud.edu.au/PEPPER/.
Mycobacterium tuberculosis genome comprises approximately 10% of two families of poorly characterised genes due to their high GC content and highly repetitive nature. The largest sub-group, the proline-glutamic acid polymorphic guanine-cytosine-rich sequence (PE_PGRS) family, is thought to be involved in host response and disease pathogenicity. Due to their high genetic variability and complexity of analysis, they are typically disregarded for further research in genomic studies. There are currently limited online resources and homology computational tools that can identify and analyse PE_PGRS proteins. In addition, they are computational-intensive and time-consuming, and lack sensitivity. Therefore, computational methods that can rapidly and accurately identify PE_PGRS proteins are valuable to facilitate the functional elucidation of the PE_PGRS family proteins. In this study, we developed the first machine learning-based bioinformatics approach, termed PEPPER, to allow users to identify PE_PGRS proteins rapidly and accurately. PEPPER was built upon a comprehensive evaluation of 13 popular machine learning algorithms with various sequence and physicochemical features. Empirical studies demonstrated that PEPPER achieved significantly better performance than alignment-based approaches, BLASTP and PHMMER, in both prediction accuracy and speed. PEPPER is anticipated to facilitate community-wide efforts to conduct high-throughput identification and analysis of PE_PGRS proteins.
Collapse
|
25
|
He W, Jiang Y, Jin J, Li Z, Zhao J, Manavalan B, Su R, Gao X, Wei L. Accelerating bioactive peptide discovery via mutual information-based meta-learning. Brief Bioinform 2021; 23:6457168. [PMID: 34882225 DOI: 10.1093/bib/bbab499] [Citation(s) in RCA: 26] [Impact Index Per Article: 8.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2021] [Revised: 10/07/2021] [Accepted: 10/30/2021] [Indexed: 12/28/2022] Open
Abstract
Recently, machine learning methods have been developed to identify various peptide bio-activities. However, due to the lack of experimentally validated peptides, machine learning methods cannot provide a sufficiently trained model, easily resulting in poor generalizability. Furthermore, there is no generic computational framework to predict the bioactivities of different peptides. Thus, a natural question is whether we can use limited samples to build an effective predictive model for different kinds of peptides. To address this question, we propose Mutual Information Maximization Meta-Learning (MIMML), a novel meta-learning-based predictive model for bioactive peptide discovery. Using few samples from various functional peptides, MIMML can sufficiently learn the discriminative information amongst various functions and characterize functional differences. Experimental results show excellent performance of MIMML though using far fewer training samples as compared to the state-of-the-art methods. We also decipher the latent relationships among different kinds of functions to understand what meta-model learned to improve a specific task. In summary, this study is a pioneering work in the field of functional peptide mining and provides the first-of-its-kind solution for few-sample learning problems in biological sequence analysis, accelerating the new functional peptide discovery. The source codes and datasets are available on https://github.com/TearsWaiting/MIMML.
Collapse
Affiliation(s)
- Wenjia He
- School of Software, Shandong University, Jinan, China.,Joint SDU-NTU Centre for Artificial Intelligence Research (C-FAIR), Shandong University, Jinan, China.,BioMap, Beijing, China
| | - Yi Jiang
- School of Software, Shandong University, Jinan, China.,Joint SDU-NTU Centre for Artificial Intelligence Research (C-FAIR), Shandong University, Jinan, China
| | - Junru Jin
- School of Software, Shandong University, Jinan, China.,Joint SDU-NTU Centre for Artificial Intelligence Research (C-FAIR), Shandong University, Jinan, China
| | - Zhongshen Li
- School of Software, Shandong University, Jinan, China.,Joint SDU-NTU Centre for Artificial Intelligence Research (C-FAIR), Shandong University, Jinan, China
| | - Jiaojiao Zhao
- School of Software, Shandong University, Jinan, China.,Joint SDU-NTU Centre for Artificial Intelligence Research (C-FAIR), Shandong University, Jinan, China
| | | | - Ran Su
- College of Intelligence and Computing, Tianjin University, Tianjin, China
| | - Xin Gao
- King Abdullah University of Science and Technology (KAUST), Computational Bioscience Research Center (CBRC), Computer, Electrical, and Mathematical Sciences and Engineering (CEMSE) Division, Thuwal, 23955-6900, Saudi Arabia
| | - Leyi Wei
- School of Software, Shandong University, Jinan, China.,Joint SDU-NTU Centre for Artificial Intelligence Research (C-FAIR), Shandong University, Jinan, China
| |
Collapse
|
26
|
Ahmed S, Muhammod R, Khan ZH, Adilina S, Sharma A, Shatabda S, Dehzangi A. ACP-MHCNN: an accurate multi-headed deep-convolutional neural network to predict anticancer peptides. Sci Rep 2021; 11:23676. [PMID: 34880291 PMCID: PMC8654959 DOI: 10.1038/s41598-021-02703-3] [Citation(s) in RCA: 29] [Impact Index Per Article: 9.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2021] [Accepted: 11/17/2021] [Indexed: 01/10/2023] Open
Abstract
Although advancing the therapeutic alternatives for treating deadly cancers has gained much attention globally, still the primary methods such as chemotherapy have significant downsides and low specificity. Most recently, Anticancer peptides (ACPs) have emerged as a potential alternative to therapeutic alternatives with much fewer negative side-effects. However, the identification of ACPs through wet-lab experiments is expensive and time-consuming. Hence, computational methods have emerged as viable alternatives. During the past few years, several computational ACP identification techniques using hand-engineered features have been proposed to solve this problem. In this study, we propose a new multi headed deep convolutional neural network model called ACP-MHCNN, for extracting and combining discriminative features from different information sources in an interactive way. Our model extracts sequence, physicochemical, and evolutionary based features for ACP identification using different numerical peptide representations while restraining parameter overhead. It is evident through rigorous experiments using cross-validation and independent-dataset that ACP-MHCNN outperforms other models for anticancer peptide identification by a substantial margin on our employed benchmarks. ACP-MHCNN outperforms state-of-the-art model by 6.3%, 8.6%, 3.7%, 4.0%, and 0.20 in terms of accuracy, sensitivity, specificity, precision, and MCC respectively. ACP-MHCNN and its relevant codes and datasets are publicly available at: https://github.com/mrzResearchArena/Anticancer-Peptides-CNN . ACP-MHCNN is also publicly available as an online predictor at: https://anticancer.pythonanywhere.com/ .
Collapse
Affiliation(s)
- Sajid Ahmed
- Department of Computer Science and Engineering, United International University, Dhaka, Bangladesh
| | - Rafsanjani Muhammod
- Department of Computer Science and Engineering, United International University, Dhaka, Bangladesh
| | - Zahid Hossain Khan
- Department of Computer Science and Engineering, United International University, Dhaka, Bangladesh
| | - Sheikh Adilina
- Department of Computer Science and Engineering, United International University, Dhaka, Bangladesh
| | - Alok Sharma
- Laboratory for Medical Science Mathematics, RIKEN Center for Integrative Medical Sciences, Yokohama, 230-0045, Japan
- Institute for Integrated and Intelligent Systems, Griffith University, Brisbane, QLD, 4111, Australia
| | - Swakkhar Shatabda
- Department of Computer Science and Engineering, United International University, Dhaka, Bangladesh.
| | - Abdollah Dehzangi
- Department of Computer Science, Rutgers University, Camden, NJ, 08102, USA.
- Center for Computational and Integrative Biology, Rutgers University, Camden, NJ, 08102, USA.
| |
Collapse
|