1
|
Zhu L, Chen Z, Yang S. EnDM-CPP: A Multi-view Explainable Framework Based on Deep Learning and Machine Learning for Identifying Cell-Penetrating Peptides with Transformers and Analyzing Sequence Information. Interdiscip Sci 2024:10.1007/s12539-024-00673-4. [PMID: 39714579 DOI: 10.1007/s12539-024-00673-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/04/2024] [Revised: 10/28/2024] [Accepted: 11/01/2024] [Indexed: 12/24/2024]
Abstract
Cell-Penetrating Peptides (CPPs) are a crucial carrier for drug delivery. Since the process of synthesizing new CPPs in the laboratory is both time- and resource-consuming, computational methods to predict potential CPPs can be used to find CPPs to enhance the development of CPPs in therapy. In this study, EnDM-CPP is proposed, which combines machine learning algorithms (SVM and CatBoost) with convolutional neural networks (CNN and TextCNN). For dataset construction, three previous CPP benchmark datasets, including CPPsite 2.0, MLCPP 2.0, and CPP924, are merged to improve the diversity and reduce homology. For feature generation, two language model-based features obtained from the Transformer architecture, including ProtT5 and ESM-2, are employed in CNN and TextCNN. Additionally, sequence features, such as CPRS, Hybrid PseAAC, KSC, etc., are input to SVM and CatBoost. Based on the result of each predictor, Logistic Regression (LR) is built to predict the final decision. The experiment results indicate that ProtT5 and ESM-2 fusion features significantly contribute to predicting CPP and that combining employed features and models demonstrates better association. On an independent test dataset comparison, EnDM-CPP achieved an accuracy of 0.9495 and a Matthews correlation coefficient of 0.9008 with an improvement of 2.23%-9.48% and 4.32%-19.02%, respectively, compared with other state-of-the-art methods. Code and data are available at https://github.com/tudou1231/EnDM-CPP.git .
Collapse
Affiliation(s)
- Lun Zhu
- School of Computer Science and Artificial Intelligence, Aliyun School of Big Data, School of Software, Changzhou University, Changzhou, 213164, China
| | - Zehua Chen
- School of Computer Science and Artificial Intelligence, Aliyun School of Big Data, School of Software, Changzhou University, Changzhou, 213164, China
| | - Sen Yang
- School of Computer Science and Artificial Intelligence, Aliyun School of Big Data, School of Software, Changzhou University, Changzhou, 213164, China.
- The Affiliated Changzhou No. 2 People's Hospital of Nanjing Medical University, Changzhou, 213164, China.
| |
Collapse
|
2
|
Wang X, Zhang Z, Liu C. iACP-DFSRA: Identification of Anticancer Peptides Based on a Dual-channel Fusion Strategy of ResCNN and Attention. J Mol Biol 2024; 436:168810. [PMID: 39362624 DOI: 10.1016/j.jmb.2024.168810] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2024] [Revised: 09/10/2024] [Accepted: 09/27/2024] [Indexed: 10/05/2024]
Abstract
Anticancer peptides (ACPs) have been widely applied in the treatment of cancer owing to good safety, rational side effects, and high selectivity. However, the number of ACPs that have been experimentally validated is limited as identification of ACPs is extremely expensive. Hence, accurate and cost-effective identification methods for ACPs are urgently needed. In this work, we proposed a deep learning-based model, named iACP-DFSRA, for ACPs identification. Specifically, we adopted two kinds of sequence embedding technologies, ProtBert_BFD pre-training language model and handcrafted features to encode protein sequences. Then, the LightGBM was used for feature selection, and the selected features were input into ResCNN and Attention mechanism, respectively, to extract local and global features. Finally, the concatenate features were deeply fused by using the Attention mechanism to allow key features to be paid more attention to by the model and make predictions by fully connected layer. The results of 10-fold cross-validation demonstrated that the iACP-DFSRA model delivered improved results in most metrics with Sp of 94.15%, Sn of 95.32%, Acc of 94.74% and MCC of 89.48% compared to the latest AACFlow model. Indeed, the iACP-DFSRA model is the only model with Acc > 90% and MCC > 80% on this independent test dataset. Furthermore, we have further demonstrated the superiority of our model on additional datasets. In addition, t-SNE and SHAP interpretation analysis demonstrated that it is crucial to use two channels for feature extraction and use the Attention mechanism for deep fusion, which helps the iACP-DFSRA to predict ACPs more effectively.
Collapse
Affiliation(s)
- Xin Wang
- School of Science, Dalian Maritime University, Dalian 116026, China.
| | - Zimeng Zhang
- School of Science, Dalian Maritime University, Dalian 116026, China
| | - Chang Liu
- School of Science, Dalian Maritime University, Dalian 116026, China
| |
Collapse
|
3
|
Ullah F, Salam A, Nadeem M, Amin F, AlSalman H, Abrar M, Alfakih T. Extended dipeptide composition framework for accurate identification of anticancer peptides. Sci Rep 2024; 14:17381. [PMID: 39075193 PMCID: PMC11286958 DOI: 10.1038/s41598-024-68475-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2024] [Accepted: 07/24/2024] [Indexed: 07/31/2024] Open
Abstract
The identification of anticancer peptides (ACPs) is crucial, especially in the development of peptide-based cancer therapy. The classical models such as Split Amino Acid Composition (SAAC) and Pseudo Amino Acid Composition (PseAAC) lack the incorporation of feature representation. These advancements improve the predictive accuracy and efficiency of ACP identification. Thus, the effort of this research is to propose and develop an advanced framework based on feature extraction. Thus, to achieve this objective herein we propose an Extended Dipeptide Composition (EDPC) framework. The proposed EDPC framework extends the dipeptide composition by considering the local sequence environment information and reforming the CD-HIT framework to remove noise and redundancy. To measure the accuracy, we have performed several experiments. These experiments were employed using four famous machine learning (ML) algorithms named; Support Vector Machine (SVM), Decision Tree (DT), Random Forest (RF), and K Nearest Neighbor (KNN). For comparisons, we have used accuracy, specificity, sensitivity, precision, recall, and F1-Score as evaluation criteria. The reliability of the proposed framework is further evaluated using statistical significance tests. As a result, the proposed EDPC framework exhibited enhanced performance than SAAC and PseAAC, where the SVM model delivered the highest accuracy of 96. 6% and significant enhancements in specificity, sensitivity, precision, and F1-score over multiple datasets. Due to the incorporation of enhanced feature representation and the incorporation of local and global sequence profiles proposed EDPC achieves higher classification performance. The proposed frameworks can deal with noise and also duplicating features. These are accompanied by a wide range of feature representations. Finally, our proposed framework can be used for clinical applications where ACP identification is essential. Future works will include extending to a larger variety of datasets, incorporating tertiary structural information, and using deep learning techniques to improve the proposed EDPC.
Collapse
Affiliation(s)
- Faizan Ullah
- Department of Computer Science, Bacha Khan University, Charsadda, 24420, Pakistan
| | - Abdu Salam
- Department of Computer Science, Abdul Wali Khan University, Mardan, 23200, Pakistan
| | - Muhammad Nadeem
- Department of Computer Science and Software Engineering, International Islamic University, Islamabad, 44000, Pakistan
| | - Farhan Amin
- School of Computer Science and Engineering, Yeungnam University, Gyeongsan, 38541, Korea.
| | - Hussain AlSalman
- Department of Computer Science, College of Computer and Information Sciences, King Saud University, 11543, Riyadh, Saudi Arabia.
| | - Mohammad Abrar
- Faculty of Computer Studies, Arab Open University, Muscat, Oman
| | - Taha Alfakih
- Department of Information Systems, College of Computer and Information Sciences, King Saud University, 11543, Riyadh, Saudi Arabia
| |
Collapse
|
4
|
Arif M, Musleh S, Fida H, Alam T. PLMACPred prediction of anticancer peptides based on protein language model and wavelet denoising transformation. Sci Rep 2024; 14:16992. [PMID: 39043738 PMCID: PMC11266708 DOI: 10.1038/s41598-024-67433-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2024] [Accepted: 07/11/2024] [Indexed: 07/25/2024] Open
Abstract
Anticancer peptides (ACPs) perform a promising role in discovering anti-cancer drugs. The growing research on ACPs as therapeutic agent is increasing due to its minimal side effects. However, identifying novel ACPs using wet-lab experiments are generally time-consuming, labor-intensive, and expensive. Leveraging computational methods for fast and accurate prediction of ACPs would harness the drug discovery process. Herein, a machine learning-based predictor, called PLMACPred, is developed for identifying ACPs from peptide sequence only. PLMACPred adopted a set of encoding schemes representing evolutionary-property, composition-property, and protein language model (PLM), i.e., evolutionary scale modeling (ESM-2)- and ProtT5-based embedding to encode peptides. Then, two-dimensional (2D) wavelet denoising (WD) was employed to remove the noise from extracted features. Finally, ensemble-based cascade deep forest (CDF) model was developed to identify ACP. PLMACPred model attained superior performance on all three benchmark datasets, namely, ACPmain, ACPAlter, and ACP740 over tenfold cross validation and independent dataset. PLMACPred outperformed the existing models and improved the prediction accuracy by 18.53%, 2.4%, 7.59% on ACPmain, ACPalter, ACP740 dataset, respectively. We showed that embedding from ProtT5 and ESM-2 was capable of capturing better contextual information from the entire sequence than the other encoding schemes for ACP prediction. For the explainability of proposed model, SHAP (SHapley Additive exPlanations) method was used to analyze the feature effect on the ACP prediction. A list of novel sequence motifs was proposed from the ACP sequence using MEME suites. We believe, PLMACPred will support in accelerating the discovery of novel ACPs as well as other activities of microbial peptides.
Collapse
Affiliation(s)
- Muhammad Arif
- College of Science and Engineering, Hamad Bin Khalifa University, Doha, Qatar
| | - Saleh Musleh
- College of Science and Engineering, Hamad Bin Khalifa University, Doha, Qatar
| | - Huma Fida
- Department of Microbiology, Abdul Wali Khan University, Mardan, KPK, Pakistan
| | - Tanvir Alam
- College of Science and Engineering, Hamad Bin Khalifa University, Doha, Qatar.
| |
Collapse
|
5
|
Bhattarai S, Tayara H, Chong KT. Advancing Peptide-Based Cancer Therapy with AI: In-Depth Analysis of State-of-the-Art AI Models. J Chem Inf Model 2024; 64:4941-4957. [PMID: 38874445 DOI: 10.1021/acs.jcim.4c00295] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/15/2024]
Abstract
Anticancer peptides (ACPs) play a vital role in selectively targeting and eliminating cancer cells. Evaluating and comparing predictions from various machine learning (ML) and deep learning (DL) techniques is challenging but crucial for anticancer drug research. We conducted a comprehensive analysis of 15 ML and 10 DL models, including the models released after 2022, and found that support vector machines (SVMs) with feature combination and selection significantly enhance overall performance. DL models, especially convolutional neural networks (CNNs) with light gradient boosting machine (LGBM) based feature selection approaches, demonstrate improved characterization. Assessment using a new test data set (ACP10) identifies ACPred, MLACP 2.0, AI4ACP, mACPred, and AntiCP2.0_AAC as successive optimal predictors, showcasing robust performance. Our review underscores current prediction tool limitations and advocates for an omnidirectional ACP prediction framework to propel ongoing research.
Collapse
Affiliation(s)
- Sadik Bhattarai
- Department of Electronics and Information Engineering, Jeonbuk National University, Jeonju-si, 54896 Jeollabuk-do, South Korea
| | - Hilal Tayara
- School of International Engineering and Science, Jeonbuk National University, Jeonju-si, 54896 Jeollabuk-do, South Korea
| | - Kil To Chong
- Department of Electronics and Information Engineering, Jeonbuk National University, Jeonju-si, 54896 Jeollabuk-do, South Korea
- Advanced Electronics and Information Research Center, Jeonbuk National University, Jeonju-si, 54896 Jeollabuk-do, South Korea
| |
Collapse
|
6
|
Kao HJ, Weng TH, Chen CH, Chen YC, Chi YH, Huang KY, Weng SL. Integrating In Silico and In Vitro Approaches to Identify Natural Peptides with Selective Cytotoxicity against Cancer Cells. Int J Mol Sci 2024; 25:6848. [PMID: 38999958 PMCID: PMC11240926 DOI: 10.3390/ijms25136848] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2024] [Revised: 06/14/2024] [Accepted: 06/18/2024] [Indexed: 07/14/2024] Open
Abstract
Anticancer peptides (ACPs) are bioactive compounds known for their selective cytotoxicity against tumor cells via various mechanisms. Recent studies have demonstrated that in silico machine learning methods are effective in predicting peptides with anticancer activity. In this study, we collected and analyzed over a thousand experimentally verified ACPs, specifically targeting peptides derived from natural sources. We developed a precise prediction model based on their sequence and structural features, and the model's evaluation results suggest its strong predictive ability for anticancer activity. To enhance reliability, we integrated the results of this model with those from other available methods. In total, we identified 176 potential ACPs, some of which were synthesized and further evaluated using the MTT colorimetric assay. All of these putative ACPs exhibited significant anticancer effects and selective cytotoxicity against specific tumor cells. In summary, we present a strategy for identifying and characterizing natural peptides with selective cytotoxicity against cancer cells, which could serve as novel therapeutic agents. Our prediction model can effectively screen new molecules for potential anticancer activity, and the results from in vitro experiments provide compelling evidence of the candidates' anticancer effects and selective cytotoxicity.
Collapse
Affiliation(s)
- Hui-Ju Kao
- Department of Medical Research, Hsinchu MacKay Memorial Hospital, Hsinchu City 300, Taiwan
- Department of Medical Research, Hsinchu Municipal MacKay Children's Hospital, Hsinchu City 300, Taiwan
| | - Tzu-Han Weng
- Department of Dermatology, MacKay Memorial Hospital, Taipei City 104, Taiwan
| | - Chia-Hung Chen
- Department of Medical Research, Hsinchu MacKay Memorial Hospital, Hsinchu City 300, Taiwan
- Department of Medical Research, Hsinchu Municipal MacKay Children's Hospital, Hsinchu City 300, Taiwan
| | - Yu-Chi Chen
- Department of Medical Research, Hsinchu MacKay Memorial Hospital, Hsinchu City 300, Taiwan
- Department of Medical Research, Hsinchu Municipal MacKay Children's Hospital, Hsinchu City 300, Taiwan
| | - Yu-Hsiang Chi
- National Center for High-Performance Computing, Hsinchu City 300, Taiwan
| | - Kai-Yao Huang
- Department of Medical Research, Hsinchu MacKay Memorial Hospital, Hsinchu City 300, Taiwan
- Department of Medical Research, Hsinchu Municipal MacKay Children's Hospital, Hsinchu City 300, Taiwan
- Department of Medicine, MacKay Medical College, New Taipei City 252, Taiwan
- Institute of Biomedical Sciences, MacKay Medical College, New Taipei City 252, Taiwan
| | - Shun-Long Weng
- Department of Medicine, MacKay Medical College, New Taipei City 252, Taiwan
- Department of Obstetrics and Gynecology, Hsinchu MacKay Memorial Hospital, Hsinchu City 300, Taiwan
- Department of Obstetrics and Gynecology, Hsinchu Municipal MacKay Children's Hospital, Hsinchu City 300, Taiwan
| |
Collapse
|
7
|
Arif R, Kanwal S, Ahmed S, Kabir M. A Computational Predictor for Accurate Identification of Tumor Homing Peptides by Integrating Sequential and Deep BiLSTM Features. Interdiscip Sci 2024; 16:503-518. [PMID: 38733473 DOI: 10.1007/s12539-024-00628-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2023] [Revised: 03/16/2024] [Accepted: 03/27/2024] [Indexed: 05/13/2024]
Abstract
Cancer remains a severe illness, and current research indicates that tumor homing peptides (THPs) play an important part in cancer therapy. The identification of THPs can provide crucial insights for drug-discovery and pharmaceutical industries as they allow for tailored medication delivery towards cancer cells. These peptides have a high affinity enabling particular receptors present upon tumor surfaces, allowing for the creation of precision medications that reduce off-target consequences and enhance cancer patient treatment results. Wet-lab techniques are considered essential tools for studying THPs; however, they're labor-extensive and time-consuming, therefore making prediction of THPs a challenging task for the researchers. Computational-techniques, on the other hand, are considered significant tools in identifying THPs according to the sequence data. Despite many strategies have been presented to predict new THP, there is still a need to develop a robust method with higher rates of success. In this paper, we developed a novel framework, THP-DF, for accurately identifying THPs on a large-scale. Firstly, the peptide sequences are encoded through various sequential features. Secondly, each feature is passed to BiLSTM and attention layers to extract simplified deep features. Finally, an ensemble-framework is formed via integrating sequential- and deep features which are fed to a support vector machine which with 10-fold cross-validation to carry to validate the efficiency. The experimental results showed that THP-DF worked better on both [Formula: see text] and [Formula: see text] datasets by achieving accuracy of > 95% which are higher than existing predictors both datasets. This indicates that the proposed predictor could be a beneficial tool to precisely and rapidly identify THPs and will contribute to the cutting-edge cancer treatment strategies and pharmaceuticals.
Collapse
Affiliation(s)
- Roha Arif
- School of Systems and Technology, University of Management and Technology, Lahore, 54782, Pakistan
| | - Sameera Kanwal
- School of Systems and Technology, University of Management and Technology, Lahore, 54782, Pakistan
| | - Saeed Ahmed
- School of Systems and Technology, University of Management and Technology, Lahore, 54782, Pakistan
| | - Muhammad Kabir
- School of Systems and Technology, University of Management and Technology, Lahore, 54782, Pakistan.
| |
Collapse
|
8
|
Lee B, Shin D. Contrastive learning for enhancing feature extraction in anticancer peptides. Brief Bioinform 2024; 25:bbae220. [PMID: 38725157 PMCID: PMC11082072 DOI: 10.1093/bib/bbae220] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2024] [Revised: 03/28/2024] [Accepted: 04/21/2024] [Indexed: 05/13/2024] Open
Abstract
Cancer, recognized as a primary cause of death worldwide, has profound health implications and incurs a substantial social burden. Numerous efforts have been made to develop cancer treatments, among which anticancer peptides (ACPs) are garnering recognition for their potential applications. While ACP screening is time-consuming and costly, in silico prediction tools provide a way to overcome these challenges. Herein, we present a deep learning model designed to screen ACPs using peptide sequences only. A contrastive learning technique was applied to enhance model performance, yielding better results than a model trained solely on binary classification loss. Furthermore, two independent encoders were employed as a replacement for data augmentation, a technique commonly used in contrastive learning. Our model achieved superior performance on five of six benchmark datasets against previous state-of-the-art models. As prediction tools advance, the potential in peptide-based cancer therapeutics increases, promising a brighter future for oncology research and patient care.
Collapse
Affiliation(s)
- Byungjo Lee
- Research Institute, National Cancer Center, 323, Ilsan-ro, Ilsandong-gu, Goyang, 10408, Republic of Korea
| | - Dongkwan Shin
- Research Institute, National Cancer Center, 323, Ilsan-ro, Ilsandong-gu, Goyang, 10408, Republic of Korea
- Department of Cancer Biomedical Science, National Cancer Center Graduate School of Cancer Science and Policy, 323, Ilsan-ro, Ilsandong-gu, Goyang, 10408, Republic of Korea
| |
Collapse
|
9
|
Liao W, Yan S, Cao X, Xia H, Wang S, Sun G, Cai K. A Novel LSTM-Based Machine Learning Model for Predicting the Activity of Food Protein-Derived Antihypertensive Peptides. Molecules 2023; 28:4901. [PMID: 37446561 DOI: 10.3390/molecules28134901] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2023] [Revised: 06/14/2023] [Accepted: 06/19/2023] [Indexed: 07/15/2023] Open
Abstract
Food protein-derived antihypertensive peptides are a representative type of bioactive peptides. Several models based on partial least squares regression have been constructed to delineate the relationship between the structure and activity of the peptides. Machine-learning-based models have been applied in broad areas, which also indicates their potential to be incorporated into the field of bioactive peptides. In this study, a long short-term memory (LSTM) algorithm-based deep learning model was constructed, which could predict the IC50 value of the peptide in inhibiting ACE activity. In addition to the test dataset, the model was also validated using randomly synthesized peptides. The LSTM-based model constructed in this study provides an efficient and simplified method for screening antihypertensive peptides from food proteins.
Collapse
Affiliation(s)
- Wang Liao
- Key Laboratory of Environmental Medicine and Engineering of Ministry of Education, School of Public Health, Southeast University, Nanjing 210009, China
- Department of Nutrition and Food Hygiene, School of Public Health, Southeast University, Nanjing 210009, China
| | - Siyuan Yan
- Key Laboratory of Environmental Medicine and Engineering of Ministry of Education, School of Public Health, Southeast University, Nanjing 210009, China
- Department of Nutrition and Food Hygiene, School of Public Health, Southeast University, Nanjing 210009, China
| | - Xinyi Cao
- Key Laboratory of Environmental Medicine and Engineering of Ministry of Education, School of Public Health, Southeast University, Nanjing 210009, China
- Department of Nutrition and Food Hygiene, School of Public Health, Southeast University, Nanjing 210009, China
| | - Hui Xia
- Key Laboratory of Environmental Medicine and Engineering of Ministry of Education, School of Public Health, Southeast University, Nanjing 210009, China
- Department of Nutrition and Food Hygiene, School of Public Health, Southeast University, Nanjing 210009, China
| | - Shaokang Wang
- Key Laboratory of Environmental Medicine and Engineering of Ministry of Education, School of Public Health, Southeast University, Nanjing 210009, China
- Department of Nutrition and Food Hygiene, School of Public Health, Southeast University, Nanjing 210009, China
| | - Guiju Sun
- Key Laboratory of Environmental Medicine and Engineering of Ministry of Education, School of Public Health, Southeast University, Nanjing 210009, China
- Department of Nutrition and Food Hygiene, School of Public Health, Southeast University, Nanjing 210009, China
| | - Kaida Cai
- Key Laboratory of Environmental Medicine and Engineering of Ministry of Education, School of Public Health, Southeast University, Nanjing 210009, China
- Department of Epidemiology & Biostatistics, School of Public Health, Southeast University, Nanjing 210009, China
- Department of Statistics and Actuarial Sciences, School of Mathematics, Southeast University, Nanjing 210009, China
| |
Collapse
|
10
|
Liang Y, Ma X. iACP-GE: accurate identification of anticancer peptides by using gradient boosting decision tree and extra tree. SAR AND QSAR IN ENVIRONMENTAL RESEARCH 2023; 34:1-19. [PMID: 36562289 DOI: 10.1080/1062936x.2022.2160011] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/24/2022] [Accepted: 12/12/2022] [Indexed: 06/17/2023]
Abstract
Cancer is one of the main diseases threatening human life, accounting for millions of deaths around the world each year. Traditional physical and chemical methods for cancer treatment are extremely time-consuming, lab-intensive, expensive, inefficient and difficult to be applied in a high-throughput way. Hence, it is an urgent task to develop automated computational methods to enable fast and accurate identification of anticancer peptides (ACPs). In this paper, we develop a novel model named iACP-GE to identify ACPs. Multi-features are extracted by using binary encoding, enhanced grouped amino acid composition and BLOSUM62 encoding based on the N5C5 sequence, as well as detrended forward moving-average auto-cross correlation analysis based on physicochemical properties of 20 natural amino acids. Thus, 835 features are obtained for each sample, in order to avoid information redundancy, gradient boosting decision tree was adopted as the feature selection strategy. Then, the optimal feature subset is input to the extra tree classifier. The accuracies of ACP740 and ACP240 datasets with the 5-fold cross-validation were 90.54% and 91.25%, respectively. Experimental results indicate that iACP-GE significantly outperforms several existing models on ACP740 and ACP240 datasets and can be used as an effective tool for the identification of ACPs. The datasets and source codes for iACP-GE are available at https://github.com/yunyunliang88/iACP-GE.
Collapse
Affiliation(s)
- Y Liang
- School of Science, Xi'an Polytechnic University, Xi'an, P. R. China
| | - X Ma
- School of Science, Xi'an Polytechnic University, Xi'an, P. R. China
| |
Collapse
|
11
|
ACP-ADA: A Boosting Method with Data Augmentation for Improved Prediction of Anticancer Peptides. Int J Mol Sci 2022; 23:ijms232012194. [PMID: 36293050 PMCID: PMC9603247 DOI: 10.3390/ijms232012194] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2022] [Revised: 10/08/2022] [Accepted: 10/11/2022] [Indexed: 11/30/2022] Open
Abstract
Cancer is the second-leading cause of death worldwide, and therapeutic peptides that target and destroy cancer cells have received a great deal of interest in recent years. Traditional wet experiments are expensive and inefficient for identifying novel anticancer peptides; therefore, the development of an effective computational approach is essential to recognize ACP candidates before experimental methods are used. In this study, we proposed an Ada-boosting algorithm with the base learner random forest called ACP-ADA, which integrates binary profile feature, amino acid index, and amino acid composition with a 210-dimensional feature space vector to represent the peptides. Training samples in the feature space were augmented to increase the sample size and further improve the performance of the model in the case of insufficient samples. Furthermore, we used five-fold cross-validation to find model parameters, and the cross-validation results showed that ACP-ADA outperforms existing methods for this feature combination with data augmentation in terms of performance metrics. Specifically, ACP-ADA recorded an average accuracy of 86.4% and a Mathew’s correlation coefficient of 74.01% for dataset ACP740 and 90.83% and 81.65% for dataset ACP240; consequently, it can be a very useful tool in drug development and biomedical research.
Collapse
|
12
|
Zakharova E, Orsi M, Capecchi A, Reymond J. Machine Learning Guided Discovery of Non-Hemolytic Membrane Disruptive Anticancer Peptides. ChemMedChem 2022; 17:e202200291. [PMID: 35880810 PMCID: PMC9541320 DOI: 10.1002/cmdc.202200291] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2022] [Revised: 06/29/2022] [Indexed: 12/05/2022]
Abstract
Most antimicrobial peptides (AMPs) and anticancer peptides (ACPs) fold into membrane disruptive cationic amphiphilic α-helices, many of which are however also unpredictably hemolytic and toxic. Here we exploited the ability of recurrent neural networks (RNN) to distinguish active from inactive and non-hemolytic from hemolytic AMPs and ACPs to discover new non-hemolytic ACPs. Our discovery pipeline involved: 1) sequence generation using either a generative RNN or a genetic algorithm, 2) RNN classification for activity and hemolysis, 3) selection for sequence novelty, helicity and amphiphilicity, and 4) synthesis and testing. Experimental evaluation of thirty-three peptides resulted in eleven active ACPs, four of which were non-hemolytic, with properties resembling those of the natural ACP lasioglossin III. These experiments show the first example of direct machine learning guided discovery of non-hemolytic ACPs.
Collapse
Affiliation(s)
- Elena Zakharova
- Department of ChemistryBiochemistry and Pharmaceutical SciencesUniversity of BernFreiestrasse 33012BernSwitzerland
| | - Markus Orsi
- Department of ChemistryBiochemistry and Pharmaceutical SciencesUniversity of BernFreiestrasse 33012BernSwitzerland
| | - Alice Capecchi
- Department of ChemistryBiochemistry and Pharmaceutical SciencesUniversity of BernFreiestrasse 33012BernSwitzerland
| | - Jean‐Louis Reymond
- Department of ChemistryBiochemistry and Pharmaceutical SciencesUniversity of BernFreiestrasse 33012BernSwitzerland
| |
Collapse
|
13
|
To Assist Oncologists: An Efficient Machine Learning-Based Approach for Anti-Cancer Peptides Classification. SENSORS 2022; 22:s22114005. [PMID: 35684624 PMCID: PMC9185351 DOI: 10.3390/s22114005] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/04/2022] [Revised: 05/19/2022] [Accepted: 05/20/2022] [Indexed: 12/10/2022]
Abstract
In the modern technological era, Anti-cancer peptides (ACPs) have been considered a promising cancer treatment. It’s critical to find new ACPs to ensure a better knowledge of their functioning processes and vaccine development. Thus, timely and efficient ACPs using a computational technique are highly needed because of the enormous peptide sequences generated in the post-genomic era. Recently, numerous adaptive statistical algorithms have been developed for separating ACPs and NACPs. Despite great advancements, existing approaches still have insufficient feature descriptors and learning methods, limiting predictive performance. To address this, a trustworthy framework is developed for the precise identification of ACPs. Particularly, the presented approach incorporates four hypothetical feature encoding mechanisms namely: amino acid, dipeptide, tripeptide, and an improved version of pseudo amino acid composition are applied to indicate the motif of the target class. Moreover, principal component analysis (PCA) is employed for feature pruning, while selecting optimal, deep, and highly variated features. Due to the diverse nature of learning, experiments are performed over numerous algorithms to select the optimum operating method. After investigating the empirical outcomes, the support vector machine with hybrid feature space shows better performance. The proposed framework achieved an accuracy of 97.09% and 98.25% over the benchmark and independent datasets, respectively. The comparative analysis demonstrates that our proposed model outperforms as compared to the existing methods and is beneficial in drug development, and oncology.
Collapse
|
14
|
Chen X, Zhang Q, Li B, Lu C, Yang S, Long J, He B, Chen H, Huang J. BBPpredict: A Web Service for Identifying Blood-Brain Barrier Penetrating Peptides. Front Genet 2022; 13:845747. [PMID: 35656322 PMCID: PMC9152268 DOI: 10.3389/fgene.2022.845747] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2021] [Accepted: 03/30/2022] [Indexed: 12/22/2022] Open
Abstract
Blood-brain barrier (BBB) is a major barrier to drug delivery into the brain in the treatment of central nervous system (CNS) diseases. Blood-brain barrier penetrating peptides (BBPs), a class of peptides that can cross BBB through various mechanisms without damaging BBB, are effective drug candidates for CNS diseases. However, identification of BBPs by experimental methods is time-consuming and laborious. To discover more BBPs as drugs for CNS disease, it is urgent to develop computational methods that can quickly and accurately identify BBPs and non-BBPs. In the present study, we created a training dataset that consists of 326 BBPs derived from previous databases and published manuscripts and 326 non-BBPs collected from UniProt, to construct a BBP predictor based on sequence information. We also constructed an independent testing dataset with 99 BBPs and 99 non-BBPs. Multiple machine learning methods were compared based on the training dataset via a nested cross-validation. The final BBP predictor was constructed based on the training dataset and the results showed that random forest (RF) method outperformed other classification algorithms on the training and independent testing dataset. Compared with previous BBP prediction tools, the RF-based predictor, named BBPpredict, performs considerably better than state-of-the-art BBP predictors. BBPpredict is expected to contribute to the discovery of novel BBPs, or at least can be a useful complement to the existing methods in this area. BBPpredict is freely available at http://i.uestc.edu.cn/BBPpredict/cgi-bin/BBPpredict.pl.
Collapse
Affiliation(s)
- Xue Chen
- Medical College, Guizhou University, Guiyang, China
| | | | - Bowen Li
- Medical College, Guizhou University, Guiyang, China
| | - Chunying Lu
- Medical College, Guizhou University, Guiyang, China
| | | | - Jinjin Long
- Medical College, Guizhou University, Guiyang, China
| | - Bifang He
- Medical College, Guizhou University, Guiyang, China
| | - Heng Chen
- Medical College, Guizhou University, Guiyang, China
| | - Jian Huang
- School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, China
| |
Collapse
|
15
|
Dee W. LMPred: predicting antimicrobial peptides using pre-trained language models and deep learning. BIOINFORMATICS ADVANCES 2022; 2:vbac021. [PMID: 36699381 PMCID: PMC9710646 DOI: 10.1093/bioadv/vbac021] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/29/2021] [Revised: 03/01/2022] [Accepted: 03/29/2022] [Indexed: 01/28/2023]
Abstract
Motivation Antimicrobial peptides (AMPs) are increasingly being used in the development of new therapeutic drugs in areas such as cancer therapy and hypertension. Additionally, they are seen as an alternative to antibiotics due to the increasing occurrence of bacterial resistance. Wet-laboratory experimental identification, however, is both time-consuming and costly, so in silico models are now commonly used in order to screen new AMP candidates. Results This paper proposes a novel approach for creating model inputs; using pre-trained language models to produce contextualized embeddings, representing the amino acids within each peptide sequence, before a convolutional neural network is trained as the classifier. The results were validated on two datasets-one previously used in AMP prediction research, and a larger independent dataset created by this paper. Predictive accuracies of 93.33% and 88.26% were achieved, respectively, outperforming previous state-of-the-art classification models. Availability and implementation All codes are available and can be accessed here: https://github.com/williamdee1/LMPred_AMP_Prediction. Supplementary information Supplementary data are available at Bioinformatics Advances online.
Collapse
Affiliation(s)
- William Dee
- Department of Bioinformatics, School of Biological and Behavioural Sciences, Queen Mary University of London, London E1 4NS, UK,To whom correspondence should be addressed.
| |
Collapse
|
16
|
Nguyen L, Nguyen Vo TH, Trinh QH, Nguyen BH, Nguyen-Hoang PU, Le L, Nguyen BP. iANP-EC: Identifying Anticancer Natural Products Using Ensemble Learning Incorporated with Evolutionary Computation. J Chem Inf Model 2022; 62:5080-5089. [PMID: 35157472 DOI: 10.1021/acs.jcim.1c00920] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]
Abstract
Cancer is one of the most deadly diseases that annually kills millions of people worldwide. The investigation on anticancer medicines has never ceased to seek better and more adaptive agents with fewer side effects. Besides chemically synthetic anticancer compounds, natural products are scientifically proved as a highly potential alternative source for anticancer drug discovery. Along with experimental approaches being used to find anticancer drug candidates, computational approaches have been developed to virtually screen for potential anticancer compounds. In this study, we construct an ensemble computational framework, called iANP-EC, using machine learning approaches incorporated with evolutionary computation. Four learning algorithms (k-NN, SVM, RF, and XGB) and four molecular representation schemes are used to build a set of classifiers, among which the top-four best-performing classifiers are selected to form an ensemble classifier. Particle swarm optimization (PSO) is used to optimise the weights used to combined the four top classifiers. The models are developed by a set of curated 997 compounds which are collected from the NPACT and CancerHSP databases. The results show that iANP-EC is a stable, robust, and effective framework that achieves an AUC-ROC value of 0.9193 and an AUC-PR value of 0.8366. The comparative analysis of molecular substructures between natural anticarcinogens and nonanticarcinogens partially unveils several key substructures that drive anticancerous activities. We also deploy the proposed ensemble model as an online web server with a user-friendly interface to support the research community in identifying natural products with anticancer activities.
Collapse
Affiliation(s)
- Loc Nguyen
- Computational Biology Center, International University - VNU HCMC, Ho Chi Minh City 700000, Vietnam
| | - Thanh-Hoang Nguyen Vo
- School of Mathematics and Statistics, Victoria University of Wellington, Wellington 6140, New Zealand
| | - Quang H Trinh
- Computational Biology Center, International University - VNU HCMC, Ho Chi Minh City 700000, Vietnam.,School of Information and Communication Technology, Hanoi University of Science and Technology, Hanoi 100000, Vietnam
| | - Bach Hoai Nguyen
- School of Engineering and Computer Science, Victoria University of Wellington, Wellington 6140, New Zealand
| | - Phuong-Uyen Nguyen-Hoang
- Computational Biology Center, International University - VNU HCMC, Ho Chi Minh City 700000, Vietnam
| | - Ly Le
- Computational Biology Center, International University - VNU HCMC, Ho Chi Minh City 700000, Vietnam.,Vingroup Big Data Institute, Ha Noi 100000, Vietnam
| | - Binh P Nguyen
- School of Mathematics and Statistics, Victoria University of Wellington, Wellington 6140, New Zealand
| |
Collapse
|
17
|
Maraming P, Daduang J, Kah JCY. Conjugation with gold nanoparticles improves the stability of the KT2 peptide and maintains its anticancer properties. RSC Adv 2021; 12:319-325. [PMID: 35424498 PMCID: PMC8978663 DOI: 10.1039/d1ra05980g] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2021] [Accepted: 12/01/2021] [Indexed: 12/18/2022] Open
Abstract
One of the major weaknesses of therapeutic peptides is their sensitivity to degradation by proteolytic enzymes in vivo. Gold nanoparticles (GNPs) are a good carrier for therapeutic peptides to improve their stability and cellular uptake in vitro and in vivo. We conjugated the anticancer KT2 peptide as an anticancer peptide model to PEGylated GNPs (GNPs-PEG) and investigated the peptide stability, cellular uptake and ability of the GNPs-KT2-PEG conjugates to induce MDA-MB-231 human breast cancer cell death. We found that 11 nm GNPs protected the conjugated KT2 peptide from trypsin proteolysis, keeping it stable up to 0.128% trypsin, which is higher than the serum trypsin concentration (range 0.0000285 ± 0.0000125%) reported by Lake-Bakaar, G. et al., 1979. GNPs significantly enhanced the cellular uptake of KT2 peptides after conjugation. Free KT2 peptides pretreated with trypsin were not able to kill MDA-MB-231 cells due to proteolysis, while GNPs-KT2-PEG was still able to exert effective cancer cell killing after trypsin treatment at levels comparable to GNPs-KT2-PEG without enzyme pretreatment. The outcome of this study highlights the utility of conjugated anticancer peptides on nanoparticles to improve peptide stability and retain anticancer ability. One of the major weaknesses of therapeutic peptides is their sensitivity to degradation by proteolytic enzymes in vivo.![]()
Collapse
Affiliation(s)
- Pornsuda Maraming
- Centre for Research and Development of Medical Diagnostic Laboratories, Faculty of Associated Medical Sciences, Khon Kaen University Khon Kaen 40002 Thailand
| | - Jureerut Daduang
- Centre for Research and Development of Medical Diagnostic Laboratories, Faculty of Associated Medical Sciences, Khon Kaen University Khon Kaen 40002 Thailand
| | - James Chen Yong Kah
- Department of Biomedical Engineering, National University of Singapore 4 Engineering Drive 3, Blk E4, #04-08 Singapore 117583
| |
Collapse
|
18
|
Liang X, Li F, Chen J, Li J, Wu H, Li S, Song J, Liu Q. Large-scale comparative review and assessment of computational methods for anti-cancer peptide identification. Brief Bioinform 2021; 22:bbaa312. [PMID: 33316035 PMCID: PMC8294543 DOI: 10.1093/bib/bbaa312] [Citation(s) in RCA: 48] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2020] [Revised: 09/30/2020] [Accepted: 08/25/2020] [Indexed: 12/13/2022] Open
Abstract
Anti-cancer peptides (ACPs) are known as potential therapeutics for cancer. Due to their unique ability to target cancer cells without affecting healthy cells directly, they have been extensively studied. Many peptide-based drugs are currently evaluated in the preclinical and clinical trials. Accurate identification of ACPs has received considerable attention in recent years; as such, a number of machine learning-based methods for in silico identification of ACPs have been developed. These methods promote the research on the mechanism of ACPs therapeutics against cancer to some extent. There is a vast difference in these methods in terms of their training/testing datasets, machine learning algorithms, feature encoding schemes, feature selection methods and evaluation strategies used. Therefore, it is desirable to summarize the advantages and disadvantages of the existing methods, provide useful insights and suggestions for the development and improvement of novel computational tools to characterize and identify ACPs. With this in mind, we firstly comprehensively investigate 16 state-of-the-art predictors for ACPs in terms of their core algorithms, feature encoding schemes, performance evaluation metrics and webserver/software usability. Then, comprehensive performance assessment is conducted to evaluate the robustness and scalability of the existing predictors using a well-prepared benchmark dataset. We provide potential strategies for the model performance improvement. Moreover, we propose a novel ensemble learning framework, termed ACPredStackL, for the accurate identification of ACPs. ACPredStackL is developed based on the stacking ensemble strategy combined with SVM, Naïve Bayesian, lightGBM and KNN. Empirical benchmarking experiments against the state-of-the-art methods demonstrate that ACPredStackL achieves a comparative performance for predicting ACPs. The webserver and source code of ACPredStackL is freely available at http://bigdata.biocie.cn/ACPredStackL/ and https://github.com/liangxiaoq/ACPredStackL, respectively.
Collapse
Affiliation(s)
- Xiao Liang
- College of Information Engineering, Northwest A&F University, Yangling, 712100, China
- Shaanxi Key Laboratory of Agricultural Information Perception and Intelligent Service, Yangling, Shaanxi 712100, China
| | - Fuyi Li
- Monash Biomedicine Discovery Institute and Department of Biochemistry and Molecular Biology, Monash University, Melbourne, VIC 3800, Australia
- Monash Centre for Data Science, Monash University, Melbourne, VIC 3800, Australia
- Department of Microbiology and Immunology, Peter Doherty Institute for Infection and Immunity, University of Melbourne, Melbourne, Victoria, Australia
| | - Jinxiang Chen
- College of Information Engineering, Northwest A&F University, Yangling, 712100, China
| | - Junlong Li
- College of Information Engineering, Northwest A&F University, Yangling, 712100, China
| | - Hao Wu
- College of Information Engineering, Northwest A&F University, Yangling, 712100, China
| | - Shuqin Li
- College of Information Engineering, Northwest A&F University, Yangling, 712100, China
- Shaanxi Key Laboratory of Agricultural Information Perception and Intelligent Service, Yangling, Shaanxi 712100, China
| | - Jiangning Song
- Monash Biomedicine Discovery Institute and Department of Biochemistry and Molecular Biology, Monash University, Melbourne, VIC 3800, Australia
- Monash Centre for Data Science, Monash University, Melbourne, VIC 3800, Australia
- ARC Centre of Excellence in Advanced Molecular Imaging, Monash University, Melbourne, VIC 3800, Australia
| | - Quanzhong Liu
- College of Information Engineering, Northwest A&F University, Yangling, 712100, China
- Shaanxi Key Laboratory of Agricultural Information Perception and Intelligent Service, Yangling, Shaanxi 712100, China
| |
Collapse
|
19
|
Chen XG, Zhang W, Yang X, Li C, Chen H. ACP-DA: Improving the Prediction of Anticancer Peptides Using Data Augmentation. Front Genet 2021; 12:698477. [PMID: 34276801 PMCID: PMC8279753 DOI: 10.3389/fgene.2021.698477] [Citation(s) in RCA: 23] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2021] [Accepted: 06/07/2021] [Indexed: 12/09/2022] Open
Abstract
Anticancer peptides (ACPs) have provided a promising perspective for cancer treatment, and the prediction of ACPs is very important for the discovery of new cancer treatment drugs. It is time consuming and expensive to use experimental methods to identify ACPs, so computational methods for ACP identification are urgently needed. There have been many effective computational methods, especially machine learning-based methods, proposed for such predictions. Most of the current machine learning methods try to find suitable features or design effective feature learning techniques to accurately represent ACPs. However, the performance of these methods can be further improved for cases with insufficient numbers of samples. In this article, we propose an ACP prediction model called ACP-DA (Data Augmentation), which uses data augmentation for insufficient samples to improve the prediction performance. In our method, to better exploit the information of peptide sequences, peptide sequences are represented by integrating binary profile features and AAindex features, and then the samples in the training set are augmented in the feature space. After data augmentation, the samples are used to train the machine learning model, which is used to predict ACPs. The performance of ACP-DA exceeds that of existing methods, and ACP-DA achieves better performance in the prediction of ACPs compared with a method without data augmentation. The proposed method is available at http://github.com/chenxgscuec/ACPDA.
Collapse
Affiliation(s)
- Xian-Gan Chen
- School of Biomedical Engineering, South-Central University for Nationalities, Wuhan, China.,Hubei Key Laboratory of Medical Information Analysis and Tumor Diagnosis & Treatment, South-Central University for Nationalities, Wuhan, China.,Key Laboratory of Cognitive Science (South-Central University for Nationalities), State Ethnic Affairs Commission, Wuhan, China
| | - Wen Zhang
- College of Informatics, Huazhong Agricultural University, Wuhan, China.,Hubei Engineering Technology Research Center of Agricultural Big Data, Wuhan, China
| | - Xiaofei Yang
- School of Biomedical Engineering, South-Central University for Nationalities, Wuhan, China.,Hubei Key Laboratory of Medical Information Analysis and Tumor Diagnosis & Treatment, South-Central University for Nationalities, Wuhan, China.,Key Laboratory of Cognitive Science (South-Central University for Nationalities), State Ethnic Affairs Commission, Wuhan, China
| | - Chenhong Li
- School of Biomedical Engineering, South-Central University for Nationalities, Wuhan, China.,Hubei Key Laboratory of Medical Information Analysis and Tumor Diagnosis & Treatment, South-Central University for Nationalities, Wuhan, China.,Key Laboratory of Cognitive Science (South-Central University for Nationalities), State Ethnic Affairs Commission, Wuhan, China
| | - Hengling Chen
- School of Biomedical Engineering, South-Central University for Nationalities, Wuhan, China.,Hubei Key Laboratory of Medical Information Analysis and Tumor Diagnosis & Treatment, South-Central University for Nationalities, Wuhan, China.,Key Laboratory of Cognitive Science (South-Central University for Nationalities), State Ethnic Affairs Commission, Wuhan, China
| |
Collapse
|
20
|
Chai TT, Ee KY, Kumar DT, Manan FA, Wong FC. Plant Bioactive Peptides: Current Status and Prospects Towards Use on Human Health. Protein Pept Lett 2021; 28:623-642. [PMID: 33319654 DOI: 10.2174/0929866527999201211195936] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2020] [Revised: 11/02/2020] [Accepted: 11/06/2020] [Indexed: 12/28/2022]
Abstract
Large numbers of bioactive peptides with potential applications in protecting against human diseases have been identified from plant sources. In this review, we summarized recent progress in the research of plant-derived bioactive peptides, encompassing their production, biological effects, and mechanisms. This review focuses on antioxidant, antimicrobial, antidiabetic, and anticancer peptides, giving special attention to evidence derived from cellular and animal models. Studies investigating peptides with known sequences and well-characterized peptidic fractions or protein hydrolysates will be discussed. The use of molecular docking tools to elucidate inter-molecular interactions between bioactive peptides and target proteins is highlighted. In conclusion, the accumulating evidence from in silico, in vitro and in vivo studies to date supports the envisioned applications of plant peptides as natural antioxidants as well as health-promoting agents. Notwithstanding, much work is still required before the envisioned applications of plant peptides can be realized. To this end, future researches for addressing current gaps were proposed.
Collapse
Affiliation(s)
- Tsun-Thai Chai
- Department of Chemical Science, Faculty of Science, Universiti Tunku Abdul Rahman, Kampar 31900, Malaysia
| | - Kah-Yaw Ee
- Center for Biodiversity Research, Universiti Tunku Abdul Rahman, Kampar 31900, Malaysia
| | - D Thirumal Kumar
- Department of Bioinformatics, Saveetha School of Engineering, Saveetha Institute of Medical and Technical Sciences, Chennai 602 105, India
| | - Fazilah Abd Manan
- Department of Biosciences, Faculty of Science, Universiti Teknologi Malaysia, Skudai 81310, Johor, Malaysia
| | - Fai-Chu Wong
- Department of Chemical Science, Faculty of Science, Universiti Tunku Abdul Rahman, Kampar 31900, Malaysia
| |
Collapse
|
21
|
Recent developments on production, purification and biological activity of marine peptides. Food Res Int 2021; 147:110468. [PMID: 34399466 DOI: 10.1016/j.foodres.2021.110468] [Citation(s) in RCA: 49] [Impact Index Per Article: 12.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2021] [Revised: 05/18/2021] [Accepted: 05/23/2021] [Indexed: 12/11/2022]
Abstract
Marine peptides are one of the richest sources of structurally diverse bioactive compounds and a considerable attention has been drawn towards their production and bioactivity. However, there is a paucity in consolidation of emerging trends encompassing both production techniques and biological application. Herein, we intend to review the recent advancements on different production, purification and identification technologies used for marine peptides along with presenting their potential health benefits. Bibliometric analysis revealed a growing number of scientific publications on marine peptides (268 documents per year) with both Asia (37.2%) and Europe (33.1%) being the major contributors. Extraction and purification by ultrafiltration and enzymatic hydrolysis, followed by identification by chromatographic techniques coupled with an appropriate detector could yield a high content of peptides with improved bioactivity. Moreover, the multifunctional health benefits exerted by marine peptides including anti-microbial, antioxidant, anti-hypertension, anti-diabetes and anti-cancer along with their structure-activity relationship were presented. The future perspective on marine peptide research should focus on finding improved separation and purification technologies with enhanced selectivity and resolution for obtaining more novel peptides with high yield and low cost. In addition, by employing encapsulation strategies such as nanoemulsion and nanoliposome, oral bioavailability and bioactivity of peptides can be greatly enhanced. Also, the potential health benefits that are demonstrated by in vitro and in vivo models should be validated by conducting human clinical trials for a technology transfer from bench to bedside.
Collapse
|
22
|
Wan Y, Wang Z, Lee TY. Incorporating support vector machine with sequential minimal optimization to identify anticancer peptides. BMC Bioinformatics 2021; 22:286. [PMID: 34051755 PMCID: PMC8164238 DOI: 10.1186/s12859-021-03965-4] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2020] [Accepted: 01/08/2021] [Indexed: 12/09/2022] Open
Abstract
BACKGROUND Cancer is one of the major causes of death worldwide. To treat cancer, the use of anticancer peptides (ACPs) has attracted increased attention in recent years. ACPs are a unique group of small molecules that can target and kill cancer cells fast and directly. However, identifying ACPs by wet-lab experiments is time-consuming and labor-intensive. Therefore, it is significant to develop computational tools for ACPs prediction. Though some ACP prediction tools have been developed recently, their performances are not well enough and most of them do not offer a function to distinguish ACPs from antimicrobial peptides (AMPs). Considering the fact that a growing number of studies have shown that some AMPs exhibit anticancer function, this work tries to build a model for distinguishing AMPs from ACPs in addition to a model that predicts ACPs from whole peptides. RESULTS This study chooses amino acid composition, N5C5, k-space, position-specific scoring matrix (PSSM) as features, and analyzes them by machine learning methods, including support vector machine (SVM) and sequential minimal optimization (SMO) to build a model (model 2) for distinguishing ACPs from whole peptides. Another model (model 1) that distinguishes ACPs from AMPs is also developed. Comparing to previous models, models developed in this research show better performance (accuracy: 85.5% for model 1 and 95.2% for model 2). CONCLUSIONS This work utilizes a new feature, PSSM, which contributes to better performance than other features. In addition to SVM, SMO is used in this research for optimizing SVM and the SMO-optimized models show better performance than non-optimized models. Last but not least, this work provides two different functions, including distinguishing ACPs from AMPs and distinguishing ACPs from all peptides. The second SMO-optimized model, which utilizes PSSM as a feature, performs better than all other existing tools.
Collapse
Affiliation(s)
- Yu Wan
- School of Life and Health Sciences, The Chinese University of Hong Kong, Shenzhen, Shenzhen, 518172, Guangdong, People's Republic of China
| | - Zhuo Wang
- Warshel Institute for Computational Biology, The Chinese University of Hong Kong, Shenzhen, Shenzhen, 518172, Guangdong, People's Republic of China
| | - Tzong-Yi Lee
- School of Life and Health Sciences, The Chinese University of Hong Kong, Shenzhen, Shenzhen, 518172, Guangdong, People's Republic of China.
- Warshel Institute for Computational Biology, The Chinese University of Hong Kong, Shenzhen, Shenzhen, 518172, Guangdong, People's Republic of China.
| |
Collapse
|
23
|
Khongdetch J, Laohakunjit N, Kaprasob R. King Boletus mushroom‐derived bioactive protein hydrolysate: characterisation, antioxidant, ACE inhibitory and cytotoxic activities. Int J Food Sci Technol 2021. [DOI: 10.1111/ijfs.15100] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022]
Affiliation(s)
- Jindaporn Khongdetch
- School of Bioresources and Technology King Mongkut’s University of Technology Thonburi 49 Teintalay 25 Road, Thakam, Bangkhuntein Bangkok10150Thailand
- Rajamangala University of Technology Suvarnabhumi Ayutthaya Phra Nakhon Si Ayutthaya Thailand
| | - Natta Laohakunjit
- School of Bioresources and Technology King Mongkut’s University of Technology Thonburi 49 Teintalay 25 Road, Thakam, Bangkhuntein Bangkok10150Thailand
| | - Ratchadaporn Kaprasob
- School of Bioresources and Technology King Mongkut’s University of Technology Thonburi 49 Teintalay 25 Road, Thakam, Bangkhuntein Bangkok10150Thailand
| |
Collapse
|
24
|
Kardani K, Bolhassani A. Antimicrobial/anticancer peptides: bioactive molecules and therapeutic agents. Immunotherapy 2021; 13:669-684. [PMID: 33878901 DOI: 10.2217/imt-2020-0312] [Citation(s) in RCA: 23] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/10/2023] Open
Abstract
Antimicrobial peptides (AMPs) have been known as host-defense peptides. These cationic and amphipathic peptides are relatively short (∼5-50 L-amino acids) with molecular weight less than 10 kDa. AMPs have various roles including immunomodulatory, angiogenic and antitumor activities. Anticancer peptides (ACPs) are a main subset of AMPs as a novel therapeutic approach against tumor cells. The physicochemical properties of the ACPs influence their cell penetration, stability and efficiency of targeting. Up to now, several databases and web servers for in silico prediction of AMPs/ACPs have been established prior to the lab analysis. The present review focuses on the recent advancement about AMPs/ACPs activities including their in silico prediction by computational tools and their potential applications as therapeutic agents especially in cancer.
Collapse
Affiliation(s)
- Kimia Kardani
- Department of Hepatitis & AIDS, Pasteur Institute of Iran, Tehran, Iran.,Iranian Comprehensive Hemophilia Care Center, Tehran, Iran
| | - Azam Bolhassani
- Department of Hepatitis & AIDS, Pasteur Institute of Iran, Tehran, Iran
| |
Collapse
|
25
|
Dong GF, Zheng L, Huang SH, Gao J, Zuo YC. Amino Acid Reduction Can Help to Improve the Identification of Antimicrobial Peptides and Their Functional Activities. Front Genet 2021; 12:669328. [PMID: 33959153 PMCID: PMC8093877 DOI: 10.3389/fgene.2021.669328] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/18/2021] [Accepted: 03/23/2021] [Indexed: 02/03/2023] Open
Abstract
Antimicrobial peptides (AMPs) are considered as potential substitutes of antibiotics in the field of new anti-infective drug design. There have been several machine learning algorithms and web servers in identifying AMPs and their functional activities. However, there is still room for improvement in prediction algorithms and feature extraction methods. The reduced amino acid (RAA) alphabet effectively solved the problems of simplifying protein complexity and recognizing the structure conservative region. This article goes into details about evaluating the performances of more than 5,000 amino acid reduced descriptors generated from 74 types of amino acid reduced alphabet in the first stage and the second stage to construct an excellent two-stage classifier, Identification of Antimicrobial Peptides by Reduced Amino Acid Cluster (iAMP-RAAC), for identifying AMPs and their functional activities, respectively. The results show that the first stage AMP classifier is able to achieve the accuracy of 97.21 and 97.11% for the training data set and independent test dataset. In the second stage, our classifier still shows good performance. At least three of the four metrics, sensitivity (SN), specificity (SP), accuracy (ACC), and Matthews correlation coefficient (MCC), exceed the calculation results in the literature. Further, the ANOVA with incremental feature selection (IFS) is used for feature selection to further improve prediction performance. The prediction performance is further improved after the feature selection of each stage. At last, a user-friendly web server, iAMP-RAAC, is established at http://bioinfor.imu.edu. cn/iampraac.
Collapse
Affiliation(s)
- Gai-Fang Dong
- Inner Mongolia Autonomous Region Key Laboratory of Big Data Research and Application of Agriculture and Animal Husbandry, College of Computer and Information Engineering, Inner Mongolia Agricultural University, Hohhot, China
| | - Lei Zheng
- The State Key Laboratory of Reproductive Regulation and Breeding of Grassland Livestock, College of Life Sciences, Inner Mongolia University, Hohhot, China
| | - Sheng-Hui Huang
- The State Key Laboratory of Reproductive Regulation and Breeding of Grassland Livestock, College of Life Sciences, Inner Mongolia University, Hohhot, China
| | - Jing Gao
- Inner Mongolia Autonomous Region Key Laboratory of Big Data Research and Application of Agriculture and Animal Husbandry, College of Computer and Information Engineering, Inner Mongolia Agricultural University, Hohhot, China
| | - Yong-Chun Zuo
- The State Key Laboratory of Reproductive Regulation and Breeding of Grassland Livestock, College of Life Sciences, Inner Mongolia University, Hohhot, China
| |
Collapse
|
26
|
Wang L, Niu D, Wang X, Khan J, Shen Q, Xue Y. A Novel Machine Learning Strategy for the Prediction of Antihypertensive Peptides Derived from Food with High Efficiency. Foods 2021; 10:foods10030550. [PMID: 33800877 PMCID: PMC7999667 DOI: 10.3390/foods10030550] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2020] [Revised: 03/01/2021] [Accepted: 03/03/2021] [Indexed: 12/22/2022] Open
Abstract
Strategies to screen antihypertensive peptides with high throughput and rapid speed will doubtlessly contribute to the treatment of hypertension. Food-derived antihypertensive peptides can reduce blood pressure without side effects. In the present study, a novel model based on the eXtreme Gradient Boosting (XGBoost) algorithm was developed and compared with the dominating machine learning models. To further reflect on the reliability of the method in a real situation, the optimized XGBoost model was utilized to predict the antihypertensive degree of the k-mer peptides cutting from six key proteins in bovine milk, and the peptide-protein docking technology was introduced to verify the findings. The results showed that the XGBoost model achieved outstanding performance, with an accuracy of 86.50% and area under the receiver operating characteristic curve of 94.11%, which were better than the other models. Using the XGBoost model, the prediction of antihypertensive peptides derived from milk protein was consistent with the peptide-protein docking results, and was more efficient. Our results indicate that using the XGBoost algorithm as a novel auxiliary tool is feasible to screen for antihypertensive peptides derived from food, with high throughput and high efficiency.
Collapse
Affiliation(s)
- Liyang Wang
- College of Food Science and Nutritional Engineering, China Agricultural University, Beijing 100083, China; (L.W.); (X.W.); (J.K.); (Q.S.)
| | - Dantong Niu
- College of Information and Electrical Engineering, China Agricultural University, Beijing 100083, China;
| | - Xiaoya Wang
- College of Food Science and Nutritional Engineering, China Agricultural University, Beijing 100083, China; (L.W.); (X.W.); (J.K.); (Q.S.)
| | - Jabir Khan
- College of Food Science and Nutritional Engineering, China Agricultural University, Beijing 100083, China; (L.W.); (X.W.); (J.K.); (Q.S.)
| | - Qun Shen
- College of Food Science and Nutritional Engineering, China Agricultural University, Beijing 100083, China; (L.W.); (X.W.); (J.K.); (Q.S.)
| | - Yong Xue
- College of Food Science and Nutritional Engineering, China Agricultural University, Beijing 100083, China; (L.W.); (X.W.); (J.K.); (Q.S.)
- Correspondence:
| |
Collapse
|
27
|
Jing XY, Li FM. Predicting Cell Wall Lytic Enzymes Using Combined Features. Front Bioeng Biotechnol 2021; 8:627335. [PMID: 33585423 PMCID: PMC7874139 DOI: 10.3389/fbioe.2020.627335] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2020] [Accepted: 12/04/2020] [Indexed: 11/13/2022] Open
Abstract
Due to the overuse of antibiotics, people are worried that existing antibiotics will become ineffective against pathogens with the rapid rise of antibiotic-resistant strains. The use of cell wall lytic enzymes to destroy bacteria has become a viable alternative to avoid the crisis of antimicrobial resistance. In this paper, an improved method for cell wall lytic enzymes prediction was proposed and the amino acid composition (AAC), the dipeptide composition (DC), the position-specific score matrix auto-covariance (PSSM-AC), and the auto-covariance average chemical shift (acACS) were selected to predict the cell wall lytic enzymes with support vector machine (SVM). In order to overcome the imbalanced data classification problems and remove redundant or irrelevant features, the synthetic minority over-sampling technique (SMOTE) was used to balance the dataset. The F-score was used to select features. The Sn, Sp, MCC, and Acc were 99.35%, 99.02%, 0.98, and 99.19% with jackknife test using the optimized combination feature AAC+DC+acACS+PSSM-AC. The Sn, Sp, MCC, and Acc of cell wall lytic enzymes in our predictive model were higher than those in existing methods. This improved method may be helpful for protein function prediction.
Collapse
Affiliation(s)
- Xiao-Yang Jing
- College of Science, Inner Mongolia Agricultural University, Hohhot, China
| | - Feng-Min Li
- College of Science, Inner Mongolia Agricultural University, Hohhot, China
| |
Collapse
|
28
|
Charoenkwan P, Chiangjong W, Lee VS, Nantasenamat C, Hasan MM, Shoombuatong W. Improved prediction and characterization of anticancer activities of peptides using a novel flexible scoring card method. Sci Rep 2021; 11:3017. [PMID: 33542286 PMCID: PMC7862624 DOI: 10.1038/s41598-021-82513-9] [Citation(s) in RCA: 48] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2020] [Accepted: 01/18/2021] [Indexed: 01/30/2023] Open
Abstract
As anticancer peptides (ACPs) have attracted great interest for cancer treatment, several approaches based on machine learning have been proposed for ACP identification. Although existing methods have afforded high prediction accuracies, however such models are using a large number of descriptors together with complex ensemble approaches that consequently leads to low interpretability and thus poses a challenge for biologists and biochemists. Therefore, it is desirable to develop a simple, interpretable and efficient predictor for accurate ACP identification as well as providing the means for the rational design of new anticancer peptides with promising potential for clinical application. Herein, we propose a novel flexible scoring card method (FSCM) making use of propensity scores of local and global sequential information for the development of a sequence-based ACP predictor (named iACP-FSCM) for improving the prediction accuracy and model interpretability. To the best of our knowledge, iACP-FSCM represents the first sequence-based ACP predictor for rationalizing an in-depth understanding into the molecular basis for the enhancement of anticancer activities of peptides via the use of FSCM-derived propensity scores. The independent testing results showed that the iACP-FSCM provided accuracies of 0.825 and 0.910 as evaluated on the main and alternative datasets, respectively. Results from comparative benchmarking demonstrated that iACP-FSCM could outperform seven other existing ACP predictors with marked improvements of 7% and 17% for accuracy and MCC, respectively, on the main dataset. Furthermore, the iACP-FSCM (0.910) achieved very comparable results to that of the state-of-the-art ensemble model AntiCP2.0 (0.920) as evaluated on the alternative dataset. Comparative results demonstrated that iACP-FSCM was the most suitable choice for ACP identification and characterization considering its simplicity, interpretability and generalizability. It is highly anticipated that the iACP-FSCM may be a robust tool for the rapid screening and identification of promising ACPs for clinical use.
Collapse
Affiliation(s)
- Phasit Charoenkwan
- Modern Management and Information Technology, College of Arts, Media and Technology, Chiang Mai University, Chiang Mai, 50200, Thailand
| | - Wararat Chiangjong
- Pediatric Translational Research Unit, Department of Pediatrics, Faculty of Medicine, Ramathibodi Hospital, Mahidol University, Bangkok, 10400, Thailand
| | - Vannajan Sanghiran Lee
- Department of Chemistry, Centre of Theoretical and Computational Physics, Faculty of Science, University of Malaya, 50603, Kuala Lumpur, Malaysia
| | - Chanin Nantasenamat
- Center of Data Mining and Biomedical Informatics, Faculty of Medical Technology, Mahidol University, Bangkok, 10700, Thailand
| | - Md Mehedi Hasan
- Department of Bioscience and Bioinformatics, Kyushu Institute of Technology, 680-4 Kawazu, Iizuka, Fukuoka, 820-8502, Japan
| | - Watshara Shoombuatong
- Center of Data Mining and Biomedical Informatics, Faculty of Medical Technology, Mahidol University, Bangkok, 10700, Thailand.
| |
Collapse
|
29
|
Yu L, Jing R, Liu F, Luo J, Li Y. DeepACP: A Novel Computational Approach for Accurate Identification of Anticancer Peptides by Deep Learning Algorithm. MOLECULAR THERAPY-NUCLEIC ACIDS 2020; 22:862-870. [PMID: 33230481 PMCID: PMC7658571 DOI: 10.1016/j.omtn.2020.10.005] [Citation(s) in RCA: 51] [Impact Index Per Article: 10.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/02/2020] [Accepted: 10/06/2020] [Indexed: 12/24/2022]
Abstract
Cancer is one of the most dangerous diseases to human health. The accurate prediction of anticancer peptides (ACPs) would be valuable for the development and design of novel anticancer agents. Current deep neural network models have obtained state-of-the-art prediction accuracy for the ACP classification task. However, based on existing studies, it remains unclear which deep learning architecture achieves the best performance. Thus, in this study, we first present a systematic exploration of three important deep learning architectures: convolutional, recurrent, and convolutional-recurrent networks for distinguishing ACPs from non-ACPs. We find that the recurrent neural network with bidirectional long short-term memory cells is superior to other architectures. By utilizing the proposed model, we implement a sequence-based deep learning tool (DeepACP) to accurately predict the likelihood of a peptide exhibiting anticancer activity. The results indicate that DeepACP outperforms several existing methods and can be used as an effective tool for the prediction of anticancer peptides. Furthermore, we visualize and understand the deep learning model. We hope that our strategy can be extended to identify other types of peptides and may provide more assistance to the development of proteomics and new drugs.
Collapse
Affiliation(s)
- Lezheng Yu
- School of Chemistry and Materials Science, Guizhou Education University, Guiyang 550018, China
- Corresponding author: Lezheng Yu, School of Chemistry and Materials Science, Guizhou Education University, Guiyang 550018, China.
| | - Runyu Jing
- College of Cybersecurity, Sichuan University, Chengdu 610065, China
| | - Fengjuan Liu
- School of Geography and Resources, Guizhou Education University, Guiyang 550018, China
| | - Jiesi Luo
- Department of Pharmacology, School of Pharmacy, Southwest Medical University, Luzhou 646000, Sichuan, China
- Corresponding author: Jiesi Luo, Department of Pharmacology, School of Pharmacy, Southwest Medical University, Luzhou 646000, Sichuan, China.
| | - Yizhou Li
- College of Cybersecurity, Sichuan University, Chengdu 610065, China
| |
Collapse
|
30
|
Identifying Heat Shock Protein Families from Imbalanced Data by Using Combined Features. COMPUTATIONAL AND MATHEMATICAL METHODS IN MEDICINE 2020; 2020:8894478. [PMID: 33029195 PMCID: PMC7530508 DOI: 10.1155/2020/8894478] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/28/2020] [Revised: 09/08/2020] [Accepted: 09/14/2020] [Indexed: 11/29/2022]
Abstract
Heat shock proteins (HSPs) are ubiquitous in living organisms. HSPs are an essential component for cell growth and survival; the main function of HSPs is controlling the folding and unfolding process of proteins. According to molecular function and mass, HSPs are categorized into six different families: HSP20 (small HSPS), HSP40 (J-proteins), HSP60, HSP70, HSP90, and HSP100. In this paper, improved methods for HSP prediction are proposed—the split amino acid composition (SAAC), the dipeptide composition (DC), the conjoint triad feature (CTF), and the pseudoaverage chemical shift (PseACS) were selected to predict the HSPs with a support vector machine (SVM). In order to overcome the imbalance data classification problems, the syntactic minority oversampling technique (SMOTE) was used to balance the dataset. The overall accuracy was 99.72% with a balanced dataset in the jackknife test by using the optimized combination feature SAAC+DC+CTF+PseACS, which was 4.81% higher than the imbalanced dataset with the same combination feature. The Sn, Sp, Acc, and MCC of HSP families in our predictive model were higher than those in existing methods. This improved method may be helpful for protein function prediction.
Collapse
|
31
|
Liscano Y, Oñate-Garzón J, Delgado JP. Peptides with Dual Antimicrobial-Anticancer Activity: Strategies to Overcome Peptide Limitations and Rational Design of Anticancer Peptides. Molecules 2020; 25:E4245. [PMID: 32947811 PMCID: PMC7570524 DOI: 10.3390/molecules25184245] [Citation(s) in RCA: 52] [Impact Index Per Article: 10.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2020] [Revised: 09/04/2020] [Accepted: 09/11/2020] [Indexed: 12/31/2022] Open
Abstract
Peptides are naturally produced by all organisms and exhibit a wide range of physiological, immunomodulatory, and wound healing functions. Furthermore, they can provide with protection against microorganisms and tumor cells. Their multifaceted performance, high selectivity, and reduced toxicity have positioned them as effective therapeutic agents, representing a positive economic impact for pharmaceutical companies. Currently, efforts have been made to invest in the development of new peptides with antimicrobial and anticancer properties, but the poor stability of these molecules in physiological environments has triggered a bottleneck. Therefore, some tools, such as nanotechnology and in silico approaches can be applied as alternatives to try to overcome these obstacles. In silico studies provide a priori knowledge that can lead to the development of new anticancer peptides with enhanced biological activity and improved stability. This review focuses on the current status of research in peptides with dual antimicrobial-anticancer activity, including advances in computational biology using in silico analyses as a powerful tool for the study and rational design of these types of peptides.
Collapse
Affiliation(s)
- Yamil Liscano
- Research Group of Chemical and Biotechnology, Faculty of Basic Sciences, Universidad Santiago de Cali, 760035 Cali, Colombia;
- Research Group of Genetics, Regeneration and Cancer, Institute of Biology, Universidad de Antioquia, 050010 Medellin, Colombia;
| | - Jose Oñate-Garzón
- Research Group of Chemical and Biotechnology, Faculty of Basic Sciences, Universidad Santiago de Cali, 760035 Cali, Colombia;
| | - Jean Paul Delgado
- Research Group of Genetics, Regeneration and Cancer, Institute of Biology, Universidad de Antioquia, 050010 Medellin, Colombia;
| |
Collapse
|
32
|
Li FM, Gao XW. Predicting Gram-Positive Bacterial Protein Subcellular Location by Using Combined Features. BIOMED RESEARCH INTERNATIONAL 2020; 2020:9701734. [PMID: 32802888 PMCID: PMC7421015 DOI: 10.1155/2020/9701734] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/23/2020] [Revised: 06/30/2020] [Accepted: 07/13/2020] [Indexed: 12/14/2022]
Abstract
There are a lot of bacteria in the environment, and Gram-positive bacteria are the most common ones. Some Gram-positive bacteria are very harmful to the human body, so it is significant to predict Gram-positive bacterial protein subcellular location. And identification of Gram-positive bacterial protein subcellular location is important for developing effective drugs. In this paper, a new Gram-positive bacterial protein subcellular location dataset was established. The amino acid composition, the gene ontology annotation information, the hydropathy dipeptide composition information, the amino acid dipeptide composition information, and the autocovariance average chemical shift information were selected as characteristic parameters, then these parameters were combined. The locations of Gram-positive bacterial proteins were predicted by the Support Vector Machine (SVM) algorithm, and the overall accuracy (OA) reached 86.1% under the Jackknife test. The overall accuracy (OA) in our predictive model was higher than those in existing methods. This improved method may be helpful for protein function prediction.
Collapse
Affiliation(s)
- Feng-Min Li
- College of Science, Inner Mongolia Agricultural University, Hohhot 010018, China
| | - Xiao-Wei Gao
- College of Science, Inner Mongolia Agricultural University, Hohhot 010018, China
| |
Collapse
|
33
|
Ge R, Feng G, Jing X, Zhang R, Wang P, Wu Q. EnACP: An Ensemble Learning Model for Identification of Anticancer Peptides. Front Genet 2020; 11:760. [PMID: 32903636 PMCID: PMC7438906 DOI: 10.3389/fgene.2020.00760] [Citation(s) in RCA: 21] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2020] [Accepted: 06/26/2020] [Indexed: 12/13/2022] Open
Abstract
As cancer remains one of the main threats of human life, developing efficient cancer treatments is urgent. Anticancer peptides, which could overcome the significant side effects and poor results of traditional cancer treatments, have become a new potential alternative these years. However, identifying anticancer peptides by experimental methods is time consuming and resource consuming, it is of great significance to develop effective computational tools to quickly and accurately identify potential anticancer peptides from amino acid sequences. For most current computational methods, feature representation plays a key role in their final successes. This study proposes a novel fast and accurate approach to identify anticancer peptides using diversified feature representations and ensemble learning method. For the feature representations, the information is encoded from multidimensional feature spaces, including sequence composition, sequence-order, physicochemical properties, etc. In order to better model the potential relationships of peptides, multiple ensemble classifiers, LightGBMs, are applied to detect the different feature sets at first. Then the obtained multiple outputs are used as inputs of the support vector machine classifier, which effectively identifies anticancer peptides. Experimental results on cross validation and independent test sets demonstrate that our method can achieve better or comparable performances compared with other state-of-the-art methods.
Collapse
Affiliation(s)
- Ruiquan Ge
- Key Laboratory of Complex Systems Modeling and Simulation, School of Computer Science and Technology, Hangzhou Dianzi University, Hangzhou, China
| | - Guanwen Feng
- Xi'an Key Laboratory of Big Data and Intelligent Vision, School of Computer Science and Technology, Xidian University, Xi'an, China
| | - Xiaoyang Jing
- Toyota Technological Institute at Chicago, Chicago, IL, United States
| | - Renfeng Zhang
- Shandong Provincial Hospital Affiliated to Shandong First Medical University, Jinan, China
| | - Pu Wang
- Computer School, Hubei University of Arts and Science, Xiangyang, China
| | - Qing Wu
- Key Laboratory of Complex Systems Modeling and Simulation, School of Computer Science and Technology, Hangzhou Dianzi University, Hangzhou, China
| |
Collapse
|
34
|
Zhao T, Hu Y, Valsdottir LR, Zang T, Peng J. Identifying drug-target interactions based on graph convolutional network and deep neural network. Brief Bioinform 2020; 22:2141-2150. [PMID: 32367110 DOI: 10.1093/bib/bbaa044] [Citation(s) in RCA: 141] [Impact Index Per Article: 28.2] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2019] [Revised: 03/05/2020] [Accepted: 03/06/2020] [Indexed: 12/21/2022] Open
Abstract
Identification of new drug-target interactions (DTIs) is an important but a time-consuming and costly step in drug discovery. In recent years, to mitigate these drawbacks, researchers have sought to identify DTIs using computational approaches. However, most existing methods construct drug networks and target networks separately, and then predict novel DTIs based on known associations between the drugs and targets without accounting for associations between drug-protein pairs (DPPs). To incorporate the associations between DPPs into DTI modeling, we built a DPP network based on multiple drugs and proteins in which DPPs are the nodes and the associations between DPPs are the edges of the network. We then propose a novel learning-based framework, 'graph convolutional network (GCN)-DTI', for DTI identification. The model first uses a graph convolutional network to learn the features for each DPP. Second, using the feature representation as an input, it uses a deep neural network to predict the final label. The results of our analysis show that the proposed framework outperforms some state-of-the-art approaches by a large margin.
Collapse
Affiliation(s)
- Tianyi Zhao
- Department of Computer Science at Harbin Institute of Technology. He currently works as a bioinformatician in Beth Israel Deaconess Medical Center
| | - Yang Hu
- Department of Life Science at Harbin Institute of Technology. His expertise is bioinformatics
| | - Linda R Valsdottir
- MS in Biology and works as a scientific writer at the Smith Center for Outcomes Research in Cardiology at Beth Israel Deaconess Medical Center in Boston, MA. Her work is focused on helping researchers communicate their findings in an effort to translate novel analytical approaches and clinical expertise into improved outcomes for patients
| | - Tianyi Zang
- School of Computer Science and Technology at Harbin Institute of Technology (HIT), China. Before joining HIT in 2009, he was a research fellow at the Department of Computer Science at University of Oxford, UK. His current research is concerned with biomedical bigdata computing and algorithms, deep-learning algorithms for network data, intelligent recommendation algorithms, and modeling and analysis methods for complex systems
| | - Jiajie Peng
- School of Computer Science at Northwestern Polytechnical University. His expertise is computational biology and machine learning. Availability and implementation: https://github.com/zty2009/GCN-DNN/
| |
Collapse
|
35
|
Song X, Zhuang Y, Lan Y, Lin Y, Min X. Comprehensive Review and Comparison for Anticancer Peptides Identification Models. Curr Protein Pept Sci 2020; 22:CPPS-EPUB-103745. [PMID: 31957608 DOI: 10.2174/1389203721666200117162958] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2019] [Revised: 05/16/2019] [Accepted: 05/30/2019] [Indexed: 11/22/2022]
Abstract
Anticancer peptides (ACPs) eliminate pathogenic bacteria and kill tumor cells, showing no hemolysis and no damages to normal human cells. This unique ability explores the possibility of ACPs as therapeutic delivery and its potential applications in clinical therapy. Identifying ACPs is one of the most fundamental and central problems in new antitumor drug research. During the past decades, a number of machine learning-based prediction tools have been developed to solve this important task. However, the predictions produced by various tools are difficult to quantify and compare. Therefore, in this article, we provide a comprehensive review of existing machine learning methods for ACPs prediction and fair comparison of the predictors. To evaluate current prediction tools, we conducted a comparative study and analyzed the existing ACPs predictor from 10 public literatures. The comparative results obtained suggest that Support Vector Machine-based model with features combination provided significant improvement in the overall performance, when compared to the other machine learning method-based prediction models.
Collapse
|
36
|
Basith S, Manavalan B, Hwan Shin T, Lee G. Machine intelligence in peptide therapeutics: A next‐generation tool for rapid disease screening. Med Res Rev 2020; 40:1276-1314. [DOI: 10.1002/med.21658] [Citation(s) in RCA: 139] [Impact Index Per Article: 27.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2019] [Revised: 11/26/2019] [Accepted: 12/16/2019] [Indexed: 12/12/2022]
Affiliation(s)
- Shaherin Basith
- Department of PhysiologyAjou University School of MedicineSuwon Republic of Korea
| | | | - Tae Hwan Shin
- Department of PhysiologyAjou University School of MedicineSuwon Republic of Korea
| | - Gwang Lee
- Department of PhysiologyAjou University School of MedicineSuwon Republic of Korea
| |
Collapse
|
37
|
Wu C, Gao R, Zhang Y, De Marinis Y. PTPD: predicting therapeutic peptides by deep learning and word2vec. BMC Bioinformatics 2019; 20:456. [PMID: 31492094 PMCID: PMC6728961 DOI: 10.1186/s12859-019-3006-z] [Citation(s) in RCA: 58] [Impact Index Per Article: 9.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2019] [Accepted: 07/25/2019] [Indexed: 01/10/2023] Open
Abstract
*: Background In the search for therapeutic peptides for disease treatments, many efforts have been made to identify various functional peptides from large numbers of peptide sequence databases. In this paper, we propose an effective computational model that uses deep learning and word2vec to predict therapeutic peptides (PTPD). *: Results Representation vectors of all k-mers were obtained through word2vec based on k-mer co-existence information. The original peptide sequences were then divided into k-mers using the windowing method. The peptide sequences were mapped to the input layer by the embedding vector obtained by word2vec. Three types of filters in the convolutional layers, as well as dropout and max-pooling operations, were applied to construct feature maps. These feature maps were concatenated into a fully connected dense layer, and rectified linear units (ReLU) and dropout operations were included to avoid over-fitting of PTPD. The classification probabilities were generated by a sigmoid function. PTPD was then validated using two datasets: an independent anticancer peptide dataset and a virulent protein dataset, on which it achieved accuracies of 96% and 94%, respectively. *: Conclusions PTPD identified novel therapeutic peptides efficiently, and it is suitable for application as a useful tool in therapeutic peptide design.
Collapse
Affiliation(s)
- Chuanyan Wu
- School of Control Science and Engineering, Shandong University, Jingshi Road, Jinan, 250061, China.,Diabetes and Endocrinology, Lund University, Malmo, 20502, Sweden
| | - Rui Gao
- School of Control Science and Engineering, Shandong University, Jingshi Road, Jinan, 250061, China.
| | - Yusen Zhang
- School of Mathematics and Statistics, Shandong University at Weihai, Weihai, 264209, China
| | - Yang De Marinis
- Diabetes and Endocrinology, Lund University, Malmo, 20502, Sweden
| |
Collapse
|
38
|
Ma Z, Zhang B, Fan Y, Wang M, Kebebe D, Li J, Liu Z. Traditional Chinese medicine combined with hepatic targeted drug delivery systems: A new strategy for the treatment of liver diseases. Biomed Pharmacother 2019; 117:109128. [PMID: 31234023 DOI: 10.1016/j.biopha.2019.109128] [Citation(s) in RCA: 36] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2019] [Revised: 06/12/2019] [Accepted: 06/12/2019] [Indexed: 12/18/2022] Open
Abstract
Liver diseases are clinically common and present a substantial public health issue. Many of the currently available drugs for the treatment of liver diseases suffer from limitations that include low hepatic distribution, lack of target effects, poor in vivo stability and adverse effects on other organs. Consequently, conventional treatment of hepatic diseases is ineffective. TCM is commonly used in the treatment of liver diseases worldwide, particularly in China, and has advantages over conventional therapy. HTDDS can be designed to enhance clinical efficacy in the treatment of liver diseases. We have conducted an extensive review of 335 studies reported since 1964. These included about 166 references involving the treatment of liver diseases with TCM (covering active components of TCM, single TCM and Chinese medicine formulas), 169 reports on HTDDS and background studies on liver-related diseases. Here we review the long history of TCM in the treatment of liver diseases.We have also reviewed the status of studies on active components of TCM using nanotechnology-based targeted delivery systems to provide support for further research and development of TCM-based targeted preparations for the treatment of liver disease.
Collapse
Affiliation(s)
- Zhe Ma
- Engineering Research Center of Modern Chinese Medicine Discovery and Preparation Technique, Ministry of Education, Tianjin University of Traditional Chinese Medicine, Tianjin, 300193, China; Institute of Traditional Chinese Medicine, Tianjin University of Traditional Chinese Medicine, Tianjin, 300193, China
| | - Bing Zhang
- Engineering Research Center of Modern Chinese Medicine Discovery and Preparation Technique, Ministry of Education, Tianjin University of Traditional Chinese Medicine, Tianjin, 300193, China; Institute of Traditional Chinese Medicine, Tianjin University of Traditional Chinese Medicine, Tianjin, 300193, China
| | - Yuqi Fan
- Institute of Traditional Chinese Medicine, Tianjin University of Traditional Chinese Medicine, Tianjin, 300193, China; School of Integrative Medicine, Tianjin University of Traditional Chinese Medicine, Tianjin, 300193, China
| | - Meng Wang
- Engineering Research Center of Modern Chinese Medicine Discovery and Preparation Technique, Ministry of Education, Tianjin University of Traditional Chinese Medicine, Tianjin, 300193, China; Institute of Traditional Chinese Medicine, Tianjin University of Traditional Chinese Medicine, Tianjin, 300193, China
| | - Dereje Kebebe
- Engineering Research Center of Modern Chinese Medicine Discovery and Preparation Technique, Ministry of Education, Tianjin University of Traditional Chinese Medicine, Tianjin, 300193, China; Institute of Traditional Chinese Medicine, Tianjin University of Traditional Chinese Medicine, Tianjin, 300193, China; School of Pharmacy, Institute of Health Sciences, Jimma University, Jimma, Ethiopia
| | - Jiawei Li
- Engineering Research Center of Modern Chinese Medicine Discovery and Preparation Technique, Ministry of Education, Tianjin University of Traditional Chinese Medicine, Tianjin, 300193, China; Institute of Traditional Chinese Medicine, Tianjin University of Traditional Chinese Medicine, Tianjin, 300193, China; School of Chinese Materia Medica, Tianjin University of Traditional Chinese Medicine, Tianjin, 300193, China.
| | - Zhidong Liu
- Engineering Research Center of Modern Chinese Medicine Discovery and Preparation Technique, Ministry of Education, Tianjin University of Traditional Chinese Medicine, Tianjin, 300193, China; Institute of Traditional Chinese Medicine, Tianjin University of Traditional Chinese Medicine, Tianjin, 300193, China.
| |
Collapse
|
39
|
Schaduangrat N, Nantasenamat C, Prachayasittikul V, Shoombuatong W. ACPred: A Computational Tool for the Prediction and Analysis of Anticancer Peptides. Molecules 2019; 24:E1973. [PMID: 31121946 PMCID: PMC6571645 DOI: 10.3390/molecules24101973] [Citation(s) in RCA: 138] [Impact Index Per Article: 23.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2019] [Revised: 05/07/2019] [Accepted: 05/17/2019] [Indexed: 01/01/2023] Open
Abstract
Anticancer peptides (ACPs) have emerged as a new class of therapeutic agent for cancer treatment due to their lower toxicity as well as greater efficacy, selectivity and specificity when compared to conventional small molecule drugs. However, the experimental identification of ACPs still remains a time-consuming and expensive endeavor. Therefore, it is desirable to develop and improve upon existing computational models for predicting and characterizing ACPs. In this study, we present a bioinformatics tool called the ACPred, which is an interpretable tool for the prediction and characterization of the anticancer activities of peptides. ACPred was developed by utilizing powerful machine learning models (support vector machine and random forest) and various classes of peptide features. It was observed by a jackknife cross-validation test that ACPred can achieve an overall accuracy of 95.61% in identifying ACPs. In addition, analysis revealed the following distinguishing characteristics that ACPs possess: (i) hydrophobic residue enhances the cationic properties of α-helical ACPs resulting in better cell penetration; (ii) the amphipathic nature of the α-helical structure plays a crucial role in its mechanism of cytotoxicity; and (iii) the formation of disulfide bridges on β-sheets is vital for structural maintenance which correlates with its ability to kill cancer cells. Finally, for the convenience of experimental scientists, the ACPred web server was established and made freely available online.
Collapse
Affiliation(s)
- Nalini Schaduangrat
- Center of Data Mining and Biomedical Informatics, Faculty of Medical Technology, Mahidol University, Bangkok 10700, Thailand.
| | - Chanin Nantasenamat
- Center of Data Mining and Biomedical Informatics, Faculty of Medical Technology, Mahidol University, Bangkok 10700, Thailand.
| | - Virapong Prachayasittikul
- Department of Clinical Microbiology and Applied Technology, Faculty of Medical Technology, Mahidol University, Bangkok 10700, Thailand.
| | - Watshara Shoombuatong
- Center of Data Mining and Biomedical Informatics, Faculty of Medical Technology, Mahidol University, Bangkok 10700, Thailand.
| |
Collapse
|
40
|
Ma Z, Fan Y, Wu Y, Kebebe D, Zhang B, Lu P, Pi J, Liu Z. Traditional Chinese medicine-combination therapies utilizing nanotechnology-based targeted delivery systems: a new strategy for antitumor treatment. Int J Nanomedicine 2019; 14:2029-2053. [PMID: 30962686 PMCID: PMC6435121 DOI: 10.2147/ijn.s197889] [Citation(s) in RCA: 59] [Impact Index Per Article: 9.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022] Open
Abstract
Cancer is a major public health problem, and is now the world’s leading cause of death. Traditional Chinese medicine (TCM)-combination therapy is a new treatment approach and a vital therapeutic strategy for cancer, as it exhibits promising antitumor potential. Nano-targeted drug-delivery systems have remarkable advantages and allow the development of TCM-combination therapies by systematically controlling drug release and delivering drugs to solid tumors. In this review, the anticancer activity of TCM compounds is introduced. The combined use of TCM for antitumor treatment is analyzed and summarized. These combination therapies, using a single nanocarrier system, namely codelivery, are analyzed, issues that require attention are determined, and future perspectives are identified. We carried out a systematic review of >280 studies published in PubMed since 1985 (no patents involved), in order to provide a few basic considerations in terms of the design principles and management of targeted nanotechnology-based TCM-combination therapies.
Collapse
Affiliation(s)
- Zhe Ma
- Engineering Research Center of Modern Chinese Medicine Discovery and Preparation Technique, Ministry of Education, Tianjin University of Traditional Chinese Medicine, Tianjin 300193, China, ; .,Institute of Traditional Chinese Medicine, Tianjin University of Traditional Chinese Medicine, Tianjin 300193, China, ;
| | - Yuqi Fan
- Institute of Traditional Chinese Medicine, Tianjin University of Traditional Chinese Medicine, Tianjin 300193, China, ; .,School of Integrative Medicine, Tianjin University of Traditional Chinese Medicine, Tianjin 300193, China
| | - Yumei Wu
- Engineering Research Center of Modern Chinese Medicine Discovery and Preparation Technique, Ministry of Education, Tianjin University of Traditional Chinese Medicine, Tianjin 300193, China, ; .,Institute of Traditional Chinese Medicine, Tianjin University of Traditional Chinese Medicine, Tianjin 300193, China, ;
| | - Dereje Kebebe
- Engineering Research Center of Modern Chinese Medicine Discovery and Preparation Technique, Ministry of Education, Tianjin University of Traditional Chinese Medicine, Tianjin 300193, China, ; .,Institute of Traditional Chinese Medicine, Tianjin University of Traditional Chinese Medicine, Tianjin 300193, China, ; .,School of Pharmacy, Institute of Health Sciences, Jimma University, Jimma, Ethiopia
| | - Bing Zhang
- Engineering Research Center of Modern Chinese Medicine Discovery and Preparation Technique, Ministry of Education, Tianjin University of Traditional Chinese Medicine, Tianjin 300193, China, ; .,Institute of Traditional Chinese Medicine, Tianjin University of Traditional Chinese Medicine, Tianjin 300193, China, ;
| | - Peng Lu
- Engineering Research Center of Modern Chinese Medicine Discovery and Preparation Technique, Ministry of Education, Tianjin University of Traditional Chinese Medicine, Tianjin 300193, China, ; .,Institute of Traditional Chinese Medicine, Tianjin University of Traditional Chinese Medicine, Tianjin 300193, China, ;
| | - Jiaxin Pi
- Engineering Research Center of Modern Chinese Medicine Discovery and Preparation Technique, Ministry of Education, Tianjin University of Traditional Chinese Medicine, Tianjin 300193, China, ; .,Institute of Traditional Chinese Medicine, Tianjin University of Traditional Chinese Medicine, Tianjin 300193, China, ;
| | - Zhidong Liu
- Engineering Research Center of Modern Chinese Medicine Discovery and Preparation Technique, Ministry of Education, Tianjin University of Traditional Chinese Medicine, Tianjin 300193, China, ; .,Institute of Traditional Chinese Medicine, Tianjin University of Traditional Chinese Medicine, Tianjin 300193, China, ;
| |
Collapse
|
41
|
Identifying anticancer peptides by using a generalized chaos game representation. J Math Biol 2018; 78:441-463. [PMID: 30291366 DOI: 10.1007/s00285-018-1279-x] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2017] [Revised: 08/01/2018] [Indexed: 10/28/2022]
Abstract
We generalize chaos game representation (CGR) to higher dimensional spaces while maintaining its bijection, keeping such method sufficiently representative and mathematically rigorous compare to previous attempts. We first state and prove the asymptotic property of CGR and our generalized chaos game representation (GCGR) method. The prediction follows that the dissimilarity of sequences which possess identical subsequences but distinct positions would be lowered exponentially by the length of the identical subsequence; this effect was taking place unbeknownst to researchers. By shining a spotlight on it now, we show the effect fundamentally supports (G)CGR as a similarity measure or feature extraction technique. We develop two feature extraction techniques: GCGR-Centroid and GCGR-Variance. We use the GCGR-Centroid to analyze the similarity between protein sequences by using the datasets 9 ND5, 24 TF and 50 beta-globin proteins. We obtain consistent results compared with previous studies which proves the significance thereof. Finally, by utilizing support vector machines, we train the anticancer peptide prediction model by using both GCGR-Centroid and GCGR-Variance, and achieve a significantly higher prediction performance by employing the 3 well-studied anticancer peptide datasets.
Collapse
|
42
|
Shoombuatong W, Schaduangrat N, Nantasenamat C. Unraveling the bioactivity of anticancer peptides as deduced from machine learning. EXCLI JOURNAL 2018; 17:734-752. [PMID: 30190664 PMCID: PMC6123611 DOI: 10.17179/excli2018-1447] [Citation(s) in RCA: 51] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/20/2018] [Accepted: 07/10/2018] [Indexed: 12/13/2022]
Abstract
Cancer imposes a global health burden as it represents one of the leading causes of morbidity and mortality while also giving rise to significant economic burden owing to the associated expenditures for its monitoring and treatment. In spite of advancements in cancer therapy, the low success rate and recurrence of tumor has necessitated the ongoing search for new therapeutic agents. Aside from drugs based on small molecules and protein-based biopharmaceuticals, there has been an intense effort geared towards the development of peptide-based therapeutics owing to its favorable and intrinsic properties of being relatively small, highly selective, potent, safe and low in production costs. In spite of these advantages, there are several inherent weaknesses that are in need of attention in the design and development of therapeutic peptides. An abundance of data on bioactive and therapeutic peptides have been accumulated over the years and the burgeoning area of artificial intelligence has set the stage for the lucrative utilization of machine learning to make sense of these large and high-dimensional data. This review summarizes the current state-of-the-art on the application of machine learning for studying the bioactivity of anticancer peptides along with future outlook of the field. Data and R codes used in the analysis herein are available on GitHub at https://github.com/Shoombuatong2527/anticancer-peptides-review.
Collapse
Affiliation(s)
- Watshara Shoombuatong
- Center of Data Mining and Biomedical Informatics, Faculty of Medical Technology, Mahidol University, Bangkok 10700, Thailand
| | - Nalini Schaduangrat
- Center of Data Mining and Biomedical Informatics, Faculty of Medical Technology, Mahidol University, Bangkok 10700, Thailand
| | - Chanin Nantasenamat
- Center of Data Mining and Biomedical Informatics, Faculty of Medical Technology, Mahidol University, Bangkok 10700, Thailand
| |
Collapse
|