1
|
Garai S, Thomas J, Dey P, Das D. LGBM-ACp: an ensemble model for anticancer peptide prediction and in silico screening with potential drug targets. Mol Divers 2024; 28:1965-1981. [PMID: 36637711 DOI: 10.1007/s11030-023-10602-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2022] [Accepted: 01/06/2023] [Indexed: 01/14/2023]
Abstract
Conventional cancer therapies are highly expensive and have serious complications. An alternative approach now emphasizes on the development of small, biologically active peptides without acute toxicity. Experimental screening to find curative anticancer peptides (ACP) often gives rise to multiple obstacles and is time dependent. Consequently, developing an effective computational technique to identify promising ACP candidates prior to preclinical research is in high demand. This study proposed a machine-learning framework that used the light gradient-boosting machine as a classifier and two compositional and two binary profile features as input. The ensemble model displayed an accuracy, MCC, and AUROC of 97.52%, 0.91, and 0.98, respectively, which outclassed most of the existing sequence-based computational tools. A distinct dataset of non-mutagenic, non-toxic, and non-inhibitory Cytochrome P-450 peptides was used to validate the hybrid model. The most relevant ACP in the alternative dataset was compared with two standard ACPs, beta defensin 2, and cecropin-A. Molecular docking of the predicted peptide revealed that it has a strong binding affinity with twenty-five anticancer drug targets, most notably phosphoenolpyruvate carboxykinase (- 7.2 kcal/mol). Additionally, molecular dynamics simulation and principal component analysis supported the stability of the peptide-receptor complex. Overall, the present findings will take a step forward in rational drug design through rapid identification and screening of therapeutic peptides.
Collapse
Affiliation(s)
- Swarnava Garai
- Department of Bioengineering, NIT Agartala, Tripura, 799046, India
| | - Juanit Thomas
- Department of Bioengineering, NIT Agartala, Tripura, 799046, India
| | - Palash Dey
- Civil Engineering Department, The ICFAI University, Tripura, 799210, India
| | - Deeplina Das
- Department of Bioengineering, NIT Agartala, Tripura, 799046, India.
| |
Collapse
|
2
|
Ghafoor H, Asim MN, Ibrahim MA, Ahmed S, Dengel A. CAPTURE: Comprehensive anti-cancer peptide predictor with a unique amino acid sequence encoder. Comput Biol Med 2024; 176:108538. [PMID: 38759585 DOI: 10.1016/j.compbiomed.2024.108538] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2024] [Revised: 04/26/2024] [Accepted: 04/28/2024] [Indexed: 05/19/2024]
Abstract
Anticancer peptides (ACPs) key properties including bioactivity, high efficacy, low toxicity, and lack of drug resistance make them ideal candidates for cancer therapies. To deeply explore the potential of ACPs and accelerate development of cancer therapies, although 53 Artificial Intelligence supported computational predictors have been developed for ACPs and non ACPs classification but only one predictor has been developed for ACPs functional types annotations. Moreover, these predictors extract amino acids distribution patterns to transform peptides sequences into statistical vectors that are further fed to classifiers for discriminating peptides sequences and annotating peptides functional classes. Overall, these predictors remain fail in extracting diverse types of amino acids distribution patterns from peptide sequences. The paper in hand presents a unique CARE encoder that transforms peptides sequences into statistical vectors by extracting 4 different types of distribution patterns including correlation, distribution, composition, and transition. Across public benchmark dataset, proposed encoder potential is explored under two different evaluation settings namely; intrinsic and extrinsic. Extrinsic evaluation indicates that 12 different machine learning classifiers achieve superior performance with the proposed encoder as compared to 55 existing encoders. Furthermore, an intrinsic evaluation reveals that, unlike existing encoders, the proposed encoder generates more discriminative clusters for ACPs and non-ACPs classes. Across 8 public benchmark ACPs and non-ACPs classification datasets, proposed encoder and Adaboost classifier based CAPTURE predictor outperforms existing predictors with an average accuracy, recall and MCC score of 1%, 4%, and 2% respectively. In generalizeability evaluation case study, across 7 benchmark anti-microbial peptides classification datasets, CAPTURE surpasses existing predictors by an average AU-ROC of 2%. CAPTURE predictive pipeline along with label powerset method outperforms state-of-the-art ACPs functional types predictor by 5%, 5%, 5%, 6%, and 3% in terms of average accuracy, subset accuracy, precision, recall, and F1 respectively. CAPTURE web application is available at https://sds_genetic_analysis.opendfki.de/CAPTURE.
Collapse
Affiliation(s)
- Hina Ghafoor
- Department of Computer Science, Rhineland-Palatinate Technical University of Kaiserslautern-Landau, Kaiserslautern, 67663, Germany; German Research Center for Artificial Intelligence GmbH, Kaiserslautern, 67663, Germany
| | - Muhammad Nabeel Asim
- German Research Center for Artificial Intelligence GmbH, Kaiserslautern, 67663, Germany.
| | - Muhammad Ali Ibrahim
- Department of Computer Science, Rhineland-Palatinate Technical University of Kaiserslautern-Landau, Kaiserslautern, 67663, Germany; German Research Center for Artificial Intelligence GmbH, Kaiserslautern, 67663, Germany
| | - Sheraz Ahmed
- German Research Center for Artificial Intelligence GmbH, Kaiserslautern, 67663, Germany
| | - Andreas Dengel
- Department of Computer Science, Rhineland-Palatinate Technical University of Kaiserslautern-Landau, Kaiserslautern, 67663, Germany; German Research Center for Artificial Intelligence GmbH, Kaiserslautern, 67663, Germany
| |
Collapse
|
3
|
Xu M, Pang J, Ye Y, Zhang Z. Integrating Traditional Machine Learning and Deep Learning for Precision Screening of Anticancer Peptides: A Novel Approach for Efficient Drug Discovery. ACS OMEGA 2024; 9:16820-16831. [PMID: 38617603 PMCID: PMC11007766 DOI: 10.1021/acsomega.4c01374] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/12/2024] [Revised: 03/03/2024] [Accepted: 03/22/2024] [Indexed: 04/16/2024]
Abstract
The rapid and effective identification of anticancer peptides (ACPs) by computer technology provides a new perspective for cancer treatment. In the identification process of ACPs, accurate sequence encoding and effective classification models are crucial for predicting their biological activity. Traditional machine learning methods have been widely applied in sequence analysis, but deep learning provides a new approach to capture sequence complexity. In this study, a two-stage ACPs classification model was innovatively proposed. Three novel coding strategies were explored; two mainstream Natural Language Processing (NLP) models and 11 machine learning models were fused to identify ACPs, which significantly improved the prediction accuracy of ACPs. We analyzed the correlation between peptide chain amino acids and evaluated the relevant performance of the model by the ROC curve and t-SNE dimensionality reduction technique. The results indicated that the deep learning and machine learning fusion models of M3E-base and KNeighborsDist models, especially when considering the semantic information on amino acid sequences, achieved the highest average accuracy (AvgAcc) of 0.939, with an AUC value as high as 0.97. Then, in vitro cell experiments were used to verify that the two ACPs predicted by the model had antitumor efficacy. This study provides a convenient and effective method for screening ACPs. With further optimization and testing, these strategies have the potential to play an important role in drug discovery and design.
Collapse
Affiliation(s)
- Meiqi Xu
- Key
Laboratory of Novel Targets and Drug Study for Neural Repair of Zhejiang
Province, School of Medicine, Hangzhou City
University, Hangzhou 310015, Zhejiang, China
| | - Jiefu Pang
- School
of Computer Science, Hangzhou Dianzi University, Hangzhou 310018, Zhejiang, China
| | - Yangyang Ye
- Key
Laboratory of Novel Targets and Drug Study for Neural Repair of Zhejiang
Province, School of Medicine, Hangzhou City
University, Hangzhou 310015, Zhejiang, China
| | - Ziyi Zhang
- Key
Laboratory of Novel Targets and Drug Study for Neural Repair of Zhejiang
Province, School of Medicine, Hangzhou City
University, Hangzhou 310015, Zhejiang, China
| |
Collapse
|
4
|
Azad H, Akbar MY, Sarfraz J, Haider W, Riaz MN, Ali GM, Ghazanfar S. G-ACP: a machine learning approach to the prediction of therapeutic peptides for gastric cancer. J Biomol Struct Dyn 2024:1-14. [PMID: 38450672 DOI: 10.1080/07391102.2024.2323141] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2023] [Accepted: 02/15/2024] [Indexed: 03/08/2024]
Abstract
Conventional Gastrointestinal (GI) cancer treatments are quite expensive and have major hazards. Nowadays, a different strategy places more emphasis on creating tiny biologically active peptides that do not cause severe poisoning. Anticancer peptides (ACPs) are found through experimental screening, which is time-dependent and frequently fraught with difficulties. Gastric ACPs are emerging as a promising GI cancer treatment in the current day. It is crucial to identify novel gastric ACPs to have an improved knowledge of their functioning processes and treatment of gastric cancer. As a result of the post-genomic era's massive production of peptide sequences, rapid and effective ACPs using a computational method are essential. Several adaptive statistical techniques for distinguishing ACPs and non-ACPs have recently been developed. A variety of adapted statistically significant methods have been developed to differentiate between ACPs and non-ACPs. Despite significant progress, there is no specific model for the prediction of gastric ACPs because the specific model will predict a particular type of peptide more accurately and quickly. To overcome this, an initiative is taken for the creation of a reliable framework for the accurate identification of gastric ACPs. The current technique in particular contains four possible features along with one hybrid feature encoding mechanisms which are the target-class motif previously indicated by Amino Acid Composition, Dipeptide Composition, Tripeptide Composition (TPC), Pseudo Amino Acid Composition (PAAC), and their Hybrid. Machine Learning algorithms make high-performance and accurate prediction tools. Moreover, highly variable and ideal deep feature selection is done using an ANOVA-based F score for feature pruning. Experiments on a range of algorithms are carried out to identify the optimal operating strategy due to the diverse nature of learning. Following analysis of the empirical results, Naïve Bayes with TPC and Hybrid feature space outperforms other methods with 0.99 accuracy score on the testing dataset. To find the model generalization an external validation is carried out. In external datasets, the Extra Trees with PAAC features outperforms with the accuracy of 0.94. The comparison study shows that our suggested model will predict gastric ACPs more accurately and will be useful in drug development and gastric cancer. The predictive model can be freely accessed at https://github.com/humeraazad10/G-ACP.git.Communicated by Ramaswamy H. Sarma.
Collapse
Affiliation(s)
- Humera Azad
- Department of Biosciences (Bioinformatics) Islamabad, Comsats University Islamabad, Pakistan
| | - Muhammad Yasir Akbar
- National Institute for Genomics and Advanced Biotechnology (NIGAB), National Agricultural Research Center (NARC), Pakistan
| | | | - Waseem Haider
- Department of Biosciences (Bioinformatics) Islamabad, Comsats University Islamabad, Pakistan
| | - Muhammad Naeem Riaz
- National Institute for Genomics and Advanced Biotechnology (NIGAB), National Agricultural Research Center (NARC), Pakistan
| | - Ghulam Muhammad Ali
- Department of Biosciences (Bioinformatics) Islamabad, Comsats University Islamabad, Pakistan
| | - Shakira Ghazanfar
- National Institute for Genomics and Advanced Biotechnology (NIGAB), National Agricultural Research Center (NARC), Pakistan
| |
Collapse
|
5
|
Sun M, Hu H, Pang W, Zhou Y. ACP-BC: A Model for Accurate Identification of Anticancer Peptides Based on Fusion Features of Bidirectional Long Short-Term Memory and Chemically Derived Information. Int J Mol Sci 2023; 24:15447. [PMID: 37895128 PMCID: PMC10607064 DOI: 10.3390/ijms242015447] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2023] [Revised: 09/10/2023] [Accepted: 10/20/2023] [Indexed: 10/29/2023] Open
Abstract
Anticancer peptides (ACPs) have been proven to possess potent anticancer activities. Although computational methods have emerged for rapid ACPs identification, their accuracy still needs improvement. In this study, we propose a model called ACP-BC, a three-channel end-to-end model that utilizes various combinations of data augmentation techniques. In the first channel, features are extracted from the raw sequence using a bidirectional long short-term memory network. In the second channel, the entire sequence is converted into a chemical molecular formula, which is further simplified using Simplified Molecular Input Line Entry System notation to obtain deep abstract features through a bidirectional encoder representation transformer (BERT). In the third channel, we manually selected four effective features according to dipeptide composition, binary profile feature, k-mer sparse matrix, and pseudo amino acid composition. Notably, the application of chemical BERT in predicting ACPs is novel and successfully integrated into our model. To validate the performance of our model, we selected two benchmark datasets, ACPs740 and ACPs240. ACP-BC achieved prediction accuracy with 87% and 90% on these two datasets, respectively, representing improvements of 1.3% and 7% compared to existing state-of-the-art methods on these datasets. Therefore, systematic comparative experiments have shown that the ACP-BC can effectively identify anticancer peptides.
Collapse
Affiliation(s)
- Mingwei Sun
- Key Laboratory of Symbol Computation and Knowledge Engineering of Ministry of Education, College of Computer Science and Technology, Jilin University, Changchun 130012, China; (M.S.); (H.H.)
| | - Haoyuan Hu
- Key Laboratory of Symbol Computation and Knowledge Engineering of Ministry of Education, College of Computer Science and Technology, Jilin University, Changchun 130012, China; (M.S.); (H.H.)
| | - Wei Pang
- School of Mathematical and Computer Sciences, Heriot-Watt University, Edinburgh EH14 4AS, UK;
| | - You Zhou
- Key Laboratory of Symbol Computation and Knowledge Engineering of Ministry of Education, College of Computer Science and Technology, Jilin University, Changchun 130012, China; (M.S.); (H.H.)
- College of Software, Jilin University, Changchun 130012, China
| |
Collapse
|
6
|
Zhou W, Liu Y, Li Y, Kong S, Wang W, Ding B, Han J, Mou C, Gao X, Liu J. TriNet: A tri-fusion neural network for the prediction of anticancer and antimicrobial peptides. PATTERNS (NEW YORK, N.Y.) 2023; 4:100702. [PMID: 36960450 PMCID: PMC10028424 DOI: 10.1016/j.patter.2023.100702] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/09/2022] [Revised: 12/20/2022] [Accepted: 02/03/2023] [Indexed: 03/04/2023]
Abstract
The accurate identification of anticancer peptides (ACPs) and antimicrobial peptides (AMPs) remains a computational challenge. We propose a tri-fusion neural network termed TriNet for the accurate prediction of both ACPs and AMPs. The framework first defines three kinds of features to capture the peptide information contained in serial fingerprints, sequence evolutions, and physicochemical properties, which are then fed into three parallel modules: a convolutional neural network module enhanced by channel attention, a bidirectional long short-term memory module, and an encoder module for training and final classification. To achieve a better training effect, TriNet is trained via a training approach using iterative interactions between the samples in the training and validation datasets. TriNet is tested on multiple challenging ACP and AMP datasets and exhibits significant improvements over various state-of-the-art methods. The web server and source code of TriNet are respectively available at http://liulab.top/TriNet/server and https://github.com/wanyunzh/TriNet.
Collapse
Affiliation(s)
- Wanyun Zhou
- SDU-ANU Joint Science College, Shandong University (Weihai), Weihai 264209, China
| | - Yufei Liu
- SDU-ANU Joint Science College, Shandong University (Weihai), Weihai 264209, China
| | - Yingxin Li
- School of Mechanical, Electrical & Information Engineering, Shandong University (Weihai), Weihai 264209, China
| | - Siqi Kong
- SDU-ANU Joint Science College, Shandong University (Weihai), Weihai 264209, China
| | - Weilin Wang
- SDU-ANU Joint Science College, Shandong University (Weihai), Weihai 264209, China
| | - Boyun Ding
- SDU-ANU Joint Science College, Shandong University (Weihai), Weihai 264209, China
| | - Jiyun Han
- School of Mathematics and Statistics, Shandong University (Weihai), Weihai 264209, China
| | - Chaozhou Mou
- School of Mathematics and Statistics, Shandong University (Weihai), Weihai 264209, China
| | - Xin Gao
- Computational Bioscience Research Center (CBRC), Computer, Electrical and Mathematical Sciences and Engineering Division, King Abdullah University of Science and Technology (KAUST), Thuwal 23955, Saudi Arabia
| | - Juntao Liu
- School of Mathematics and Statistics, Shandong University (Weihai), Weihai 264209, China
| |
Collapse
|
7
|
Ghaly G, Tallima H, Dabbish E, Badr ElDin N, Abd El-Rahman MK, Ibrahim MAA, Shoeib T. Anti-Cancer Peptides: Status and Future Prospects. Molecules 2023; 28:molecules28031148. [PMID: 36770815 PMCID: PMC9920184 DOI: 10.3390/molecules28031148] [Citation(s) in RCA: 12] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2022] [Revised: 12/26/2022] [Accepted: 01/19/2023] [Indexed: 01/26/2023] Open
Abstract
The dramatic rise in cancer incidence, alongside treatment deficiencies, has elevated cancer to the second-leading cause of death globally. The increasing morbidity and mortality of this disease can be traced back to a number of causes, including treatment-related side effects, drug resistance, inadequate curative treatment and tumor relapse. Recently, anti-cancer bioactive peptides (ACPs) have emerged as a potential therapeutic choice within the pharmaceutical arsenal due to their high penetration, specificity and fewer side effects. In this contribution, we present a general overview of the literature concerning the conformational structures, modes of action and membrane interaction mechanisms of ACPs, as well as provide recent examples of their successful employment as targeting ligands in cancer treatment. The use of ACPs as a diagnostic tool is summarized, and their advantages in these applications are highlighted. This review expounds on the main approaches for peptide synthesis along with their reconstruction and modification needed to enhance their therapeutic effect. Computational approaches that could predict therapeutic efficacy and suggest ACP candidates for experimental studies are discussed. Future research prospects in this rapidly expanding area are also offered.
Collapse
Affiliation(s)
- Gehane Ghaly
- Department of Chemistry, The American University in Cairo, New Cairo 11835, Egypt
| | - Hatem Tallima
- Department of Chemistry, The American University in Cairo, New Cairo 11835, Egypt
| | - Eslam Dabbish
- Department of Chemistry, The American University in Cairo, New Cairo 11835, Egypt
| | - Norhan Badr ElDin
- Analytical Chemistry Department, Faculty of Pharmacy, Cairo University, Kasr-El Aini Street, Cairo 11562, Egypt
| | - Mohamed K. Abd El-Rahman
- Analytical Chemistry Department, Faculty of Pharmacy, Cairo University, Kasr-El Aini Street, Cairo 11562, Egypt
- Department of Chemistry and Chemical Biology, Harvard University, 12 Oxford Street, Cambridge, MA 02138, USA
| | - Mahmoud A. A. Ibrahim
- Computational Chemistry Laboratory, Chemistry Department, Faculty of Science, Minia University, Minia 61519, Egypt
- School of Health Sciences, University of Kwa-Zulu-Natal, Westville, Durban 4000, South Africa
| | - Tamer Shoeib
- Department of Chemistry, The American University in Cairo, New Cairo 11835, Egypt
- Correspondence:
| |
Collapse
|
8
|
Ayad A, Hallawa A, Peine A, Martin L, Fazlic LB, Dartmann G, Marx G, Schmeink A. Predicting Abnormalities in Laboratory Values of Patients in the Intensive Care Unit Using Different Deep Learning Models: Comparative Study. JMIR Med Inform 2022; 10:e37658. [PMID: 36001363 PMCID: PMC9453586 DOI: 10.2196/37658] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2022] [Revised: 06/05/2022] [Accepted: 06/12/2022] [Indexed: 11/13/2022] Open
Abstract
Background In recent years, the volume of medical knowledge and health data has increased rapidly. For example, the increased availability of electronic health records (EHRs) provides accurate, up-to-date, and complete information about patients at the point of care and enables medical staff to have quick access to patient records for more coordinated and efficient care. With this increase in knowledge, the complexity of accurate, evidence-based medicine tends to grow all the time. Health care workers must deal with an increasing amount of data and documentation. Meanwhile, relevant patient data are frequently overshadowed by a layer of less relevant data, causing medical staff to often miss important values or abnormal trends and their importance to the progression of the patient’s case. Objective The goal of this work is to analyze the current laboratory results for patients in the intensive care unit (ICU) and classify which of these lab values could be abnormal the next time the test is done. Detecting near-future abnormalities can be useful to support clinicians in their decision-making process in the ICU by drawing their attention to the important values and focus on future lab testing, saving them both time and money. Additionally, it will give doctors more time to spend with patients, rather than skimming through a long list of lab values. Methods We used Structured Query Language to extract 25 lab values for mechanically ventilated patients in the ICU from the MIMIC-III and eICU data sets. Additionally, we applied time-windowed sampling and holding, and a support vector machine to fill in the missing values in the sparse time series, as well as the Tukey range to detect and delete anomalies. Then, we used the data to train 4 deep learning models for time series classification, as well as a gradient boosting–based algorithm and compared their performance on both data sets. Results The models tested in this work (deep neural networks and gradient boosting), combined with the preprocessing pipeline, achieved an accuracy of at least 80% on the multilabel classification task. Moreover, the model based on the multiple convolutional neural network outperformed the other algorithms on both data sets, with the accuracy exceeding 89%. Conclusions In this work, we show that using machine learning and deep neural networks to predict near-future abnormalities in lab values can achieve satisfactory results. Our system was trained, validated, and tested on 2 well-known data sets to ensure that our system bridged the reality gap as much as possible. Finally, the model can be used in combination with our preprocessing pipeline on real-life EHRs to improve patients’ diagnosis and treatment.
Collapse
Affiliation(s)
- Ahmad Ayad
- Chair of Information Theory and Data Analytics, Rheinisch-Westfälische Technische Hochschule Aachen, Aachen, Germany
| | - Ahmed Hallawa
- Department of Intensive Care and Intermediate Care, University Hospital Rheinisch-Westfälische Technische Hochschule Aachen, Aachen, Germany
| | - Arne Peine
- Department of Intensive Care and Intermediate Care, University Hospital Rheinisch-Westfälische Technische Hochschule Aachen, Aachen, Germany
| | - Lukas Martin
- Department of Intensive Care and Intermediate Care, University Hospital Rheinisch-Westfälische Technische Hochschule Aachen, Aachen, Germany
| | - Lejla Begic Fazlic
- Fachbereich Umweltplanung/Umwelttechnik - Fachrichtung Informatik, Trier University of Applied Sciences, Trier, Germany
| | - Guido Dartmann
- Fachbereich Umweltplanung/Umwelttechnik - Fachrichtung Informatik, Trier University of Applied Sciences, Trier, Germany
| | - Gernot Marx
- Department of Intensive Care and Intermediate Care, University Hospital Rheinisch-Westfälische Technische Hochschule Aachen, Aachen, Germany
| | - Anke Schmeink
- Chair of Information Theory and Data Analytics, Rheinisch-Westfälische Technische Hochschule Aachen, Aachen, Germany
| |
Collapse
|
9
|
Zhu L, Ye C, Hu X, Yang S, Zhu C. ACP-check: An anticancer peptide prediction model based on bidirectional long short-term memory and multi-features fusion strategy. Comput Biol Med 2022; 148:105868. [PMID: 35868046 DOI: 10.1016/j.compbiomed.2022.105868] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2022] [Revised: 06/14/2022] [Accepted: 07/09/2022] [Indexed: 11/16/2022]
Abstract
The anticancer peptide is an emerging anticancer drug that has become an effective alternative to chemotherapy and targeted therapy due to fewer side effects and resistance. The traditional biological experimental method for identifying anticancer peptides is a time-consuming and complicated process that hinders large-scale, rapid, and effective identification. In this paper, we propose a model based on a bidirectional long short-term memory network and multi-features fusion, called ACP-check, which employs a bidirectional long short-term memory network to extract time-dependent information features from peptide sequences, and combines them with amino acid sequence features including binary profile feature, dipeptide composition, the composition of k-spaced amino acid group pairs, amino acid composition, and sequence-order-coupling number. To verify the performance of the model, six benchmark datasets are selected, including ACPred-Fuse, ACPred-FL, ACP240, ACP740, main and alternate datasets of AntiCP2.0. In terms of Matthews correlation coefficients, ACP-check obtains 0.37, 0.82, 0.80, 0.75, 0.56, and 0.86 on six datasets respectively, which is an improvement by 2%-86% than existing state-of-the-art anticancer peptides prediction methods. Furthermore, ACP-check achieves prediction accuracy with 0.91, 0.91, 0.90, 0.87, 0.78, and 0.93 respectively, which increases range from 1%-49%. Overall, the comparison experiment shows that ACP-check can accurately identify anticancer peptides by sequence-level information. The code and data are available at http://www.cczubio.top/ACP-check/.
Collapse
Affiliation(s)
- Lun Zhu
- School of Computer Science and Artificial Intelligence Aliyun School of Big Data School of Software, Changzhou University, Changzhou, 213164, China
| | - Chenyang Ye
- School of Computer Science and Artificial Intelligence Aliyun School of Big Data School of Software, Changzhou University, Changzhou, 213164, China
| | - Xuemei Hu
- Key Laboratory of Symbol Computation and Knowledge Engineering of Ministry of Education, College of Computer Science and Technology, Jilin University, Changchun, 130012, China
| | - Sen Yang
- School of Computer Science and Artificial Intelligence Aliyun School of Big Data School of Software, Changzhou University, Changzhou, 213164, China; Changzhou No.2 People's Hospital, the Affiliated Hospital of Nanjing Medical University, Changzhou, 213164, China.
| | - Chenyang Zhu
- School of Computer Science and Artificial Intelligence Aliyun School of Big Data School of Software, Changzhou University, Changzhou, 213164, China
| |
Collapse
|
10
|
Chen X, Huang J, He B. AntiDMPpred: a web service for identifying anti-diabetic peptides. PeerJ 2022; 10:e13581. [PMID: 35722269 PMCID: PMC9205309 DOI: 10.7717/peerj.13581] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2022] [Accepted: 05/23/2022] [Indexed: 01/17/2023] Open
Abstract
Diabetes mellitus (DM) is a chronic metabolic disease that has been a major threat to human health globally, causing great economic and social adversities. The oral administration of anti-diabetic peptide drugs has become a novel route for diabetes therapy. Numerous bioactive peptides have demonstrated potential anti-diabetic properties and are promising as alternative treatment measures to prevent and manage diabetes. The computational prediction of anti-diabetic peptides can help promote peptide-based drug discovery in the process of searching newly effective therapeutic peptide agents for diabetes treatment. Here, we resorted to random forest to develop a computational model, named AntiDMPpred, for predicting anti-diabetic peptides. A benchmark dataset with 236 anti-diabetic and 236 non-anti-diabetic peptides was first constructed. Four types of sequence-derived descriptors were used to represent the peptide sequences. We then combined four machine learning methods and six feature scoring methods to select the non-redundant features, which were fed into diverse machine learning classifiers to train the models. Experimental results show that AntiDMPpred reached an accuracy of 77.12% and area under the receiver operating curve (AUCROC) of 0.8193 in the nested five-fold cross-validation, yielding a satisfactory performance and surpassing other classifiers implemented in the study. The web service is freely accessible at http://i.uestc.edu.cn/AntiDMPpred/cgi-bin/AntiDMPpred.pl. We hope AntiDMPpred could improve the discovery of anti-diabetic bioactive peptides.
Collapse
Affiliation(s)
- Xue Chen
- Medical College, Guizhou University, Guiyang, China
| | - Jian Huang
- School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, China
| | - Bifang He
- Medical College, Guizhou University, Guiyang, China
| |
Collapse
|