51
|
ACP-ADA: A Boosting Method with Data Augmentation for Improved Prediction of Anticancer Peptides. Int J Mol Sci 2022; 23:ijms232012194. [PMID: 36293050 PMCID: PMC9603247 DOI: 10.3390/ijms232012194] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2022] [Revised: 10/08/2022] [Accepted: 10/11/2022] [Indexed: 11/30/2022] Open
Abstract
Cancer is the second-leading cause of death worldwide, and therapeutic peptides that target and destroy cancer cells have received a great deal of interest in recent years. Traditional wet experiments are expensive and inefficient for identifying novel anticancer peptides; therefore, the development of an effective computational approach is essential to recognize ACP candidates before experimental methods are used. In this study, we proposed an Ada-boosting algorithm with the base learner random forest called ACP-ADA, which integrates binary profile feature, amino acid index, and amino acid composition with a 210-dimensional feature space vector to represent the peptides. Training samples in the feature space were augmented to increase the sample size and further improve the performance of the model in the case of insufficient samples. Furthermore, we used five-fold cross-validation to find model parameters, and the cross-validation results showed that ACP-ADA outperforms existing methods for this feature combination with data augmentation in terms of performance metrics. Specifically, ACP-ADA recorded an average accuracy of 86.4% and a Mathew’s correlation coefficient of 74.01% for dataset ACP740 and 90.83% and 81.65% for dataset ACP240; consequently, it can be a very useful tool in drug development and biomedical research.
Collapse
|
52
|
Harnkit N, Khongsonthi T, Masuwan N, Prasartkul P, Noikaew T, Chumnanpuen P. Virtual Screening for SARS-CoV-2 Main Protease Inhibitory Peptides from the Putative Hydrolyzed Peptidome of Rice Bran. Antibiotics (Basel) 2022; 11:antibiotics11101318. [PMID: 36289976 PMCID: PMC9598432 DOI: 10.3390/antibiotics11101318] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2022] [Revised: 09/21/2022] [Accepted: 09/26/2022] [Indexed: 11/16/2022] Open
Abstract
The Coronavirus Disease 2019 (COVID-19) caused by the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) has led to the loss of life and has affected the life quality, economy, and lifestyle. The SARS-CoV-2 main protease (Mpro), which hydrolyzes the polyprotein, is an interesting antiviral target to inhibit the spreading mechanism of COVID-19. Through predictive digestion, the peptidomes of the four major proteins in rice bran, albumin, glutelin, globulin, and prolamin, with three protease enzymes (pepsin, trypsin, and chymotrypsin), the putative hydrolyzed peptidome was established and used as the input dataset. Then, the prediction of the antiviral peptides (AVPs) was performed by online bioinformatics tools, i.e., AVPpred, Meta-iAVP, AMPfun, and ENNAVIA programs. The amino acid composition and cytotoxicity of candidate AVPs were analyzed by COPid and ToxinPred, respectively. The ten top-ranked antiviral peptides were selected and docked to the SARS-CoV-2 main protease using GalaxyPepDock. Only the top docking scored candidate (AVP4) was further analyzed by molecular dynamics simulation for one nanosecond. According to the bioinformatic analysis results, the candidate SARS-CoV-2 main protease inhibitory peptides were 7–33 amino acid residues and formed hydrogen bonds at Thr22–24, Glu154, and Thr178 in domain 2 with short bonding distances. In addition, these top-ten candidate bioactive peptides contain hydrophilic amino acid residues and have a positive net charge. We hope that this study will provide a potential starting point for peptide-based therapeutic agents against COVID-19.
Collapse
Affiliation(s)
- Nathaphat Harnkit
- Medicinal Plant Research Institute, Department of Medical Sciences, Ministry of Public Health, Nonthaburi 11000, Thailand
| | - Thanakamol Khongsonthi
- Mahidol Wittayanusorn School, 364 Salaya, Phuttamonthon District, Nakhon Prathom 73170, Thailand
| | - Noprada Masuwan
- Mahidol Wittayanusorn School, 364 Salaya, Phuttamonthon District, Nakhon Prathom 73170, Thailand
| | - Pornpinit Prasartkul
- Mahidol Wittayanusorn School, 364 Salaya, Phuttamonthon District, Nakhon Prathom 73170, Thailand
| | - Tipanart Noikaew
- Department of Biology and Health Science, Mahidol Wittayanusorn School, 364 Salaya, Phuttamonthon District, Nakhon Prathom 73170, Thailand
| | - Pramote Chumnanpuen
- Omics Center for Agriculture, Bioresources, Food and Health, Kasetsart University (OmiKU), Bangkok 10900, Thailand
- Department of Zoology, Faculty of Science, Kasetsart University, Bangkok 10900, Thailand
- Correspondence:
| |
Collapse
|
53
|
Charoenkwan P, Schaduangrat N, Lio’ P, Moni MA, Shoombuatong W, Manavalan B. Computational prediction and interpretation of druggable proteins using a stacked ensemble-learning framework. iScience 2022; 25:104883. [PMID: 36046193 PMCID: PMC9421381 DOI: 10.1016/j.isci.2022.104883] [Citation(s) in RCA: 13] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2022] [Revised: 07/08/2022] [Accepted: 08/02/2022] [Indexed: 11/22/2022] Open
Abstract
Discovery of potential drugs requires rapid and precise identification of drug targets. Although traditional experimental methodologies can accurately identify drug targets, they are time-consuming and inappropriate for high-throughput screening. Computational approaches based on machine learning (ML) algorithms can expedite the prediction of druggable proteins; however, the performance of the existing computational methods remains unsatisfactory. This study proposes a computational tool, SPIDER, to enhance the accurate prediction of druggable proteins. SPIDER employs various feature descriptors pertaining to several aspects, including physicochemical properties, compositional information, and composition-transition-distribution information, coupled with well-known ML algorithms to facilitate the construction of the final meta-predictor. The experimental results showed that SPIDER enabled more precise and robust prediction of druggable proteins than the baseline models and current existing methods in terms of the independent test dataset. An online web server was established and made freely available online.
Collapse
Affiliation(s)
- Phasit Charoenkwan
- Modern Management and Information Technology, College of Arts, Media and Technology, Chiang Mai University, Chiang Mai 50200, Thailand
| | - Nalini Schaduangrat
- Center of Data Mining and Biomedical Informatics, Faculty of Medical Technology, Mahidol University, Bangkok 10700, Thailand
| | - Pietro Lio’
- Department of Computer Science and Technology, University of Cambridge, Cambridge CB3 0FD, UK
| | - Mohammad Ali Moni
- Artificial Intelligence & Digital Health, School of Health and Rehabilitation Sciences, Faculty of Health and Behavioural Sciences, The University of Queensland, St Lucia, QLD 4072, Australia
| | - Watshara Shoombuatong
- Center of Data Mining and Biomedical Informatics, Faculty of Medical Technology, Mahidol University, Bangkok 10700, Thailand
| | - Balachandran Manavalan
- Computational Biology and Bioinformatics Laboratory, Department of Integrative Biotechnology, College of Biotechnology and Bioengineering, Sungkyunkwan University, Suwon 16419, Gyeonggi-do, Republic of Korea
| |
Collapse
|
54
|
Liu J, Li M, Chen X. AntiMF: A deep learning framework for predicting anticancer peptides based on multi-view feature extraction. Methods 2022; 207:38-43. [PMID: 36100141 DOI: 10.1016/j.ymeth.2022.07.017] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2022] [Revised: 07/20/2022] [Accepted: 07/26/2022] [Indexed: 01/10/2023] Open
Abstract
In recent years, anticancer peptides have emerged as a new viable option in cancer therapy, with the ability to overcome the considerable side effects and poor outcomes of standard cancer therapies. Accurate anticancer peptide identification can facilitate its finding and speed up its application in treating cancer. However, many recent approaches are based on machine learning, which not only restricts the representation ability of the models but also requires a complex hand-crafted feature extraction process. Here, we propose AntiMF, a deep learning model that utilizes multi-view mechanism based on different feature extraction models. Comparative results show that our model has a better performance than the state-of-the-art methods in the prediction of anticancer peptides. By using an ensemble learning framework to extract representation, AntiMF can capture the different dimensional information, which can make model representation more complete. Moreover, we visualize what AntiMF learns on one of its ensemble models to intuitively show the effectivity of our model, indicating that AntiMF has the great potential ability to be an effective and useful model to identify anticancer peptides accurately.
Collapse
Affiliation(s)
- Jingjing Liu
- Eye Hospital, The First Affiliated Hospital of Harbin Medical University, Harbin 150001, China
| | - Minghao Li
- Beidahuang Industry Group General Hospital, Harbin 150001, China
| | - Xin Chen
- Eye Hospital, The First Affiliated Hospital of Harbin Medical University, Harbin 150001, China; Department of Neurosurgical Laboratory, The First Affiliated Hospital of Harbin Medical University, Harbin 150001, China.
| |
Collapse
|
55
|
Zakharova E, Orsi M, Capecchi A, Reymond J. Machine Learning Guided Discovery of Non-Hemolytic Membrane Disruptive Anticancer Peptides. ChemMedChem 2022; 17:e202200291. [PMID: 35880810 PMCID: PMC9541320 DOI: 10.1002/cmdc.202200291] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2022] [Revised: 06/29/2022] [Indexed: 12/05/2022]
Abstract
Most antimicrobial peptides (AMPs) and anticancer peptides (ACPs) fold into membrane disruptive cationic amphiphilic α-helices, many of which are however also unpredictably hemolytic and toxic. Here we exploited the ability of recurrent neural networks (RNN) to distinguish active from inactive and non-hemolytic from hemolytic AMPs and ACPs to discover new non-hemolytic ACPs. Our discovery pipeline involved: 1) sequence generation using either a generative RNN or a genetic algorithm, 2) RNN classification for activity and hemolysis, 3) selection for sequence novelty, helicity and amphiphilicity, and 4) synthesis and testing. Experimental evaluation of thirty-three peptides resulted in eleven active ACPs, four of which were non-hemolytic, with properties resembling those of the natural ACP lasioglossin III. These experiments show the first example of direct machine learning guided discovery of non-hemolytic ACPs.
Collapse
Affiliation(s)
- Elena Zakharova
- Department of ChemistryBiochemistry and Pharmaceutical SciencesUniversity of BernFreiestrasse 33012BernSwitzerland
| | - Markus Orsi
- Department of ChemistryBiochemistry and Pharmaceutical SciencesUniversity of BernFreiestrasse 33012BernSwitzerland
| | - Alice Capecchi
- Department of ChemistryBiochemistry and Pharmaceutical SciencesUniversity of BernFreiestrasse 33012BernSwitzerland
| | - Jean‐Louis Reymond
- Department of ChemistryBiochemistry and Pharmaceutical SciencesUniversity of BernFreiestrasse 33012BernSwitzerland
| |
Collapse
|
56
|
Akbar S, Hayat M, Tahir M, Khan S, Alarfaj FK. cACP-DeepGram: Classification of anticancer peptides via deep neural network and skip-gram-based word embedding model. Artif Intell Med 2022; 131:102349. [DOI: 10.1016/j.artmed.2022.102349] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2021] [Revised: 05/24/2022] [Accepted: 07/04/2022] [Indexed: 12/28/2022]
|
57
|
Díaz-Gómez JL, López-Castillo LM, Garcia-Lara S, Castorena-Torres F, Winkler R, Wielsch N, Aguilar O. Novel α-zein peptide fractions with in vitro cytotoxic activity against hepatocarcinoma. FOOD AND BIOPRODUCTS PROCESSING 2022. [DOI: 10.1016/j.fbp.2022.07.003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/04/2022]
|
58
|
Anti-cancer Effect of Recombinant PI-Laterosporulin10 as a Novel Bacteriocin with Selective Cytotoxicity on Triple Negative Breast Cancer Cells. Int J Pept Res Ther 2022. [DOI: 10.1007/s10989-022-10453-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/15/2022]
|
59
|
Medina-Ortiz D, Contreras S, Amado-Hinojosa J, Torres-Almonacid J, Asenjo JA, Navarrete M, Olivera-Nappa Á. Generalized Property-Based Encoders and Digital Signal Processing Facilitate Predictive Tasks in Protein Engineering. Front Mol Biosci 2022; 9:898627. [PMID: 35911960 PMCID: PMC9329607 DOI: 10.3389/fmolb.2022.898627] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2022] [Accepted: 06/23/2022] [Indexed: 11/13/2022] Open
Abstract
Computational methods in protein engineering often require encoding amino acid sequences, i.e., converting them into numeric arrays. Physicochemical properties are a typical choice to define encoders, where we replace each amino acid by its value for a given property. However, what property (or group thereof) is best for a given predictive task remains an open problem. In this work, we generalize property-based encoding strategies to maximize the performance of predictive models in protein engineering. First, combining text mining and unsupervised learning, we partitioned the AAIndex database into eight semantically-consistent groups of properties. We then applied a non-linear PCA within each group to define a single encoder to represent it. Then, in several case studies, we assess the performance of predictive models for protein and peptide function, folding, and biological activity, trained using the proposed encoders and classical methods (One Hot Encoder and TAPE embeddings). Models trained on datasets encoded with our encoders and converted to signals through the Fast Fourier Transform (FFT) increased their precision and reduced their overfitting substantially, outperforming classical approaches in most cases. Finally, we propose a preliminary methodology to create de novo sequences with desired properties. All these results offer simple ways to increase the performance of general and complex predictive tasks in protein engineering without increasing their complexity.
Collapse
Affiliation(s)
- David Medina-Ortiz
- Centre for Biotechnology and Bioengineering, Universidad de Chile, Santiago, Chile
- Departamento de Ingeniería en Computación, Universidad de Magallanes, Punta Arenas, Chile
| | - Sebastian Contreras
- Max Planck Institute for Dynamics and Self-Organization, Göttingen, Germany
- *Correspondence: Sebastian Contreras, ; Álvaro Olivera-Nappa,
| | - Juan Amado-Hinojosa
- Centre for Biotechnology and Bioengineering, Universidad de Chile, Santiago, Chile
- Departamento de Ingeniería Química, Biotecnología y Materiales, Facultad de Ciencias Físicas y Matemáticas, Universidad de Chile, Santiago, Chile
| | - Jorge Torres-Almonacid
- Departamento de Ingeniería en Computación, Universidad de Magallanes, Punta Arenas, Chile
| | - Juan A. Asenjo
- Centre for Biotechnology and Bioengineering, Universidad de Chile, Santiago, Chile
- Departamento de Ingeniería Química, Biotecnología y Materiales, Facultad de Ciencias Físicas y Matemáticas, Universidad de Chile, Santiago, Chile
| | | | - Álvaro Olivera-Nappa
- Centre for Biotechnology and Bioengineering, Universidad de Chile, Santiago, Chile
- Departamento de Ingeniería Química, Biotecnología y Materiales, Facultad de Ciencias Físicas y Matemáticas, Universidad de Chile, Santiago, Chile
- *Correspondence: Sebastian Contreras, ; Álvaro Olivera-Nappa,
| |
Collapse
|
60
|
Zhu L, Ye C, Hu X, Yang S, Zhu C. ACP-check: An anticancer peptide prediction model based on bidirectional long short-term memory and multi-features fusion strategy. Comput Biol Med 2022; 148:105868. [PMID: 35868046 DOI: 10.1016/j.compbiomed.2022.105868] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2022] [Revised: 06/14/2022] [Accepted: 07/09/2022] [Indexed: 11/16/2022]
Abstract
The anticancer peptide is an emerging anticancer drug that has become an effective alternative to chemotherapy and targeted therapy due to fewer side effects and resistance. The traditional biological experimental method for identifying anticancer peptides is a time-consuming and complicated process that hinders large-scale, rapid, and effective identification. In this paper, we propose a model based on a bidirectional long short-term memory network and multi-features fusion, called ACP-check, which employs a bidirectional long short-term memory network to extract time-dependent information features from peptide sequences, and combines them with amino acid sequence features including binary profile feature, dipeptide composition, the composition of k-spaced amino acid group pairs, amino acid composition, and sequence-order-coupling number. To verify the performance of the model, six benchmark datasets are selected, including ACPred-Fuse, ACPred-FL, ACP240, ACP740, main and alternate datasets of AntiCP2.0. In terms of Matthews correlation coefficients, ACP-check obtains 0.37, 0.82, 0.80, 0.75, 0.56, and 0.86 on six datasets respectively, which is an improvement by 2%-86% than existing state-of-the-art anticancer peptides prediction methods. Furthermore, ACP-check achieves prediction accuracy with 0.91, 0.91, 0.90, 0.87, 0.78, and 0.93 respectively, which increases range from 1%-49%. Overall, the comparison experiment shows that ACP-check can accurately identify anticancer peptides by sequence-level information. The code and data are available at http://www.cczubio.top/ACP-check/.
Collapse
Affiliation(s)
- Lun Zhu
- School of Computer Science and Artificial Intelligence Aliyun School of Big Data School of Software, Changzhou University, Changzhou, 213164, China
| | - Chenyang Ye
- School of Computer Science and Artificial Intelligence Aliyun School of Big Data School of Software, Changzhou University, Changzhou, 213164, China
| | - Xuemei Hu
- Key Laboratory of Symbol Computation and Knowledge Engineering of Ministry of Education, College of Computer Science and Technology, Jilin University, Changchun, 130012, China
| | - Sen Yang
- School of Computer Science and Artificial Intelligence Aliyun School of Big Data School of Software, Changzhou University, Changzhou, 213164, China; Changzhou No.2 People's Hospital, the Affiliated Hospital of Nanjing Medical University, Changzhou, 213164, China.
| | - Chenyang Zhu
- School of Computer Science and Artificial Intelligence Aliyun School of Big Data School of Software, Changzhou University, Changzhou, 213164, China
| |
Collapse
|
61
|
Marzhoseyni Z, Shayestehpour M, Salimian M, Esmaeili D, Saffari M, Fathizadeh H. Designing a novel fusion protein from Streptococcus agalactiae with apoptosis induction effects on cervical cancer cells. Microb Pathog 2022; 169:105670. [PMID: 35809755 DOI: 10.1016/j.micpath.2022.105670] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2022] [Revised: 07/02/2022] [Accepted: 07/04/2022] [Indexed: 11/15/2022]
Abstract
Cervical cancer remains life-threatening cancer in women around the world. Due to the limitations of conventional treatment approaches, there is an urgent need to develop novel and more efficient strategies against cervical cancer. Therefore, the researchers attend to the alternative anti-cancer compounds like bacterial products. Rib and α are known as surface proteins of Streptococcus agalactiae with immunologic effects. In the present study, we designed a new anti-cancer fusion protein (Rib-α) originating from S. agalactiae with in silico methods, and then, the recombinant gene was cloned in the pET-22 (+) expression vector. The recombinant protein was expressed in E. coli BL21. To purify the expressed protein, we applied the Ni-NTA column. The molecular mechanism by which Rib-α is cytotoxic to cancer cells has been discussed based on MTT, flow cytometry, and real-time PCR methods. The engineered fusion protein suppressed the proliferation of the cancer cells at 180 μg/ml. Cytotoxic assessment and morphological changes, augmentation of apoptotic-related genes, upregulation of caspase-3 mRNA, and flow cytometric analysis confirmed that apoptosis might be the principal mechanism of cell death. According to our findings, Rib-α fusion protein motivated the intrinsic apoptosis pathway. Therefore, it can be an exciting candidate to discover a new class of antineoplastic agents.
Collapse
Affiliation(s)
- Zeynab Marzhoseyni
- Department of Microbiology and Immunology, Faculty of Medicine, Kashan University of Medical Sciences, Kashan, Iran
| | - Mohammad Shayestehpour
- Department of Microbiology and Immunology, Faculty of Medicine, Kashan University of Medical Sciences, Kashan, Iran; Autoimmune Diseases Research Center, Kashan University of Medical Sciences, Kashan, Iran
| | - Morteza Salimian
- Anatomical Science Research Center, Kashan University of Medical Sciences, Kashan, Iran.
| | - Davoud Esmaeili
- Department of Microbiology and Applied Virology Research Center, Baqiyatallah University of Medical Sciences, Tehran, Iran.
| | - Mahmood Saffari
- Department of Microbiology and Immunology, Faculty of Medicine, Kashan University of Medical Sciences, Kashan, Iran
| | - Hadis Fathizadeh
- Student Research Committee, Sirjan School of Medical Sciences, Sirjan, Iran; Department of Laboratory Sciences, Sirjan School of Medical Sciences, Sirjan, Iran
| |
Collapse
|
62
|
Li Y, Li X, Liu Y, Yao Y, Huang G. MPMABP: A CNN and Bi-LSTM-Based Method for Predicting Multi-Activities of Bioactive Peptides. Pharmaceuticals (Basel) 2022; 15:707. [PMID: 35745625 PMCID: PMC9231127 DOI: 10.3390/ph15060707] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2022] [Revised: 05/23/2022] [Accepted: 05/30/2022] [Indexed: 12/30/2022] Open
Abstract
Bioactive peptides are typically small functional peptides with 2-20 amino acid residues and play versatile roles in metabolic and biological processes. Bioactive peptides are multi-functional, so it is vastly challenging to accurately detect all their functions simultaneously. We proposed a convolution neural network (CNN) and bi-directional long short-term memory (Bi-LSTM)-based deep learning method (called MPMABP) for recognizing multi-activities of bioactive peptides. The MPMABP stacked five CNNs at different scales, and used the residual network to preserve the information from loss. The empirical results showed that the MPMABP is superior to the state-of-the-art methods. Analysis on the distribution of amino acids indicated that the lysine preferred to appear in the anti-cancer peptide, the leucine in the anti-diabetic peptide, and the proline in the anti-hypertensive peptide. The method and analysis are beneficial to recognize multi-activities of bioactive peptides.
Collapse
Affiliation(s)
- You Li
- School of Electrical Engineering, Shaoyang University, Shaoyang 422000, China; (Y.L.); (X.L.)
| | - Xueyong Li
- School of Electrical Engineering, Shaoyang University, Shaoyang 422000, China; (Y.L.); (X.L.)
| | - Yuewu Liu
- College of Information and Intelligence, Hunan Agricultural University, Changsha 410128, China;
| | - Yuhua Yao
- School of Mathematics and Statistics, Hainan Normal University, Haikou 571158, China;
| | - Guohua Huang
- School of Electrical Engineering, Shaoyang University, Shaoyang 422000, China; (Y.L.); (X.L.)
| |
Collapse
|
63
|
Charoenkwan P, Schaduangrat N, Hasan MM, Moni MA, Lió P, Shoombuatong W. Empirical comparison and analysis of machine learning-based predictors for predicting and analyzing of thermophilic proteins. EXCLI JOURNAL 2022; 21:554-570. [PMID: 35651661 PMCID: PMC9150013 DOI: 10.17179/excli2022-4723] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/28/2022] [Accepted: 02/21/2022] [Indexed: 12/15/2022]
Abstract
Thermophilic proteins (TPPs) are critical for basic research and in the food industry due to their ability to maintain a thermodynamically stable fold at extremely high temperatures. Thus, the expeditious identification of novel TPPs through computational models from protein sequences is very desirable. Over the last few decades, a number of computational methods, especially machine learning (ML)-based methods, for in silico prediction of TPPs have been developed. Therefore, it is desirable to revisit these methods and summarize their advantages and disadvantages in order to further develop new computational approaches to achieve more accurate and improved prediction of TPPs. With this goal in mind, we comprehensively investigate a large collection of fourteen state-of-the-art TPP predictors in terms of their dataset size, feature encoding schemes, feature selection strategies, ML algorithms, evaluation strategies and web server/software usability. To the best of our knowledge, this article represents the first comprehensive review on the development of ML-based methods for in silico prediction of TPPs. Among these TPP predictors, they can be classified into two groups according to the interpretability of ML algorithms employed (i.e., computational black-box methods and computational white-box methods). In order to perform the comparative analysis, we conducted a comparative study on several currently available TPP predictors based on two benchmark datasets. Finally, we provide future perspectives for the design and development of new computational models for TPP prediction. We hope that this comprehensive review will facilitate researchers in selecting an appropriate TPP predictor that is the most suitable one to deal with their purposes and provide useful perspectives for the development of more effective and accurate TPP predictors.
Collapse
Affiliation(s)
- Phasit Charoenkwan
- Modern Management and Information Technology, College of Arts, Media and Technology, Chiang Mai University, Chiang Mai, Thailand, 50200
| | - Nalini Schaduangrat
- Center of Data Mining and Biomedical Informatics, Faculty of Medical Technology, Mahidol University, Bangkok, Thailand, 10700
| | - Md Mehedi Hasan
- Tulane Center for Biomedical Informatics and Genomics, Division of Biomedical Informatics and Genomics, John W. Deming Department of Medicine, School of Medicine, Tulane University, New Orleans, LA 70112, USA
| | - Mohammad Ali Moni
- School of Health and Rehabilitation Sciences, Faculty of Health and Behavioural Sciences, the University of Queensland, St Lucia, QLD 4072, Australia
| | - Pietro Lió
- Department of Computer Science and Technology, University of Cambridge, Cambridge, CB3 0FD, UK
| | - Watshara Shoombuatong
- Center of Data Mining and Biomedical Informatics, Faculty of Medical Technology, Mahidol University, Bangkok, Thailand, 10700
| |
Collapse
|
64
|
To Assist Oncologists: An Efficient Machine Learning-Based Approach for Anti-Cancer Peptides Classification. SENSORS 2022; 22:s22114005. [PMID: 35684624 PMCID: PMC9185351 DOI: 10.3390/s22114005] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/04/2022] [Revised: 05/19/2022] [Accepted: 05/20/2022] [Indexed: 12/10/2022]
Abstract
In the modern technological era, Anti-cancer peptides (ACPs) have been considered a promising cancer treatment. It’s critical to find new ACPs to ensure a better knowledge of their functioning processes and vaccine development. Thus, timely and efficient ACPs using a computational technique are highly needed because of the enormous peptide sequences generated in the post-genomic era. Recently, numerous adaptive statistical algorithms have been developed for separating ACPs and NACPs. Despite great advancements, existing approaches still have insufficient feature descriptors and learning methods, limiting predictive performance. To address this, a trustworthy framework is developed for the precise identification of ACPs. Particularly, the presented approach incorporates four hypothetical feature encoding mechanisms namely: amino acid, dipeptide, tripeptide, and an improved version of pseudo amino acid composition are applied to indicate the motif of the target class. Moreover, principal component analysis (PCA) is employed for feature pruning, while selecting optimal, deep, and highly variated features. Due to the diverse nature of learning, experiments are performed over numerous algorithms to select the optimum operating method. After investigating the empirical outcomes, the support vector machine with hybrid feature space shows better performance. The proposed framework achieved an accuracy of 97.09% and 98.25% over the benchmark and independent datasets, respectively. The comparative analysis demonstrates that our proposed model outperforms as compared to the existing methods and is beneficial in drug development, and oncology.
Collapse
|
65
|
Multi-channel CNN based anticancer peptides identification. Anal Biochem 2022; 650:114707. [PMID: 35568159 DOI: 10.1016/j.ab.2022.114707] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2021] [Revised: 01/27/2022] [Accepted: 04/27/2022] [Indexed: 11/20/2022]
Abstract
Cancer is one of the most dangerous diseases in the world that often leads to misery and death. Current treatments include different kinds of anticancer therapy which exhibit different types of side effects. Because of certain physicochemical properties, anticancer peptides (ACPs) have opened a new path of treatments for this deadly disease. That is why a well-performed methodology for identifying novel anticancer peptides has great importance in the fight against cancer. In addition to the laboratory techniques, various machine learning and deep learning methodologies have developed in recent years for this task. Although these models have shown reasonable predictive ability, there's still room for improvement in terms of performance and exploring new types of algorithms. In this work, we have proposed a novel multi-channel convolutional neural network (CNN) for identifying anticancer peptides from protein sequences. We have collected data from the existing state-of-the-art methodologies and applied binary encoding for data preprocessing. We have also employed k-fold cross-validation to train our models on benchmark datasets and compared our models' performance on the independent datasets. The comparison has indicated our models' superiority on various evaluation metrics. We think our work can be a valuable asset in finding novel anticancer peptides. We have provided a user-friendly web server for academic purposes and it is publicly available at: \texttt{http://103.99.176.239/iacp-cnn/}.
Collapse
|
66
|
Fathi F, Ghobeh M, Tabarzad M. Anti-Microbial Peptides: Strategies of Design and Development and Their Promising Wound-Healing Activities. Mol Biol Rep 2022; 49:9001-9012. [DOI: 10.1007/s11033-022-07405-1] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2022] [Revised: 03/13/2022] [Accepted: 03/17/2022] [Indexed: 12/30/2022]
|
67
|
Development of Anticancer Peptides Using Artificial Intelligence and Combinational Therapy for Cancer Therapeutics. Pharmaceutics 2022; 14:pharmaceutics14050997. [PMID: 35631583 PMCID: PMC9147327 DOI: 10.3390/pharmaceutics14050997] [Citation(s) in RCA: 16] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2022] [Revised: 04/28/2022] [Accepted: 05/04/2022] [Indexed: 01/27/2023] Open
Abstract
Cancer is a group of diseases causing abnormal cell growth, altering the genome, and invading or spreading to other parts of the body. Among therapeutic peptide drugs, anticancer peptides (ACPs) have been considered to target and kill cancer cells because cancer cells have unique characteristics such as a high negative charge and abundance of microvilli in the cell membrane when compared to a normal cell. ACPs have several advantages, such as high specificity, cost-effectiveness, low immunogenicity, minimal toxicity, and high tolerance under normal physiological conditions. However, the development and identification of ACPs are time-consuming and expensive in traditional wet-lab-based approaches. Thus, the application of artificial intelligence on the approaches can save time and reduce the cost to identify candidate ACPs. Recently, machine learning (ML), deep learning (DL), and hybrid learning (ML combined DL) have emerged into the development of ACPs without experimental analysis, owing to advances in computer power and big data from the power system. Additionally, we suggest that combination therapy with classical approaches and ACPs might be one of the impactful approaches to increase the efficiency of cancer therapy.
Collapse
|
68
|
Padhi S, Chourasia R, Kumari M, Singh SP, Rai AK. Production and characterization of bioactive peptides from rice beans using Bacillus subtilis. BIORESOURCE TECHNOLOGY 2022; 351:126932. [PMID: 35248709 DOI: 10.1016/j.biortech.2022.126932] [Citation(s) in RCA: 18] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/10/2022] [Revised: 02/24/2022] [Accepted: 02/26/2022] [Indexed: 06/14/2023]
Abstract
A bioprocess was developed for production of bioactive peptides on microbial fermentation of rice beans using proteolytic Bacillus subtilis strains. The peptides produced were identified by LC-MS/MS analysis, revealing the presence of many unique peptide sequences to individual hydrolysates. On functional properties prediction, antihypertensive peptides (3.90%) were found to be higher in comparison to other bioactive peptides. Among different strains, B. subtilis KN2B fermented hydrolysate exhibited highest angiotensin converting enzyme (ACE)-inhibitory activity (45.73%). Furthermore, 19 selected peptides, including the common and unique peptides were examined for their affinity towards the binding cavity of ACE using molecular docking. The results showed a common peptide PFPIPFPIPIPLP, and another IPFPPIPFLPPI unique to B. subtilis KN2B fermented hydrolysate exhibited promising binding at the ACE binding site with substantial free binding energy. The process developed can be used for the production of bioactive peptides from rice bean for application in nutraceutical industries.
Collapse
Affiliation(s)
- Srichandan Padhi
- Institute of Bioresources and Sustainable Development, Regional Centre, Gangtok, India
| | - Rounak Chourasia
- Institute of Bioresources and Sustainable Development, Regional Centre, Gangtok, India
| | - Megha Kumari
- Institute of Bioresources and Sustainable Development, Regional Centre, Gangtok, India
| | - Sudhir P Singh
- Centre of Innovative and Applied Bioprocessing, Mohali, India
| | - Amit Kumar Rai
- Institute of Bioresources and Sustainable Development, Regional Centre, Gangtok, India; Institute of Bioresources and Sustainable Development, Mizoram Node, Aizawl, India.
| |
Collapse
|
69
|
Lertampaiporn S, Hongsthong A, Wattanapornprom W, Thammarongtham C. Ensemble-AHTPpred: A Robust Ensemble Machine Learning Model Integrated With a New Composite Feature for Identifying Antihypertensive Peptides. Front Genet 2022; 13:883766. [PMID: 35571042 PMCID: PMC9096110 DOI: 10.3389/fgene.2022.883766] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2022] [Accepted: 04/04/2022] [Indexed: 11/13/2022] Open
Abstract
Hypertension or elevated blood pressure is a serious medical condition that significantly increases the risks of cardiovascular disease, heart disease, diabetes, stroke, kidney disease, and other health problems, that affect people worldwide. Thus, hypertension is one of the major global causes of premature death. Regarding the prevention and treatment of hypertension with no or few side effects, antihypertensive peptides (AHTPs) obtained from natural sources might be useful as nutraceuticals. Therefore, the search for alternative/novel AHTPs in food or natural sources has received much attention, as AHTPs may be functional agents for human health. AHTPs have been observed in diverse organisms, although many of them remain underinvestigated. The identification of peptides with antihypertensive activity in the laboratory is time- and resource-consuming. Alternatively, computational methods based on robust machine learning can identify or screen potential AHTP candidates prior to experimental verification. In this paper, we propose Ensemble-AHTPpred, an ensemble machine learning algorithm composed of a random forest (RF), a support vector machine (SVM), and extreme gradient boosting (XGB), with the aim of integrating diverse heterogeneous algorithms to enhance the robustness of the final predictive model. The selected feature set includes various computed features, such as various physicochemical properties, amino acid compositions (AACs), transitions, n-grams, and secondary structure-related information; these features are able to learn more information in terms of analyzing or explaining the characteristics of the predicted peptide. In addition, the tool is integrated with a newly proposed composite feature (generated based on a logistic regression function) that combines various feature aspects to enable improved AHTP characterization. Our tool, Ensemble-AHTPpred, achieved an overall accuracy above 90% on independent test data. Additionally, the approach was applied to novel experimentally validated AHTPs, obtained from recent studies, which did not overlap with the training and test datasets, and the tool could precisely predict these AHTPs.
Collapse
Affiliation(s)
- Supatcha Lertampaiporn
- Biochemical Engineering and Systems Biology Research Group, National Center for Genetic Engineering and Biotechnology, National Science and Technology Development Agency at King Mongkut’s University of Technology Thonburi, Bangkok, Thailand
| | - Apiradee Hongsthong
- Biochemical Engineering and Systems Biology Research Group, National Center for Genetic Engineering and Biotechnology, National Science and Technology Development Agency at King Mongkut’s University of Technology Thonburi, Bangkok, Thailand
| | - Warin Wattanapornprom
- Applied Computer Science Program, Department of Mathematics, Faculty of Science, King Mongkut’s University of Technology Thonburi, Bangkok, Thailand
| | - Chinae Thammarongtham
- Biochemical Engineering and Systems Biology Research Group, National Center for Genetic Engineering and Biotechnology, National Science and Technology Development Agency at King Mongkut’s University of Technology Thonburi, Bangkok, Thailand
- *Correspondence: Chinae Thammarongtham,
| |
Collapse
|
70
|
Wu X, Zeng W, Lin F, Xu P, Li X. Anticancer Peptide Prediction via Multi-Kernel CNN and Attention Model. Front Genet 2022; 13:887894. [PMID: 35571059 PMCID: PMC9092594 DOI: 10.3389/fgene.2022.887894] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2022] [Accepted: 03/25/2022] [Indexed: 11/13/2022] Open
Abstract
Background: Modern lifestyles mean that people are more likely to suffer from some form of cancer. As anticancer peptides can effectively kill cancer cells and play an important role in fighting cancer, they have been a subject of increasing research interest. Methods: This study presents a useful tool to identify the anticancer peptides based on a multi-kernel CNN and attention model, called ACP-MCAM. This model can automatically learn adaptive embedding and the context sequence features of ACP. In addition, to obtain better interpretability and integrity, we visualized the model. Results: Benchmarking comparison shows that ACP-MCAM significantly outperforms several state-of-the-art models. Different encoding schemes have different impacts on the performance of the model. We also studied tmethod parameter optimization. Conclusion: The ACP-MCAM can integrate multi-kernel CNN and self-attention mechanism, which outperforms the previous model in identifying anticancer peptides. It is expected that the work will provide new research ideas for anticancer peptide prediction in the future. In addition, this work will promote the development of the interdisciplinary field of artificial intelligence and biomedicine.
Collapse
Affiliation(s)
- Xiujin Wu
- School of Informatics, Xiamen University, Xiamen, China
| | - Wenhua Zeng
- School of Informatics, Xiamen University, Xiamen, China
| | - Fan Lin
- School of Informatics, Xiamen University, Xiamen, China
- Boston Children’s Hospital, Boston, MA, United States
| | - Peng Xu
- Chongqing Michong Technology Co., Ltd., Chongqing, China
| | | |
Collapse
|
71
|
Breast and Lung Anticancer Peptides Classification Using N-Grams and Ensemble Learning Techniques. BIG DATA AND COGNITIVE COMPUTING 2022. [DOI: 10.3390/bdcc6020040] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/01/2023]
Abstract
Anticancer peptides (ACPs) are short protein sequences; they perform functions like some hormones and enzymes inside the body. The role of any protein or peptide is related to its structure and the sequence of amino acids that make up it. There are 20 types of amino acids in humans, and each of them has a particular characteristic according to its chemical structure. Current machine and deep learning models have been used to classify ACPs problems. However, these models have neglected Amino Acid Repeats (AARs) that play an essential role in the function and structure of peptides. Therefore, in this paper, ACPs offer a promising route for novel anticancer peptides by extracting AARs based on N-Grams and k-mers using two peptides’ datasets. These datasets pointed to breast and lung cancer cells assembled and curated manually from the Cancer Peptide and Protein Database (CancerPPD). Every dataset consists of a sequence of peptides and their synthesis and anticancer activity on breast and lung cancer cell lines. Five different feature selection methods were used in this paper to improve classification performance and reduce the experimental costs. After that, ACPs were classified using four classifiers, namely AdaBoost, Random Forest Tree (RFT), Multi-class Support Vector Machine (SVM), and Multi-Layer Perceptron (MLP). These classifiers were evaluated by applying five well-known evaluation metrics. Experimental results showed that the breast and lung ACPs classification process provided an accurate performance that reached 89.25% and 92.56%, respectively. In terms of AUC, it reached 95.35% and 96.92% for both breast and lung ACPs, respectively. The proposed classifiers performed competently somewhat equally in AUC, accuracy, precision, F-measures, and recall, except for Multi-class SVM-based feature selection, which showed superior performance. As a result, this paper significantly improved the predictive performance that can effectively distinguish ACPs as virtual inactive, experimental inactive, moderately active, and very active.
Collapse
|
72
|
Calatayud DG, Neophytou S, Nicodemou E, Giuffrida SG, Ge H, Pascu SI. Nano-Theranostics for the Sensing, Imaging and Therapy of Prostate Cancers. Front Chem 2022; 10:830133. [PMID: 35494646 PMCID: PMC9039169 DOI: 10.3389/fchem.2022.830133] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2021] [Accepted: 03/16/2022] [Indexed: 01/28/2023] Open
Abstract
We highlight hereby recent developments in the emerging field of theranostics, which encompasses the combination of therapeutics and diagnostics in a single entity aimed for an early-stage diagnosis, image-guided therapy as well as evaluation of therapeutic outcomes of relevance to prostate cancer (PCa). Prostate cancer is one of the most common malignancies in men and a frequent cause of male cancer death. As such, this overview is concerned with recent developments in imaging and sensing of relevance to prostate cancer diagnosis and therapeutic monitoring. A major advantage for the effective treatment of PCa is an early diagnosis that would provide information for an appropriate treatment. Several imaging techniques are being developed to diagnose and monitor different stages of cancer in general, and patient stratification is particularly relevant for PCa. Hybrid imaging techniques applicable for diagnosis combine complementary structural and morphological information to enhance resolution and sensitivity of imaging. The focus of this review is to sum up some of the most recent advances in the nanotechnological approaches to the sensing and treatment of prostate cancer (PCa). Targeted imaging using nanoparticles, radiotracers and biomarkers could result to a more specialised and personalised diagnosis and treatment of PCa. A myriad of reports has been published literature proposing methods to detect and treat PCa using nanoparticles but the number of techniques approved for clinical use is relatively small. Another facet of this report is on reviewing aspects of the role of functional nanoparticles in multimodality imaging therapy considering recent developments in simultaneous PET-MRI (Positron Emission Tomography-Magnetic Resonance Imaging) coupled with optical imaging in vitro and in vivo, whilst highlighting feasible case studies that hold promise for the next generation of dual modality medical imaging of PCa. It is envisaged that progress in the field of imaging and sensing domains, taken together, could benefit from the biomedical implementation of new synthetic platforms such as metal complexes and functional materials supported on organic molecular species, which can be conjugated to targeting biomolecules and encompass adaptable and versatile molecular architectures. Furthermore, we include hereby an overview of aspects of biosensing methods aimed to tackle PCa: prostate biomarkers such as Prostate Specific Antigen (PSA) have been incorporated into synthetic platforms and explored in the context of sensing and imaging applications in preclinical investigations for the early detection of PCa. Finally, some of the societal concerns around nanotechnology being used for the detection of PCa are considered and addressed together with the concerns about the toxicity of nanoparticles–these were aspects of recent lively debates that currently hamper the clinical advancements of nano-theranostics. The publications survey conducted for this review includes, to the best of our knowledge, some of the most recent relevant literature examples from the state-of-the-art. Highlighting these advances would be of interest to the biomedical research community aiming to advance the application of theranostics particularly in PCa diagnosis and treatment, but also to those interested in the development of new probes and methodologies for the simultaneous imaging and therapy monitoring employed for PCa targeting.
Collapse
Affiliation(s)
- David G. Calatayud
- Department of Chemistry, University of Bath, Bath, United Kingdom
- Department of Electroceramics, Instituto de Ceramica y Vidrio - CSIC, Madrid, Spain
- *Correspondence: Sofia I. Pascu, ; David G. Calatayud,
| | - Sotia Neophytou
- Department of Chemistry, University of Bath, Bath, United Kingdom
| | - Eleni Nicodemou
- Department of Chemistry, University of Bath, Bath, United Kingdom
| | | | - Haobo Ge
- Department of Chemistry, University of Bath, Bath, United Kingdom
| | - Sofia I. Pascu
- Department of Chemistry, University of Bath, Bath, United Kingdom
- Centre of Therapeutic Innovations, University of Bath, Bath, United Kingdom
- *Correspondence: Sofia I. Pascu, ; David G. Calatayud,
| |
Collapse
|
73
|
Peptide-Based Drug Predictions for Cancer Therapy Using Deep Learning. Pharmaceuticals (Basel) 2022; 15:ph15040422. [PMID: 35455418 PMCID: PMC9028292 DOI: 10.3390/ph15040422] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2022] [Revised: 03/26/2022] [Accepted: 03/29/2022] [Indexed: 02/01/2023] Open
Abstract
Anticancer peptides (ACPs) are selective and toxic to cancer cells as new anticancer drugs. Identifying new ACPs is time-consuming and expensive to evaluate all candidates’ anticancer abilities. To reduce the cost of ACP drug development, we collected the most updated ACP data to train a convolutional neural network (CNN) with a peptide sequence encoding method for initial in silico evaluation. Here we introduced PC6, a novel protein-encoding method, to convert a peptide sequence into a computational matrix, representing six physicochemical properties of each amino acid. By integrating data, encoding method, and deep learning model, we developed AI4ACP, a user-friendly web-based ACP distinguisher that can predict the anticancer property of query peptides and promote the discovery of peptides with anticancer activity. The experimental results demonstrate that AI4ACP in CNN, trained using the new ACP collection, outperforms the existing ACP predictors. The 5-fold cross-validation of AI4ACP with the new collection also showed that the model could perform at a stable level on high accuracy around 0.89 without overfitting. Using AI4ACP, users can easily accomplish an early-stage evaluation of unknown peptides and select potential candidates to test their anticancer activities quickly.
Collapse
|
74
|
de Oliveira ECL, da Costa KS, Taube PS, Lima AH, Junior CDSDS. Biological Membrane-Penetrating Peptides: Computational Prediction and Applications. Front Cell Infect Microbiol 2022; 12:838259. [PMID: 35402305 PMCID: PMC8992797 DOI: 10.3389/fcimb.2022.838259] [Citation(s) in RCA: 16] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2021] [Accepted: 02/21/2022] [Indexed: 12/14/2022] Open
Abstract
Peptides comprise a versatile class of biomolecules that present a unique chemical space with diverse physicochemical and structural properties. Some classes of peptides are able to naturally cross the biological membranes, such as cell membrane and blood-brain barrier (BBB). Cell-penetrating peptides (CPPs) and blood-brain barrier-penetrating peptides (B3PPs) have been explored by the biotechnological and pharmaceutical industries to develop new therapeutic molecules and carrier systems. The computational prediction of peptides’ penetration into biological membranes has been emerged as an interesting strategy due to their high throughput and low-cost screening of large chemical libraries. Structure- and sequence-based information of peptides, as well as atomistic biophysical models, have been explored in computer-assisted discovery strategies to classify and identify new structures with pharmacokinetic properties related to the translocation through biomembranes. Computational strategies to predict the permeability into biomembranes include cheminformatic filters, molecular dynamics simulations, artificial intelligence algorithms, and statistical models, and the choice of the most adequate method depends on the purposes of the computational investigation. Here, we exhibit and discuss some principles and applications of these computational methods widely used to predict the permeability of peptides into biomembranes, exhibiting some of their pharmaceutical and biotechnological applications.
Collapse
Affiliation(s)
- Ewerton Cristhian Lima de Oliveira
- Institute of Technology, Federal University of Pará, Belém, Brazil
- *Correspondence: Kauê Santana da Costa, ; Ewerton Cristhian Lima de Oliveira,
| | - Kauê Santana da Costa
- Laboratory of Computational Simulation, Institute of Biodiversity, Federal University of Western Pará, Santarém, Brazil
- *Correspondence: Kauê Santana da Costa, ; Ewerton Cristhian Lima de Oliveira,
| | - Paulo Sérgio Taube
- Laboratory of Computational Simulation, Institute of Biodiversity, Federal University of Western Pará, Santarém, Brazil
| | - Anderson H. Lima
- Laboratório de Planejamento e Desenvolvimento de Fármacos, Instituto de Ciências Exatas e Naturais, Universidade Federal do Pará, Belém, Brazil
| | | |
Collapse
|
75
|
Delaunay M, Ha-Duong T. Computational Tools and Strategies to Develop Peptide-Based Inhibitors of Protein-Protein Interactions. METHODS IN MOLECULAR BIOLOGY (CLIFTON, N.J.) 2022; 2405:205-230. [PMID: 35298816 DOI: 10.1007/978-1-0716-1855-4_11] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
Protein-protein interactions play crucial and subtle roles in many biological processes and modifications of their fine mechanisms generally result in severe diseases. Peptide derivatives are very promising therapeutic agents for modulating protein-protein associations with sizes and specificities between those of small compounds and antibodies. For the same reasons, rational design of peptide-based inhibitors naturally borrows and combines computational methods from both protein-ligand and protein-protein research fields. In this chapter, we aim to provide an overview of computational tools and approaches used for identifying and optimizing peptides that target protein-protein interfaces with high affinity and specificity. We hope that this review will help to implement appropriate in silico strategies for peptide-based drug design that builds on available information for the systems of interest.
Collapse
Affiliation(s)
| | - Tâp Ha-Duong
- Université Paris-Saclay, CNRS, BioCIS, Châtenay-Malabry, France.
| |
Collapse
|
76
|
Qi L, Gao X, Pan D, Sun Y, Cai Z, Xiong Y, Dang Y. Research progress in the screening and evaluation of umami peptides. Compr Rev Food Sci Food Saf 2022; 21:1462-1490. [PMID: 35201672 DOI: 10.1111/1541-4337.12916] [Citation(s) in RCA: 30] [Impact Index Per Article: 15.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2021] [Revised: 12/22/2021] [Accepted: 01/03/2022] [Indexed: 12/22/2022]
Abstract
Umami is an important element affecting food taste, and the development of umami peptides is a topic of interest in food-flavoring research. The existing technology used for traditional screening of umami peptides is time-consuming and labor-intensive, making it difficult to meet the requirements of high-throughput screening, which limits the rapid development of umami peptides. The difficulty in performing a standard measurement of umami intensity is another problem that restricts the development of umami peptides. The existing methods are not sensitive and specific, making it difficult to achieve a standard evaluation of umami taste. This review summarizes the umami receptors and umami peptides, focusing on the problems restricting the development of umami peptides, high-throughput screening, and establishment of evaluation standards. The rapid screening of umami peptides was realized based on molecular docking technology and a machine learning method, and the standard evaluation of umami could be realized with a bionic taste sensor. The progress of rapid screening and evaluation methods significantly promotes the study of umami peptides and increases its application in the seasoning industry.
Collapse
Affiliation(s)
- Lulu Qi
- State Key Laboratory for Managing Biotic and Chemical Threats to the Quality and Safety of AgroProducts, College of Food and Pharmaceutical Sciences, Ningbo University, Ningbo, Zhejiang, China
| | - Xinchang Gao
- Department of Chemistry, Tsinghua University, Beijing, China
| | - Daodong Pan
- State Key Laboratory for Managing Biotic and Chemical Threats to the Quality and Safety of AgroProducts, College of Food and Pharmaceutical Sciences, Ningbo University, Ningbo, Zhejiang, China.,National R&D Center for Freshwater Fish Processing, Jiangxi Normal University, Nanchang, Jiangxi, China
| | - Yangying Sun
- State Key Laboratory for Managing Biotic and Chemical Threats to the Quality and Safety of AgroProducts, College of Food and Pharmaceutical Sciences, Ningbo University, Ningbo, Zhejiang, China
| | - Zhendong Cai
- State Key Laboratory for Managing Biotic and Chemical Threats to the Quality and Safety of AgroProducts, College of Food and Pharmaceutical Sciences, Ningbo University, Ningbo, Zhejiang, China
| | - Yongzhao Xiong
- State Key Laboratory for Managing Biotic and Chemical Threats to the Quality and Safety of AgroProducts, College of Food and Pharmaceutical Sciences, Ningbo University, Ningbo, Zhejiang, China
| | - Yali Dang
- State Key Laboratory for Managing Biotic and Chemical Threats to the Quality and Safety of AgroProducts, College of Food and Pharmaceutical Sciences, Ningbo University, Ningbo, Zhejiang, China
| |
Collapse
|
77
|
Bonidia RP, Domingues DS, Sanches DS, de Carvalho ACPLF. MathFeature: feature extraction package for DNA, RNA and protein sequences based on mathematical descriptors. Brief Bioinform 2022; 23:bbab434. [PMID: 34750626 PMCID: PMC8769707 DOI: 10.1093/bib/bbab434] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2021] [Revised: 09/18/2021] [Accepted: 09/20/2021] [Indexed: 12/24/2022] Open
Abstract
One of the main challenges in applying machine learning algorithms to biological sequence data is how to numerically represent a sequence in a numeric input vector. Feature extraction techniques capable of extracting numerical information from biological sequences have been reported in the literature. However, many of these techniques are not available in existing packages, such as mathematical descriptors. This paper presents a new package, MathFeature, which implements mathematical descriptors able to extract relevant numerical information from biological sequences, i.e. DNA, RNA and proteins (prediction of structural features along the primary sequence of amino acids). MathFeature makes available 20 numerical feature extraction descriptors based on approaches found in the literature, e.g. multiple numeric mappings, genomic signal processing, chaos game theory, entropy and complex networks. MathFeature also allows the extraction of alternative features, complementing the existing packages. To ensure that our descriptors are robust and to assess their relevance, experimental results are presented in nine case studies. According to these results, the features extracted by MathFeature showed high performance (0.6350-0.9897, accuracy), both applying only mathematical descriptors, but also hybridization with well-known descriptors in the literature. Finally, through MathFeature, we overcame several studies in eight benchmark datasets, exemplifying the robustness and viability of the proposed package. MathFeature has advanced in the area by bringing descriptors not available in other packages, as well as allowing non-experts to use feature extraction techniques.
Collapse
Affiliation(s)
- Robson P Bonidia
- Institute of Mathematics and Computer Sciences, University of São Paulo, São Carlos 13566-590, Brazil
| | - Douglas S Domingues
- Group of Genomics and Transcriptomes in Plants, Institute of Biosciences, São Paulo State University (UNESP), Rio Claro 13506-900, Brazil
| | - Danilo S Sanches
- Department of Computer Science, Federal University of Technology - Paraná, UTFPR, Cornélio Procópio 86300-000, Brazil
| | - André C P L F de Carvalho
- Institute of Mathematics and Computer Sciences, University of São Paulo, São Carlos 13566-590, Brazil
| |
Collapse
|
78
|
HAMIN NETO YAA, GARZON NGDR, COITINHO LB, SOBRAL LM, LEOPOLDINO AM, CATALDI TR, LABATE CA, CABRAL H. Fungal metalloprotease generate whey-derived peptides that may be involved in apoptosis in B16F10 melanoma cells. FOOD SCIENCE AND TECHNOLOGY 2022. [DOI: 10.1590/fst.43022] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
|
79
|
Abstract
Background:
Therapeutic peptide prediction is critical for drug development and therapy. Researchers have been studying this essential task, developing several computational methods to identify different therapeutic peptide types.
Objective:
Most predictors are the specific methods for certain peptides. Currently, developing methods to predict the presence of multiple peptides remains a challenging problem. Moreover, it is still challenging to combine different features to make the therapeutic prediction.
Method:
In this paper, we proposed a new ensemble method TP-MV for general therapeutic peptide recognition. TP-MV is developed using the stacking framework in conjunction with the KNN, SVM, ET, RF, and XGB. Then TP-MV constructs a multi-view learning model as meta-classifiers to extract the discriminative feature for different peptides.
Results:
In the experiment, the proposed method outperforms the other existing methods on the benchmark datasets, indicating that the proposed method has the ability to predict multiple therapeutic peptides simultaneously.
Conclusion:
The TP-MV is a useful tool for predicting therapeutic peptides.
Collapse
Affiliation(s)
- Ke Yan
- School of Computer Science and Technology, Beijing Institute of Technology, Beijing, China
| | - Hongwu Lv
- School of Computer Science and Technology, Beijing Institute of Technology, Beijing, China
| | - Yichen Guo
- School of Computer Science and Technology, Beijing Institute of Technology, Beijing, China
| | - Jie Wen
- School of Computer Science and Technology, Harbin Institute of Technology, Shenzhen, Guangdong, China
| | - Bin Liu
- School of Computer Science and Technology, Beijing Institute of Technology, Beijing, China
| |
Collapse
|
80
|
Natural Peptides Inducing Cancer Cell Death: Mechanisms and Properties of Specific Candidates for Cancer Therapeutics. Molecules 2021; 26:molecules26247453. [PMID: 34946535 PMCID: PMC8708364 DOI: 10.3390/molecules26247453] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2021] [Revised: 12/05/2021] [Accepted: 12/07/2021] [Indexed: 01/10/2023] Open
Abstract
Nowadays, cancer has become the second highest leading cause of death, and it is expected to continue to affect the population in forthcoming years. Additionally, treatment options will become less accessible to the public as cases continue to grow and disease mechanisms expand. Hence, specific candidates with confirmed anticancer effects are required to develop new drugs. Among the novel therapeutic options, proteins are considered a relevant source, given that they have bioactive peptides encrypted within their sequences. These bioactive peptides, which are molecules consisting of 2–50 amino acids, have specific activities when administered, producing anticancer effects. Current databases report the effects of peptides. However, uncertainty is found when their molecular mechanisms are investigated. Furthermore, analyses addressing their interaction networks or their directly implicated mechanisms are needed to elucidate their effects on cancer cells entirely. Therefore, relevant peptides considered as candidates for cancer therapeutics with specific sequences and known anticancer mechanisms were accurately reviewed. Likewise, those features which turn certain peptides into candidates and the mechanisms by which peptides mediate tumor cell death were highlighted. This information will make robust the knowledge of these candidate peptides with recognized mechanisms and enhance their non-toxic capacity in relation to healthy cells and further avoid cell resistance.
Collapse
|
81
|
Trinidad-Calderón PA, Varela-Chinchilla CD, García-Lara S. Natural Peptides Inducing Cancer Cell Death: Mechanisms and Properties of Specific Candidates for Cancer Therapeutics. Molecules 2021. [DOI: https://doi.org/10.3390/molecules26247453] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/10/2023] Open
Abstract
Nowadays, cancer has become the second highest leading cause of death, and it is expected to continue to affect the population in forthcoming years. Additionally, treatment options will become less accessible to the public as cases continue to grow and disease mechanisms expand. Hence, specific candidates with confirmed anticancer effects are required to develop new drugs. Among the novel therapeutic options, proteins are considered a relevant source, given that they have bioactive peptides encrypted within their sequences. These bioactive peptides, which are molecules consisting of 2–50 amino acids, have specific activities when administered, producing anticancer effects. Current databases report the effects of peptides. However, uncertainty is found when their molecular mechanisms are investigated. Furthermore, analyses addressing their interaction networks or their directly implicated mechanisms are needed to elucidate their effects on cancer cells entirely. Therefore, relevant peptides considered as candidates for cancer therapeutics with specific sequences and known anticancer mechanisms were accurately reviewed. Likewise, those features which turn certain peptides into candidates and the mechanisms by which peptides mediate tumor cell death were highlighted. This information will make robust the knowledge of these candidate peptides with recognized mechanisms and enhance their non-toxic capacity in relation to healthy cells and further avoid cell resistance.
Collapse
|
82
|
Ahmed S, Muhammod R, Khan ZH, Adilina S, Sharma A, Shatabda S, Dehzangi A. ACP-MHCNN: an accurate multi-headed deep-convolutional neural network to predict anticancer peptides. Sci Rep 2021; 11:23676. [PMID: 34880291 PMCID: PMC8654959 DOI: 10.1038/s41598-021-02703-3] [Citation(s) in RCA: 29] [Impact Index Per Article: 9.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2021] [Accepted: 11/17/2021] [Indexed: 01/10/2023] Open
Abstract
Although advancing the therapeutic alternatives for treating deadly cancers has gained much attention globally, still the primary methods such as chemotherapy have significant downsides and low specificity. Most recently, Anticancer peptides (ACPs) have emerged as a potential alternative to therapeutic alternatives with much fewer negative side-effects. However, the identification of ACPs through wet-lab experiments is expensive and time-consuming. Hence, computational methods have emerged as viable alternatives. During the past few years, several computational ACP identification techniques using hand-engineered features have been proposed to solve this problem. In this study, we propose a new multi headed deep convolutional neural network model called ACP-MHCNN, for extracting and combining discriminative features from different information sources in an interactive way. Our model extracts sequence, physicochemical, and evolutionary based features for ACP identification using different numerical peptide representations while restraining parameter overhead. It is evident through rigorous experiments using cross-validation and independent-dataset that ACP-MHCNN outperforms other models for anticancer peptide identification by a substantial margin on our employed benchmarks. ACP-MHCNN outperforms state-of-the-art model by 6.3%, 8.6%, 3.7%, 4.0%, and 0.20 in terms of accuracy, sensitivity, specificity, precision, and MCC respectively. ACP-MHCNN and its relevant codes and datasets are publicly available at: https://github.com/mrzResearchArena/Anticancer-Peptides-CNN . ACP-MHCNN is also publicly available as an online predictor at: https://anticancer.pythonanywhere.com/ .
Collapse
Affiliation(s)
- Sajid Ahmed
- Department of Computer Science and Engineering, United International University, Dhaka, Bangladesh
| | - Rafsanjani Muhammod
- Department of Computer Science and Engineering, United International University, Dhaka, Bangladesh
| | - Zahid Hossain Khan
- Department of Computer Science and Engineering, United International University, Dhaka, Bangladesh
| | - Sheikh Adilina
- Department of Computer Science and Engineering, United International University, Dhaka, Bangladesh
| | - Alok Sharma
- Laboratory for Medical Science Mathematics, RIKEN Center for Integrative Medical Sciences, Yokohama, 230-0045, Japan
- Institute for Integrated and Intelligent Systems, Griffith University, Brisbane, QLD, 4111, Australia
| | - Swakkhar Shatabda
- Department of Computer Science and Engineering, United International University, Dhaka, Bangladesh.
| | - Abdollah Dehzangi
- Department of Computer Science, Rutgers University, Camden, NJ, 08102, USA.
- Center for Computational and Integrative Biology, Rutgers University, Camden, NJ, 08102, USA.
| |
Collapse
|
83
|
Guo Y, Yan K, Lv H, Liu B. PreTP-EL: prediction of therapeutic peptides based on ensemble learning. Brief Bioinform 2021; 22:6359002. [PMID: 34459488 DOI: 10.1093/bib/bbab358] [Citation(s) in RCA: 22] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2021] [Revised: 07/27/2021] [Accepted: 08/11/2021] [Indexed: 01/02/2023] Open
Abstract
Therapeutic peptides are important for understanding the correlation between peptides and their therapeutic diagnostic potential. The therapeutic peptides can be further divided into different types based on therapeutic function sharing different characteristics. Although some computational approaches have been proposed to predict different types of therapeutic peptides, they failed to accurately predict all types of therapeutic peptides. In this study, a predictor called PreTP-EL has been proposed via employing the ensemble learning approach to fuse the different features and machine learning techniques in order to capture the different characteristics of various therapeutic peptides. Experimental results showed that PreTP-EL outperformed other competing methods. Availability and implementation: A user-friendly web-server of PreTP-EL predictor is available at http://bliulab.net/PreTP-EL.
Collapse
Affiliation(s)
- Yichen Guo
- School of Computer Science and Technology, Beijing Institute of Technology, Beijing, China
| | - Ke Yan
- School of Computer Science and Technology, Beijing Institute of Technology, Beijing, China
| | - Hongwu Lv
- School of Computer Science and Technology, Beijing Institute of Technology, Beijing, China
| | - Bin Liu
- School of Computer Science and Technology, Beijing Institute of Technology, Beijing, China
| |
Collapse
|
84
|
Wang H, Zhao J, Zhao H, Li H, Wang J. CL-ACP: a parallel combination of CNN and LSTM anticancer peptide recognition model. BMC Bioinformatics 2021; 22:512. [PMID: 34670488 PMCID: PMC8527680 DOI: 10.1186/s12859-021-04433-9] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2021] [Accepted: 10/05/2021] [Indexed: 01/10/2023] Open
Abstract
BACKGROUND Anticancer peptides are defence substances with innate immune functions that can selectively act on cancer cells without harming normal cells and many studies have been conducted to identify anticancer peptides. In this paper, we introduce the anticancer peptide secondary structures as additional features and propose an effective computational model, CL-ACP, that uses a combined network and attention mechanism to predict anticancer peptides. RESULTS The CL-ACP model uses secondary structures and original sequences of anticancer peptides to construct the feature space. The long short-term memory and convolutional neural network are used to extract the contextual dependence and local correlations of the feature space. Furthermore, a multi-head self-attention mechanism is used to strengthen the anticancer peptide sequences. Finally, three categories of feature information are classified by cascading. CL-ACP was validated using two types of datasets, anticancer peptide datasets and antimicrobial peptide datasets, on which it achieved good results compared to previous methods. CL-ACP achieved the highest AUC values of 0.935 and 0.972 on the anticancer peptide and antimicrobial peptide datasets, respectively. CONCLUSIONS CL-ACP can effectively recognize antimicrobial peptides, especially anticancer peptides, and the parallel combined neural network structure of CL-ACP does not require complex feature design and high time cost. It is suitable for application as a useful tool in antimicrobial peptide design.
Collapse
Affiliation(s)
- Huiqing Wang
- College of Information and Computer, Taiyuan University of Technology, Taiyuan, 030024, China
| | - Jian Zhao
- College of Information and Computer, Taiyuan University of Technology, Taiyuan, 030024, China.
| | - Hong Zhao
- College of Information and Computer, Taiyuan University of Technology, Taiyuan, 030024, China
| | - Haolin Li
- College of Information and Computer, Taiyuan University of Technology, Taiyuan, 030024, China
| | - Juan Wang
- College of Information and Computer, Taiyuan University of Technology, Taiyuan, 030024, China
| |
Collapse
|
85
|
You H, Yu L, Tian S, Ma X, Xing Y, Song J, Wu W. Anti-cancer Peptide Recognition Based on Grouped Sequence and Spatial Dimension Integrated Networks. Interdiscip Sci 2021; 14:196-208. [PMID: 34637113 DOI: 10.1007/s12539-021-00481-0] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2021] [Revised: 09/05/2021] [Accepted: 09/09/2021] [Indexed: 11/24/2022]
Abstract
The diversification of the characteristic sequences of anti-cancer peptides has imposed difficulties on research. To effectively predict new anti-cancer peptides, this paper proposes a more suitable feature grouping sequence and spatial dimension-integrated network algorithm for anti-cancer peptide sequence prediction called GRCI-Net. The main process is as follows: First, we implemented the fusion reduction of binary structure features and K-mer sparse matrix features through principal component analysis and generated a set of new features; second, we constructed a new bidirectional long- and short-term memory network. We used traditional convolution and dilated convolution to acquire features in the spatial dimension using the memory network's grouping sequence model, which is designed to better handle the diversification of anti-cancer peptide feature sequences and to fully learn the contextual information between features. Finally, we achieved the fusion of grouping sequence features and spatial dimensional integration features through two sets of dense network layers, achieved the prediction of anti-cancer peptides through the sigmoid function, and verified the approach with two public datasets, ACP740 (accuracy reached 0.8230) and ACP240 (accuracy reached 0.8750). The following is a link to the model code and datasets mentioned in this article: https://github.com/ YouHongfeng101/ACP-DL.
Collapse
Affiliation(s)
- Hongfeng You
- College of Information Science and Engineering, Xinjiang University, 666 Shengli Road, Tianshan District, Urumqi, Xinjiang, China
| | - Long Yu
- Network Center, Xinjiang University, Xinjiang, China.
| | - Shengwei Tian
- School of Software, Xinjiang University, Tianshan District, 666 Shengli Road, Urumqi, Xinjiang, China
| | - Xiang Ma
- Department of Cardiology, The First Affiliated Hospital of Xinjiang Medical University, Urumqi, 830011, China
| | - Yan Xing
- Imaging Center, The First Affiliated Hospital of Xinjiang Medical University, No. 137, LiYuShan South Road, Urumqi, Xinjiang, China
| | - Jinmiao Song
- College of Information Science and Engineering, Xinjiang University, Urumqi, Xinjiang, China
| | - Weidong Wu
- People's Hospital of Xinjiang Uygur Autonomous Region, Urumqi, Xinjiang, China
| |
Collapse
|
86
|
Alghamdi W, Alzahrani E, Ullah MZ, Khan YD. 4mC-RF: Improving the prediction of 4mC sites using composition and position relative features and statistical moment. Anal Biochem 2021; 633:114385. [PMID: 34571005 DOI: 10.1016/j.ab.2021.114385] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2020] [Revised: 09/09/2021] [Accepted: 09/13/2021] [Indexed: 01/28/2023]
Abstract
N4-methylcytosine (4 mC) is an important epigenetic modification that occurs enzymatically by the action of DNA methyltransferases. 4 mC sites exist in prokaryotes and eukaryotes while playing a vital role in regulating gene expression, DNA replication, and cell cycle. The efficient and accurate prediction of 4 mC sites has a significant role in the insight of 4 mC biological properties and functions. Therefore, a sequence-based predictor is proposed, namely 4 mC-RF, for identifying 4 mC sites through the integration of statistical moments along with position, and composition-dependent features. Relative and absolute position-based features are computed to extract optimal features. A popular machine learning classifier Random Forest was used for training the model. Validation results were obtained through rigorous processes of self-consistency, 10-fold cross-validation, Independent set testing, and Jackknife yielding 95.1%, 95.2%, 97.0%, and 94.7% accuracies, respectively. Our proposed model depicts the highest prediction accuracies as compared to existing models. Subsequently, the developed 4 mC-RF model was constructed into a web server. A significant and more accurate predictor of 4 mC Methylcytosine sites helps experimental scientists to gather faster, efficient, and cost-effective results.
Collapse
Affiliation(s)
- Wajdi Alghamdi
- Department of Information Technology, Faculty of Computing and Information Technology, King Abdulaziz University, P. O. Box 80221, Jeddah 21589, Saudi Arabia.
| | - Ebraheem Alzahrani
- Department of Mathematics, Faculty of Science, King Abdulaziz University, P. O. Box 80203, Jeddah 21589, Saudi Arabia.
| | - Malik Zaka Ullah
- Department of Mathematics, Faculty of Science, King Abdulaziz University, P. O. Box 80203, Jeddah 21589, Saudi Arabia.
| | - Yaser Daanial Khan
- Department of Computer Science, University of Management and Technology, Lahore 54770, Pakistan.
| |
Collapse
|
87
|
Virtual Screening for Biomimetic Anti-Cancer Peptides from Cordyceps militaris Putative Pepsinized Peptidome and Validation on Colon Cancer Cell Line. Molecules 2021; 26:molecules26195767. [PMID: 34641308 PMCID: PMC8510206 DOI: 10.3390/molecules26195767] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2021] [Revised: 09/18/2021] [Accepted: 09/19/2021] [Indexed: 02/07/2023] Open
Abstract
Colorectal cancer is one of the leading causes of cancer-related death in Thailand and many other countries. The standard practice for curing this cancer is surgery with an adjuvant chemotherapy treatment. However, the unfavorable side effects of chemotherapeutic drugs are undeniable. Recently, protein hydrolysates and anticancer peptides have become popular alternative options for colon cancer treatment. Therefore, we aimed to screen and select the anticancer peptide candidates from the in silico pepsin hydrolysate of a Cordyceps militaris (CM) proteome using machine-learning-based prediction servers for anticancer prediction, i.e., AntiCP, iACP, and MLACP. The selected CM-anticancer peptide candidates could be an alternative treatment or co-treatment agent for colorectal cancer, reducing the use of chemotherapeutic drugs. To ensure the anticancer properties, an in vitro assay was performed with "CM-biomimetic peptides" on the non-metastatic colon cancer cell line (HT-29). According to the 3-(4,5-dimethylthiazol-2-yl)-2,5-diphenyltetrazolium bromide (MTT) assay results from peptide candidate treatments at 0-400 µM, the IC50 doses of the CM-biomimetic peptide with no toxic and cancer-cell-penetrating ability, original C. militaris biomimetic peptide (C-ori), against the HT-29 cell line were 114.9 µM at 72 hours. The effects of C-ori compared to the doxorubicin, a conventional chemotherapeutic drug for colon cancer treatment, and the combination effects of both the CM-anticancer peptide and doxorubicin were observed. The results showed that C-ori increased the overall efficiency in the combination treatment with doxorubicin. According to the acridine orange/propidium iodine (AO/PI) staining assay, C-ori can induce apoptosis in HT-29 cells significantly, confirmed by chromatin condensation, membrane blebbing, apoptotic bodies, and late apoptosis which were observed under a fluorescence microscope.
Collapse
|
88
|
Cai L, Wang L, Fu X, Zeng X. Active Semisupervised Model for Improving the Identification of Anticancer Peptides. ACS OMEGA 2021; 6:23998-24008. [PMID: 34568678 PMCID: PMC8459422 DOI: 10.1021/acsomega.1c03132] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/15/2021] [Indexed: 06/13/2023]
Abstract
Cancer is one of the most dangerous threats to human health. Accurate identification of anticancer peptides (ACPs) is valuable for the development and design of new anticancer agents. However, most machine-learning algorithms have limited ability to identify ACPs, and their accuracy is sensitive to the amount of label data. In this paper, we construct a new technology that combines active learning (AL) and label propagation (LP) algorithm to solve this problem, called (ACP-ALPM). First, we develop an efficient feature representation method based on various descriptor information and coding information of the peptide sequence. Then, an AL strategy is used to filter out the most informative data for model training, and a more powerful LP classifier is cast through continuous iterations. Finally, we evaluate the performance of ACP-ALPM and compare it with that of some of the state-of-the-art and classic methods; experimental results show that our method is significantly superior to them. In addition, through the experimental comparison of random selection and AL on three public data sets, it is proved that the AL strategy is more effective. Notably, a visualization experiment further verified that AL can utilize unlabeled data to improve the performance of the model. We hope that our method can be extended to other types of peptides and provide more inspiration for other similar work.
Collapse
Affiliation(s)
- Lijun Cai
- Department of Information
Science and Technology, Hunan University, Changsha, Hunan 410000, China
| | - Li Wang
- Department of Information
Science and Technology, Hunan University, Changsha, Hunan 410000, China
| | - Xiangzheng Fu
- Department of Information
Science and Technology, Hunan University, Changsha, Hunan 410000, China
| | - Xiangxiang Zeng
- Department of Information
Science and Technology, Hunan University, Changsha, Hunan 410000, China
| |
Collapse
|
89
|
Khan YD, Khan NS, Naseer S, Butt AH. iSUMOK-PseAAC: prediction of lysine sumoylation sites using statistical moments and Chou's PseAAC. PeerJ 2021; 9:e11581. [PMID: 34430072 PMCID: PMC8349168 DOI: 10.7717/peerj.11581] [Citation(s) in RCA: 20] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/18/2021] [Accepted: 05/19/2021] [Indexed: 01/25/2023] Open
Abstract
Sumoylation is the post-translational modification that is involved in the adaption of the cells and the functional properties of a large number of proteins. Sumoylation has key importance in subcellular concentration, transcriptional synchronization, chromatin remodeling, response to stress, and regulation of mitosis. Sumoylation is associated with developmental defects in many human diseases such as cancer, Huntington's, Alzheimer's, Parkinson's, Spin cerebellar ataxia 1, and amyotrophic lateral sclerosis. The covalent bonding of Sumoylation is essential to inheriting part of the operative characteristics of some other proteins. For that reason, the prediction of the Sumoylation site has significance in the scientific community. A novel and efficient technique is proposed to predict the Sumoylation sites in proteins by incorporating Chou's Pseudo Amino Acid Composition (PseAAC) with statistical moments-based features. The outcomes from the proposed system using 10 fold cross-validation testing are 94.51%, 94.24%, 94.79% and 0.8903% accuracy, sensitivity, specificity and MCC, respectively. The performance of the proposed system is so far the best in comparison to the other state-of-the-art methods. The codes for the current study are available on the GitHub repository using the link: https://github.com/csbioinfopk/iSumoK-PseAAC.
Collapse
Affiliation(s)
- Yaser Daanial Khan
- Department of Computer Science, School of Systems and Technology, University of Management and Technology, Lahore, Punjab, Pakistan
| | - Nabeel Sabir Khan
- Department of Computer Science, School of Systems and Technology, University of Management and Technology, Lahore, Punjab, Pakistan
| | - Sheraz Naseer
- Department of Computer Science, School of Systems and Technology, University of Management and Technology, Lahore, Punjab, Pakistan
| | - Ahmad Hassan Butt
- Department of Computer Science, School of Systems and Technology, University of Management and Technology, Lahore, Punjab, Pakistan
| |
Collapse
|
90
|
iBitter-Fuse: A Novel Sequence-Based Bitter Peptide Predictor by Fusing Multi-View Features. Int J Mol Sci 2021; 22:ijms22168958. [PMID: 34445663 PMCID: PMC8396555 DOI: 10.3390/ijms22168958] [Citation(s) in RCA: 24] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2021] [Revised: 08/08/2021] [Accepted: 08/17/2021] [Indexed: 12/19/2022] Open
Abstract
Accurate identification of bitter peptides is of great importance for better understanding their biochemical and biophysical properties. To date, machine learning-based methods have become effective approaches for providing a good avenue for identifying potential bitter peptides from large-scale protein datasets. Although few machine learning-based predictors have been developed for identifying the bitterness of peptides, their prediction performances could be improved. In this study, we developed a new predictor (named iBitter-Fuse) for achieving more accurate identification of bitter peptides. In the proposed iBitter-Fuse, we have integrated a variety of feature encoding schemes for providing sufficient information from different aspects, namely consisting of compositional information and physicochemical properties. To enhance the predictive performance, the customized genetic algorithm utilizing self-assessment-report (GA-SAR) was employed for identifying informative features followed by inputting optimal ones into a support vector machine (SVM)-based classifier for developing the final model (iBitter-Fuse). Benchmarking experiments based on both 10-fold cross-validation and independent tests indicated that the iBitter-Fuse was able to achieve more accurate performance as compared to state-of-the-art methods. To facilitate the high-throughput identification of bitter peptides, the iBitter-Fuse web server was established and made freely available online. It is anticipated that the iBitter-Fuse will be a useful tool for aiding the discovery and de novo design of bitter peptides.
Collapse
|
91
|
Guo W, Liu X, Ma Y, Zhang R. iRspot-DCC: Recombination hot/ cold spots identification based on dinucleotide-based correlation coefficient and convolutional neural network. JOURNAL OF INTELLIGENT & FUZZY SYSTEMS 2021. [DOI: 10.3233/jifs-210213] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022]
Abstract
The correct identification of gene recombination cold/hot spots is of great significance for studying meiotic recombination and genetic evolution. However, most of the existing recombination spots recognition methods ignore the global sequence information hidden in the DNA sequence, resulting in their low recognition accuracy. A computational predictor called iRSpot-DCC was proposed in this paper to improve the accuracy of cold/hot spots identification. In this approach, we propose a feature extraction method based on dinucleotide correlation coefficients that focus more on extracting potential DNA global sequence information. Then, 234 representative features vectors are filtered by SVM weight calculation. Finally, a convolutional neural network with better performance than SVM is selected as a classifier. The experimental results of 5-fold cross-validation test on two standard benchmark datasets showed that the prediction accuracy of our recognition method reached 95.11%, and the Mathew correlation coefficient (MCC) reaches 90.04%, outperforming most other methods. Therefore, iRspot-DCC is a high-precision cold/hot spots identification method for gene recombination, which effectively extracts potential global sequence information from DNA sequences.
Collapse
Affiliation(s)
- Wang Guo
- Chongqing Key Laboratory of Complex Systems and Bionic Control, Chongqing University of Posts and Telecommunications, Chongqing, China
| | - Xingmou Liu
- Chongqing Key Laboratory of Complex Systems and Bionic Control, Chongqing University of Posts and Telecommunications, Chongqing, China
| | - You Ma
- Chongqing Key Laboratory of Complex Systems and Bionic Control, Chongqing University of Posts and Telecommunications, Chongqing, China
| | - Rongjie Zhang
- Chongqing Key Laboratory of Complex Systems and Bionic Control, Chongqing University of Posts and Telecommunications, Chongqing, China
| |
Collapse
|
92
|
Charoenkwan P, Anuwongcharoen N, Nantasenamat C, Hasan MM, Shoombuatong W. In Silico Approaches for the Prediction and Analysis of Antiviral Peptides: A Review. Curr Pharm Des 2021; 27:2180-2188. [PMID: 33138759 DOI: 10.2174/1381612826666201102105827] [Citation(s) in RCA: 15] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2020] [Accepted: 08/20/2020] [Indexed: 11/22/2022]
Abstract
In light of the growing resistance toward current antiviral drugs, efforts to discover novel and effective antiviral therapeutic agents remain a pressing scientific effort. Antiviral peptides (AVPs) represent promising therapeutic agents due to their extraordinary advantages in terms of potency, efficacy and pharmacokinetic properties. The growing volume of newly discovered peptide sequences in the post-genomic era requires computational approaches for timely and accurate identification of AVPs. Machine learning (ML) methods such as random forest and support vector machine represent robust learning algorithms that are instrumental in successful peptide-based drug discovery. Therefore, this review summarizes the current state-of-the-art application of ML methods for identifying AVPs directly from the sequence information. We compare the efficiency of these methods in terms of the underlying characteristics of the dataset used along with feature encoding methods, ML algorithms, cross-validation methods and prediction performance. Finally, guidelines for the development of robust AVP models are also discussed. It is anticipated that this review will serve as a useful guide for the design and development of robust AVP and related therapeutic peptide predictors in the future.
Collapse
Affiliation(s)
- Phasit Charoenkwan
- Modern Management and Information Technology, College of Arts, Media and Technology, Chiang Mai University, Chiang Mai 50200, Thailand
| | - Nuttapat Anuwongcharoen
- Center of Data Mining and Biomedical Informatics, Faculty of Medical Technology, Mahidol University, Bangkok, 10700, Thailand
| | - Chanin Nantasenamat
- Center of Data Mining and Biomedical Informatics, Faculty of Medical Technology, Mahidol University, Bangkok, 10700, Thailand
| | - Md Mehedi Hasan
- Department of Bioscience and Bioinformatics, Kyushu Institute of Technology, 680-4 Kawazu, Iizuka, Fukuoka 820-8502, Japan
| | - Watshara Shoombuatong
- Center of Data Mining and Biomedical Informatics, Faculty of Medical Technology, Mahidol University, Bangkok, 10700, Thailand
| |
Collapse
|
93
|
Nasiri F, Atanaki FF, Behrouzi S, Kavousi K, Bagheri M. CpACpP: In Silico Cell-Penetrating Anticancer Peptide Prediction Using a Novel Bioinformatics Framework. ACS OMEGA 2021; 6:19846-19859. [PMID: 34368571 PMCID: PMC8340416 DOI: 10.1021/acsomega.1c02569] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/17/2021] [Accepted: 07/13/2021] [Indexed: 05/12/2023]
Abstract
Cell-penetrating anticancer peptides (Cp-ACPs) are considered promising candidates in solid tumor and hematologic cancer therapies. Current approaches for the design and discovery of Cp-ACPs trust the expensive high-throughput screenings that often give rise to multiple obstacles, including instrumentation adaptation and experimental handling. The application of machine learning (ML) tools developed for peptide activity prediction is importantly of growing interest. In this study, we applied the random forest (RF)-, support vector machine (SVM)-, and eXtreme gradient boosting (XGBoost)-based algorithms to predict the active Cp-ACPs using an experimentally validated data set. The model, CpACpP, was developed on the basis of two independent cell-penetrating peptide (CPP) and anticancer peptide (ACP) subpredictors. Various compositional and physiochemical-based features were combined or selected using the multilayered recursive feature elimination (RFE) method for both data sets. Our results showed that the ACP subclassifiers obtain a mean performance accuracy (ACC) of 0.98 with an area under curve (AUC) ≈ 0.98 vis-à-vis the CPP predictors displaying relevant values of ∼0.94 and ∼0.95 via the hybrid-based features and independent data sets, respectively. Also, the predicting evaluation of Cp-ACPs gave accuracies of ∼0.79 and 0.89 on a series of independent sequences by applying our CPP and ACP classifiers, respectively, which leaves the performance of our predictors better than the earlier reported ACPred, mACPpred, MLCPP, and CPPred-RF. The described consensus-based fusion method additionally reached an AUC of 0.94 for the prediction of Cp-ACP (http://cbb1.ut.ac.ir/CpACpP/Index).
Collapse
Affiliation(s)
- Farid Nasiri
- Peptide
Chemistry Laboratory, Department of Biochemistry, Institute of Biochemistry
and Biophysics (IBB), University of Tehran, Tehran 14176-14335, Iran
| | - Fereshteh Fallah Atanaki
- Laboratory
of Complex Biological Systems and Bioinformatics (CBB), Department
of Bioinformatics, Institute of Biochemistry and Biophysics (IBB), University of Tehran, Tehran 14176-14411, Iran
| | - Saman Behrouzi
- Laboratory
of Complex Biological Systems and Bioinformatics (CBB), Department
of Bioinformatics, Institute of Biochemistry and Biophysics (IBB), University of Tehran, Tehran 14176-14411, Iran
| | - Kaveh Kavousi
- Laboratory
of Complex Biological Systems and Bioinformatics (CBB), Department
of Bioinformatics, Institute of Biochemistry and Biophysics (IBB), University of Tehran, Tehran 14176-14411, Iran
| | - Mojtaba Bagheri
- Peptide
Chemistry Laboratory, Department of Biochemistry, Institute of Biochemistry
and Biophysics (IBB), University of Tehran, Tehran 14176-14335, Iran
| |
Collapse
|
94
|
Cao R, Wang M, Bin Y, Zheng C. DLFF-ACP: prediction of ACPs based on deep learning and multi-view features fusion. PeerJ 2021; 9:e11906. [PMID: 34414035 PMCID: PMC8344685 DOI: 10.7717/peerj.11906] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2021] [Accepted: 07/14/2021] [Indexed: 01/10/2023] Open
Abstract
An emerging type of therapeutic agent, anticancer peptides (ACPs), has attracted attention because of its lower risk of toxic side effects. However process of identifying ACPs using experimental methods is both time-consuming and laborious. In this study, we developed a new and efficient algorithm that predicts ACPs by fusing multi-view features based on dual-channel deep neural network ensemble model. In the model, one channel used the convolutional neural network CNN to automatically extract the potential spatial features of a sequence. Another channel was used to process and extract more effective features from handcrafted features. Additionally, an effective feature fusion method was explored for the mutual fusion of different features. Finally, we adopted the neural network to predict ACPs based on the fusion features. The performance comparisons across the single and fusion features showed that the fusion of multi-view features could effectively improve the model's predictive ability. Among these, the fusion of the features extracted by the CNN and composition of k-spaced amino acid group pairs achieved the best performance. To further validate the performance of our model, we compared it with other existing methods using two independent test sets. The results showed that our model's area under curve was 0.90, which was higher than that of the other existing methods on the first test set and higher than most of the other existing methods on the second test set. The source code and datasets are available at https://github.com/wame-ng/DLFF-ACP.
Collapse
Affiliation(s)
- Ruifen Cao
- Key Laboratory of Intelligent Computing and Signal Processing of Ministry of Education, School of Computer Science and Technology, Anhui University, Hefei, Anhui, China
- Engineering Research Center of Big Data Application in Private Health Medicine, Fujian Province University, Putian, Fujian, China
| | - Meng Wang
- Key Laboratory of Intelligent Computing and Signal Processing of Ministry of Education, School of Computer Science and Technology, Anhui University, Hefei, Anhui, China
| | - Yannan Bin
- Key Laboratory of Intelligent Computing and Signal Processing of Ministry of Education, School of Computer Science and Technology, Anhui University, Hefei, Anhui, China
- Institutes of Physical Science and Information Technology, Anhui University, Hefei, Anhui, China
| | - Chunhou Zheng
- Key Laboratory of Intelligent Computing and Signal Processing of Ministry of Education, School of Computer Science and Technology, Anhui University, Hefei, Anhui, China
- Engineering Research Center of Big Data Application in Private Health Medicine, Fujian Province University, Putian, Fujian, China
| |
Collapse
|
95
|
Chen J, Cheong HH, Siu SWI. xDeep-AcPEP: Deep Learning Method for Anticancer Peptide Activity Prediction Based on Convolutional Neural Network and Multitask Learning. J Chem Inf Model 2021; 61:3789-3803. [PMID: 34327990 DOI: 10.1021/acs.jcim.1c00181] [Citation(s) in RCA: 36] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022]
Abstract
Cancer is one of the leading causes of death worldwide. Conventional cancer treatment relies on radiotherapy and chemotherapy, but both methods bring severe side effects to patients, as these therapies not only attack cancer cells but also damage normal cells. Anticancer peptides (ACPs) are a promising alternative as therapeutic agents that are efficient and selective against tumor cells. Here, we propose a deep learning method based on convolutional neural networks to predict biological activity (EC50, LC50, IC50, and LD50) against six tumor cells, including breast, colon, cervix, lung, skin, and prostate. We show that models derived with multitask learning achieve better performance than conventional single-task models. In repeated 5-fold cross validation using the CancerPPD data set, the best models with the applicability domain defined obtain an average mean squared error of 0.1758, Pearson's correlation coefficient of 0.8086, and Kendall's correlation coefficient of 0.6156. As a step toward model interpretability, we infer the contribution of each residue in the sequence to the predicted activity by means of feature importance weights derived from the convolutional layers of the model. The present method, referred to as xDeep-AcPEP, will help to identify effective ACPs in rational peptide design for therapeutic purposes. The data, script files for reproducing the experiments, and the final prediction models can be downloaded from http://github.com/chen709847237/xDeep-AcPEP. The web server to directly access this prediction method is at https://app.cbbio.online/acpep/home.
Collapse
Affiliation(s)
- Jiarui Chen
- Department of Computer and Information Science, University of Macau, Avenida da Universidade, Taipa, Macau 999078, China
| | - Hong Hin Cheong
- Department of Computer and Information Science, University of Macau, Avenida da Universidade, Taipa, Macau 999078, China
| | - Shirley W I Siu
- Department of Computer and Information Science, University of Macau, Avenida da Universidade, Taipa, Macau 999078, China.,School of Pharmaceutical Sciences, Universiti Sains Malaysia, 11800 USM, Penang, Malaysia
| |
Collapse
|
96
|
He W, Wang Y, Cui L, Su R, Wei L. Learning embedding features based on multi-sense-scaled attention architecture to improve the predictive performance of anticancer peptides. Bioinformatics 2021; 37:4684-4693. [PMID: 34323948 DOI: 10.1093/bioinformatics/btab560] [Citation(s) in RCA: 24] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2021] [Revised: 07/03/2021] [Accepted: 07/28/2021] [Indexed: 01/10/2023] Open
Abstract
MOTIVATION Anticancer peptides (ACPs) have recently emerged as effective anticancer drugs in cancer therapy. Machine-learning-based predictors have been developed to identify ACPs and achieve satisfactory performance. However, existing methods suffer from experience-based feature engineering, which not only restricts the representation ability of the models to a certain extent but also lacks adaptivity for different data, limiting the further improvement of the predictive performance and impacting the robustness of the predictive models. To alleviate the above problems, we propose a novel deep-learning-based predictor named ACPred-LAF, in which we propose a novel multi-sense and multi-scaled embedding algorithm to automatically learn and extract context sequential characteristics of ACPs. RESULTS Through the feature comparative analysis, we demonstrate that our learnable and self-adaptive embedding features are better than hand-crafted features in capturing discriminative information, which can effectively benefit the performance improvement for ACP prediction. In addition, benchmarking comparison results demonstrate that our ACPred-LAF outperforms the state-of-the-art methods both on existing benchmark datasets and our newly constructed dataset. Furthermore, we also prove and validate the robustness of the model via the data interference experiment. To avoid potential evaluation bias, here we construct a new ACP benchmark dataset named ACP-Mixed by integrating existing datasets. We expect our newly constructed dataset to be a golden standard benchmark dataset in this field. To facilitate the use of our model, we develop a web server as the implementation of ACPred-LAF. AVAILABILITY Our proposed ACPred-LAF, newly constructed benchmark dataset ACP-Mixed are open source collaborative initiatives available in the GitHub repository (https://github.com/TearsWaiting/ACPred-LAF). Besides, a webserver as the implementation of ACPred-LAF that can be accessed via: http://server.malab.cn/ACPred-LAF. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Wenjia He
- School of Software, Shandong University, Jinan, China.,Joint SDU-NTU Centre for Artificial Intelligence Research (C-FAIR), Shandong University, Jinan, China
| | - Yu Wang
- School of Software, Shandong University, Jinan, China.,Joint SDU-NTU Centre for Artificial Intelligence Research (C-FAIR), Shandong University, Jinan, China
| | - Lizhen Cui
- School of Software, Shandong University, Jinan, China.,Joint SDU-NTU Centre for Artificial Intelligence Research (C-FAIR), Shandong University, Jinan, China
| | - Ran Su
- College of Intelligence and Computing, Tianjin University, Tianjin, China
| | - Leyi Wei
- School of Software, Shandong University, Jinan, China.,Joint SDU-NTU Centre for Artificial Intelligence Research (C-FAIR), Shandong University, Jinan, China
| |
Collapse
|
97
|
Liang X, Li F, Chen J, Li J, Wu H, Li S, Song J, Liu Q. Large-scale comparative review and assessment of computational methods for anti-cancer peptide identification. Brief Bioinform 2021; 22:bbaa312. [PMID: 33316035 PMCID: PMC8294543 DOI: 10.1093/bib/bbaa312] [Citation(s) in RCA: 44] [Impact Index Per Article: 14.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2020] [Revised: 09/30/2020] [Accepted: 08/25/2020] [Indexed: 12/13/2022] Open
Abstract
Anti-cancer peptides (ACPs) are known as potential therapeutics for cancer. Due to their unique ability to target cancer cells without affecting healthy cells directly, they have been extensively studied. Many peptide-based drugs are currently evaluated in the preclinical and clinical trials. Accurate identification of ACPs has received considerable attention in recent years; as such, a number of machine learning-based methods for in silico identification of ACPs have been developed. These methods promote the research on the mechanism of ACPs therapeutics against cancer to some extent. There is a vast difference in these methods in terms of their training/testing datasets, machine learning algorithms, feature encoding schemes, feature selection methods and evaluation strategies used. Therefore, it is desirable to summarize the advantages and disadvantages of the existing methods, provide useful insights and suggestions for the development and improvement of novel computational tools to characterize and identify ACPs. With this in mind, we firstly comprehensively investigate 16 state-of-the-art predictors for ACPs in terms of their core algorithms, feature encoding schemes, performance evaluation metrics and webserver/software usability. Then, comprehensive performance assessment is conducted to evaluate the robustness and scalability of the existing predictors using a well-prepared benchmark dataset. We provide potential strategies for the model performance improvement. Moreover, we propose a novel ensemble learning framework, termed ACPredStackL, for the accurate identification of ACPs. ACPredStackL is developed based on the stacking ensemble strategy combined with SVM, Naïve Bayesian, lightGBM and KNN. Empirical benchmarking experiments against the state-of-the-art methods demonstrate that ACPredStackL achieves a comparative performance for predicting ACPs. The webserver and source code of ACPredStackL is freely available at http://bigdata.biocie.cn/ACPredStackL/ and https://github.com/liangxiaoq/ACPredStackL, respectively.
Collapse
Affiliation(s)
- Xiao Liang
- College of Information Engineering, Northwest A&F University, Yangling, 712100, China
- Shaanxi Key Laboratory of Agricultural Information Perception and Intelligent Service, Yangling, Shaanxi 712100, China
| | - Fuyi Li
- Monash Biomedicine Discovery Institute and Department of Biochemistry and Molecular Biology, Monash University, Melbourne, VIC 3800, Australia
- Monash Centre for Data Science, Monash University, Melbourne, VIC 3800, Australia
- Department of Microbiology and Immunology, Peter Doherty Institute for Infection and Immunity, University of Melbourne, Melbourne, Victoria, Australia
| | - Jinxiang Chen
- College of Information Engineering, Northwest A&F University, Yangling, 712100, China
| | - Junlong Li
- College of Information Engineering, Northwest A&F University, Yangling, 712100, China
| | - Hao Wu
- College of Information Engineering, Northwest A&F University, Yangling, 712100, China
| | - Shuqin Li
- College of Information Engineering, Northwest A&F University, Yangling, 712100, China
- Shaanxi Key Laboratory of Agricultural Information Perception and Intelligent Service, Yangling, Shaanxi 712100, China
| | - Jiangning Song
- Monash Biomedicine Discovery Institute and Department of Biochemistry and Molecular Biology, Monash University, Melbourne, VIC 3800, Australia
- Monash Centre for Data Science, Monash University, Melbourne, VIC 3800, Australia
- ARC Centre of Excellence in Advanced Molecular Imaging, Monash University, Melbourne, VIC 3800, Australia
| | - Quanzhong Liu
- College of Information Engineering, Northwest A&F University, Yangling, 712100, China
- Shaanxi Key Laboratory of Agricultural Information Perception and Intelligent Service, Yangling, Shaanxi 712100, China
| |
Collapse
|
98
|
Chen XG, Zhang W, Yang X, Li C, Chen H. ACP-DA: Improving the Prediction of Anticancer Peptides Using Data Augmentation. Front Genet 2021; 12:698477. [PMID: 34276801 PMCID: PMC8279753 DOI: 10.3389/fgene.2021.698477] [Citation(s) in RCA: 17] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2021] [Accepted: 06/07/2021] [Indexed: 12/09/2022] Open
Abstract
Anticancer peptides (ACPs) have provided a promising perspective for cancer treatment, and the prediction of ACPs is very important for the discovery of new cancer treatment drugs. It is time consuming and expensive to use experimental methods to identify ACPs, so computational methods for ACP identification are urgently needed. There have been many effective computational methods, especially machine learning-based methods, proposed for such predictions. Most of the current machine learning methods try to find suitable features or design effective feature learning techniques to accurately represent ACPs. However, the performance of these methods can be further improved for cases with insufficient numbers of samples. In this article, we propose an ACP prediction model called ACP-DA (Data Augmentation), which uses data augmentation for insufficient samples to improve the prediction performance. In our method, to better exploit the information of peptide sequences, peptide sequences are represented by integrating binary profile features and AAindex features, and then the samples in the training set are augmented in the feature space. After data augmentation, the samples are used to train the machine learning model, which is used to predict ACPs. The performance of ACP-DA exceeds that of existing methods, and ACP-DA achieves better performance in the prediction of ACPs compared with a method without data augmentation. The proposed method is available at http://github.com/chenxgscuec/ACPDA.
Collapse
Affiliation(s)
- Xian-Gan Chen
- School of Biomedical Engineering, South-Central University for Nationalities, Wuhan, China.,Hubei Key Laboratory of Medical Information Analysis and Tumor Diagnosis & Treatment, South-Central University for Nationalities, Wuhan, China.,Key Laboratory of Cognitive Science (South-Central University for Nationalities), State Ethnic Affairs Commission, Wuhan, China
| | - Wen Zhang
- College of Informatics, Huazhong Agricultural University, Wuhan, China.,Hubei Engineering Technology Research Center of Agricultural Big Data, Wuhan, China
| | - Xiaofei Yang
- School of Biomedical Engineering, South-Central University for Nationalities, Wuhan, China.,Hubei Key Laboratory of Medical Information Analysis and Tumor Diagnosis & Treatment, South-Central University for Nationalities, Wuhan, China.,Key Laboratory of Cognitive Science (South-Central University for Nationalities), State Ethnic Affairs Commission, Wuhan, China
| | - Chenhong Li
- School of Biomedical Engineering, South-Central University for Nationalities, Wuhan, China.,Hubei Key Laboratory of Medical Information Analysis and Tumor Diagnosis & Treatment, South-Central University for Nationalities, Wuhan, China.,Key Laboratory of Cognitive Science (South-Central University for Nationalities), State Ethnic Affairs Commission, Wuhan, China
| | - Hengling Chen
- School of Biomedical Engineering, South-Central University for Nationalities, Wuhan, China.,Hubei Key Laboratory of Medical Information Analysis and Tumor Diagnosis & Treatment, South-Central University for Nationalities, Wuhan, China.,Key Laboratory of Cognitive Science (South-Central University for Nationalities), State Ethnic Affairs Commission, Wuhan, China
| |
Collapse
|
99
|
Huang KY, Tseng YJ, Kao HJ, Chen CH, Yang HH, Weng SL. Identification of subtypes of anticancer peptides based on sequential features and physicochemical properties. Sci Rep 2021; 11:13594. [PMID: 34193950 PMCID: PMC8245499 DOI: 10.1038/s41598-021-93124-9] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2020] [Accepted: 06/08/2021] [Indexed: 11/25/2022] Open
Abstract
Anticancer peptides (ACPs) are a kind of bioactive peptides which could be used as a novel type of anticancer drug that has several advantages over chemistry-based drug, including high specificity, strong tumor penetration capacity, and low toxicity to normal cells. As the number of experimentally verified bioactive peptides has increased significantly, various of in silico approaches are imperative for investigating the characteristics of ACPs. However, the lack of methods for investigating the differences in physicochemical properties of ACPs. In this study, we compared the N- and C-terminal amino acid composition for each peptide, there are three major subtypes of ACPs that are defined based on the distribution of positively charged residues. For the first time, we were motivated to develop a two-step machine learning model for identification of the subtypes of ACPs, which classify the input data into the corresponding group before applying the classifier. Further, to improve the predictive power, the hybrid feature sets were considered for prediction. Evaluation by five-fold cross-validation showed that the two-step model trained with sequence-based features and physicochemical properties was most effective in discriminating between ACPs and non-ACPs. The two-step model trained with the hybrid features performed well, with a sensitivity of 86.75%, a specificity of 85.75%, an accuracy of 86.08%, and a Matthews Correlation Coefficient value of 0.703. Furthermore, the model also consistently provides the effective performance in independent testing set, with sensitivity of 77.6%, specificity of 94.74%, accuracy of 88.99% and the MCC value reached 0.75. Finally, the two-step model has been implemented as a web-based tool, namely iDACP, which is now freely available at http://mer.hc.mmh.org.tw/iDACP/ .
Collapse
Affiliation(s)
- Kai-Yao Huang
- Department of Medical Research, Hsinchu Mackay Memorial Hospital, Hsinchu City, 300, Taiwan
- Department of Medicine, Mackay Medical College, New Taipei City, 252, Taiwan
| | - Yi-Jhan Tseng
- Department of Medical Research, Hsinchu Mackay Memorial Hospital, Hsinchu City, 300, Taiwan
| | - Hui-Ju Kao
- Department of Medical Research, Hsinchu Mackay Memorial Hospital, Hsinchu City, 300, Taiwan
| | - Chia-Hung Chen
- Department of Medical Research, Hsinchu Mackay Memorial Hospital, Hsinchu City, 300, Taiwan
| | - Hsiao-Hsiang Yang
- Department of Medical Research, Hsinchu Mackay Memorial Hospital, Hsinchu City, 300, Taiwan
| | - Shun-Long Weng
- Department of Medicine, Mackay Medical College, New Taipei City, 252, Taiwan.
- Department of Obstetrics and Gynecology, Hsinchu Mackay Memorial Hospital, Hsinchu City, 300, Taiwan.
- Mackay Junior College of Medicine, Medicine, Nursing and Management College, Taipei City, 112, Taiwan.
| |
Collapse
|
100
|
Feng P, Feng L, Tang C. Comparison and Analysis of Computational Methods for Identifying N6-Methyladenosine Sites in Saccharomyces cerevisiae. Curr Pharm Des 2021; 27:1219-1229. [PMID: 33167827 DOI: 10.2174/1381612826666201109110703] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2020] [Accepted: 07/20/2020] [Indexed: 11/22/2022]
Abstract
BACKGROUND N6-methyladenosine (m6A) plays critical roles in a broad range of biological processes. Knowledge about the precise location of m6A site in the transcriptome is vital for deciphering its biological functions. Although experimental techniques have made substantial contributions to identify m6A, they are still labor intensive and time consuming. As complement to experimental methods, in the past few years, a series of computational approaches have been proposed to identify m6A sites. METHODS In order to facilitate researchers to select appropriate methods for identifying m6A sites, it is necessary to conduct a comprehensive review and comparison of existing methods. RESULTS Since research works on m6A in Saccharomyces cerevisiae are relatively clear, in this review, we summarized recent progress of computational prediction of m6A sites in S. cerevisiae and assessed the performance of existing computational methods. Finally, future directions of computationally identifying m6A sites are presented. CONCLUSION Taken together, we anticipate that this review will serve as an important guide for computational analysis of m6A modifications.
Collapse
Affiliation(s)
- Pengmian Feng
- School of Basic Medical Sciences, Chengdu University of Traditional Chinese Medicine, Chengdu 611730, China
| | - Lijing Feng
- School of Sciences, North China University of Science and Technology, Tangshan 063000, China
| | - Chaohui Tang
- School of Basic Medical Sciences, Chengdu University of Traditional Chinese Medicine, Chengdu 611730, China
| |
Collapse
|