1
|
Cui Z, Wang SG, He Y, Chen ZH, Zhang QH. DeepTPpred: A Deep Learning Approach With Matrix Factorization for Predicting Therapeutic Peptides by Integrating Length Information. IEEE J Biomed Health Inform 2023; 27:4611-4622. [PMID: 37368803 DOI: 10.1109/jbhi.2023.3290014] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/29/2023]
Abstract
The abuse of traditional antibiotics has led to increased resistance of bacteria and viruses. Efficient therapeutic peptide prediction is critical for peptide drug discovery. However, most of the existing methods only make effective predictions for one class of therapeutic peptides. It is worth noting that currently no predictive method considers sequence length information as a distinct feature of therapeutic peptides. In this article, a novel deep learning approach with matrix factorization for predicting therapeutic peptides (DeepTPpred) by integrating length information are proposed. The matrix factorization layer can learn the potential features of the encoded sequence through the mechanism of first compression and then restoration. And the length features of the sequence of therapeutic peptides are embedded with encoded amino acid sequences. To automatically learn therapeutic peptide predictions, these latent features are input into the neural networks with self-attention mechanism. On eight therapeutic peptide datasets, DeepTPpred achieved excellent prediction results. Based on these datasets, we first integrated eight datasets to obtain a full therapeutic peptide integration dataset. Then, we obtained two functional integration datasets based on the functional similarity of the peptides. Finally, we also conduct experiments on the latest versions of the ACP and CPP datasets. Overall, the experimental results show that our work is effective for the identification of therapeutic peptides.
Collapse
|
2
|
Guo Y, Yan K, Lv H, Liu B. PreTP-EL: prediction of therapeutic peptides based on ensemble learning. Brief Bioinform 2021; 22:6359002. [PMID: 34459488 DOI: 10.1093/bib/bbab358] [Citation(s) in RCA: 22] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2021] [Revised: 07/27/2021] [Accepted: 08/11/2021] [Indexed: 01/02/2023] Open
Abstract
Therapeutic peptides are important for understanding the correlation between peptides and their therapeutic diagnostic potential. The therapeutic peptides can be further divided into different types based on therapeutic function sharing different characteristics. Although some computational approaches have been proposed to predict different types of therapeutic peptides, they failed to accurately predict all types of therapeutic peptides. In this study, a predictor called PreTP-EL has been proposed via employing the ensemble learning approach to fuse the different features and machine learning techniques in order to capture the different characteristics of various therapeutic peptides. Experimental results showed that PreTP-EL outperformed other competing methods. Availability and implementation: A user-friendly web-server of PreTP-EL predictor is available at http://bliulab.net/PreTP-EL.
Collapse
Affiliation(s)
- Yichen Guo
- School of Computer Science and Technology, Beijing Institute of Technology, Beijing, China
| | - Ke Yan
- School of Computer Science and Technology, Beijing Institute of Technology, Beijing, China
| | - Hongwu Lv
- School of Computer Science and Technology, Beijing Institute of Technology, Beijing, China
| | - Bin Liu
- School of Computer Science and Technology, Beijing Institute of Technology, Beijing, China
| |
Collapse
|
3
|
Toropova AP, Toropov AA. Application of the Monte Carlo Method for the Prediction of Behavior of Peptides. Curr Protein Pept Sci 2019; 20:1151-1157. [DOI: 10.2174/1389203720666190123163907] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2018] [Revised: 12/17/2018] [Accepted: 12/20/2018] [Indexed: 12/26/2022]
Abstract
Prediction of physicochemical and biochemical behavior of peptides is an important and attractive
task of the modern natural sciences, since these substances have a key role in life processes. The
Monte Carlo technique is a possible way to solve the above task. The Monte Carlo method is a tool with
different applications relative to the study of peptides: (i) analysis of the 3D configurations (conformers);
(ii) establishment of quantitative structure – property / activity relationships (QSPRs/QSARs); and (iii)
development of databases on the biopolymers. Current ideas related to application of the Monte Carlo
technique for studying peptides and biopolymers have been discussed in this review.
Collapse
Affiliation(s)
- Alla P. Toropova
- Istituto di Ricerche Farmacologiche Mario Negri IRCCS, Via La Masa 19, 20156 Milano, Italy
| | - Andrey A. Toropov
- Istituto di Ricerche Farmacologiche Mario Negri IRCCS, Via La Masa 19, 20156 Milano, Italy
| |
Collapse
|
4
|
Munteanu CR, Gestal M, Martínez-Acevedo YG, Pedreira N, Pazos A, Dorado J. Improvement of Epitope Prediction Using Peptide Sequence Descriptors and Machine Learning. Int J Mol Sci 2019; 20:ijms20184362. [PMID: 31491969 PMCID: PMC6770149 DOI: 10.3390/ijms20184362] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2019] [Revised: 08/26/2019] [Accepted: 08/30/2019] [Indexed: 01/27/2023] Open
Abstract
In this work, we improved a previous model used for the prediction of proteomes as new B-cell epitopes in vaccine design. The predicted epitope activity of a queried peptide is based on its sequence, a known reference epitope sequence under specific experimental conditions. The peptide sequences were transformed into molecular descriptors of sequence recurrence networks and were mixed under experimental conditions. The new models were generated using 709,100 instances of pair descriptors for query and reference peptide sequences. Using perturbations of the initial descriptors under sequence or assay conditions, 10 transformed features were used as inputs for seven Machine Learning methods. The best model was obtained with random forest classifiers with an Area Under the Receiver Operating Characteristics (AUROC) of 0.981 ± 0.0005 for the external validation series (five-fold cross-validation). The database included information about 83,683 peptides sequences, 1448 epitope organisms, 323 host organisms, 15 types of in vivo processes, 28 experimental techniques, and 505 adjuvant additives. The current model could improve the in silico predictions of epitopes for vaccine design. The script and results are available as a free repository.
Collapse
Affiliation(s)
- Cristian R Munteanu
- RNASA-IMEDIR, Computer Science Faculty, University of A Coruna, 15071 A Coruña, Spain
- Biomedical Research Institute of A Coruña (INIBIC), University Hospital Complex of A Coruña (CHUAC), 15006 A Coruña, Spain
- Centro de Investigación en Tecnologías de la Información y las Comunicaciones (CITIC), Campus de Elviña s/n, 15071 A Coruña, Spain
| | - Marcos Gestal
- RNASA-IMEDIR, Computer Science Faculty, University of A Coruna, 15071 A Coruña, Spain.
- Centro de Investigación en Tecnologías de la Información y las Comunicaciones (CITIC), Campus de Elviña s/n, 15071 A Coruña, Spain.
| | - Yunuen G Martínez-Acevedo
- RNASA-IMEDIR, Computer Science Faculty, University of A Coruna, 15071 A Coruña, Spain
- Unidad Profesional Interdisciplinaria de Biotecnología, National Polytechnic Institute (IPN), Ticoman, 07340 Mexico City, Mexico
| | - Nieves Pedreira
- RNASA-IMEDIR, Computer Science Faculty, University of A Coruna, 15071 A Coruña, Spain
| | - Alejandro Pazos
- RNASA-IMEDIR, Computer Science Faculty, University of A Coruna, 15071 A Coruña, Spain
- Biomedical Research Institute of A Coruña (INIBIC), University Hospital Complex of A Coruña (CHUAC), 15006 A Coruña, Spain
| | - Julián Dorado
- RNASA-IMEDIR, Computer Science Faculty, University of A Coruna, 15071 A Coruña, Spain
- Centro de Investigación en Tecnologías de la Información y las Comunicaciones (CITIC), Campus de Elviña s/n, 15071 A Coruña, Spain
| |
Collapse
|
5
|
From biomedicinal to in silico models and back to therapeutics: a review on the advancement of peptidic modeling. Future Med Chem 2019; 11:2313-2331. [PMID: 31581914 DOI: 10.4155/fmc-2018-0365] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/02/2023] Open
Abstract
Bioactive peptides participate in numerous metabolic functions of living organisms and have emerged as potential therapeutics on a diverse range of diseases. Albeit peptide design does not go without challenges, overwhelming advancements on in silico methodologies have increased the scope of peptide-based drug design and discovery to an unprecedented amount. Within an in silico model versus an experimental validation scenario, this review aims to summarize and discuss how different in silico techniques contribute at present to the design of peptide-based molecules. Published in silico results from 2014 to 2018 were selected and discriminated in major methodological groups, allowing a transversal analysis, promoting a landscape vision and asserting its increasing value in drug design.
Collapse
|
6
|
Wei L, Zhou C, Su R, Zou Q. PEPred-Suite: improved and robust prediction of therapeutic peptides using adaptive feature representation learning. Bioinformatics 2019; 35:4272-4280. [DOI: 10.1093/bioinformatics/btz246] [Citation(s) in RCA: 80] [Impact Index Per Article: 16.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2018] [Revised: 01/28/2019] [Accepted: 04/11/2019] [Indexed: 11/13/2022] Open
Abstract
Abstract
Motivation
Prediction of therapeutic peptides is critical for the discovery of novel and efficient peptide-based therapeutics. Computational methods, especially machine learning based methods, have been developed for addressing this need. However, most of existing methods are peptide-specific; currently, there is no generic predictor for multiple peptide types. Moreover, it is still challenging to extract informative feature representations from the perspective of primary sequences.
Results
In this study, we have developed PEPred-Suite, a bioinformatics tool for the generic prediction of therapeutic peptides. In PEPred-Suite, we introduce an adaptive feature representation strategy that can learn the most representative features for different peptide types. To be specific, we train diverse sequence-based feature descriptors, integrate the learnt class information into our features, and utilize a two-step feature optimization strategy based on the area under receiver operating characteristic curve to extract the most discriminative features. Using the learnt representative features, we trained eight random forest models for eight different types of functional peptides, respectively. Benchmarking results showed that as compared with existing predictors, PEPred-Suite achieves better and robust performance for different peptides. As far as we know, PEPred-Suite is currently the first tool that is capable of predicting so many peptide types simultaneously. In addition, our work demonstrates that the learnt features can reliably predict different peptides.
Availability and implementation
The user-friendly webserver implementing the proposed PEPred-Suite is freely accessible at http://server.malab.cn/PEPred-Suite.
Supplementary information
Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Leyi Wei
- School of Computer Science and Technology, Tianjin University, Tianjin, China
| | - Chen Zhou
- School of Computer Science and Technology, Tianjin University, Tianjin, China
| | - Ran Su
- School of Computer Software, College of Intelligence and Computing, Tianjin University, Tianjin, China
| | - Quan Zou
- Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu, China
| |
Collapse
|
7
|
Martínez-Arzate SG, Tenorio-Borroto E, Barbabosa Pliego A, Díaz-Albiter HM, Vázquez-Chagoyán JC, González-Díaz H. PTML Model for Proteome Mining of B-Cell Epitopes and Theoretical–Experimental Study of Bm86 Protein Sequences from Colima, Mexico. J Proteome Res 2017; 16:4093-4103. [DOI: 10.1021/acs.jproteome.7b00477] [Citation(s) in RCA: 36] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/09/2023]
Affiliation(s)
- Saúl G. Martínez-Arzate
- Molecular
Biology Laboratory, CIESA, FMVZ, Autonomous University of The State of Mexico (UAEM), Toluca, 50200 Mexico State, Mexico
| | - Esvieta Tenorio-Borroto
- Molecular
Biology Laboratory, CIESA, FMVZ, Autonomous University of The State of Mexico (UAEM), Toluca, 50200 Mexico State, Mexico
| | - Alberto Barbabosa Pliego
- Molecular
Biology Laboratory, CIESA, FMVZ, Autonomous University of The State of Mexico (UAEM), Toluca, 50200 Mexico State, Mexico
| | - Héctor M. Díaz-Albiter
- Laboratory
of Biochemistry and Physiology of Insects, Oswaldo Cruz Institute, FIOCRUZ, 4365 Rio de Janeiro, Brazil
- Wellcome
Trust Centre for Molecular Parasitology, University of Glasgow, University Place, Glasgow G12 8TA, United Kingdom
| | - Juan C. Vázquez-Chagoyán
- Molecular
Biology Laboratory, CIESA, FMVZ, Autonomous University of The State of Mexico (UAEM), Toluca, 50200 Mexico State, Mexico
| | - Humbert González-Díaz
- Department
of Organic Chemistry II, University of the Basque Country (UPV/EHU), Bilbao, 48940 Biscay, Spain
- IKERBASQUE, Basque Foundation for Science, Bilbao, 48011 Biscay, Spain
| |
Collapse
|
8
|
A study of the Immune Epitope Database for some fungi species using network topological indices. Mol Divers 2017; 21:713-718. [DOI: 10.1007/s11030-017-9749-4] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2016] [Accepted: 05/09/2017] [Indexed: 10/19/2022]
|