1
|
Zhong G, Liu H, Deng L. Ensemble Machine Learning and Predicted Properties Promote Antimicrobial Peptide Identification. Interdiscip Sci 2024:10.1007/s12539-024-00640-z. [PMID: 38972032 DOI: 10.1007/s12539-024-00640-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2024] [Revised: 06/04/2024] [Accepted: 06/07/2024] [Indexed: 07/08/2024]
Abstract
The emergence of antibiotic-resistant microbes raises a pressing demand for novel alternative treatments. One promising alternative is the antimicrobial peptides (AMPs), a class of innate immunity mediators within the therapeutic peptide realm. AMPs offer salient advantages such as high specificity, cost-effective synthesis, and reduced toxicity. Although some computational methodologies have been proposed to identify potential AMPs with the rapid development of artificial intelligence techniques, there is still ample room to improve their performance. This study proposes a predictive framework which ensembles deep learning and statistical learning methods to screen peptides with antimicrobial activity. We integrate multiple LightGBM classifiers and convolution neural networks which leverages various predicted sequential, structural and physicochemical properties from their residue sequences extracted by diverse machine learning paradigms. Comparative experiments exhibit that our method outperforms other state-of-the-art approaches on an independent test dataset, in terms of representative capability measures. Besides, we analyse the discrimination quality under different varieties of attribute information and it reveals that combination of multiple features could improve prediction. In addition, a case study is carried out to illustrate the exemplary favorable identification effect. We establish a web application at http://amp.denglab.org to provide convenient usage of our proposal and make the predictive framework, source code, and datasets publicly accessible at https://github.com/researchprotein/amp .
Collapse
Affiliation(s)
- Guolun Zhong
- School of Computer Science and Engineering, Central South University, Changsha, 410083, China
| | - Hui Liu
- College of Computer and Information Engineering, Nanjing Tech University, Nanjing, 211816, China.
| | - Lei Deng
- School of Computer Science and Engineering, Central South University, Changsha, 410083, China.
| |
Collapse
|
2
|
Shanthappa PM, Suravajhala R, Kumar G, Melethadathil N. Computational exploration of novel antimicrobial modalities targeting fucose-binding lectins and ribosomes in Mycobacterium smegmatis using tRNA-encoded peptides. J Biomol Struct Dyn 2024:1-13. [PMID: 38676533 DOI: 10.1080/07391102.2024.2335555] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2023] [Accepted: 03/19/2024] [Indexed: 04/29/2024]
Abstract
tRNA-Encoded Peptides (tREPs), encoded by small open reading frames (smORFs) within tRNA genes, have recently emerged as a new class of functional peptides exhibiting antiparasitic activity. The discovery of tREPs has led to a re-evaluation of the role of tRNAs in biology and has expanded our understanding of the genetic code. This presents an immense, unexplored potential in the realm of tRNA-peptide interactions, paving the way for groundbreaking discoveries and innovative applications in various biological functions. This study explores the antimicrobial potential of tREPs against protein targets by employing a computational method that uses verified data sources and highly recognized predictive algorithms to provide a sorted list of likely antimicrobial peptides, which were then filtered for toxicity, cell permeability, allergenicity and half-life. These peptides were then docked with screened protein targets and computationally validated using molecular dynamics (MD) simulations for 150 ns and the binding free energy was estimated. The peptides Pep2 (VVLWRKPRVRKTG) and Pep6 (HRLRLRRRKPWW) exhibited good binding affinities of -110.5 +/- 2.5 and -129.0 +/- 3.9, respectively, with RMSD values of 0.4 and 0.25 nm against the fucose-binding lectin (7NEF) and the 30S ribosome of Mycobacterium smegmatis (5O5J) protein targets. The 7NEF-Pep2 and 5O5J-Pep6 complexes indicated higher negative binding free energies of -52.55 kcal/mol and -55.52 kcal/mol respectively, as calculated by Molecular Mechanics Poisson-Boltzmann Surface Area (MMPBSA). Thus, the tREPs derived peptides designed as a part of this study, provide novel approaches for potential anti-bacterial therapeutic modalities.Communicated by Ramaswamy H. Sarma.
Collapse
Affiliation(s)
- Pallavi M Shanthappa
- Department of Computer Science, School of Computing, Amrita Vishwa Vidyapeetham, Mysuru, India
| | | | - Geetha Kumar
- School of Biotechnology, Amrita Vishwa Vidyapeetham, Amritapuri, India
| | | |
Collapse
|
3
|
Zhuang J, Gao W, Su R. EnAMP: A novel deep learning ensemble antibacterial peptide recognition algorithm based on multi-features. J Bioinform Comput Biol 2024; 22:2450001. [PMID: 38406833 DOI: 10.1142/s021972002450001x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/27/2024]
Abstract
Antimicrobial peptides (AMPs), as the preferred alternatives to antibiotics, have wide application with good prospects. Identifying AMPs through wet lab experiments remains expensive, time-consuming and challenging. Many machine learning methods have been proposed to predict AMPs and achieved good results. In this work, we combine two kinds of word embedding features with the statistical features of peptide sequences to develop an ensemble classifier, named EnAMP, in which, two deep neural networks are trained based on Word2vec and Glove word embedding features of peptide sequences, respectively, meanwhile, we utilize statistical features of peptide sequences to train random forest and support vector machine classifiers. The average of four classifiers is the final prediction result. Compared with other state-of-the-art algorithms on six datasets, EnAMP outperforms most existing models with similar computational costs, even when compared with high computational cost algorithms based on Bidirectional Encoder Representation from Transformers (BERT), the performance of our model is comparable. EnAMP source code and the data are available at https://github.com/ruisue/EnAMP.
Collapse
Affiliation(s)
- Jujuan Zhuang
- School of Science, Dalian Maritime University, Dalian, Liaoning, P. R. China
| | - Wanquan Gao
- School of Science, Dalian Maritime University, Dalian, Liaoning, P. R. China
| | - Rui Su
- School of Science, Dalian Maritime University, Dalian, Liaoning, P. R. China
| |
Collapse
|
4
|
Chung CR, Liou JT, Wu LC, Horng JT, Lee TY. Multi-label classification and features investigation of antimicrobial peptides with various functional classes. iScience 2023; 26:108250. [PMID: 38025779 PMCID: PMC10679894 DOI: 10.1016/j.isci.2023.108250] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2023] [Revised: 07/15/2023] [Accepted: 10/16/2023] [Indexed: 12/01/2023] Open
Abstract
The challenge of drug-resistant bacteria to global public health has led to increased attention on antimicrobial peptides (AMPs) as a targeted therapeutic alternative with a lower risk of resistance. However, high production costs and limitations in functional class prediction have hindered progress in this field. In this study, we used multi-label classifiers with binary relevance and algorithm adaptation techniques to predict different functions of AMPs across a wide range of pathogen categories, including bacteria, mammalian cells, fungi, viruses, and cancer cells. Our classifiers attained promising AUC scores varying from 0.8492 to 0.9126 on independent testing data. Forward feature selection identified sequence order and charge as critical, with specific amino acids (C and E) as discriminative. These findings provide valuable insights for the design of antimicrobial peptides (AMPs) with multiple functionalities, thus contributing to the broader effort to combat drug-resistant pathogens.
Collapse
Affiliation(s)
- Chia-Ru Chung
- Department of Computer Science and Information Engineering, National Central University, Taoyuan, Taiwan
| | - Jhen-Ting Liou
- Department of Computer Science and Information Engineering, National Central University, Taoyuan, Taiwan
| | - Li-Ching Wu
- Department of Biomedical Sciences and Engineering, National Central University, Taoyuan, Taiwan
| | - Jorng-Tzong Horng
- Department of Computer Science and Information Engineering, National Central University, Taoyuan, Taiwan
- Department of Bioinformatics and Medical Engineering, Asia University, Taoyuan City, Taiwan
| | - Tzong-Yi Lee
- Institute of Bioinformatics and Systems Biology, National Yang Ming Chiao Tung University, Hsinchu City, Taiwan
- Center for Intelligent Drug Systems and Smart Biodevices (IDS2B), National Yang Ming Chiao Tung University, Hsinchu City, Taiwan
| |
Collapse
|
5
|
Sowers A, Wang G, Xing M, Li B. Advances in Antimicrobial Peptide Discovery via Machine Learning and Delivery via Nanotechnology. Microorganisms 2023; 11:1129. [PMID: 37317103 PMCID: PMC10223199 DOI: 10.3390/microorganisms11051129] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2023] [Revised: 04/13/2023] [Accepted: 04/19/2023] [Indexed: 06/16/2023] Open
Abstract
Antimicrobial peptides (AMPs) have been investigated for their potential use as an alternative to antibiotics due to the increased demand for new antimicrobial agents. AMPs, widely found in nature and obtained from microorganisms, have a broad range of antimicrobial protection, allowing them to be applied in the treatment of infections caused by various pathogenic microorganisms. Since these peptides are primarily cationic, they prefer anionic bacterial membranes due to electrostatic interactions. However, the applications of AMPs are currently limited owing to their hemolytic activity, poor bioavailability, degradation from proteolytic enzymes, and high-cost production. To overcome these limitations, nanotechnology has been used to improve AMP bioavailability, permeation across barriers, and/or protection against degradation. In addition, machine learning has been investigated due to its time-saving and cost-effective algorithms to predict AMPs. There are numerous databases available to train machine learning models. In this review, we focus on nanotechnology approaches for AMP delivery and advances in AMP design via machine learning. The AMP sources, classification, structures, antimicrobial mechanisms, their role in diseases, peptide engineering technologies, currently available databases, and machine learning techniques used to predict AMPs with minimal toxicity are discussed in detail.
Collapse
Affiliation(s)
- Alexa Sowers
- Department of Orthopaedics, School of Medicine, West Virginia University, Morgantown, WV 26506, USA
- School of Pharmacy, West Virginia University, Morgantown, WV 26506, USA
| | - Guangshun Wang
- Department of Pathology and Microbiology, College of Medicine, University of Nebraska Medical Center, 985900 Nebraska Medical Center, Omaha, NE 68198, USA
| | - Malcolm Xing
- Department of Mechanical Engineering, University of Manitoba, Winnipeg, MB R3T 2N2, Canada
| | - Bingyun Li
- Department of Orthopaedics, School of Medicine, West Virginia University, Morgantown, WV 26506, USA
| |
Collapse
|
6
|
García-Jacas CR, García-González LA, Martinez-Rios F, Tapia-Contreras IP, Brizuela CA. Handcrafted versus non-handcrafted (self-supervised) features for the classification of antimicrobial peptides: complementary or redundant? Brief Bioinform 2022; 23:6754757. [PMID: 36215083 DOI: 10.1093/bib/bbac428] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2022] [Revised: 08/28/2022] [Accepted: 09/02/2022] [Indexed: 12/14/2022] Open
Abstract
Antimicrobial peptides (AMPs) have received a great deal of attention given their potential to become a plausible option to fight multi-drug resistant bacteria as well as other pathogens. Quantitative sequence-activity models (QSAMs) have been helpful to discover new AMPs because they allow to explore a large universe of peptide sequences and help reduce the number of wet lab experiments. A main aspect in the building of QSAMs based on shallow learning is to determine an optimal set of protein descriptors (features) required to discriminate between sequences with different antimicrobial activities. These features are generally handcrafted from peptide sequence datasets that are labeled with specific antimicrobial activities. However, recent developments have shown that unsupervised approaches can be used to determine features that outperform human-engineered (handcrafted) features. Thus, knowing which of these two approaches contribute to a better classification of AMPs, it is a fundamental question in order to design more accurate models. Here, we present a systematic and rigorous study to compare both types of features. Experimental outcomes show that non-handcrafted features lead to achieve better performances than handcrafted features. However, the experiments also prove that an improvement in performance is achieved when both types of features are merged. A relevance analysis reveals that non-handcrafted features have higher information content than handcrafted features, while an interaction-based importance analysis reveals that handcrafted features are more important. These findings suggest that there is complementarity between both types of features. Comparisons regarding state-of-the-art deep models show that shallow models yield better performances both when fed with non-handcrafted features alone and when fed with non-handcrafted and handcrafted features together.
Collapse
Affiliation(s)
- César R García-Jacas
- Cátedras CONACYT - Departamento de Ciencias de la Computación, Centro de Investigación Científica y de Educación Superior de Ensenada (CICESE), 22860 Ensenada, Baja California, México
| | - Luis A García-González
- Departamento de Ciencias de la Computación, Centro de Investigación Científica y de Educación Superior de Ensenada (CICESE), 22860 Ensenada, Baja California, México
| | | | - Issac P Tapia-Contreras
- Departamento de Ciencias de la Computación, Centro de Investigación Científica y de Educación Superior de Ensenada (CICESE), 22860 Ensenada, Baja California, México
| | - Carlos A Brizuela
- Departamento de Ciencias de la Computación, Centro de Investigación Científica y de Educación Superior de Ensenada (CICESE), 22860 Ensenada, Baja California, México
| |
Collapse
|
7
|
Madanchi H, Rahmati S, Doaee Y, Sardari S, Mousavi Maleki MS, Rostamian M, Ebrahimi Kiasari R, Seyed Mousavi SJ, Ghods E, Ardakanian M. Determination of antifungal activity and action mechanism of the modified Aurein 1.2 peptide derivatives. Microb Pathog 2022; 173:105866. [DOI: 10.1016/j.micpath.2022.105866] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2022] [Revised: 10/29/2022] [Accepted: 10/31/2022] [Indexed: 11/06/2022]
|
8
|
Expression of cathelicidin, ERK, MyD88, and TLR-9 in the blood of women in the pre-pregnancy, pregnancy, and their infant cord blood. Hum Immunol 2022; 83:826-831. [PMID: 36058765 DOI: 10.1016/j.humimm.2022.08.014] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2022] [Revised: 07/22/2022] [Accepted: 08/26/2022] [Indexed: 11/21/2022]
Abstract
During pregnancy, the immune responses are modulated to protect mothers and infants from different pathogens. Cathelicidin as an antimicrobial peptide has a defending role against many pathogens. In this study, to better understand the role of cathelicidin peptide and three of its related proteins in immune pathways (ERK, MyD88, and TLR-9) in the immune system during pregnancy, we examined their expression in the blood of non-pregnant and pregnant mothers and their infant's cord blood. Blood samples were taken, and their peripheral blood mononuclear cells (PBMCs) were obtained. The expression level of cathelicidin was determined by quantitative PCR. Also, the expression of cathelicidin, ERK, MyD88, and TLR-9 was assessed by Western blotting. Higher level of cathelicidin mRNA was detected in the cord blood samples compared to other samples. The Western blotting results showed higher levels of cathelicidin, ERK, MyD88, and TLR-9 in the cord blood samples than in the blood of both pregnant and non-pregnant samples. Also, the level of all molecules was higher in pregnant than non-pregnant women. These high levels of the mentioned molecules are necessary to protect the mother and fetus against various pathogens, although understanding their mechanism of action needs more studies.
Collapse
|
9
|
Agüero-Chapin G, Galpert-Cañizares D, Domínguez-Pérez D, Marrero-Ponce Y, Pérez-Machado G, Teijeira M, Antunes A. Emerging Computational Approaches for Antimicrobial Peptide Discovery. Antibiotics (Basel) 2022; 11:antibiotics11070936. [PMID: 35884190 PMCID: PMC9311958 DOI: 10.3390/antibiotics11070936] [Citation(s) in RCA: 13] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2022] [Revised: 07/01/2022] [Accepted: 07/08/2022] [Indexed: 02/05/2023] Open
Abstract
In the last two decades many reports have addressed the application of artificial intelligence (AI) in the search and design of antimicrobial peptides (AMPs). AI has been represented by machine learning (ML) algorithms that use sequence-based features for the discovery of new peptidic scaffolds with promising biological activity. From AI perspective, evolutionary algorithms have been also applied to the rational generation of peptide libraries aimed at the optimization/design of AMPs. However, the literature has scarcely dedicated to other emerging non-conventional in silico approaches for the search/design of such bioactive peptides. Thus, the first motivation here is to bring up some non-standard peptide features that have been used to build classical ML predictive models. Secondly, it is valuable to highlight emerging ML algorithms and alternative computational tools to predict/design AMPs as well as to explore their chemical space. Another point worthy of mention is the recent application of evolutionary algorithms that actually simulate sequence evolution to both the generation of diversity-oriented peptide libraries and the optimization of hit peptides. Last but not least, included here some new considerations in proteogenomic analyses currently incorporated into the computational workflow for unravelling AMPs in natural sources.
Collapse
Affiliation(s)
- Guillermin Agüero-Chapin
- CIIMAR—Centro Interdisciplinar de Investigação Marinha e Ambiental, Universidade do Porto, Terminal de Cruzeiros do Porto de Leixões, Av. General Norton de Matos, s/n, 4450-208 Porto, Portugal;
- Departamento de Biologia, Faculdade de Ciências, Universidade do Porto, Rua do Campo Alegre, 4169-007 Porto, Portugal
- Correspondence: (G.A.-C.); (A.A.); Tel.: +351-22-340-1813 (G.A.-C. & A.A.)
| | - Deborah Galpert-Cañizares
- Departamento de Ciencia de la Computación, Universidad Central Marta Abreu de Las Villas (UCLV), Santa Clara 54830, Cuba;
| | - Dany Domínguez-Pérez
- CIIMAR—Centro Interdisciplinar de Investigação Marinha e Ambiental, Universidade do Porto, Terminal de Cruzeiros do Porto de Leixões, Av. General Norton de Matos, s/n, 4450-208 Porto, Portugal;
- Proquinorte, Unipessoal, Lda, Avenida 5 de Outubro, 124, 7º Piso, Avenidas Novas, 1050-061 Lisboa, Portugal
| | - Yovani Marrero-Ponce
- Universidad San Francisco de Quito (USFQ), Grupo de Medicina Molecular y Translacional (MeM&T), Colegio de Ciencias de la Salud (COCSA), Escuela de Medicina, Edificio de Especialidades Médicas and Instituto de Simulación Computacional (ISC-USFQ), Diego de Robles y vía Interoceánica, Quito 170157, Ecuador;
| | - Gisselle Pérez-Machado
- EpiDisease S.L—Spin-Off of Centro de Investigación Biomédica en Red de Enfermedades Raras (CIBERER), 46980 Valencia, Spain;
| | - Marta Teijeira
- Departamento de Química Orgánica, Facultade de Química, Universidade de Vigo, 36310 Vigo, Spain;
- Instituto de Investigación Sanitaria Galicia Sur, Hospital Álvaro Cunqueiro, 36213 Vigo, Spain
| | - Agostinho Antunes
- CIIMAR—Centro Interdisciplinar de Investigação Marinha e Ambiental, Universidade do Porto, Terminal de Cruzeiros do Porto de Leixões, Av. General Norton de Matos, s/n, 4450-208 Porto, Portugal;
- Departamento de Biologia, Faculdade de Ciências, Universidade do Porto, Rua do Campo Alegre, 4169-007 Porto, Portugal
- Correspondence: (G.A.-C.); (A.A.); Tel.: +351-22-340-1813 (G.A.-C. & A.A.)
| |
Collapse
|
10
|
Assessing sequence-based protein-protein interaction predictors for use in therapeutic peptide engineering. Sci Rep 2022; 12:9610. [PMID: 35688894 PMCID: PMC9187631 DOI: 10.1038/s41598-022-13227-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2021] [Accepted: 04/25/2022] [Indexed: 12/01/2022] Open
Abstract
Engineering peptides to achieve a desired therapeutic effect through the inhibition of a specific target activity or protein interaction is a non-trivial task. Few of the existing in silico peptide design algorithms generate target-specific peptides. Instead, many methods produce peptides that achieve a desired effect through an unknown mechanism. In contrast with resource-intensive high-throughput experiments, in silico screening is a cost-effective alternative that can prune the space of candidates when engineering target-specific peptides. Using a set of FDA-approved peptides we curated specifically for this task, we assess the applicability of several sequence-based protein–protein interaction predictors as a screening tool within the context of peptide therapeutic engineering. We show that similarity-based protein–protein interaction predictors are more suitable for this purpose than the state-of-the-art deep learning methods publicly available at the time of writing. We also show that this approach is mostly useful when designing new peptides against targets for which naturally-occurring interactors are already known, and that deploying it for de novo peptide engineering tasks may require gathering additional target-specific training data. Taken together, this work offers evidence that supports the use of similarity-based protein–protein interaction predictors for peptide therapeutic engineering, especially peptide analogs.
Collapse
|
11
|
León Madrazo A, Segura Campos MR. In silico prediction of peptide variants from chia (S. hispanica L.) with antimicrobial, antibiofilm, and antioxidant potential. Comput Biol Chem 2022; 98:107695. [DOI: 10.1016/j.compbiolchem.2022.107695] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2022] [Revised: 05/03/2022] [Accepted: 05/11/2022] [Indexed: 11/03/2022]
|
12
|
García-Jacas CR, Pinacho-Castellanos SA, García-González LA, Brizuela CA. Do deep learning models make a difference in the identification of antimicrobial peptides? Brief Bioinform 2022; 23:6563422. [PMID: 35380616 DOI: 10.1093/bib/bbac094] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2021] [Revised: 02/16/2022] [Accepted: 02/23/2022] [Indexed: 12/21/2022] Open
Abstract
In the last few decades, antimicrobial peptides (AMPs) have been explored as an alternative to classical antibiotics, which in turn motivated the development of machine learning models to predict antimicrobial activities in peptides. The first generation of these predictors was filled with what is now known as shallow learning-based models. These models require the computation and selection of molecular descriptors to characterize each peptide sequence and train the models. The second generation, known as deep learning-based models, which no longer requires the explicit computation and selection of those descriptors, started to be used in the prediction task of AMPs just four years ago. The superior performance claimed by deep models regarding shallow models has created a prevalent inertia to using deep learning to identify AMPs. However, methodological flaws and/or modeling biases in the building of deep models do not support such superiority. Here, we analyze the main pitfalls that led to establish biased conclusions on the leading performance of deep models. Also, we analyze whether deep models truly contribute to achieve better predictions than shallow models by performing fair studies on different state-of-the-art benchmarking datasets. The experiments reveal that deep models do not outperform shallow models in the classification of AMPs, and that both types of models codify similar chemical information since their predictions are highly similar. Thus, according to the currently available datasets, we conclude that the use of deep learning could not be the most suitable approach to develop models to identify AMPs, mainly because shallow models achieve comparable-to-superior performances and are simpler (Ockham's razor principle). Even so, we suggest the use of deep learning only when its capabilities lead to obtaining significantly better performance gains worth the additional computational cost.
Collapse
Affiliation(s)
- César R García-Jacas
- Cátedras CONACYT - Departamento de Ciencias de la Computación, Centro de Investigación Científica y de Educación Superior de Ensenada (CICESE), 22860 Ensenada, Baja California, México
| | - Sergio A Pinacho-Castellanos
- Departamento de Ciencias de la Computación, Centro de Investigación Científica y de Educación Superior de Ensenada (CICESE), 22860 Ensenada, Baja California, México.,Centro de Investigación y Desarrollo de Tecnología Digital (CITEDI), Instituto Politécnico Nacional (IPN), 22435 Tijuana, Baja California, México
| | - Luis A García-González
- Departamento de Ciencias de la Computación, Centro de Investigación Científica y de Educación Superior de Ensenada (CICESE), 22860 Ensenada, Baja California, México
| | - Carlos A Brizuela
- Departamento de Ciencias de la Computación, Centro de Investigación Científica y de Educación Superior de Ensenada (CICESE), 22860 Ensenada, Baja California, México
| |
Collapse
|
13
|
Ramazi S, Mohammadi N, Allahverdi A, Khalili E, Abdolmaleki P. A review on antimicrobial peptides databases and the computational tools. Database (Oxford) 2022; 2022:6550847. [PMID: 35305010 PMCID: PMC9216472 DOI: 10.1093/database/baac011] [Citation(s) in RCA: 30] [Impact Index Per Article: 15.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2021] [Revised: 02/04/2022] [Accepted: 02/28/2022] [Indexed: 12/29/2022]
Abstract
Antimicrobial Peptides (AMPs) have been considered as potential alternatives for infection therapeutics since antibiotic resistance has been raised as a global problem. The AMPs are a group of natural peptides that play a crucial role in the immune system in various organisms AMPs have features such as a short length and efficiency against microbes. Importantly, they have represented low toxicity in mammals which makes them potential candidates for peptide-based drugs. Nevertheless, the discovery of AMPs is accompanied by several issues which are associated with labour-intensive and time-consuming wet-lab experiments. During the last decades, numerous studies have been conducted on the investigation of AMPs, either natural or synthetic type, and relevant data are recently available in many databases. Through the advancement of computational methods, a great number of AMP data are obtained from publicly accessible databanks, which are valuable resources for mining patterns to design new models for AMP prediction. However, due to the current flaws in assessing computational methods, more interrogations are warranted for accurate evaluation/analysis. Considering the diversity of AMPs and newly reported ones, an improvement in Machine Learning algorithms are crucial. In this review, we aim to provide valuable information about different types of AMPs, their mechanism of action and a landscape of current databases and computational tools as resources to collect AMPs and beneficial tools for the prediction and design of a computational model for new active AMPs.
Collapse
Affiliation(s)
- Shahin Ramazi
- Department of Biophysics, Faculty of Biological Sciences, Tarbiat Modares University, Jalal Ale Ahmad Highway, Tehran 14115-111, Iran
| | - Neda Mohammadi
- Department of Medical Biotechnology, Faculty of Allied Medical Sciences, Iran University of Medical Sciences, Hemmat Highway, Tehran 1449614535, Iran,Institute of Pharmacology and Toxicology, University of Bonn, Biomedical Center, Venusberg Campus 1, Bonn 53127, Germany
| | - Abdollah Allahverdi
- Department of Biophysics, Faculty of Biological Sciences, Tarbiat Modares University, Jalal Ale Ahmad Highway, Tehran 14115-111, Iran
| | - Elham Khalili
- Department of Plant Biology, Faculty of Biological Sciences, Tarbiat Modares University, Jalal Ale Ahmad Highway, Tehran 14115-111, Iran
| | | |
Collapse
|
14
|
Quintans ILADCR, de Araújo JVA, Rocha LNM, de Andrade AEB, do Rêgo TG, Deyholos MK. An overview of databases and bioinformatics tools for plant antimicrobial peptides. Curr Protein Pept Sci 2021; 23:6-19. [PMID: 34951361 DOI: 10.2174/1389203723666211222170342] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2021] [Revised: 10/15/2021] [Accepted: 10/27/2021] [Indexed: 11/22/2022]
Abstract
Antimicrobial peptides (AMPs) are small, ribosomally synthesized proteins found in nearly all forms of life. In plants, AMPs play a central role in plant defense due to their distinct physicochemical properties. Due to their broad-spectrum antimicrobial activity and rapid killing action, plant AMPs have become important candidates for the development of new drugs to control plant and animal pathogens that are resistant to multiple drugs. Further research is required to explore the potential uses of these natural compounds. Computational strategies have been increasingly used to understand key aspects of antimicrobial peptides. These strategies will help to minimize the time and cost of "wet-lab" experimentation. Researchers have developed various tools and databases to provide updated information on AMPs. However, despite the increased availability of antimicrobial peptide resources in biological databases, finding AMPs from plants can still be a difficult task. The number of plant AMP sequences in current databases is still small and yet often redundant. To facilitate further characterization of plant AMPs, we have summarized information on the location, distribution, and annotations of plant AMPs available in the most relevant databases for AMPs research. We also mapped and categorized the bioinformatics tools available in these databases. We expect that this will allow researchers to advance in the discovery and development of new plant AMPs with potent biological properties. We hope to provide insights to further expand the application of AMPs in the fields of biotechnology, pharmacy, and agriculture.
Collapse
Affiliation(s)
| | | | | | | | | | - Michael K Deyholos
- IK Barber School of Arts and Sciences, University of British Columbia, Kelowna, BC. Canada
| |
Collapse
|
15
|
Schistocins: Novel antimicrobial peptides encrypted in the Schistosoma mansoni Kunitz Inhibitor SmKI-1. Biochim Biophys Acta Gen Subj 2021; 1865:129989. [PMID: 34389467 DOI: 10.1016/j.bbagen.2021.129989] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2021] [Revised: 07/30/2021] [Accepted: 08/06/2021] [Indexed: 02/06/2023]
Abstract
BACKGROUND Here we describe a new class of cryptides (peptides encrypted within a larger protein) with antimicrobial properties, named schistocins, derived from SmKI-1, a key protein in Shistosoma mansoni survival. This is a multi-functional protein with biotechnological potential usage as a therapeutic molecule in inflammatory diseases and to control schistosomiasis. METHODS We used our algorithm enCrypted, to perform an in silico proteolysis of SmKI-1 and a screening for potential antimicrobial activity. The selected peptides were chemically synthesized, tested in vitro and evaluated by both structural (CD, NMR) and biophysical (ITC) studies to access their structure-function relationship. RESULTS EnCrypted was capable of predicting AMPs in SmKI-1. Our biophysical analyses described a membrane-induced conformational change from random coil-to-α-helix and a peptide-membrane equilibrium for all schistocins. Our structural data allowed us to suggest a well-known mode of peptide-membrane interaction in which electrostatic attraction between the cationic peptides and anionic membranes results in the bilayer disordering. Moreover, the NMR exchange H/D data with the higher entropic contribution observed for the peptide-membrane interaction showed that shistocins have different orientations upon the membrane. CONCLUSIONS This work demonstrate the robustness for using the physicochemical features of predicted peptides in the identification of new bioactive cryptides besides the relevance of combining these analyses with biophysical methods to understand the peptide-membrane affinity and improve further algorithms. GENERAL SIGNIFICANCE Bioprospecting cryptides can be conducted through data mining of protein databases demonstrating the success of our strategy. The peptides-based agents derived from SmKI-1 might have high impact for system-biology and biotechnology.
Collapse
|
16
|
Lawrence TJ, Carper DL, Spangler MK, Carrell AA, Rush TA, Minter SJ, Weston DJ, Labbé JL. amPEPpy 1.0: a portable and accurate antimicrobial peptide prediction tool. Bioinformatics 2021; 37:2058-2060. [PMID: 33135060 DOI: 10.1093/bioinformatics/btaa917] [Citation(s) in RCA: 35] [Impact Index Per Article: 11.7] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2020] [Revised: 10/07/2020] [Accepted: 10/16/2020] [Indexed: 01/16/2023] Open
Abstract
SUMMARY Antimicrobial peptides (AMPs) are promising alternative antimicrobial agents. Currently, however, portable, user-friendly and efficient methods for predicting AMP sequences from genome-scale data are not readily available. Here we present amPEPpy, an open-source, multi-threaded command-line application for predicting AMP sequences using a random forest classifier. AVAILABILITY AND IMPLEMENTATION amPEPpy is implemented in Python 3 and is freely available through GitHub (https://github.com/tlawrence3/amPEPpy). SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Travis J Lawrence
- Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, TN 37831, USA
| | - Dana L Carper
- Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, TN 37831, USA
| | - Margaret K Spangler
- Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, TN 37831, USA
| | - Alyssa A Carrell
- Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, TN 37831, USA
| | - Tomás A Rush
- Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, TN 37831, USA
| | | | - David J Weston
- Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, TN 37831, USA
| | - Jessy L Labbé
- Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, TN 37831, USA
| |
Collapse
|
17
|
Singh O, Hsu WL, Su ECY. Co-AMPpred for in silico-aided predictions of antimicrobial peptides by integrating composition-based features. BMC Bioinformatics 2021; 22:389. [PMID: 34330209 PMCID: PMC8325260 DOI: 10.1186/s12859-021-04305-2] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2021] [Accepted: 07/21/2021] [Indexed: 12/24/2022] Open
Abstract
Background Antimicrobial peptides (AMPs) are oligopeptides that act as crucial components of innate immunity, naturally occur in all multicellular organisms, and are involved in the first line of defense function. Recent studies showed that AMPs perpetuate great potential that is not limited to antimicrobial activity. They are also crucial regulators of host immune responses that can modulate a wide range of activities, such as immune regulation, wound healing, and apoptosis. However, a microorganism's ability to adapt and to resist existing antibiotics triggered the scientific community to develop alternatives to conventional antibiotics. Therefore, to address this issue, we proposed Co-AMPpred, an in silico-aided AMP prediction method based on compositional features of amino acid residues to classify AMPs and non-AMPs. Results In our study, we developed a prediction method that incorporates composition-based sequence and physicochemical features into various machine-learning algorithms. Then, the boruta feature-selection algorithm was used to identify discriminative biological features. Furthermore, we only used discriminative biological features to develop our model. Additionally, we performed a stratified tenfold cross-validation technique to validate the predictive performance of our AMP prediction model and evaluated on the independent holdout test dataset. A benchmark dataset was collected from previous studies to evaluate the predictive performance of our model. Conclusions Experimental results show that combining composition-based and physicochemical features outperformed existing methods on both the benchmark training dataset and a reduced training dataset. Finally, our proposed method achieved 80.8% accuracies and 0.871 area under the receiver operating characteristic curve by evaluating on independent test set. Our code and datasets are available at https://github.com/onkarS23/CoAMPpred. Supplementary Information The online version contains supplementary material available at 10.1186/s12859-021-04305-2.
Collapse
Affiliation(s)
- Onkar Singh
- Bioinformatics Program, Taiwan International Graduate Program, Institute of Information Science, Academia Sinica, Taipei, Taiwan.,Institute of Biomedical Informatics, National Yang Ming Chiao Tung University, Taipei, Taiwan.,Graduate Institute of Biomedical Informatics, College of Medical Science and Technology, Taipei Medical University, 250 Wu-Xing Street, Taipei, 11031, Taiwan
| | - Wen-Lian Hsu
- Bioinformatics Program, Taiwan International Graduate Program, Institute of Information Science, Academia Sinica, Taipei, Taiwan.,Institute of Biomedical Informatics, National Yang Ming Chiao Tung University, Taipei, Taiwan
| | - Emily Chia-Yu Su
- Graduate Institute of Biomedical Informatics, College of Medical Science and Technology, Taipei Medical University, 250 Wu-Xing Street, Taipei, 11031, Taiwan. .,Clinical Big Data Research Center, Taipei Medical University Hospital, Taipei, Taiwan.
| |
Collapse
|
18
|
Aronica PGA, Reid LM, Desai N, Li J, Fox SJ, Yadahalli S, Essex JW, Verma CS. Computational Methods and Tools in Antimicrobial Peptide Research. J Chem Inf Model 2021; 61:3172-3196. [PMID: 34165973 DOI: 10.1021/acs.jcim.1c00175] [Citation(s) in RCA: 43] [Impact Index Per Article: 14.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
The evolution of antibiotic-resistant bacteria is an ongoing and troubling development that has increased the number of diseases and infections that risk going untreated. There is an urgent need to develop alternative strategies and treatments to address this issue. One class of molecules that is attracting significant interest is that of antimicrobial peptides (AMPs). Their design and development has been aided considerably by the applications of molecular models, and we review these here. These methods include the use of tools to explore the relationships between their structures, dynamics, and functions and the increasing application of machine learning and molecular dynamics simulations. This review compiles resources such as AMP databases, AMP-related web servers, and commonly used techniques, together aimed at aiding researchers in the area toward complementing experimental studies with computational approaches.
Collapse
Affiliation(s)
- Pietro G A Aronica
- Bioinformatics Institute at A*STAR (Agency for Science, Technology and Research), 30 Biopolis Street, #07-01 Matrix, Singapore 138671
| | - Lauren M Reid
- Bioinformatics Institute at A*STAR (Agency for Science, Technology and Research), 30 Biopolis Street, #07-01 Matrix, Singapore 138671.,School of Chemistry, University of Southampton, Highfield Southampton, Hampshire, U.K. SO17 1BJ.,MedChemica Ltd, Alderley Park, Macclesfield, Cheshire, U.K. SK10 4TG
| | - Nirali Desai
- Bioinformatics Institute at A*STAR (Agency for Science, Technology and Research), 30 Biopolis Street, #07-01 Matrix, Singapore 138671.,Division of Biological and Life Sciences, Ahmedabad University, Central Campus, Ahmedabad, Gujarat, India 380009
| | - Jianguo Li
- Bioinformatics Institute at A*STAR (Agency for Science, Technology and Research), 30 Biopolis Street, #07-01 Matrix, Singapore 138671.,Singapore Eye Research Institute, 20 College Road Discovery Tower, Singapore 169856
| | - Stephen J Fox
- Bioinformatics Institute at A*STAR (Agency for Science, Technology and Research), 30 Biopolis Street, #07-01 Matrix, Singapore 138671
| | - Shilpa Yadahalli
- Bioinformatics Institute at A*STAR (Agency for Science, Technology and Research), 30 Biopolis Street, #07-01 Matrix, Singapore 138671
| | - Jonathan W Essex
- School of Chemistry, University of Southampton, Highfield Southampton, Hampshire, U.K. SO17 1BJ
| | - Chandra S Verma
- Bioinformatics Institute at A*STAR (Agency for Science, Technology and Research), 30 Biopolis Street, #07-01 Matrix, Singapore 138671.,Department of Biological Sciences, National University of Singapore, 14 Science Drive 4, 117543 Singapore.,School of Biological Sciences, Nanyang Technological University, 50 Nanyang Drive, 637551 Singapore
| |
Collapse
|
19
|
Pinacho-Castellanos SA, García-Jacas CR, Gilson MK, Brizuela CA. Alignment-Free Antimicrobial Peptide Predictors: Improving Performance by a Thorough Analysis of the Largest Available Data Set. J Chem Inf Model 2021; 61:3141-3157. [PMID: 34081438 DOI: 10.1021/acs.jcim.1c00251] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/13/2023]
Abstract
In the last two decades, a large number of machine-learning-based predictors for the activities of antimicrobial peptides (AMPs) have been proposed. These predictors differ from one another in the learning method and in the training and testing data sets used. Unfortunately, the training data sets present several drawbacks, such as a low representativeness regarding the experimentally validated AMP space, and duplicated peptide sequences between negative and positive data sets. These limitations give a low confidence to most of the approaches to be used in prospective studies. To address these weaknesses, we propose novel modeling and assessing data sets from the largest experimentally validated nonredundant peptide data set reported to date. From these novel data sets, alignment-free quantitative sequence-activity models (AF-QSAMs) based on Random Forest are created to identify general AMPs and their antibacterial, antifungal, antiparasitic, and antiviral functional types. An applicability domain analysis is carried out to determine the reliability of the predictions obtained, which, to the best of our knowledge, is performed for the first time for AMP recognition. A benchmarking is undertaken between the models proposed and several models from the literature that are freely available in 13 programs (ClassAMP, iAMP-2L, ADAM, MLAMP, AMPScanner v2.0, AntiFP, AMPfun, PEPred-suite, AxPEP, CAMPR3, iAMPpred, APIN, and Meta-iAVP). The models proposed are those with the best performance in all of the endpoints modeled, while most of the methods from the literature have weak-to-random predictive agreements. The models proposed are also assessed through Y-scrambling and repeated k-fold cross-validation tests, demonstrating that the outcomes obtained by them are not given by chance. Three chemometric analyses also confirmed the relevance of the peptides descriptors used in the modeling. Therefore, it can be concluded that the models built by fixing the drawbacks existing in the literature contribute to identifying antibacterial, antifungal, antiparasitic, and antiviral peptides with high effectivity and reliability. Models are freely available via the AMPDiscover tool at https://biocom-ampdiscover.cicese.mx/.
Collapse
Affiliation(s)
- Sergio A Pinacho-Castellanos
- Departamento de Ciencias de la Computación, Centro de Investigación Científica y de Educación Superior de Ensenada (CICESE), 22860 Ensenada, Baja California, México.,Centro de Investigación y Desarrollo de Tecnología Digital (CITEDI), Instituto Politécnico Nacional (IPN), 22435 Tijuana, Baja California, México
| | - César R García-Jacas
- Cátedras CONACYT-Departamento de Ciencias de la Computación, Centro de Investigación Científica y de Educación Superior de Ensenada (CICESE), 22860 Ensenada, Baja California, México
| | - Michael K Gilson
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, La Jolla, California 92093, United States
| | - Carlos A Brizuela
- Departamento de Ciencias de la Computación, Centro de Investigación Científica y de Educación Superior de Ensenada (CICESE), 22860 Ensenada, Baja California, México
| |
Collapse
|
20
|
Zhang Y, Lin J, Zhao L, Zeng X, Liu X. A novel antibacterial peptide recognition algorithm based on BERT. Brief Bioinform 2021; 22:6284370. [PMID: 34037687 DOI: 10.1093/bib/bbab200] [Citation(s) in RCA: 39] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2020] [Revised: 04/19/2021] [Accepted: 05/03/2021] [Indexed: 12/31/2022] Open
Abstract
As the best substitute for antibiotics, antimicrobial peptides (AMPs) have important research significance. Due to the high cost and difficulty of experimental methods for identifying AMPs, more and more researches are focused on using computational methods to solve this problem. Most of the existing calculation methods can identify AMPs through the sequence itself, but there is still room for improvement in recognition accuracy, and there is a problem that the constructed model cannot be universal in each dataset. The pre-training strategy has been applied to many tasks in natural language processing (NLP) and has achieved gratifying results. It also has great application prospects in the field of AMP recognition and prediction. In this paper, we apply the pre-training strategy to the model training of AMP classifiers and propose a novel recognition algorithm. Our model is constructed based on the BERT model, pre-trained with the protein data from UniProt, and then fine-tuned and evaluated on six AMP datasets with large differences. Our model is superior to the existing methods and achieves the goal of accurate identification of datasets with small sample size. We try different word segmentation methods for peptide chains and prove the influence of pre-training steps and balancing datasets on the recognition effect. We find that pre-training on a large number of diverse AMP data, followed by fine-tuning on new data, is beneficial for capturing both new data's specific features and common features between AMP sequences. Finally, we construct a new AMP dataset, on which we train a general AMP recognition model.
Collapse
Affiliation(s)
- Yue Zhang
- Xiamen University, Xiamen 361005, China
| | | | | | | | | |
Collapse
|
21
|
Xu J, Li F, Leier A, Xiang D, Shen HH, Marquez Lago TT, Li J, Yu DJ, Song J. Comprehensive assessment of machine learning-based methods for predicting antimicrobial peptides. Brief Bioinform 2021; 22:6189771. [PMID: 33774670 DOI: 10.1093/bib/bbab083] [Citation(s) in RCA: 49] [Impact Index Per Article: 16.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2020] [Revised: 02/20/2021] [Accepted: 02/22/2021] [Indexed: 12/13/2022] Open
Abstract
Antimicrobial peptides (AMPs) are a unique and diverse group of molecules that play a crucial role in a myriad of biological processes and cellular functions. AMP-related studies have become increasingly popular in recent years due to antimicrobial resistance, which is becoming an emerging global concern. Systematic experimental identification of AMPs faces many difficulties due to the limitations of current methods. Given its significance, more than 30 computational methods have been developed for accurate prediction of AMPs. These approaches show high diversity in their data set size, data quality, core algorithms, feature extraction, feature selection techniques and evaluation strategies. Here, we provide a comprehensive survey on a variety of current approaches for AMP identification and point at the differences between these methods. In addition, we evaluate the predictive performance of the surveyed tools based on an independent test data set containing 1536 AMPs and 1536 non-AMPs. Furthermore, we construct six validation data sets based on six different common AMP databases and compare different computational methods based on these data sets. The results indicate that amPEPpy achieves the best predictive performance and outperforms the other compared methods. As the predictive performances are affected by the different data sets used by different methods, we additionally perform the 5-fold cross-validation test to benchmark different traditional machine learning methods on the same data set. These cross-validation results indicate that random forest, support vector machine and eXtreme Gradient Boosting achieve comparatively better performances than other machine learning methods and are often the algorithms of choice of multiple AMP prediction tools.
Collapse
Affiliation(s)
- Jing Xu
- Department of Biochemistry and Molecular Biology and Biomedicine Discovery Institute, Monash University, Australia
| | - Fuyi Li
- Department of Microbiology and Immunology, the Peter Doherty Institute for Infection and Immunity, the University of Melbourne, Australia
| | - André Leier
- Department of Genetics, UAB School of Medicine, USA
| | - Dongxu Xiang
- Department of Biochemistry and Molecular Biology and Biomedicine Discovery Institute, Monash University, Australia
| | - Hsin-Hui Shen
- Department of Biochemistry & Molecular Biology and Department of Materials Science & Engineering, Monash University, Australia
| | | | - Jian Li
- Monash Biomedicine Discovery Institute and Department of Microbiology, Monash University, Australia
| | - Dong-Jun Yu
- School of Computer Science and Engineering, Nanjing University of Science and Technology, China
| | - Jiangning Song
- Monash Biomedicine Discovery Institute, Monash University, Australia
| |
Collapse
|
22
|
Santos-Júnior CD, Pan S, Zhao XM, Coelho LP. Macrel: antimicrobial peptide screening in genomes and metagenomes. PeerJ 2020; 8:e10555. [PMID: 33384902 PMCID: PMC7751412 DOI: 10.7717/peerj.10555] [Citation(s) in RCA: 28] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2020] [Accepted: 11/22/2020] [Indexed: 12/21/2022] Open
Abstract
Motivation Antimicrobial peptides (AMPs) have the potential to tackle multidrug-resistant pathogens in both clinical and non-clinical contexts. The recent growth in the availability of genomes and metagenomes provides an opportunity for in silico prediction of novel AMP molecules. However, due to the small size of these peptides, standard gene prospection methods cannot be applied in this domain and alternative approaches are necessary. In particular, standard gene prediction methods have low precision for short peptides, and functional classification by homology results in low recall. Results Here, we present Macrel (for metagenomic AMP classification and retrieval), which is an end-to-end pipeline for the prospection of high-quality AMP candidates from (meta)genomes. For this, we introduce a novel set of 22 peptide features. These were used to build classifiers which perform similarly to the state-of-the-art in the prediction of both antimicrobial and hemolytic activity of peptides, but with enhanced precision (using standard benchmarks as well as a stricter testing regime). We demonstrate that Macrel recovers high-quality AMP candidates using realistic simulations and real data. Availability Macrel is implemented in Python 3. It is available as open source at https://github.com/BigDataBiology/macrel and through bioconda. Classification of peptides or prediction of AMPs in contigs can also be performed on the webserver: https://big-data-biology.org/software/macrel.
Collapse
Affiliation(s)
- Célio Dias Santos-Júnior
- Institute of Science and Technology for Brain-Inspired Intelligence, Fudan University, Shanghai, China.,Ministry of Education, Key Laboratory of Computational Neuroscience and Brain-Inspired Intelligence, Shanghai, China
| | - Shaojun Pan
- Institute of Science and Technology for Brain-Inspired Intelligence, Fudan University, Shanghai, China.,Ministry of Education, Key Laboratory of Computational Neuroscience and Brain-Inspired Intelligence, Shanghai, China
| | - Xing-Ming Zhao
- Institute of Science and Technology for Brain-Inspired Intelligence, Fudan University, Shanghai, China.,Ministry of Education, Key Laboratory of Computational Neuroscience and Brain-Inspired Intelligence, Shanghai, China
| | - Luis Pedro Coelho
- Institute of Science and Technology for Brain-Inspired Intelligence, Fudan University, Shanghai, China.,Ministry of Education, Key Laboratory of Computational Neuroscience and Brain-Inspired Intelligence, Shanghai, China
| |
Collapse
|
23
|
Machine learning-guided discovery and design of non-hemolytic peptides. Sci Rep 2020; 10:16581. [PMID: 33024236 PMCID: PMC7538962 DOI: 10.1038/s41598-020-73644-6] [Citation(s) in RCA: 48] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2020] [Accepted: 09/18/2020] [Indexed: 12/13/2022] Open
Abstract
Reducing hurdles to clinical trials without compromising the therapeutic promises of peptide candidates becomes an essential step in peptide-based drug design. Machine-learning models are cost-effective and time-saving strategies used to predict biological activities from primary sequences. Their limitations lie in the diversity of peptide sequences and biological information within these models. Additional outlier detection methods are needed to set the boundaries for reliable predictions; the applicability domain. Antimicrobial peptides (AMPs) constitute an extensive library of peptides offering promising avenues against antibiotic-resistant infections. Most AMPs present in clinical trials are administrated topically due to their hemolytic toxicity. Here we developed machine learning models and outlier detection methods that ensure robust predictions for the discovery of AMPs and the design of novel peptides with reduced hemolytic activity. Our best models, gradient boosting classifiers, predicted the hemolytic nature from any peptide sequence with 95–97% accuracy. Nearly 70% of AMPs were predicted as hemolytic peptides. Applying multivariate outlier detection models, we found that 273 AMPs (~ 9%) could not be predicted reliably. Our combined approach led to the discovery of 34 high-confidence non-hemolytic natural AMPs, the de novo design of 507 non-hemolytic peptides, and the guidelines for non-hemolytic peptide design.
Collapse
|
24
|
Nava Lara RA, Beltrán JA, Brizuela CA, Del Rio G. Relevant Features of Polypharmacologic Human-Target Antimicrobials Discovered by Machine-Learning Techniques. Pharmaceuticals (Basel) 2020; 13:ph13090204. [PMID: 32825532 PMCID: PMC7559829 DOI: 10.3390/ph13090204] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2020] [Revised: 08/07/2020] [Accepted: 08/07/2020] [Indexed: 11/16/2022] Open
Abstract
Polypharmacologic human-targeted antimicrobials (polyHAM) are potentially useful in the treatment of complex human diseases where the microbiome is important (e.g., diabetes, hypertension). We previously reported a machine-learning approach to identify polyHAM from FDA-approved human targeted drugs using a heterologous approach (training with peptides and non-peptide compounds). Here we discover that polyHAM are more likely to be found among antimicrobials displaying a broad-spectrum antibiotic activity and that topological, but not chemical features, are most informative to classify this activity. A heterologous machine-learning approach was trained with broad-spectrum antimicrobials and tested with human metabolites; these metabolites were labeled as antimicrobials or non-antimicrobials based on a naïve text-mining approach. Human metabolites are not commonly recognized as antimicrobials yet circulate in the human body where microbes are found and our heterologous model was able to classify those with antimicrobial activity. These results provide the basis to develop applications aimed to design human diets that purposely alter metabolic compounds proportions as a way to control human microbiome.
Collapse
Affiliation(s)
- Rodrigo A. Nava Lara
- Department of Biochemistry and Structural Biology, Instituto de Fisiologia Celular, UNAM, Mexico City 04510, Mexico;
| | - Jesús A. Beltrán
- Department of Computer Science, CICESE Research Center, Ensenada 22860, Mexico; (J.A.B.); (C.A.B.)
| | - Carlos A. Brizuela
- Department of Computer Science, CICESE Research Center, Ensenada 22860, Mexico; (J.A.B.); (C.A.B.)
| | - Gabriel Del Rio
- Department of Biochemistry and Structural Biology, Instituto de Fisiologia Celular, UNAM, Mexico City 04510, Mexico;
- Correspondence:
| |
Collapse
|