1
|
Jin C, Jia C, Hu W, Xu H, Shen Y, Yue M. Predicting antimicrobial resistance in E. coli with discriminative position fused deep learning classifier. Comput Struct Biotechnol J 2024; 23:559-565. [PMID: 38274998 PMCID: PMC10809114 DOI: 10.1016/j.csbj.2023.12.041] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2023] [Revised: 12/26/2023] [Accepted: 12/26/2023] [Indexed: 01/27/2024] Open
Abstract
Escherichia coli (E. coli) has become a particular concern due to the increasing incidence of antimicrobial resistance (AMR) observed worldwide. Using machine learning (ML) to predict E. coli AMR is a more efficient method than traditional laboratory testing. However, further improvement in the predictive performance of existing models remains challenging. In this study, we collected 1937 high-quality whole genome sequencing (WGS) data from public databases with an antimicrobial resistance phenotype and modified the existing workflow by adding an attention mechanism to enable the modified workflow to focus more on core single nucleotide polymorphisms (SNPs) that may significantly lead to the development of AMR in E. coli. While comparing the model performance before and after adding the attention mechanism, we also performed a cross-comparison among the published models using random forest (RF), support vector machine (SVM), logistic regression (LR), and convolutional neural network (CNN). Our study demonstrates that the discriminative positional colors of Chaos Game Representation (CGR) images can selectively influence and highlight genome regions without prior knowledge, enhancing prediction accuracy. Furthermore, we developed an online tool (https://github.com/tjiaa/E.coli-ML/tree/main) for assisting clinicians in the rapid prediction of the AMR phenotype of E. coli and accelerating clinical decision-making.
Collapse
Affiliation(s)
- Canghong Jin
- School of Computer and Computing Science, Hangzhou City University, Hangzhou 310015, China
| | - Chenghao Jia
- Institute of Preventive Veterinary Sciences and Department of Veterinary Medicine, Zhejiang University College of Animal Sciences, Hangzhou 310058, China
| | - Wenkang Hu
- School of Computer and Computing Science, Hangzhou City University, Hangzhou 310015, China
- College of Computer Science and Technology, Zhejiang University, Hangzhou 310027, China
| | - Haidong Xu
- School of Computer and Computing Science, Hangzhou City University, Hangzhou 310015, China
| | - Yanyi Shen
- School of Computer and Computing Science, Hangzhou City University, Hangzhou 310015, China
| | - Min Yue
- Institute of Preventive Veterinary Sciences and Department of Veterinary Medicine, Zhejiang University College of Animal Sciences, Hangzhou 310058, China
- Hainan Institute of Zhejiang University, Sanya 572000, China
- Zhejiang Provincial Key Laboratory of Preventive Veterinary Medicine, Hangzhou 310058, China
- State Key Laboratory for Diagnosis and Treatment of Infectious Diseases, National Clinical Research Center for Infectious Diseases, National Medical Center for Infectious Diseases, The First Affiliated Hospital, College of Medicine, Zhejiang University, Hangzhou 310003, China
| |
Collapse
|
2
|
Bizzotto E, Zampieri G, Treu L, Filannino P, Di Cagno R, Campanaro S. Classification of bioactive peptides: A systematic benchmark of models and encodings. Comput Struct Biotechnol J 2024; 23:2442-2452. [PMID: 38867723 PMCID: PMC11168199 DOI: 10.1016/j.csbj.2024.05.040] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/19/2024] [Revised: 05/10/2024] [Accepted: 05/22/2024] [Indexed: 06/14/2024] Open
Abstract
Bioactive peptides are short amino acid chains possessing biological activity and exerting physiological effects relevant to human health. Despite their therapeutic value, their identification remains a major problem, as it mainly relies on time-consuming in vitro tests. While bioinformatic tools for the identification of bioactive peptides are available, they are focused on specific functional classes and have not been systematically tested on realistic settings. To tackle this problem, bioactive peptide sequences and functions were here gathered from a variety of databases to generate a unified collection of bioactive peptides from microbial fermentation. This collection was organized into nine functional classes including some previously studied and some unexplored such as immunomodulatory, opioid and cardiovascular peptides. Upon assessing their sequence properties, four alternative encoding methods were tested in combination with a multitude of machine learning algorithms, from basic classifiers like logistic regression to advanced algorithms like BERT. Tests on a total of 171 models showed that, while some functions are intrinsically easier to detect, no single combination of classifiers and encoders worked universally well for all classes. For this reason, we unified all the best individual models for each class and generated CICERON (Classification of bIoaCtive pEptides fRom micrObial fermeNtation), a classification tool for the functional classification of peptides. State-of-the-art classifiers were found to underperform on our realistic benchmark dataset compared to the models included in CICERON. Altogether, our work provides a tool for real-world peptide classification and can serve as a benchmark for future model development.
Collapse
Affiliation(s)
- Edoardo Bizzotto
- Department of Biology, University of Padua, Via U. Bassi 58/b, Padova 35131, Italy
| | - Guido Zampieri
- Department of Biology, University of Padua, Via U. Bassi 58/b, Padova 35131, Italy
| | - Laura Treu
- Department of Biology, University of Padua, Via U. Bassi 58/b, Padova 35131, Italy
| | - Pasquale Filannino
- Department of Soil, Plant and Food Science, University of Bari Aldo Moro, Via G. Amendola 165/a, Bari 70126, Italy
| | - Raffaella Di Cagno
- Faculty of Agricultural, Environmental and Food Sciences, Free University of Bolzano, Piazza Universita, 5, Bolzano 39100, Italy
| | - Stefano Campanaro
- Department of Biology, University of Padua, Via U. Bassi 58/b, Padova 35131, Italy
| |
Collapse
|
3
|
Kim N, Ma J, Kim W, Kim J, Belenky P, Lee I. Genome-resolved metagenomics: a game changer for microbiome medicine. Exp Mol Med 2024; 56:1501-1512. [PMID: 38945961 PMCID: PMC11297344 DOI: 10.1038/s12276-024-01262-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2023] [Revised: 03/06/2024] [Accepted: 03/25/2024] [Indexed: 07/02/2024] Open
Abstract
Recent substantial evidence implicating commensal bacteria in human diseases has given rise to a new domain in biomedical research: microbiome medicine. This emerging field aims to understand and leverage the human microbiota and derivative molecules for disease prevention and treatment. Despite the complex and hierarchical organization of this ecosystem, most research over the years has relied on 16S amplicon sequencing, a legacy of bacterial phylogeny and taxonomy. Although advanced sequencing technologies have enabled cost-effective analysis of entire microbiota, translating the relatively short nucleotide information into the functional and taxonomic organization of the microbiome has posed challenges until recently. In the last decade, genome-resolved metagenomics, which aims to reconstruct microbial genomes directly from whole-metagenome sequencing data, has made significant strides and continues to unveil the mysteries of various human-associated microbial communities. There has been a rapid increase in the volume of whole metagenome sequencing data and in the compilation of novel metagenome-assembled genomes and protein sequences in public depositories. This review provides an overview of the capabilities and methods of genome-resolved metagenomics for studying the human microbiome, with a focus on investigating the prokaryotic microbiota of the human gut. Just as decoding the human genome and its variations marked the beginning of the genomic medicine era, unraveling the genomes of commensal microbes and their sequence variations is ushering us into the era of microbiome medicine. Genome-resolved metagenomics stands as a pivotal tool in this transition and can accelerate our journey toward achieving these scientific and medical milestones.
Collapse
Affiliation(s)
- Nayeon Kim
- Department of Biotechnology, College of Life Science and Biotechnology, Yonsei University, Seoul, 03722, Republic of Korea
| | - Junyeong Ma
- Department of Biotechnology, College of Life Science and Biotechnology, Yonsei University, Seoul, 03722, Republic of Korea
| | - Wonjong Kim
- Department of Biotechnology, College of Life Science and Biotechnology, Yonsei University, Seoul, 03722, Republic of Korea
| | - Jungyeon Kim
- Department of Biotechnology, College of Life Science and Biotechnology, Yonsei University, Seoul, 03722, Republic of Korea
| | - Peter Belenky
- Department of Molecular Microbiology and Immunology, Brown University, Providence, RI, 02912, USA.
| | - Insuk Lee
- Department of Biotechnology, College of Life Science and Biotechnology, Yonsei University, Seoul, 03722, Republic of Korea.
- POSTECH Biotech Center, Pohang University of Science and Technology (POSTECH), Pohang, 37673, Republic of Korea.
| |
Collapse
|
4
|
Yuan L, Wang K, Fang Y, Xu X, Chen Y, Zhao D, Lu K. Interaction of Cecropin A (1-7) Analogs with DNA Analyzed by Multi-spectroscopic Methods. Protein J 2024; 43:274-282. [PMID: 38265732 DOI: 10.1007/s10930-023-10177-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 12/12/2023] [Indexed: 01/25/2024]
Abstract
Cecropin A (1-7) is a cationic antimicrobial peptide which contain lots of basic amino acids. To understand the effect of basic amino acids on cecropin A (1-7), analogues CA2, CA3 and CA4 which have more arginine or lysine at the N-terminal or C-terminal were designed and synthesized. The interaction of cecropin A (1-7) and its analogs with DNA was studied using ultraviolet-visible spectroscopy, fluorescence spectroscopy and circular dichroism spectroscopy. Multispectral analysis showed that basic amino acids improved the interaction between the analogues and DNA. The interaction between CA4 and DNA is most pronounced. Fluorescence spectrum indicated that Ksv value of CA4 is 1.19 × 105 L mol-1 compared to original peptide cecropin A (1-7) of 3.73 × 104 L mol-1. The results of antimicrobial experiments with cecropin A (1-7) and its analogues showed that basic amino acids enhanced the antimicrobial effect of the analogues. The antimicrobial activity of CA4 against E. coli was eightfold higher than that of cecropin A (1-7). The importance of basic amino acid in peptides is revealed and provides useful information for subsequent studies of antimicrobial peptides.
Collapse
Affiliation(s)
- Libo Yuan
- College of Chemistry and Chemical Engineering, Henan University of Technology, Zhengzhou, 450001, People's Republic of China.
| | - Ke Wang
- College of Chemistry and Chemical Engineering, Henan University of Technology, Zhengzhou, 450001, People's Republic of China
| | - Yuan Fang
- Pharmacy Department, Zhengzhou People's Hospital, Zhengzhou, 450003, People's Republic of China.
| | - Xiujuan Xu
- College of Chemistry and Chemical Engineering, Henan University of Technology, Zhengzhou, 450001, People's Republic of China
| | - Yingcun Chen
- College of Chemistry and Chemical Engineering, Henan University of Technology, Zhengzhou, 450001, People's Republic of China
| | - Dongxin Zhao
- College of Chemistry and Chemical Engineering, Henan University of Technology, Zhengzhou, 450001, People's Republic of China
| | - Kui Lu
- College of Chemistry and Chemical Engineering, Henan University of Technology, Zhengzhou, 450001, People's Republic of China.
| |
Collapse
|
5
|
Okasha H. Fundamental Uses of Peptides as a New Model in Both Treatment and Diagnosis. Recent Pat Biotechnol 2024; 18:110-127. [PMID: 38282442 DOI: 10.2174/1872208317666230512143508] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2023] [Revised: 03/16/2023] [Accepted: 04/04/2023] [Indexed: 01/30/2024]
Abstract
An amino acid short chain is known as a peptide. Peptide bonds are the connections that hold the amino acids of a peptide together in a particular order. Characteristically, the shorter length of peptides helps to identify them from proteins. Different ways are used to classify peptides, including chain length, source of peptides, or their biological functions. The fact that peptides serve several purposes suggests that there is a foundation for improvement in peptide production and structure to enhance action. In addition, many patents on peptides for therapeutic and diagnostic approaches have been obtained. This review aims to give an overview of peptides used recently in treatment and diagnosis.
Collapse
Affiliation(s)
- Hend Okasha
- Department of Biochemistry and Molecular Biology, Theodor Bilharz Research Institute, Giza, 12411, Egypt
| |
Collapse
|
6
|
Haselbeck F, John M, Zhang Y, Pirnay J, Fuenzalida-Werner J, Costa R, Grimm D. Superior protein thermophilicity prediction with protein language model embeddings. NAR Genom Bioinform 2023; 5:lqad087. [PMID: 37829176 PMCID: PMC10566323 DOI: 10.1093/nargab/lqad087] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2023] [Revised: 07/14/2023] [Accepted: 09/18/2023] [Indexed: 10/14/2023] Open
Abstract
Protein thermostability is important in many areas of biotechnology, including enzyme engineering and protein-hybrid optoelectronics. Ever-growing protein databases and information on stability at different temperatures allow the training of machine learning models to predict whether proteins are thermophilic. In silico predictions could reduce costs and accelerate the development process by guiding researchers to more promising candidates. Existing models for predicting protein thermophilicity rely mainly on features derived from physicochemical properties. Recently, modern protein language models that directly use sequence information have demonstrated superior performance in several tasks. In this study, we evaluate the usefulness of protein language model embeddings for thermophilicity prediction with ProLaTherm, a Protein Language model-based Thermophilicity predictor. ProLaTherm significantly outperforms all feature-, sequence- and literature-based comparison partners on multiple evaluation metrics. In terms of the Matthew's correlation coefficient, ProLaTherm outperforms the second-best competitor by 18.1% in a nested cross-validation setup. Using proteins from species not overlapping with species from the training data, ProLaTherm outperforms all competitors by at least 9.7%. On these data, it misclassified only one nonthermophilic protein as thermophilic. Furthermore, it correctly identified 97.4% of all thermophilic proteins in our test set with an optimal growth temperature above 70°C.
Collapse
Affiliation(s)
- Florian Haselbeck
- Technical University of Munich, Campus Straubing for Biotechnology and Sustainability, Bioinformatics, 94315 Straubing, Germany
- Weihenstephan-Triesdorf University of Applied Sciences, Bioinformatics, 94315 Straubing, Germany
| | - Maura John
- Technical University of Munich, Campus Straubing for Biotechnology and Sustainability, Bioinformatics, 94315 Straubing, Germany
- Weihenstephan-Triesdorf University of Applied Sciences, Bioinformatics, 94315 Straubing, Germany
| | - Yuqi Zhang
- Technical University of Munich, Campus Straubing for Biotechnology and Sustainability, Bioinformatics, 94315 Straubing, Germany
| | - Jonathan Pirnay
- Technical University of Munich, Campus Straubing for Biotechnology and Sustainability, Bioinformatics, 94315 Straubing, Germany
- Weihenstephan-Triesdorf University of Applied Sciences, Bioinformatics, 94315 Straubing, Germany
| | - Juan Pablo Fuenzalida-Werner
- Technical University of Munich, Campus Straubing for Biotechnology and Sustainability, Chair of Biogenic Functional Materials, 94315 Straubing, Germany
| | - Rubén D Costa
- Technical University of Munich, Campus Straubing for Biotechnology and Sustainability, Chair of Biogenic Functional Materials, 94315 Straubing, Germany
| | - Dominik G Grimm
- Technical University of Munich, Campus Straubing for Biotechnology and Sustainability, Bioinformatics, 94315 Straubing, Germany
- Weihenstephan-Triesdorf University of Applied Sciences, Bioinformatics, 94315 Straubing, Germany
- Technical University of Munich, TUM School of Computation, Information and Technology (CIT), 85748 Garching, Germany
| |
Collapse
|
7
|
Mwangi J, Kamau PM, Thuku RC, Lai R. Design methods for antimicrobial peptides with improved performance. Zool Res 2023; 44:1095-1114. [PMID: 37914524 PMCID: PMC10802102 DOI: 10.24272/j.issn.2095-8137.2023.246] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2023] [Accepted: 09/20/2023] [Indexed: 11/03/2023] Open
Abstract
The recalcitrance of pathogens to traditional antibiotics has made treating and eradicating bacterial infections more difficult. In this regard, developing new antimicrobial agents to combat antibiotic-resistant strains has become a top priority. Antimicrobial peptides (AMPs), a ubiquitous class of naturally occurring compounds with broad-spectrum antipathogenic activity, hold significant promise as an effective solution to the current antimicrobial resistance (AMR) crisis. Several AMPs have been identified and evaluated for their therapeutic application, with many already in the drug development pipeline. Their distinct properties, such as high target specificity, potency, and ability to bypass microbial resistance mechanisms, make AMPs a promising alternative to traditional antibiotics. Nonetheless, several challenges, such as high toxicity, lability to proteolytic degradation, low stability, poor pharmacokinetics, and high production costs, continue to hamper their clinical applicability. Therefore, recent research has focused on optimizing the properties of AMPs to improve their performance. By understanding the physicochemical properties of AMPs that correspond to their activity, such as amphipathicity, hydrophobicity, structural conformation, amino acid distribution, and composition, researchers can design AMPs with desired and improved performance. In this review, we highlight some of the key strategies used to optimize the performance of AMPs, including rational design and de novo synthesis. We also discuss the growing role of predictive computational tools, utilizing artificial intelligence and machine learning, in the design and synthesis of highly efficacious lead drug candidates.
Collapse
Affiliation(s)
- James Mwangi
- Key Laboratory of Bioactive Peptides of Yunnan Province, Engineering Laboratory of Peptides of Chinese Academy of Sciences, KIZ-CUHK Joint Laboratory of Bioresources and Molecular Research in Common Diseases, National Resource Centre for Non-Human Primates, Kunming Primate Research Centre, National Research Facility for Phenotypic & Genetic Analysis of Model Animals (Primate Facility), Sino-African Joint Research Centre, New Cornerstone Science Institute, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, Yunnan 650107, China
- Kunming College of Life Science, University of Chinese Academy of Sciences, Kunming, Yunnan 650204, China
| | - Peter Muiruri Kamau
- Key Laboratory of Bioactive Peptides of Yunnan Province, Engineering Laboratory of Peptides of Chinese Academy of Sciences, KIZ-CUHK Joint Laboratory of Bioresources and Molecular Research in Common Diseases, National Resource Centre for Non-Human Primates, Kunming Primate Research Centre, National Research Facility for Phenotypic & Genetic Analysis of Model Animals (Primate Facility), Sino-African Joint Research Centre, New Cornerstone Science Institute, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, Yunnan 650107, China
- Kunming College of Life Science, University of Chinese Academy of Sciences, Kunming, Yunnan 650204, China
| | - Rebecca Caroline Thuku
- Key Laboratory of Bioactive Peptides of Yunnan Province, Engineering Laboratory of Peptides of Chinese Academy of Sciences, KIZ-CUHK Joint Laboratory of Bioresources and Molecular Research in Common Diseases, National Resource Centre for Non-Human Primates, Kunming Primate Research Centre, National Research Facility for Phenotypic & Genetic Analysis of Model Animals (Primate Facility), Sino-African Joint Research Centre, New Cornerstone Science Institute, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, Yunnan 650107, China
- Kunming College of Life Science, University of Chinese Academy of Sciences, Kunming, Yunnan 650204, China
| | - Ren Lai
- Key Laboratory of Bioactive Peptides of Yunnan Province, Engineering Laboratory of Peptides of Chinese Academy of Sciences, KIZ-CUHK Joint Laboratory of Bioresources and Molecular Research in Common Diseases, National Resource Centre for Non-Human Primates, Kunming Primate Research Centre, National Research Facility for Phenotypic & Genetic Analysis of Model Animals (Primate Facility), Sino-African Joint Research Centre, New Cornerstone Science Institute, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, Yunnan 650107, China
- Centre for Evolution and Conservation Biology, Southern Marine Science and Engineering Guangdong Laboratory, Guangzhou, Guangdong 511458, China. E-mail:
| |
Collapse
|
8
|
Mousa WK, Ghemrawi R, Abu-Izneid T, Ramadan A, Al-Marzooq F. Discovery of Lactomodulin, a Unique Microbiome-Derived Peptide That Exhibits Dual Anti-Inflammatory and Antimicrobial Activity against Multidrug-Resistant Pathogens. Int J Mol Sci 2023; 24:6901. [PMID: 37108065 PMCID: PMC10138793 DOI: 10.3390/ijms24086901] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2023] [Revised: 04/05/2023] [Accepted: 04/06/2023] [Indexed: 04/29/2023] Open
Abstract
The human body is a superorganism that harbors trillions of microbes, most of which inhabit the gut. To colonize our bodies, these microbes have evolved strategies to regulate the immune system and maintain intestinal immune homeostasis by secreting chemical mediators. There is much interest in deciphering these chemicals and furthering their development as novel therapeutics. In this work, we present a combined experimental and computational approach to identifying functional immunomodulatory molecules from the gut microbiome. Based on this approach, we report the discovery of lactomodulin, a unique peptide from Lactobacillus rhamnosus that exhibits dual anti-inflammatory and antibiotic activities and minimal cytotoxicity in human cell lines. Lactomodulin reduces several secreted proinflammatory cytokines, including IL-8, IL-6, IL-1β, and TNF-α. As an antibiotic, lactomodulin is effective against a range of human pathogens, and is most potent against antibiotic-resistant strains such as methicillin-resistant Staphylococcus aureus (MRSA) and vancomycin-resistant Enterococcus faecium (VRE). The multifunctional activity of lactomodulin affirms that the microbiome encodes evolved functional molecules with promising therapeutic potential.
Collapse
Affiliation(s)
- Walaa K. Mousa
- College of Pharmacy, Al Ain University, Abu Dhabi P.O. Box 112612, United Arab Emirates
- AAU Health and Biomedical Research Center, Al Ain University, Abu Dhabi P.O. Box 112612, United Arab Emirates
- College of Pharmacy, Mansoura University, Mansoura 35516, Egypt
| | - Rose Ghemrawi
- College of Pharmacy, Al Ain University, Abu Dhabi P.O. Box 112612, United Arab Emirates
- AAU Health and Biomedical Research Center, Al Ain University, Abu Dhabi P.O. Box 112612, United Arab Emirates
| | - Tareq Abu-Izneid
- College of Pharmacy, Al Ain University, Abu Dhabi P.O. Box 112612, United Arab Emirates
- AAU Health and Biomedical Research Center, Al Ain University, Abu Dhabi P.O. Box 112612, United Arab Emirates
| | - Azza Ramadan
- College of Pharmacy, Al Ain University, Abu Dhabi P.O. Box 112612, United Arab Emirates
- AAU Health and Biomedical Research Center, Al Ain University, Abu Dhabi P.O. Box 112612, United Arab Emirates
| | - Farah Al-Marzooq
- Department of Medical Microbiology and Immunology, College of Medicine and Health Sciences, UAE University, Al Ain P.O. Box 15551, United Arab Emirates
| |
Collapse
|
9
|
Spänig S, Michel A, Heider D. Unsupervised encoding selection through ensemble pruning for biomedical classification. BioData Min 2023; 16:10. [PMID: 36927546 PMCID: PMC10018861 DOI: 10.1186/s13040-022-00317-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2022] [Accepted: 11/27/2022] [Indexed: 03/18/2023] Open
Abstract
BACKGROUND Owing to the rising levels of multi-resistant pathogens, antimicrobial peptides, an alternative strategy to classic antibiotics, got more attention. A crucial part is thereby the costly identification and validation. With the ever-growing amount of annotated peptides, researchers leverage artificial intelligence to circumvent the cumbersome, wet-lab-based identification and automate the detection of promising candidates. However, the prediction of a peptide's function is not limited to antimicrobial efficiency. To date, multiple studies successfully classified additional properties, e.g., antiviral or cell-penetrating effects. In this light, ensemble classifiers are employed aiming to further improve the prediction. Although we recently presented a workflow to significantly diminish the initial encoding choice, an entire unsupervised encoding selection, considering various machine learning models, is still lacking. RESULTS We developed a workflow, automatically selecting encodings and generating classifier ensembles by employing sophisticated pruning methods. We observed that the Pareto frontier pruning is a good method to create encoding ensembles for the datasets at hand. In addition, encodings combined with the Decision Tree classifier as the base model are often superior. However, our results also demonstrate that none of the ensemble building techniques is outstanding for all datasets. CONCLUSION The workflow conducts multiple pruning methods to evaluate ensemble classifiers composed from a wide range of peptide encodings and base models. Consequently, researchers can use the workflow for unsupervised encoding selection and ensemble creation. Ultimately, the extensible workflow can be used as a plugin for the PEPTIDE REACToR, further establishing it as a versatile tool in the domain.
Collapse
Affiliation(s)
- Sebastian Spänig
- Data Science in Biomedicine, Department of Mathematics and Computer Science, University of Marburg, Marburg, Germany
| | - Alexander Michel
- Data Science in Biomedicine, Department of Mathematics and Computer Science, University of Marburg, Marburg, Germany
| | - Dominik Heider
- Data Science in Biomedicine, Department of Mathematics and Computer Science, University of Marburg, Marburg, Germany.
| |
Collapse
|
10
|
You Y, Liu H, Zhu Y, Zheng H. Rational design of stapled antimicrobial peptides. Amino Acids 2023; 55:421-442. [PMID: 36781451 DOI: 10.1007/s00726-023-03245-w] [Citation(s) in RCA: 7] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2022] [Accepted: 01/30/2023] [Indexed: 02/15/2023]
Abstract
The global increase in antimicrobial drug resistance has dramatically reduced the effectiveness of traditional antibiotics. Structurally diverse antibiotics are urgently needed to combat multiple-resistant bacterial infections. As part of innate immunity, antimicrobial peptides have been recognized as the most promising candidates because they comprise diverse sequences and mechanisms of action and have a relatively low induction rate of resistance. However, because of their low chemical stability, susceptibility to proteases, and high hemolytic effect, their usage is subject to many restrictions. Chemical modifications such as D-amino acid substitution, cyclization, and unnatural amino acid modification have been used to improve the stability of antimicrobial peptides for decades. Among them, a side-chain covalent bridge modification, the so-called stapled peptide, has attracted much attention. The stapled side-chain bridge stabilizes the secondary structure, induces protease resistance, and increases cell penetration and biological activity. Recent progress in computer-aided drug design and artificial intelligence methods has also been used in the design of stapled antimicrobial peptides and has led to the successful discovery of many prospective peptides. This article reviews the possible structure-activity relationships of stapled antimicrobial peptides, the physicochemical properties that influence their activity (such as net charge, hydrophobicity, helicity, and dipole moment), and computer-aided methods of stapled peptide design. Antimicrobial peptides under clinical trial: Pexiganan (NCT01594762, 2012-05-07). Omiganan (NCT02576847, 2015-10-13).
Collapse
Affiliation(s)
- YuHao You
- School of Life Science and Technology, China Pharmaceutical University, Nanjing, 210009, People's Republic of China
| | - HongYu Liu
- School of Life Science and Technology, China Pharmaceutical University, Nanjing, 210009, People's Republic of China
| | - YouZhuo Zhu
- School of Life Science and Technology, China Pharmaceutical University, Nanjing, 210009, People's Republic of China
| | - Heng Zheng
- School of Life Science and Technology, China Pharmaceutical University, Nanjing, 210009, People's Republic of China.
| |
Collapse
|
11
|
Hattab G, Anžel A, Spänig S, Neumann N, Heider D. A parametric approach for molecular encodings using multilevel atomic neighborhoods applied to peptide classification. NAR Genom Bioinform 2023; 5:lqac103. [PMID: 36632611 PMCID: PMC9830542 DOI: 10.1093/nargab/lqac103] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2022] [Revised: 11/26/2022] [Accepted: 12/19/2022] [Indexed: 01/11/2023] Open
Abstract
Exploring new ways to represent and discover organic molecules is critical to the development of new therapies. Fingerprinting algorithms are used to encode or machine-read organic molecules. Molecular encodings facilitate the computation of distance and similarity measurements to support tasks such as similarity search or virtual screening. Motivated by the ubiquity of carbon and the emerging structured patterns, we propose a parametric approach for molecular encodings using carbon-based multilevel atomic neighborhoods. It implements a walk along the carbon chain of a molecule to compute different representations of the neighborhoods in the form of a binary or numerical array that can later be exported into an image. Applied to the task of binary peptide classification, the evaluation was performed by using forty-nine encodings of twenty-nine data sets from various biomedical fields, resulting in well over 1421 machine learning models. By design, the parametric approach is domain- and task-agnostic and scopes all organic molecules including unnatural and exotic amino acids as well as cyclic peptides. Applied to peptide classification, our results point to a number of promising applications and extensions. The parametric approach was developed as a Python package (cmangoes), the source code and documentation of which can be found at https://github.com/ghattab/cmangoes and https://doi.org/10.5281/zenodo.7483771.
Collapse
Affiliation(s)
| | - Aleksandar Anžel
- Department of Mathematics and Computer Science, Philipps-Universität Marburg, Marburg 35032, Germany
| | - Sebastian Spänig
- Department of Mathematics and Computer Science, Philipps-Universität Marburg, Marburg 35032, Germany
| | - Nils Neumann
- Department of Mathematics and Computer Science, Philipps-Universität Marburg, Marburg 35032, Germany
| | - Dominik Heider
- Department of Mathematics and Computer Science, Philipps-Universität Marburg, Marburg 35032, Germany
| |
Collapse
|
12
|
Bohnsack KS, Kaden M, Abel J, Villmann T. Alignment-Free Sequence Comparison: A Systematic Survey From a Machine Learning Perspective. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2023; 20:119-135. [PMID: 34990369 DOI: 10.1109/tcbb.2022.3140873] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/04/2023]
Abstract
The encounter of large amounts of biological sequence data generated during the last decades and the algorithmic and hardware improvements have offered the possibility to apply machine learning techniques in bioinformatics. While the machine learning community is aware of the necessity to rigorously distinguish data transformation from data comparison and adopt reasonable combinations thereof, this awareness is often lacking in the field of comparative sequence analysis. With realization of the disadvantages of alignments for sequence comparison, some typical applications use more and more so-called alignment-free approaches. In light of this development, we present a conceptual framework for alignment-free sequence comparison, which highlights the delineation of: 1) the sequence data transformation comprising of adequate mathematical sequence coding and feature generation, from 2) the subsequent (dis-)similarity evaluation of the transformed data by means of problem-specific but mathematically consistent proximity measures. We consider coding to be an information-loss free data transformation in order to get an appropriate representation, whereas feature generation is inevitably information-lossy with the intention to extract just the task-relevant information. This distinction sheds light on the plethora of methods available and assists in identifying suitable methods in machine learning and data analysis to compare the sequences under these premises.
Collapse
|
13
|
Dong B, Li M, Jiang B, Gao B, Li D, Zhang T. Antimicrobial Peptides Prediction method based on sequence multidimensional feature embedding. Front Genet 2022; 13:1069558. [PMID: 36468005 PMCID: PMC9714691 DOI: 10.3389/fgene.2022.1069558] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2022] [Accepted: 11/02/2022] [Indexed: 09/10/2024] Open
Abstract
Antimicrobial peptides (AMPs) are alkaline substances with efficient bactericidal activity produced in living organisms. As the best substitute for antibiotics, they have been paid more and more attention in scientific research and clinical application. AMPs can be produced from almost all organisms and are capable of killing a wide variety of pathogenic microorganisms. In addition to being antibacterial, natural AMPs have many other therapeutically important activities, such as wound healing, antioxidant and immunomodulatory effects. To discover new AMPs, the use of wet experimental methods is expensive and difficult, and bioinformatics technology can effectively solve this problem. Recently, some deep learning methods have been applied to the prediction of AMPs and achieved good results. To further improve the prediction accuracy of AMPs, this paper designs a new deep learning method based on sequence multidimensional representation. By encoding and embedding sequence features, and then inputting the model to identify AMPs, high-precision classification of AMPs and Non-AMPs with lengths of 10-200 is achieved. The results show that our method improved accuracy by 1.05% compared to the most advanced model in independent data validation without decreasing other indicators.
Collapse
Affiliation(s)
- Benzhi Dong
- College of Information and Computer Engineering, Northeast Forestry University, Harbin, China
| | - Mengna Li
- College of Information and Computer Engineering, Northeast Forestry University, Harbin, China
| | - Bei Jiang
- Tianjin Second People's Hospital, Tianjin Institute of Hepatology, Tianjin, China
| | - Bo Gao
- Department of Radiology, The Second Affiliated Hospital of Harbin Medical University, Harbin, China
| | - Dan Li
- College of Information and Computer Engineering, Northeast Forestry University, Harbin, China
| | - Tianjiao Zhang
- College of Information and Computer Engineering, Northeast Forestry University, Harbin, China
| |
Collapse
|
14
|
Yan J, Cai J, Zhang B, Wang Y, Wong DF, Siu SWI. Recent Progress in the Discovery and Design of Antimicrobial Peptides Using Traditional Machine Learning and Deep Learning. Antibiotics (Basel) 2022; 11:1451. [PMID: 36290108 PMCID: PMC9598685 DOI: 10.3390/antibiotics11101451] [Citation(s) in RCA: 26] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2022] [Revised: 10/11/2022] [Accepted: 10/13/2022] [Indexed: 11/16/2022] Open
Abstract
Antimicrobial resistance has become a critical global health problem due to the abuse of conventional antibiotics and the rise of multi-drug-resistant microbes. Antimicrobial peptides (AMPs) are a group of natural peptides that show promise as next-generation antibiotics due to their low toxicity to the host, broad spectrum of biological activity, including antibacterial, antifungal, antiviral, and anti-parasitic activities, and great therapeutic potential, such as anticancer, anti-inflammatory, etc. Most importantly, AMPs kill bacteria by damaging cell membranes using multiple mechanisms of action rather than targeting a single molecule or pathway, making it difficult for bacterial drug resistance to develop. However, experimental approaches used to discover and design new AMPs are very expensive and time-consuming. In recent years, there has been considerable interest in using in silico methods, including traditional machine learning (ML) and deep learning (DL) approaches, to drug discovery. While there are a few papers summarizing computational AMP prediction methods, none of them focused on DL methods. In this review, we aim to survey the latest AMP prediction methods achieved by DL approaches. First, the biology background of AMP is introduced, then various feature encoding methods used to represent the features of peptide sequences are presented. We explain the most popular DL techniques and highlight the recent works based on them to classify AMPs and design novel peptide sequences. Finally, we discuss the limitations and challenges of AMP prediction.
Collapse
Affiliation(s)
- Jielu Yan
- PAMI Research Group, Department of Computer and Information Science, University of Macau, Taipa, Macau, China
| | - Jianxiu Cai
- Faculty of Applied Sciences, Macao Polytechnic University, Macau, China
- Institute of Science and Environment, University of Saint Joseph, Estr. Marginal da Ilha Verde, Macau, China
| | - Bob Zhang
- PAMI Research Group, Department of Computer and Information Science, University of Macau, Taipa, Macau, China
| | - Yapeng Wang
- Faculty of Applied Sciences, Macao Polytechnic University, Macau, China
| | - Derek F. Wong
- NLP2CT Lab, Department of Computer and Information Science, University of Macau, Taipa, Macau, China
| | - Shirley W. I. Siu
- Institute of Science and Environment, University of Saint Joseph, Estr. Marginal da Ilha Verde, Macau, China
- School of Pharmaceutical Sciences, Universiti Sains Malaysia, Pulau Pinang 11800, Malaysia
| |
Collapse
|
15
|
Agüero-Chapin G, Galpert-Cañizares D, Domínguez-Pérez D, Marrero-Ponce Y, Pérez-Machado G, Teijeira M, Antunes A. Emerging Computational Approaches for Antimicrobial Peptide Discovery. Antibiotics (Basel) 2022; 11:antibiotics11070936. [PMID: 35884190 PMCID: PMC9311958 DOI: 10.3390/antibiotics11070936] [Citation(s) in RCA: 15] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2022] [Revised: 07/01/2022] [Accepted: 07/08/2022] [Indexed: 02/05/2023] Open
Abstract
In the last two decades many reports have addressed the application of artificial intelligence (AI) in the search and design of antimicrobial peptides (AMPs). AI has been represented by machine learning (ML) algorithms that use sequence-based features for the discovery of new peptidic scaffolds with promising biological activity. From AI perspective, evolutionary algorithms have been also applied to the rational generation of peptide libraries aimed at the optimization/design of AMPs. However, the literature has scarcely dedicated to other emerging non-conventional in silico approaches for the search/design of such bioactive peptides. Thus, the first motivation here is to bring up some non-standard peptide features that have been used to build classical ML predictive models. Secondly, it is valuable to highlight emerging ML algorithms and alternative computational tools to predict/design AMPs as well as to explore their chemical space. Another point worthy of mention is the recent application of evolutionary algorithms that actually simulate sequence evolution to both the generation of diversity-oriented peptide libraries and the optimization of hit peptides. Last but not least, included here some new considerations in proteogenomic analyses currently incorporated into the computational workflow for unravelling AMPs in natural sources.
Collapse
Affiliation(s)
- Guillermin Agüero-Chapin
- CIIMAR—Centro Interdisciplinar de Investigação Marinha e Ambiental, Universidade do Porto, Terminal de Cruzeiros do Porto de Leixões, Av. General Norton de Matos, s/n, 4450-208 Porto, Portugal;
- Departamento de Biologia, Faculdade de Ciências, Universidade do Porto, Rua do Campo Alegre, 4169-007 Porto, Portugal
- Correspondence: (G.A.-C.); (A.A.); Tel.: +351-22-340-1813 (G.A.-C. & A.A.)
| | - Deborah Galpert-Cañizares
- Departamento de Ciencia de la Computación, Universidad Central Marta Abreu de Las Villas (UCLV), Santa Clara 54830, Cuba;
| | - Dany Domínguez-Pérez
- CIIMAR—Centro Interdisciplinar de Investigação Marinha e Ambiental, Universidade do Porto, Terminal de Cruzeiros do Porto de Leixões, Av. General Norton de Matos, s/n, 4450-208 Porto, Portugal;
- Proquinorte, Unipessoal, Lda, Avenida 5 de Outubro, 124, 7º Piso, Avenidas Novas, 1050-061 Lisboa, Portugal
| | - Yovani Marrero-Ponce
- Universidad San Francisco de Quito (USFQ), Grupo de Medicina Molecular y Translacional (MeM&T), Colegio de Ciencias de la Salud (COCSA), Escuela de Medicina, Edificio de Especialidades Médicas and Instituto de Simulación Computacional (ISC-USFQ), Diego de Robles y vía Interoceánica, Quito 170157, Ecuador;
| | - Gisselle Pérez-Machado
- EpiDisease S.L—Spin-Off of Centro de Investigación Biomédica en Red de Enfermedades Raras (CIBERER), 46980 Valencia, Spain;
| | - Marta Teijeira
- Departamento de Química Orgánica, Facultade de Química, Universidade de Vigo, 36310 Vigo, Spain;
- Instituto de Investigación Sanitaria Galicia Sur, Hospital Álvaro Cunqueiro, 36213 Vigo, Spain
| | - Agostinho Antunes
- CIIMAR—Centro Interdisciplinar de Investigação Marinha e Ambiental, Universidade do Porto, Terminal de Cruzeiros do Porto de Leixões, Av. General Norton de Matos, s/n, 4450-208 Porto, Portugal;
- Departamento de Biologia, Faculdade de Ciências, Universidade do Porto, Rua do Campo Alegre, 4169-007 Porto, Portugal
- Correspondence: (G.A.-C.); (A.A.); Tel.: +351-22-340-1813 (G.A.-C. & A.A.)
| |
Collapse
|
16
|
Zhuang Y, Liu X, Zhong Y, Wu L. A Deep Ensemble Predictor for Identifying Anti-Hypertensive Peptides Using Pretrained Protein Embedding. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2022; 19:1986-1992. [PMID: 33760739 DOI: 10.1109/tcbb.2021.3068381] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Hypertension (HT), or high blood pressure is one of the most common and main causes in cardiovascular diseases, which is also related to a series of detrimental diseases in humans. Deficiencies in effective treatment in HT are often associated with a series of diseases including multi-infarct dementia, amputation, and renal failure. Therefore, identifying anti-hypertension peptides has the vital realistic significance. Although many bioactive peptides have been developed to reduce blood pressure, they are time-consuming and laborious. In views of the obstacles of the intrinsic methods in antihypertensive peptide (AHTP) classification, computational methods are suggested as a supplement to identify AHTPs. In this study, we develop a comprehensive feature representation algorithm based on pretrained model and convolutional neural network and apply the deep ensemble model to construct the prediction model. The new predictor is used to identify AHTPs in benchmark and independent datasets. It has been shown in the independent test set that the performance is better than the recent methods. Comparative results indicate that our model can shed some light on hypertension therapy and gains more insights of classifying AHTPs. The implements and codes can be found in https://github.com/yuanying566/AHPred-DE.
Collapse
|
17
|
Otović E, Njirjak M, Kalafatovic D, Mauša G. Sequential Properties Representation Scheme for Recurrent Neural Network-Based Prediction of Therapeutic Peptides. J Chem Inf Model 2022; 62:2961-2972. [PMID: 35704881 DOI: 10.1021/acs.jcim.2c00526] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
The discovery of therapeutic peptides is often accelerated by means of virtual screening supported by machine learning-based predictive models. The predictive performance of such models is sensitive to the choice of data and its representation scheme. While the peptide physicochemical and compositional representations fail to distinguish sequence permutations, the amino acid arrangement within the sequence lacks the important information contained in physicochemical, conformational, topological, and geometrical properties. In this paper, we propose a solution to the identified information gap by implementing a hybrid scheme that complements the best traits from both approaches with the aim of predicting antimicrobial and antiviral activities based on experimental data from DRAMP 2.0, AVPdb, and Uniprot data repositories. Using the Friedman test of statistical significance, we compared our hybrid, sequential properties approach to peptide properties, one-hot vector encoding, and word embedding schemes in the 10-fold cross-validation setting, with respect to the F1 score, Matthews correlation coefficient, geometric mean, recall, and precision evaluation metrics. Moreover, the sequence modeling neural network was employed to gain insight into the synergic effect of both properties- and amino acid order-based predictions. The results suggest that sequential properties significantly (P < 0.01) surpasses the aforementioned state-of-the-art representation schemes. This makes it a strong candidate for increasing the predictive power of screening methods based on machine learning, applicable to any category of peptides.
Collapse
Affiliation(s)
- Erik Otović
- University of Rijeka, Faculty of Engineering, 51000 Rijeka, Croatia
| | - Marko Njirjak
- University of Rijeka, Faculty of Engineering, 51000 Rijeka, Croatia
| | - Daniela Kalafatovic
- University of Rijeka, Department of Biotechnology, 51000 Rijeka, Croatia.,University of Rijeka, Center for Artificial Intelligence and Cybersecurity, 51000 Rijeka, Croatia
| | - Goran Mauša
- University of Rijeka, Faculty of Engineering, 51000 Rijeka, Croatia.,University of Rijeka, Center for Artificial Intelligence and Cybersecurity, 51000 Rijeka, Croatia
| |
Collapse
|
18
|
Wan F, Kontogiorgos-Heintz D, de la Fuente-Nunez C. Deep generative models for peptide design. DIGITAL DISCOVERY 2022; 1:195-208. [PMID: 35769205 PMCID: PMC9189861 DOI: 10.1039/d1dd00024a] [Citation(s) in RCA: 28] [Impact Index Per Article: 14.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 10/15/2021] [Accepted: 03/19/2022] [Indexed: 12/13/2022]
Abstract
Computers can already be programmed for superhuman pattern recognition of images and text. For machines to discover novel molecules, they must first be trained to sort through the many characteristics of molecules and determine which properties should be retained, suppressed, or enhanced to optimize functions of interest. Machines need to be able to understand, read, write, and eventually create new molecules. Today, this creative process relies on deep generative models, which have gained popularity since powerful deep neural networks were introduced to generative model frameworks. In recent years, they have demonstrated excellent ability to model complex distribution of real-word data (e.g., images, audio, text, molecules, and biological sequences). Deep generative models can generate data beyond those provided in training samples, thus yielding an efficient and rapid tool for exploring the massive search space of high-dimensional data such as DNA/protein sequences and facilitating the design of biomolecules with desired functions. Here, we review the emerging field of deep generative models applied to peptide science. In particular, we discuss several popular deep generative model frameworks as well as their applications to generate peptides with various kinds of properties (e.g., antimicrobial, anticancer, cell penetration, etc). We conclude our review with a discussion of current limitations and future perspectives in this emerging field.
Collapse
Affiliation(s)
- Fangping Wan
- Machine Biology Group, Departments of Psychiatry and Microbiology, Institute for Biomedical Informatics, Institute for Translational Medicine and Therapeutics, Perelman School of Medicine, University of Pennsylvania Philadelphia Pennsylvania USA
- Departments of Bioengineering and Chemical and Biomolecular Engineering, School of Engineering and Applied Science, University of Pennsylvania Philadelphia Pennsylvania USA
- Penn Institute for Computational Science, University of Pennsylvania Philadelphia Pennsylvania USA
| | - Daphne Kontogiorgos-Heintz
- Machine Biology Group, Departments of Psychiatry and Microbiology, Institute for Biomedical Informatics, Institute for Translational Medicine and Therapeutics, Perelman School of Medicine, University of Pennsylvania Philadelphia Pennsylvania USA
- Departments of Bioengineering and Chemical and Biomolecular Engineering, School of Engineering and Applied Science, University of Pennsylvania Philadelphia Pennsylvania USA
- Penn Institute for Computational Science, University of Pennsylvania Philadelphia Pennsylvania USA
- Department of Computer and Information Science, School of Engineering and Applied Science, University of Pennsylvania Philadelphia Pennsylvania USA
| | - Cesar de la Fuente-Nunez
- Machine Biology Group, Departments of Psychiatry and Microbiology, Institute for Biomedical Informatics, Institute for Translational Medicine and Therapeutics, Perelman School of Medicine, University of Pennsylvania Philadelphia Pennsylvania USA
- Departments of Bioengineering and Chemical and Biomolecular Engineering, School of Engineering and Applied Science, University of Pennsylvania Philadelphia Pennsylvania USA
- Penn Institute for Computational Science, University of Pennsylvania Philadelphia Pennsylvania USA
| |
Collapse
|
19
|
Sequeira AM, Lousa D, Rocha M. ProPythia: A Python package for protein classification based on machine and deep learning. Neurocomputing 2022. [DOI: 10.1016/j.neucom.2021.07.102] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
|
20
|
Prediction of Linear Cationic Antimicrobial Peptides Active against Gram-Negative and Gram-Positive Bacteria Based on Machine Learning Models. APPLIED SCIENCES-BASEL 2022. [DOI: 10.3390/app12073631] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Antimicrobial peptides (AMPs) are considered as promising alternatives to conventional antibiotics in order to overcome the growing problems of antibiotic resistance. Computational prediction approaches receive an increasing interest to identify and design the best candidate AMPs prior to the in vitro tests. In this study, we focused on the linear cationic peptides with non-hemolytic activity, which are downloaded from the Database of Antimicrobial Activity and Structure of Peptides (DBAASP). Referring to the MIC (Minimum inhibition concentration) values, we have assigned a positive label to a peptide if it shows antimicrobial activity; otherwise, the peptide is labeled as negative. Here, we focused on the peptides showing antimicrobial activity against Gram-negative and against Gram-positive bacteria separately, and we created two datasets accordingly. Ten different physico-chemical properties of the peptides are calculated and used as features in our study. Following data exploration and data preprocessing steps, a variety of classification algorithms are used with 100-fold Monte Carlo Cross-Validation to build models and to predict the antimicrobial activity of the peptides. Among the generated models, Random Forest has resulted in the best performance metrics for both Gram-negative dataset (Accuracy: 0.98, Recall: 0.99, Specificity: 0.97, Precision: 0.97, AUC: 0.99, F1: 0.98) and Gram-positive dataset (Accuracy: 0.95, Recall: 0.95, Specificity: 0.95, Precision: 0.90, AUC: 0.97, F1: 0.92) after outlier elimination is applied. This prediction approach might be useful to evaluate the antibacterial potential of a candidate peptide sequence before moving to the experimental studies.
Collapse
|
21
|
Identification of antimicrobial peptides from the human gut microbiome using deep learning. Nat Biotechnol 2022; 40:921-931. [PMID: 35241840 DOI: 10.1038/s41587-022-01226-0] [Citation(s) in RCA: 143] [Impact Index Per Article: 71.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2021] [Accepted: 01/19/2022] [Indexed: 02/07/2023]
Abstract
The human gut microbiome encodes a large variety of antimicrobial peptides (AMPs), but the short lengths of AMPs pose a challenge for computational prediction. Here we combined multiple natural language processing neural network models, including LSTM, Attention and BERT, to form a unified pipeline for candidate AMP identification from human gut microbiome data. Of 2,349 sequences identified as candidate AMPs, 216 were chemically synthesized, with 181 showing antimicrobial activity (a positive rate of >83%). Most of these peptides have less than 40% sequence homology to AMPs in the training set. Further characterization of the 11 most potent AMPs showed high efficacy against antibiotic-resistant, Gram-negative pathogens and demonstrated significant efficacy in lowering bacterial load by more than tenfold against a mouse model of bacterial lung infection. Our study showcases the potential of machine learning approaches for mining functional peptides from metagenome data and accelerating the discovery of promising AMP candidate molecules for in-depth investigations.
Collapse
|
22
|
Grønning AGB, Kacprowski T, Schéele C. MultiPep: a hierarchical deep learning approach for multi-label classification of peptide bioactivities. Biol Methods Protoc 2021; 6:bpab021. [PMID: 34909478 PMCID: PMC8665375 DOI: 10.1093/biomethods/bpab021] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2021] [Revised: 10/28/2021] [Accepted: 11/17/2021] [Indexed: 11/14/2022] Open
Abstract
Peptide-based therapeutics are here to stay and will prosper in the future. A key step in identifying novel peptide-drugs is the determination of their bioactivities. Recent advances in peptidomics screening approaches hold promise as a strategy for identifying novel drug targets. However, these screenings typically generate an immense number of peptides and tools for ranking these peptides prior to planning functional studies are warranted. Whereas a couple of tools in the literature predict multiple classes, these are constructed using multiple binary classifiers. We here aimed to use an innovative deep learning approach to generate an improved peptide bioactivity classifier with capacity of distinguishing between multiple classes. We present MultiPep: a deep learning multi-label classifier that assigns peptides to zero or more of 20 bioactivity classes. We train and test MultiPep on data from several publically available databases. The same data are used for a hierarchical clustering, whose dendrogram shapes the architecture of MultiPep. We test a new loss function that combines a customized version of Matthews correlation coefficient with binary cross entropy (BCE), and show that this is better than using class-weighted BCE as loss function. Further, we show that MultiPep surpasses state-of-the-art peptide bioactivity classifiers and that it predicts known and novel bioactivities of FDA-approved therapeutic peptides. In conclusion, we present innovative machine learning techniques used to produce a peptide prediction tool to aid peptide-based therapy development and hypothesis generation.
Collapse
Affiliation(s)
- Alexander G B Grønning
- Novo Nordisk Foundation Center for Basic Metabolic Research, Faculty of Health and Medical Sciences, University of Copenhagen, 2200 Copenhagen, Denmark
| | - Tim Kacprowski
- Division Data Science in Biomedicine, Peter L. Reichertz Institute for Medical Informatics, TU Braunschweig and Hannover Medical School, 38106 Braunschweig, Germany.,Braunschweig Integrated Centre for Systems Biology (BRICS), 38106 Braunschweig, Germany
| | - Camilla Schéele
- Novo Nordisk Foundation Center for Basic Metabolic Research, Faculty of Health and Medical Sciences, University of Copenhagen, 2200 Copenhagen, Denmark
| |
Collapse
|
23
|
Beinecke J, Heider D. Gaussian noise up-sampling is better suited than SMOTE and ADASYN for clinical decision making. BioData Min 2021; 14:49. [PMID: 34844620 PMCID: PMC8628399 DOI: 10.1186/s13040-021-00283-6] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2021] [Accepted: 11/10/2021] [Indexed: 02/08/2023] Open
Abstract
Clinical data sets have very special properties and suffer from many caveats in machine learning. They typically show a high-class imbalance, have a small number of samples and a large number of parameters, and have missing values. While feature selection approaches and imputation techniques address the former problems, the class imbalance is typically addressed using augmentation techniques. However, these techniques have been developed for big data analytics, and their suitability for clinical data sets is unclear.This study analyzed different augmentation techniques for use in clinical data sets and subsequent employment of machine learning-based classification. It turns out that Gaussian Noise Up-Sampling (GNUS) is not always but generally, is as good as SMOTE and ADASYN and even outperform those on some datasets. However, it has also been shown that augmentation does not improve classification at all in some cases.
Collapse
Affiliation(s)
- Jacqueline Beinecke
- Department of Mathematics and Computer Science, Philipps-University of Marburg, Hans-Meerwein-Str. 6, 35043, Marburg, Germany
| | - Dominik Heider
- Department of Mathematics and Computer Science, Philipps-University of Marburg, Hans-Meerwein-Str. 6, 35043, Marburg, Germany.
| |
Collapse
|
24
|
López-Vidal EM, Schissel CK, Mohapatra S, Bellovoda K, Wu CL, Wood JA, Malmberg AB, Loas A, Gómez-Bombarelli R, Pentelute BL. Deep Learning Enables Discovery of a Short Nuclear Targeting Peptide for Efficient Delivery of Antisense Oligomers. JACS AU 2021; 1:2009-2020. [PMID: 34841414 PMCID: PMC8611673 DOI: 10.1021/jacsau.1c00327] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/23/2021] [Indexed: 06/01/2023]
Abstract
Therapeutic macromolecules such as proteins and oligonucleotides can be highly efficacious but are often limited to extracellular targets due to the cell's impermeable membrane. Cell-penetrating peptides (CPPs) are able to deliver such macromolecules into cells, but limited structure-activity relationships and inconsistent literature reports make it difficult to design effective CPPs for a given cargo. For example, polyarginine motifs are common in CPPs, promoting cell uptake at the expense of systemic toxicity. Machine learning may be able to address this challenge by bridging gaps between experimental data in order to discern sequence-activity relationships that evade our intuition. Our earlier data set and deep learning model led to the design of miniproteins (>40 amino acids) for antisense delivery. Here, we leveraged and expanded our model with data augmentation in the short CPP sequence space of the data set to extrapolate and discover short, low-arginine-content CPPs that would be easier to synthesize and amenable to rapid conjugation to desired cargo, and with minimal in vivo toxicity. The lead predicted peptide, termed P6, is as active as a polyarginine CPP for the delivery of an antisense oligomer, while having only one arginine side chain and 18 total residues. We determined the pentalysine motif and the C-terminal cysteine of P6 to be the main drivers of activity. The antisense conjugate was able to enhance corrective splicing in an animal model to produce functional eGFP in heart tissue in vivo while remaining nontoxic up to a dose of 60 mg/kg. In addition, P6 was able to deliver an enzyme to the cytosol of cells. Our findings suggest that, given a data set of long CPPs, we can discover by extrapolation short, active sequences that deliver antisense oligomers.
Collapse
Affiliation(s)
- Eva M. López-Vidal
- Department
of Chemistry, Massachusetts Institute of
Technology, 77 Massachusetts Avenue, Cambridge, Massachusetts 02139, United States
| | - Carly K. Schissel
- Department
of Chemistry, Massachusetts Institute of
Technology, 77 Massachusetts Avenue, Cambridge, Massachusetts 02139, United States
| | - Somesh Mohapatra
- Department
of Materials Science and Engineering, Massachusetts
Institute of Technology, 77 Massachusetts Avenue, Cambridge, Massachusetts 02139, United States
| | - Kamela Bellovoda
- Sarepta
Therapeutics, 215 First Street, Cambridge, Massachusetts 02142, United States
| | - Chia-Ling Wu
- Sarepta
Therapeutics, 215 First Street, Cambridge, Massachusetts 02142, United States
| | - Jenna A. Wood
- Sarepta
Therapeutics, 215 First Street, Cambridge, Massachusetts 02142, United States
| | - Annika B. Malmberg
- Sarepta
Therapeutics, 215 First Street, Cambridge, Massachusetts 02142, United States
| | - Andrei Loas
- Department
of Chemistry, Massachusetts Institute of
Technology, 77 Massachusetts Avenue, Cambridge, Massachusetts 02139, United States
| | - Rafael Gómez-Bombarelli
- Department
of Materials Science and Engineering, Massachusetts
Institute of Technology, 77 Massachusetts Avenue, Cambridge, Massachusetts 02139, United States
| | - Bradley L. Pentelute
- Department
of Chemistry, Massachusetts Institute of
Technology, 77 Massachusetts Avenue, Cambridge, Massachusetts 02139, United States
- The
Koch Institute for Integrative Cancer Research, Massachusetts Institute of Technology, 500 Main Street, Cambridge, Massachusetts 02142, United States
- Center
for Environmental Health Sciences, Massachusetts
Institute of Technology, 77 Massachusetts Avenue, Cambridge, Massachusetts 02139, United States
- Broad Institute
of MIT and Harvard, 415
Main Street, Cambridge, Massachusetts 02142, United States
| |
Collapse
|
25
|
Löchel HF, Heider D. Chaos game representation and its applications in bioinformatics. Comput Struct Biotechnol J 2021; 19:6263-6271. [PMID: 34900136 PMCID: PMC8636998 DOI: 10.1016/j.csbj.2021.11.008] [Citation(s) in RCA: 19] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2021] [Revised: 11/04/2021] [Accepted: 11/05/2021] [Indexed: 11/18/2022] Open
Abstract
Chaos game representation (CGR), a milestone in graphical bioinformatics, has become a powerful tool regarding alignment-free sequence comparison and feature encoding for machine learning. The algorithm maps a sequence to 2-dimensional space, while an extension of the CGR, the so-called frequency matrix representation (FCGR), transforms sequences of different lengths into equal-sized images or matrices. The CGR is a generalized Markov chain and includes various properties, which allow a unique representation of a sequence. Therefore, it has a broad spectrum of applications in bioinformatics, such as sequence comparison and phylogenetic analysis and as an encoding of sequences for machine learning. This review introduces the construction of CGRs and FCGRs, their applications on DNA and proteins, and gives an overview of recent applications and progress in bioinformatics.
Collapse
Affiliation(s)
- Hannah Franziska Löchel
- Department of Mathematics and Computer Science, University of Marburg, Hans-Meerwein-Str. 6, D-35032 Marburg, Germany
| | - Dominik Heider
- Department of Mathematics and Computer Science, University of Marburg, Hans-Meerwein-Str. 6, D-35032 Marburg, Germany
| |
Collapse
|
26
|
Lach J, Jęcz P, Strapagiel D, Matera-Witkiewicz A, Stączek P. The Methods of Digging for "Gold" within the Salt: Characterization of Halophilic Prokaryotes and Identification of Their Valuable Biological Products Using Sequencing and Genome Mining Tools. Genes (Basel) 2021; 12:1756. [PMID: 34828362 PMCID: PMC8619533 DOI: 10.3390/genes12111756] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2021] [Revised: 10/18/2021] [Accepted: 10/30/2021] [Indexed: 02/06/2023] Open
Abstract
Halophiles, the salt-loving organisms, have been investigated for at least a hundred years. They are found in all three domains of life, namely Archaea, Bacteria, and Eukarya, and occur in saline and hypersaline environments worldwide. They are already a valuable source of various biomolecules for biotechnological, pharmaceutical, cosmetological and industrial applications. In the present era of multidrug-resistant bacteria, cancer expansion, and extreme environmental pollution, the demand for new, effective compounds is higher and more urgent than ever before. Thus, the unique metabolism of halophilic microorganisms, their low nutritional requirements and their ability to adapt to harsh conditions (high salinity, high pressure and UV radiation, low oxygen concentration, hydrophobic conditions, extreme temperatures and pH, toxic compounds and heavy metals) make them promising candidates as a fruitful source of bioactive compounds. The main aim of this review is to highlight the nucleic acid sequencing experimental strategies used in halophile studies in concert with the presentation of recent examples of bioproducts and functions discovered in silico in the halophile's genomes. We point out methodological gaps and solutions based on in silico methods that are helpful in the identification of valuable bioproducts synthesized by halophiles. We also show the potential of an increasing number of publicly available genomic and metagenomic data for halophilic organisms that can be analysed to identify such new bioproducts and their producers.
Collapse
Affiliation(s)
- Jakub Lach
- Department of Molecular Microbiology, Faculty of Biology and Environmental Protection, University of Lodz, 93-338 Lodz, Poland; (P.J.); (P.S.)
- Biobank Lab, Department of Molecular Biophysics, Faculty of Environmental Protection, University of Lodz, 93-338 Lodz, Poland;
| | - Paulina Jęcz
- Department of Molecular Microbiology, Faculty of Biology and Environmental Protection, University of Lodz, 93-338 Lodz, Poland; (P.J.); (P.S.)
| | - Dominik Strapagiel
- Biobank Lab, Department of Molecular Biophysics, Faculty of Environmental Protection, University of Lodz, 93-338 Lodz, Poland;
| | - Agnieszka Matera-Witkiewicz
- Screening Laboratory of Biological Activity Tests and Collection of Biological Material, Faculty of Pharmacy, Wroclaw Medical University, 50-368 Wroclaw, Poland;
| | - Paweł Stączek
- Department of Molecular Microbiology, Faculty of Biology and Environmental Protection, University of Lodz, 93-338 Lodz, Poland; (P.J.); (P.S.)
| |
Collapse
|
27
|
Gandouz M, Holzmann H, Heider D. Machine learning with asymmetric abstention for biomedical decision-making. BMC Med Inform Decis Mak 2021; 21:294. [PMID: 34702225 PMCID: PMC8549182 DOI: 10.1186/s12911-021-01655-y] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2021] [Accepted: 10/13/2021] [Indexed: 02/08/2023] Open
Abstract
Machine learning and artificial intelligence have entered biomedical decision-making for diagnostics, prognostics, or therapy recommendations. However, these methods need to be interpreted with care because of the severe consequences for patients. In contrast to human decision-making, computational models typically make a decision also with low confidence. Machine learning with abstention better reflects human decision-making by introducing a reject option for samples with low confidence. The abstention intervals are typically symmetric intervals around the decision boundary. In the current study, we use asymmetric abstention intervals, which we demonstrate to be better suited for biomedical data that is typically highly imbalanced. We evaluate symmetric and asymmetric abstention on three real-world biomedical datasets and show that both approaches can significantly improve classification performance. However, asymmetric abstention rejects as many or fewer samples compared to symmetric abstention and thus, should be used in imbalanced data.
Collapse
Affiliation(s)
- Mariem Gandouz
- Department of Data Science in Biomedicine, Faculty of Mathematics and Computer Science, University of Marburg, 35032, Marburg, Germany
| | - Hajo Holzmann
- Department of Statistics, Faculty of Mathematics and Computer Science, University of Marburg, 35032, Marburg, Germany
| | - Dominik Heider
- Department of Data Science in Biomedicine, Faculty of Mathematics and Computer Science, University of Marburg, 35032, Marburg, Germany.
| |
Collapse
|
28
|
Ren Y, Chakraborty T, Doijad S, Falgenhauer L, Falgenhauer J, Goesmann A, Hauschild AC, Schwengers O, Heider D. Prediction of antimicrobial resistance based on whole-genome sequencing and machine learning. Bioinformatics 2021; 38:325-334. [PMID: 34613360 PMCID: PMC8722762 DOI: 10.1093/bioinformatics/btab681] [Citation(s) in RCA: 44] [Impact Index Per Article: 14.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2021] [Revised: 08/27/2021] [Accepted: 09/24/2021] [Indexed: 02/03/2023] Open
Abstract
MOTIVATION Antimicrobial resistance (AMR) is one of the biggest global problems threatening human and animal health. Rapid and accurate AMR diagnostic methods are thus very urgently needed. However, traditional antimicrobial susceptibility testing (AST) is time-consuming, low throughput and viable only for cultivable bacteria. Machine learning methods may pave the way for automated AMR prediction based on genomic data of the bacteria. However, comparing different machine learning methods for the prediction of AMR based on different encodings and whole-genome sequencing data without previously known knowledge remains to be done. RESULTS In this study, we evaluated logistic regression (LR), support vector machine (SVM), random forest (RF) and convolutional neural network (CNN) for the prediction of AMR for the antibiotics ciprofloxacin, cefotaxime, ceftazidime and gentamicin. We could demonstrate that these models can effectively predict AMR with label encoding, one-hot encoding and frequency matrix chaos game representation (FCGR encoding) on whole-genome sequencing data. We trained these models on a large AMR dataset and evaluated them on an independent public dataset. Generally, RFs and CNNs perform better than LR and SVM with AUCs up to 0.96. Furthermore, we were able to identify mutations that are associated with AMR for each antibiotic. AVAILABILITY AND IMPLEMENTATION Source code in data preparation and model training are provided at GitHub website (https://github.com/YunxiaoRen/ML-iAMR). SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Yunxiao Ren
- Department of Data Science in Biomedicine, Faculty of Mathematics and Computer Science, Philipps-University of Marburg, Marburg 35032, Germany
| | - Trinad Chakraborty
- Institute of Medical Microbiology, Justus Liebig University Giessen, Giessen 35392, Germany,German Center for Infection Research, Partner site Giessen-Marburg-Langen, Giessen 35392, Germany
| | - Swapnil Doijad
- Institute of Medical Microbiology, Justus Liebig University Giessen, Giessen 35392, Germany,German Center for Infection Research, Partner site Giessen-Marburg-Langen, Giessen 35392, Germany
| | - Linda Falgenhauer
- German Center for Infection Research, Partner site Giessen-Marburg-Langen, Giessen 35392, Germany,Institute of Hygiene and Environmental Medicine, Justus Liebig University Giessen, Giessen 35392, Germany,Hessisches universitäres Kompetenzzentrum Krankenhaushygiene, Giessen 35392, Germany
| | - Jane Falgenhauer
- Institute of Medical Microbiology, Justus Liebig University Giessen, Giessen 35392, Germany,German Center for Infection Research, Partner site Giessen-Marburg-Langen, Giessen 35392, Germany
| | - Alexander Goesmann
- German Center for Infection Research, Partner site Giessen-Marburg-Langen, Giessen 35392, Germany,Department of Bioinformatics and Systems Biology, Justus Liebig University Giessen, Giessen 35392, Germany
| | - Anne-Christin Hauschild
- Department of Data Science in Biomedicine, Faculty of Mathematics and Computer Science, Philipps-University of Marburg, Marburg 35032, Germany
| | - Oliver Schwengers
- German Center for Infection Research, Partner site Giessen-Marburg-Langen, Giessen 35392, Germany,Department of Bioinformatics and Systems Biology, Justus Liebig University Giessen, Giessen 35392, Germany
| | | |
Collapse
|
29
|
Schissel CK, Mohapatra S, Wolfe JM, Fadzen CM, Bellovoda K, Wu CL, Wood JA, Malmberg AB, Loas A, Gómez-Bombarelli R, Pentelute BL. Deep learning to design nuclear-targeting abiotic miniproteins. Nat Chem 2021; 13:992-1000. [PMID: 34373596 PMCID: PMC8819921 DOI: 10.1038/s41557-021-00766-3] [Citation(s) in RCA: 25] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/26/2020] [Accepted: 07/05/2021] [Indexed: 02/08/2023]
Abstract
There are more amino acid permutations within a 40-residue sequence than atoms on Earth. This vast chemical search space hinders the use of human learning to design functional polymers. Here we show how machine learning enables the de novo design of abiotic nuclear-targeting miniproteins to traffic antisense oligomers to the nucleus of cells. We combined high-throughput experimentation with a directed evolution-inspired deep-learning approach in which the molecular structures of natural and unnatural residues are represented as topological fingerprints. The model is able to predict activities beyond the training dataset, and simultaneously deciphers and visualizes sequence-activity predictions. The predicted miniproteins, termed 'Mach', reach an average mass of 10 kDa, are more effective than any previously known variant in cells and can also deliver proteins into the cytosol. The Mach miniproteins are non-toxic and efficiently deliver antisense cargo in mice. These results demonstrate that deep learning can decipher design principles to generate highly active biomolecules that are unlikely to be discovered by empirical approaches.
Collapse
Affiliation(s)
- Carly K. Schissel
- Massachusetts Institute of Technology, Department of Chemistry, 77 Massachusetts Avenue, Cambridge, MA 02139, USA
| | - Somesh Mohapatra
- Massachusetts Institute of Technology, Department of Materials Science and Engineering, 77 Massachusetts Avenue, Cambridge, MA 02139, USA
| | - Justin M. Wolfe
- Massachusetts Institute of Technology, Department of Chemistry, 77 Massachusetts Avenue, Cambridge, MA 02139, USA
| | - Colin M. Fadzen
- Massachusetts Institute of Technology, Department of Chemistry, 77 Massachusetts Avenue, Cambridge, MA 02139, USA
| | - Kamela Bellovoda
- Sarepta Therapeutics, 215 First Street, Cambridge, MA 02142, USA
| | - Chia-Ling Wu
- Sarepta Therapeutics, 215 First Street, Cambridge, MA 02142, USA
| | - Jenna A. Wood
- Sarepta Therapeutics, 215 First Street, Cambridge, MA 02142, USA
| | | | - Andrei Loas
- Massachusetts Institute of Technology, Department of Chemistry, 77 Massachusetts Avenue, Cambridge, MA 02139, USA
| | - Rafael Gómez-Bombarelli
- Massachusetts Institute of Technology, Department of Materials Science and Engineering, 77 Massachusetts Avenue, Cambridge, MA 02139, USA,Correspondence to: ,
| | - Bradley L. Pentelute
- Massachusetts Institute of Technology, Department of Chemistry, 77 Massachusetts Avenue, Cambridge, MA 02139, USA,The Koch Institute for Integrative Cancer Research, Massachusetts Institute of Technology, 500 Main Street, Cambridge, MA 02142, USA,Center for Environmental Health Sciences, Massachusetts Institute of Technology, 77 Massachusetts Avenue, Cambridge, MA 02139, USA,Broad Institute of MIT and Harvard, 415 Main Street, Cambridge, MA 02142, USA,Correspondence to: ,
| |
Collapse
|
30
|
Li H, Tamang T, Nantasenamat C. Toward insights on antimicrobial selectivity of host defense peptides via machine learning model interpretation. Genomics 2021; 113:3851-3863. [PMID: 34480984 DOI: 10.1016/j.ygeno.2021.08.023] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2020] [Revised: 08/22/2021] [Accepted: 08/25/2021] [Indexed: 10/20/2022]
Abstract
Host defense peptides are promising candidates for the development of novel antibiotics. To realize their therapeutic potential, high levels of target selectivity is essential. This study aims to identify factors governing selectivity via the use of the random forest algorithm for correlating peptide sequence information with their bioactivity data. Satisfactory predictive models were achieved from out-of-bag prediction that yielded accuracies and Matthew's correlation coefficients in excess of 0.80 and 0.57, respectively. Model interpretation through the use of variable importance metrics and partial dependence plots indicated that the selectivity was heavily influenced by the composition and distribution patterns of molecular charge and solubility related parameters. Furthermore, the three investigated bacterial target species (Escherichia coli, Pseudomonas aeruginosa and Staphylococcus aureus) likely had a significant influence on how selectivity was realized as there appears to be a similar underlying selectivity mechanism on the basis of charge-solubility properties (i.e. but which is tailored according to the target in question).
Collapse
Affiliation(s)
- Hao Li
- Center of Data Mining and Biomedical Informatics, Faculty of Medical Technology, Mahidol University, Bangkok 10700, Thailand
| | - Thinam Tamang
- Madan Bhandari Memorial College, Institute of Science and Technology, Tribhuvan University, Kathmandu 44602, Nepal
| | - Chanin Nantasenamat
- Center of Data Mining and Biomedical Informatics, Faculty of Medical Technology, Mahidol University, Bangkok 10700, Thailand.
| |
Collapse
|
31
|
Akbar S, Ahmad A, Hayat M, Rehman AU, Khan S, Ali F. iAtbP-Hyb-EnC: Prediction of antitubercular peptides via heterogeneous feature representation and genetic algorithm based ensemble learning model. Comput Biol Med 2021; 137:104778. [PMID: 34481183 DOI: 10.1016/j.compbiomed.2021.104778] [Citation(s) in RCA: 42] [Impact Index Per Article: 14.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2021] [Revised: 08/16/2021] [Accepted: 08/17/2021] [Indexed: 11/26/2022]
Abstract
Tuberculosis (TB) is a worldwide illness caused by the bacteria Mycobacterium tuberculosis. Owing to the high prevalence of multidrug-resistant tuberculosis, numerous traditional strategies for developing novel alternative therapies have been presented. The effectiveness and dependability of these procedures are not always consistent. Peptide-based therapy has recently been regarded as a preferable alternative due to its excellent selectivity in targeting specific cells without affecting the normal cells. However, due to the rapid growth of the peptide samples, predicting TB accurately has become a challenging task. To effectively identify antitubercular peptides, an intelligent and reliable prediction model is indispensable. An ensemble learning approach was used in this study to improve expected results by compensating for the shortcomings of individual classification algorithms. Initially, three distinct representation approaches were used to formulate the training samples: k-space amino acid composition, composite physiochemical properties, and one-hot encoding. The feature vectors of the applied feature extraction methods are then combined to generate a heterogeneous vector. Finally, utilizing individual and heterogeneous vectors, five distinct nature classification models were used to evaluate prediction rates. In addition, a genetic algorithm-based ensemble model was used to improve the suggested model's prediction and training capabilities. Using Training and independent datasets, the proposed ensemble model achieved an accuracy of 94.47% and 92.68%, respectively. It was observed that our proposed "iAtbP-Hyb-EnC" model outperformed and reported ~10% highest training accuracy than existing predictors. The "iAtbP-Hyb-EnC" model is suggested to be a reliable tool for scientists and might play a valuable role in academic research and drug discovery. The source code and all datasets are publicly available at https://github.com/Farman335/iAtbP-Hyb-EnC.
Collapse
Affiliation(s)
- Shahid Akbar
- Department of Computer Science, Abdul Wali Khan University, Mardan, KP, 23200, Pakistan.
| | - Ashfaq Ahmad
- Department of Computer Science, Abdul Wali Khan University, Mardan, KP, 23200, Pakistan.
| | - Maqsood Hayat
- Department of Computer Science, Abdul Wali Khan University, Mardan, KP, 23200, Pakistan.
| | - Ateeq Ur Rehman
- Department of Information Technology, The University of Haripur, KP, Pakistan.
| | - Salman Khan
- Department of Computer Science, Abdul Wali Khan University, Mardan, KP, 23200, Pakistan.
| | - Farman Ali
- School of Computer Science and Engineering, Nanjing University of Science and Technology, Nanjing, 210094, China.
| |
Collapse
|
32
|
Zhang J, Zhang Z, Pu L, Tang J, Guo F. AIEpred: An Ensemble Predictive Model of Classifier Chain to Identify Anti-Inflammatory Peptides. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2021; 18:1831-1840. [PMID: 31985437 DOI: 10.1109/tcbb.2020.2968419] [Citation(s) in RCA: 26] [Impact Index Per Article: 8.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Anti-inflammatory peptides (AIEs) have recently emerged as promising therapeutic agent for treatment of various inflammatory diseases, such as rheumatoid arthritis and Alzheimer's disease. Therefore, detecting the correlation between amino acid sequence and its anti-inflammatory property is of great importance for the discovery of new AIEs. To address this issue, we propose a novel prediction tool for accurate identification of peptides as anti-inflammatory epitopes or non anti-inflammatory epitopes. Most of all, we encode the original peptide sequence for better mining and exploring the information and patterns, based on the three feature representations as amino acid contact, position specific scoring matrix, physicochemical property. At the same time, we exploit several feature extraction models and utilize one feature selection model, in order to construct many base classifiers from various feature representations. More specifically, we develop an effective classification model, with which we can extract and learn a set of informative features from the ensemble classifier chain model with different group of base classifiers. Furthermore, in order to test the predictive power of our model, we conduct the comparative experiments on the leave-one-out cross-validation and the independent test. It shows that our novel predictor performs great accurate for identification of AIEs as well as existing outstanding prediction tools. Source codes are available at https://github.com/guofei-tju/Ensemble-classifier-chain-model.
Collapse
|
33
|
Nguyen-Vo TH, Trinh QH, Nguyen L, Do TTT, Chua MCH, Nguyen BP. Predicting Antimalarial Activity in Natural Products Using Pretrained Bidirectional Encoder Representations from Transformers. J Chem Inf Model 2021; 62:5050-5058. [DOI: 10.1021/acs.jcim.1c00584] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Affiliation(s)
- Thanh-Hoang Nguyen-Vo
- School of Mathematics and Statistics, Victoria University of Wellington, Kelburn Parade, Wellington 6140, New Zealand
| | - Quang H. Trinh
- Computational Biology Center, International University−VNU HCMC, Ho Chi Minh City 700000, Vietnam
| | - Loc Nguyen
- Computational Biology Center, International University−VNU HCMC, Ho Chi Minh City 700000, Vietnam
| | - Trang T. T. Do
- School of Business and Information Technology, Wellington Institute of Technology, 21 Kensington Avenue, Lower Hutt 5012, New Zealand
| | - Matthew Chin Heng Chua
- Institute of Systems Science, National University of Singapore, 29 Heng Mui Keng Terrace, Singapore 119620, Singapore
| | - Binh P. Nguyen
- School of Mathematics and Statistics, Victoria University of Wellington, Kelburn Parade, Wellington 6140, New Zealand
| |
Collapse
|
34
|
Heinen S, von Rudorff GF, von Lilienfeld OA. Toward the design of chemical reactions: Machine learning barriers of competing mechanisms in reactant space. J Chem Phys 2021; 155:064105. [PMID: 34391351 DOI: 10.1063/5.0059742] [Citation(s) in RCA: 33] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/13/2023] Open
Abstract
The interplay of kinetics and thermodynamics governs reactive processes, and their control is key in synthesis efforts. While sophisticated numerical methods for studying equilibrium states have well advanced, quantitative predictions of kinetic behavior remain challenging. We introduce a reactant-to-barrier (R2B) machine learning model that rapidly and accurately infers activation energies and transition state geometries throughout the chemical compound space. R2B exhibits improving accuracy as training set sizes grow and requires as input solely the molecular graph of the reactant and the information of the reaction type. We provide numerical evidence for the applicability of R2B for two competing text-book reactions relevant to organic synthesis, E2 and SN2, trained and tested on chemically diverse quantum data from the literature. After training on 1-1.8k examples, R2B predicts activation energies on average within less than 2.5 kcal/mol with respect to the coupled-cluster singles doubles reference within milliseconds. Principal component analysis of kernel matrices reveals the hierarchy of the multiple scales underpinning reactivity in chemical space: Nucleophiles and leaving groups, substituents, and pairwise substituent combinations correspond to systematic lowering of eigenvalues. Analysis of R2B based predictions of ∼11.5k E2 and SN2 barriers in the gas-phase for previously undocumented reactants indicates that on average, E2 is favored in 75% of all cases and that SN2 becomes likely for chlorine as nucleophile/leaving group and for substituents consisting of hydrogen or electron-withdrawing groups. Experimental reaction design from first principles is enabled due to R2B, which is demonstrated by the construction of decision trees. Numerical R2B based results for interatomic distances and angles of reactant and transition state geometries suggest that Hammond's postulate is applicable to SN2, but not to E2.
Collapse
Affiliation(s)
- Stefan Heinen
- Faculty of Physics, University of Vienna, Kolingasse 14-16, AT-1090 Wien, Austria
| | | | | |
Collapse
|
35
|
Chen J, Cheong HH, Siu SWI. xDeep-AcPEP: Deep Learning Method for Anticancer Peptide Activity Prediction Based on Convolutional Neural Network and Multitask Learning. J Chem Inf Model 2021; 61:3789-3803. [PMID: 34327990 DOI: 10.1021/acs.jcim.1c00181] [Citation(s) in RCA: 29] [Impact Index Per Article: 9.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022]
Abstract
Cancer is one of the leading causes of death worldwide. Conventional cancer treatment relies on radiotherapy and chemotherapy, but both methods bring severe side effects to patients, as these therapies not only attack cancer cells but also damage normal cells. Anticancer peptides (ACPs) are a promising alternative as therapeutic agents that are efficient and selective against tumor cells. Here, we propose a deep learning method based on convolutional neural networks to predict biological activity (EC50, LC50, IC50, and LD50) against six tumor cells, including breast, colon, cervix, lung, skin, and prostate. We show that models derived with multitask learning achieve better performance than conventional single-task models. In repeated 5-fold cross validation using the CancerPPD data set, the best models with the applicability domain defined obtain an average mean squared error of 0.1758, Pearson's correlation coefficient of 0.8086, and Kendall's correlation coefficient of 0.6156. As a step toward model interpretability, we infer the contribution of each residue in the sequence to the predicted activity by means of feature importance weights derived from the convolutional layers of the model. The present method, referred to as xDeep-AcPEP, will help to identify effective ACPs in rational peptide design for therapeutic purposes. The data, script files for reproducing the experiments, and the final prediction models can be downloaded from http://github.com/chen709847237/xDeep-AcPEP. The web server to directly access this prediction method is at https://app.cbbio.online/acpep/home.
Collapse
Affiliation(s)
- Jiarui Chen
- Department of Computer and Information Science, University of Macau, Avenida da Universidade, Taipa, Macau 999078, China
| | - Hong Hin Cheong
- Department of Computer and Information Science, University of Macau, Avenida da Universidade, Taipa, Macau 999078, China
| | - Shirley W I Siu
- Department of Computer and Information Science, University of Macau, Avenida da Universidade, Taipa, Macau 999078, China.,School of Pharmaceutical Sciences, Universiti Sains Malaysia, 11800 USM, Penang, Malaysia
| |
Collapse
|
36
|
Singh O, Hsu WL, Su ECY. Co-AMPpred for in silico-aided predictions of antimicrobial peptides by integrating composition-based features. BMC Bioinformatics 2021; 22:389. [PMID: 34330209 PMCID: PMC8325260 DOI: 10.1186/s12859-021-04305-2] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2021] [Accepted: 07/21/2021] [Indexed: 12/24/2022] Open
Abstract
Background Antimicrobial peptides (AMPs) are oligopeptides that act as crucial components of innate immunity, naturally occur in all multicellular organisms, and are involved in the first line of defense function. Recent studies showed that AMPs perpetuate great potential that is not limited to antimicrobial activity. They are also crucial regulators of host immune responses that can modulate a wide range of activities, such as immune regulation, wound healing, and apoptosis. However, a microorganism's ability to adapt and to resist existing antibiotics triggered the scientific community to develop alternatives to conventional antibiotics. Therefore, to address this issue, we proposed Co-AMPpred, an in silico-aided AMP prediction method based on compositional features of amino acid residues to classify AMPs and non-AMPs. Results In our study, we developed a prediction method that incorporates composition-based sequence and physicochemical features into various machine-learning algorithms. Then, the boruta feature-selection algorithm was used to identify discriminative biological features. Furthermore, we only used discriminative biological features to develop our model. Additionally, we performed a stratified tenfold cross-validation technique to validate the predictive performance of our AMP prediction model and evaluated on the independent holdout test dataset. A benchmark dataset was collected from previous studies to evaluate the predictive performance of our model. Conclusions Experimental results show that combining composition-based and physicochemical features outperformed existing methods on both the benchmark training dataset and a reduced training dataset. Finally, our proposed method achieved 80.8% accuracies and 0.871 area under the receiver operating characteristic curve by evaluating on independent test set. Our code and datasets are available at https://github.com/onkarS23/CoAMPpred. Supplementary Information The online version contains supplementary material available at 10.1186/s12859-021-04305-2.
Collapse
Affiliation(s)
- Onkar Singh
- Bioinformatics Program, Taiwan International Graduate Program, Institute of Information Science, Academia Sinica, Taipei, Taiwan.,Institute of Biomedical Informatics, National Yang Ming Chiao Tung University, Taipei, Taiwan.,Graduate Institute of Biomedical Informatics, College of Medical Science and Technology, Taipei Medical University, 250 Wu-Xing Street, Taipei, 11031, Taiwan
| | - Wen-Lian Hsu
- Bioinformatics Program, Taiwan International Graduate Program, Institute of Information Science, Academia Sinica, Taipei, Taiwan.,Institute of Biomedical Informatics, National Yang Ming Chiao Tung University, Taipei, Taiwan
| | - Emily Chia-Yu Su
- Graduate Institute of Biomedical Informatics, College of Medical Science and Technology, Taipei Medical University, 250 Wu-Xing Street, Taipei, 11031, Taiwan. .,Clinical Big Data Research Center, Taipei Medical University Hospital, Taipei, Taiwan.
| |
Collapse
|
37
|
Zervou MA, Doutsi E, Pavlidis P, Tsakalides P. Structural classification of proteins based on the computationally efficient recurrence quantification analysis and horizontal visibility graphs. Bioinformatics 2021; 37:1796-1804. [PMID: 34048559 DOI: 10.1093/bioinformatics/btab407] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2021] [Revised: 04/13/2021] [Accepted: 05/27/2021] [Indexed: 11/14/2022] Open
Abstract
MOTIVATION Protein structural class prediction is one of the most significant problems in bioinformatics, as it has a prominent role in understanding the function and evolution of proteins. Designing a computationally efficient but at the same time accurate prediction method remains a pressing issue, especially for sequences that we cannot obtain a sufficient amount of homologous information from existing protein sequence databases. Several studies demonstrate the potential of utilizing chaos game representation (CGR) along with time series analysis tools such as recurrence quantification analysis (RQA), complex networks, horizontal visibility graphs (HVG) and others. However, the majority of existing works involve a large amount of features and they require an exhaustive, time consuming search of the optimal parameters. To address the aforementioned problems, this work adopts the generalized multidimensional recurrence quantification analysis (GmdRQA) as an efficient tool that enables to process concurrently a multidimensional time series and reduce the number of features. In addition, two data-driven algorithms, namely average mutual information (AMI) and false nearest neighbors (FNN), are utilized to define in a fast yet precise manner the optimal GmdRQA parameters. RESULTS The classification accuracy is improved by the combination of GmdRQA with the HVG. Experimental evaluation on a real benchmark dataset demonstrates that our methods achieve similar performance with the state-of-the-art but with a smaller computational cost. AVAILABILITY The code to reproduce all the results is available at https://github.com/aretiz/protein_structure_classification/tree/main. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Michaela Areti Zervou
- Department of Computer Science, University of Crete, Heraklion, 700 13, Greece.,Institute of Computer Science, Foundation for Research and Technology-Hellas, Heraklion, 700 13, Greece
| | - Effrosyni Doutsi
- Institute of Computer Science, Foundation for Research and Technology-Hellas, Heraklion, 700 13, Greece
| | - Pavlos Pavlidis
- Institute of Computer Science, Foundation for Research and Technology-Hellas, Heraklion, 700 13, Greece
| | - Panagiotis Tsakalides
- Department of Computer Science, University of Crete, Heraklion, 700 13, Greece.,Institute of Computer Science, Foundation for Research and Technology-Hellas, Heraklion, 700 13, Greece
| |
Collapse
|
38
|
Spänig S, Mohsen S, Hattab G, Hauschild AC, Heider D. A large-scale comparative study on peptide encodings for biomedical classification. NAR Genom Bioinform 2021; 3:lqab039. [PMID: 34046590 PMCID: PMC8140742 DOI: 10.1093/nargab/lqab039] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2021] [Revised: 04/13/2021] [Accepted: 04/26/2021] [Indexed: 01/19/2023] Open
Abstract
Owing to the great variety of distinct peptide encodings, working on a biomedical classification task at hand is challenging. Researchers have to determine encodings capable to represent underlying patterns as numerical input for the subsequent machine learning. A general guideline is lacking in the literature, thus, we present here the first large-scale comprehensive study to investigate the performance of a wide range of encodings on multiple datasets from different biomedical domains. For the sake of completeness, we added additional sequence- and structure-based encodings. In particular, we collected 50 biomedical datasets and defined a fixed parameter space for 48 encoding groups, leading to a total of 397 700 encoded datasets. Our results demonstrate that none of the encodings are superior for all biomedical domains. Nevertheless, some encodings often outperform others, thus reducing the initial encoding selection substantially. Our work offers researchers to objectively compare novel encodings to the state of the art. Our findings pave the way for a more sophisticated encoding optimization, for example, as part of automated machine learning pipelines. The work presented here is implemented as a large-scale, end-to-end workflow designed for easy reproducibility and extensibility. All standardized datasets and results are available for download to comply with FAIR standards.
Collapse
Affiliation(s)
- Sebastian Spänig
- Data Science in Biomedicine, Department of Mathematics and Computer Science, University of Marburg, Hans-Meerwein-Str. 6, D-35032 Marburg, Germany
| | - Siba Mohsen
- Data Science in Biomedicine, Department of Mathematics and Computer Science, University of Marburg, Hans-Meerwein-Str. 6, D-35032 Marburg, Germany
| | - Georges Hattab
- Data Science in Biomedicine, Department of Mathematics and Computer Science, University of Marburg, Hans-Meerwein-Str. 6, D-35032 Marburg, Germany
| | - Anne-Christin Hauschild
- Data Science in Biomedicine, Department of Mathematics and Computer Science, University of Marburg, Hans-Meerwein-Str. 6, D-35032 Marburg, Germany
| | - Dominik Heider
- To whom correspondence should be addressed. Tel: +49 6421 2821579;
| |
Collapse
|
39
|
Boone K, Wisdom C, Camarda K, Spencer P, Tamerler C. Combining genetic algorithm with machine learning strategies for designing potent antimicrobial peptides. BMC Bioinformatics 2021; 22:239. [PMID: 33975547 PMCID: PMC8111958 DOI: 10.1186/s12859-021-04156-x] [Citation(s) in RCA: 22] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2020] [Accepted: 04/27/2021] [Indexed: 12/17/2022] Open
Abstract
BACKGROUND Current methods in machine learning provide approaches for solving challenging, multiple constraint design problems. While deep learning and related neural networking methods have state-of-the-art performance, their vulnerability in decision making processes leading to irrational outcomes is a major concern for their implementation. With the rising antibiotic resistance, antimicrobial peptides (AMPs) have increasingly gained attention as novel therapeutic agents. This challenging design problem requires peptides which meet the multiple constraints of limiting drug-resistance in bacteria, preventing secondary infections from imbalanced microbial flora, and avoiding immune system suppression. AMPs offer a promising, bioinspired design space to targeting antimicrobial activity, but their versatility also requires the curated selection from a combinatorial sequence space. This space is too large for brute-force methods or currently known rational design approaches outside of machine learning. While there has been progress in using the design space to more effectively target AMP activity, a widely applicable approach has been elusive. The lack of transparency in machine learning has limited the advancement of scientific knowledge of how AMPs are related among each other, and the lack of general applicability for fully rational approaches has limited a broader understanding of the design space. METHODS Here we combined an evolutionary method with rough set theory, a transparent machine learning approach, for designing antimicrobial peptides (AMPs). Our method achieves the customization of AMPs using supervised learning boundaries. Our system employs in vitro bacterial assays to measure fitness, codon-representation of peptides to gain flexibility of sequence selection in DNA-space with a genetic algorithm and machine learning to further accelerate the process. RESULTS We use supervised machine learning and a genetic algorithm to find a peptide active against S. epidermidis, a common bacterial strain for implant infections, with an improved aggregation propensity average for an improved ease of synthesis. CONCLUSIONS Our results demonstrate that AMP design can be customized to maintain activity and simplify production. To our knowledge, this is the first time when codon-based genetic algorithms combined with rough set theory methods is used for computational search on peptide sequences.
Collapse
Affiliation(s)
- Kyle Boone
- Bioengineering Program, University of Kansas, Institute of Bioengineering Research, University of Kansas, 1530 W 15th Street, Learned Hall, Room 5109, Lawrence, KS 66045 USA
| | - Cate Wisdom
- Bioengineering Program, University of Kansas, Institute of Bioengineering Research, University of Kansas, 1530 W 15th Street, Learned Hall, Room 5109, Lawrence, KS 66045 USA
| | - Kyle Camarda
- Chemical and Petroleum Engineering Department, University of Kansas, 1530 West 15th Street, Learned Hall, Room 4154, Lawrence, KS 66045 USA
| | - Paulette Spencer
- Mechanical Engineering Department, University of Kansas, 1530 West 15th Street, Learned Hall, Room 3111, Lawrence, KS 66045 USA
- Institute of Bioengineering Research, University of Kansas, 1530 West 15th Street, Learned Hall, Room 3111, Lawrence, KS 66045 USA
| | - Candan Tamerler
- Mechanical Engineering Department, University of Kansas, 1530 W 15th St, Learned Hall, Room 3135A, Lawrence, KS 66045 USA
- Institute of Bioengineering Research, University of Kansas, 1530 W 15th St, Learned Hall, Room 3135A, Lawrence, KS 66045 USA
| |
Collapse
|
40
|
Tarasova O, Poroikov V. Machine Learning in Discovery of New Antivirals and Optimization of Viral Infections Therapy. Curr Med Chem 2021; 28:7840-7861. [PMID: 33949929 DOI: 10.2174/0929867328666210504114351] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2020] [Revised: 02/13/2021] [Accepted: 02/24/2021] [Indexed: 11/22/2022]
Abstract
Nowadays, computational approaches play an important role in the design of new drug-like compounds and optimization of pharmacotherapeutic treatment of diseases. The emerging growth of viral infections, including those caused by the Human Immunodeficiency Virus (HIV), Ebola virus, recently detected coronavirus, and some others, leads to many newly infected people with a high risk of death or severe complications. A huge amount of chemical, biological, clinical data is at the disposal of the researchers. Therefore, there are many opportunities to find the relationships between the particular features of chemical data and the antiviral activity of biologically active compounds based on machine learning approaches. Biological and clinical data can also be used for building models to predict relationships between viral genotype and drug resistance, which might help determine the clinical outcome of treatment. In the current study, we consider machine-learning approaches in the antiviral research carried out during the past decade. We overview in detail the application of machine-learning methods for the design of new potential antiviral agents and vaccines, drug resistance prediction, and analysis of virus-host interactions. Our review also covers the perspectives of using the machine-learning approaches for antiviral research, including Dengue, Ebola viruses, Influenza A, Human Immunodeficiency Virus, coronaviruses, and some others.
Collapse
Affiliation(s)
- Olga Tarasova
- Department of Bioinformatics, Institute of Biomedical Chemistry, Moscow. Russian Federation
| | - Vladimir Poroikov
- Department of Bioinformatics, Institute of Biomedical Chemistry, Moscow. Russian Federation
| |
Collapse
|
41
|
Lei M, Jayaraman A, Van Deventer JA, Lee K. Engineering Selectively Targeting Antimicrobial Peptides. Annu Rev Biomed Eng 2021; 23:339-357. [PMID: 33852346 DOI: 10.1146/annurev-bioeng-010220-095711] [Citation(s) in RCA: 30] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
The rise of antibiotic-resistant strains of bacterial pathogens has necessitated the development of new therapeutics. Antimicrobial peptides (AMPs) are a class of compounds with potentially attractive therapeutic properties, including the ability to target specific groups of bacteria. In nature, AMPs exhibit remarkable structural and functional diversity, which may be further enhanced through genetic engineering, high-throughput screening, and chemical modification strategies. In this review, we discuss the molecular mechanisms underlying AMP selectivity and highlight recent computational and experimental efforts to design selectively targeting AMPs. While there has been an extensive effort to find broadly active and highly potent AMPs, it remains challenging to design targeting peptides to discriminate between different bacteria on the basis of physicochemical properties. We also review approaches for measuring AMP activity, point out the challenges faced in assaying for selectivity, and discuss the potential for increasing AMP diversity through chemical modifications.
Collapse
Affiliation(s)
- Ming Lei
- Department of Chemical and Biological Engineering, Tufts University, Medford, Massachusetts 02155, USA; , ,
| | - Arul Jayaraman
- Artie McFerrin Department of Chemical Engineering and Department of Biomedical Engineering, Texas A&M University, College Station, Texas 77843, USA; .,Department of Microbial Pathogenesis and Immunology, College of Medicine, Texas Health Science Center, Texas A&M University, College Station, Texas 77843, USA
| | - James A Van Deventer
- Department of Chemical and Biological Engineering, Tufts University, Medford, Massachusetts 02155, USA; , , .,Department of Biomedical Engineering, Tufts University, Medford, Massachusetts 02155, USA
| | - Kyongbum Lee
- Department of Chemical and Biological Engineering, Tufts University, Medford, Massachusetts 02155, USA; , ,
| |
Collapse
|
42
|
Nie T, Meng F, Zhou L, Lu F, Bie X, Lu Z, Lu Y. In Silico Development of Novel Chimeric Lysins with Highly Specific Inhibition against Salmonella by Computer-Aided Design. JOURNAL OF AGRICULTURAL AND FOOD CHEMISTRY 2021; 69:3751-3760. [PMID: 33565867 DOI: 10.1021/acs.jafc.0c07450] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Four novel chimeric lysins (P361, P362, P371, and P372), which were the fusion of Salmonella phage lysins and novel antimicrobial peptide LeuA-P, were obtained using bioinformatics analysis and in silico design. The recombinant chimeric lysins were expressed in E. coli BL21(DE3) strain and showed highly specific inhibition against Salmonella. The minimal inhibitory concentrations (MICs) of P362 and P372 to S. typhi CMCC 50071 were 8 and 16 μg/mL, respectively. Both 1 × MIC P362 and P372 could increase the outer membrane permeability and cleave the cell wall peptidoglycan, causing the leakage of intracellular nucleic acids and proteins and ultimately killing Salmonella efficiently without drug resistance. The combination of P362, P372, and potassium sorbate reduced more than 3 log CFU/g counts of microorganisms in contaminated chilled chicken and extended the shelf life by 7 days. The strategy of antimicrobial peptide (AMP)-lysin chimera inspired the inability of phage lysin to specifically inhibit Gram-negative bacteria with dense outer membranes in vitro.
Collapse
Affiliation(s)
- Ting Nie
- College of Food Science and Technology, Nanjing Agricultural University, Nanjing, Jiangsu Province 210095, China
| | - Fanqiang Meng
- College of Food Science and Technology, Nanjing Agricultural University, Nanjing, Jiangsu Province 210095, China
| | - Libang Zhou
- College of Food Science and Technology, Nanjing Agricultural University, Nanjing, Jiangsu Province 210095, China
| | - Fengxia Lu
- College of Food Science and Technology, Nanjing Agricultural University, Nanjing, Jiangsu Province 210095, China
| | - Xiaomei Bie
- College of Food Science and Technology, Nanjing Agricultural University, Nanjing, Jiangsu Province 210095, China
| | - Zhaoxin Lu
- College of Food Science and Technology, Nanjing Agricultural University, Nanjing, Jiangsu Province 210095, China
| | - Yingjian Lu
- College of Food Science and Engineering, Nanjing University of Finance and Economics, Nanjing, Jiangsu Province 210023, China
| |
Collapse
|
43
|
|
44
|
Machine-learning Applications to Membrane Active Peptides. SYSTEMS MEDICINE 2021. [DOI: 10.1016/b978-0-12-801238-3.11544-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022] Open
|
45
|
Santos-Júnior CD, Pan S, Zhao XM, Coelho LP. Macrel: antimicrobial peptide screening in genomes and metagenomes. PeerJ 2020; 8:e10555. [PMID: 33384902 PMCID: PMC7751412 DOI: 10.7717/peerj.10555] [Citation(s) in RCA: 28] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2020] [Accepted: 11/22/2020] [Indexed: 12/21/2022] Open
Abstract
Motivation Antimicrobial peptides (AMPs) have the potential to tackle multidrug-resistant pathogens in both clinical and non-clinical contexts. The recent growth in the availability of genomes and metagenomes provides an opportunity for in silico prediction of novel AMP molecules. However, due to the small size of these peptides, standard gene prospection methods cannot be applied in this domain and alternative approaches are necessary. In particular, standard gene prediction methods have low precision for short peptides, and functional classification by homology results in low recall. Results Here, we present Macrel (for metagenomic AMP classification and retrieval), which is an end-to-end pipeline for the prospection of high-quality AMP candidates from (meta)genomes. For this, we introduce a novel set of 22 peptide features. These were used to build classifiers which perform similarly to the state-of-the-art in the prediction of both antimicrobial and hemolytic activity of peptides, but with enhanced precision (using standard benchmarks as well as a stricter testing regime). We demonstrate that Macrel recovers high-quality AMP candidates using realistic simulations and real data. Availability Macrel is implemented in Python 3. It is available as open source at https://github.com/BigDataBiology/macrel and through bioconda. Classification of peptides or prediction of AMPs in contigs can also be performed on the webserver: https://big-data-biology.org/software/macrel.
Collapse
Affiliation(s)
- Célio Dias Santos-Júnior
- Institute of Science and Technology for Brain-Inspired Intelligence, Fudan University, Shanghai, China.,Ministry of Education, Key Laboratory of Computational Neuroscience and Brain-Inspired Intelligence, Shanghai, China
| | - Shaojun Pan
- Institute of Science and Technology for Brain-Inspired Intelligence, Fudan University, Shanghai, China.,Ministry of Education, Key Laboratory of Computational Neuroscience and Brain-Inspired Intelligence, Shanghai, China
| | - Xing-Ming Zhao
- Institute of Science and Technology for Brain-Inspired Intelligence, Fudan University, Shanghai, China.,Ministry of Education, Key Laboratory of Computational Neuroscience and Brain-Inspired Intelligence, Shanghai, China
| | - Luis Pedro Coelho
- Institute of Science and Technology for Brain-Inspired Intelligence, Fudan University, Shanghai, China.,Ministry of Education, Key Laboratory of Computational Neuroscience and Brain-Inspired Intelligence, Shanghai, China
| |
Collapse
|
46
|
Enhanced prediction of anti-tubercular peptides from sequence information using divergence measure-based intuitionistic fuzzy-rough feature selection. Soft comput 2020. [DOI: 10.1007/s00500-020-05363-z] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
|
47
|
Fodor A, Abate BA, Deák P, Fodor L, Gyenge E, Klein MG, Koncz Z, Muvevi J, Ötvös L, Székely G, Vozik D, Makrai L. Multidrug Resistance (MDR) and Collateral Sensitivity in Bacteria, with Special Attention to Genetic and Evolutionary Aspects and to the Perspectives of Antimicrobial Peptides-A Review. Pathogens 2020; 9:pathogens9070522. [PMID: 32610480 PMCID: PMC7399985 DOI: 10.3390/pathogens9070522] [Citation(s) in RCA: 32] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2020] [Revised: 06/23/2020] [Accepted: 06/23/2020] [Indexed: 12/18/2022] Open
Abstract
Antibiotic poly-resistance (multidrug-, extreme-, and pan-drug resistance) is controlled by adaptive evolution. Darwinian and Lamarckian interpretations of resistance evolution are discussed. Arguments for, and against, pessimistic forecasts on a fatal “post-antibiotic era” are evaluated. In commensal niches, the appearance of a new antibiotic resistance often reduces fitness, but compensatory mutations may counteract this tendency. The appearance of new antibiotic resistance is frequently accompanied by a collateral sensitivity to other resistances. Organisms with an expanding open pan-genome, such as Acinetobacter baumannii, Pseudomonas aeruginosa, and Klebsiella pneumoniae, can withstand an increased number of resistances by exploiting their evolutionary plasticity and disseminating clonally or poly-clonally. Multidrug-resistant pathogen clones can become predominant under antibiotic stress conditions but, under the influence of negative frequency-dependent selection, are prevented from rising to dominance in a population in a commensal niche. Antimicrobial peptides have a great potential to combat multidrug resistance, since antibiotic-resistant bacteria have shown a high frequency of collateral sensitivity to antimicrobial peptides. In addition, the mobility patterns of antibiotic resistance, and antimicrobial peptide resistance, genes are completely different. The integron trade in commensal niches is fortunately limited by the species-specificity of resistance genes. Hence, we theorize that the suggested post-antibiotic era has not yet come, and indeed might never come.
Collapse
Affiliation(s)
- András Fodor
- Department of Genetics, University of Szeged, H-6726 Szeged, Hungary;
- Correspondence: or (A.F.); (L.M.); Tel.: +36-(30)-490-9294 (A.F.); +36-(30)-271-2513 (L.M.)
| | - Birhan Addisie Abate
- Ethiopian Biotechnology Institute, Agricultural Biotechnology Directorate, Addis Ababa 5954, Ethiopia;
| | - Péter Deák
- Department of Genetics, University of Szeged, H-6726 Szeged, Hungary;
- Institute of Biochemistry, Biological Research Centre, H-6726 Szeged, Hungary
| | - László Fodor
- Department of Microbiology and Infectious Diseases, University of Veterinary Medicine, P.O. Box 22, H-1581 Budapest, Hungary;
| | - Ervin Gyenge
- Hungarian Department of Biology and Ecology, Faculty of Biology and Geology, Babeș-Bolyai University, 5-7 Clinicilor St., 400006 Cluj-Napoca, Romania; (E.G.); (G.S.)
- Institute for Research-Development-Innovation in Applied Natural Sciences, Babeș-Bolyai University, 30 Fântânele St., 400294 Cluj-Napoca, Romania
| | - Michael G. Klein
- Department of Entomology, The Ohio State University, 1680 Madison Ave., Wooster, OH 44691, USA;
| | - Zsuzsanna Koncz
- Max-Planck Institut für Pflanzenzüchtungsforschung, Carl-von-Linné-Weg 10, D-50829 Köln, Germany;
| | | | - László Ötvös
- OLPE, LLC, Audubon, PA 19403-1965, USA;
- Institute of Medical Microbiology, Semmelweis University, H-1085 Budapest, Hungary
- Arrevus, Inc., Raleigh, NC 27612, USA
| | - Gyöngyi Székely
- Hungarian Department of Biology and Ecology, Faculty of Biology and Geology, Babeș-Bolyai University, 5-7 Clinicilor St., 400006 Cluj-Napoca, Romania; (E.G.); (G.S.)
- Institute for Research-Development-Innovation in Applied Natural Sciences, Babeș-Bolyai University, 30 Fântânele St., 400294 Cluj-Napoca, Romania
- Centre for Systems Biology, Biodiversity and Bioresources, Babeș-Bolyai University, 5-7 Clinicilor St., 400006 Cluj-Napoca, Romania
| | - Dávid Vozik
- Research Institute on Bioengineering, Membrane Technology and Energetics, Faculty of Engineering, University of Veszprem, H-8200 Veszprém, Hungary; or or
| | - László Makrai
- Department of Microbiology and Infectious Diseases, University of Veterinary Medicine, P.O. Box 22, H-1581 Budapest, Hungary;
- Correspondence: or (A.F.); (L.M.); Tel.: +36-(30)-490-9294 (A.F.); +36-(30)-271-2513 (L.M.)
| |
Collapse
|
48
|
León R, Ruiz M, Valero Y, Cárdenas C, Guzman F, Vila M, Cuesta A. Exploring small cationic peptides of different origin as potential antimicrobial agents in aquaculture. FISH & SHELLFISH IMMUNOLOGY 2020; 98:720-727. [PMID: 31730928 DOI: 10.1016/j.fsi.2019.11.019] [Citation(s) in RCA: 31] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/01/2019] [Revised: 11/04/2019] [Accepted: 11/07/2019] [Indexed: 06/10/2023]
Abstract
Antimicrobial peptides (AMPs) form part of the innate immune response, which is of vital importance in fish, especially in eggs and early larval stages. Compared to antibiotics, AMPs show action against a wider spectrum of pathogens, including viruses, fungi and parasites, are more friendly to the environment, and do not seem to generate resistance in bacteria. Thus, we have tested in vitro the potential use of several synthetic peptides as antimicrobial agents in aquaculture: frog Caerin1.1, European sea bass Dicentracin (Dic) and NK-lysin peptides (NKLPs) and sole NKLP27. Our results demonstrate that the highest bactericidal activity against both human and fish pathogens was obtained with Caerin1.1 followed by sea bass Dic and NKLPs, having the sea bass NKLP20.2 none to negligible activity. Interestingly, Aeromonas salmonicida was refractory to all the fish peptides tested. Regarding the antiviral activity, synthetic peptides were able to inhibit the viral infection of nodavirus (NNV), viral septicaemia haemorrhagic virus (VHSV), infectious pancreatic necrosis virus (IPNV) and spring viremia carp virus (SVCV), which are some of the most devastating virus for aquaculture. However, their effectiveness was highly dependent on the type of virus. Strikingly, IPNV resulted the most resistant virus since Caeerin1.1 and sea bass NKLP20.2 were unable to reduce its titre and the other peptides tested only reduced it to values in the 43-78% range. These data demonstrate that synthetic peptides have great antibacterial and antiviral in vitro activity against important fish pathogens and point to their use as potential therapeutic agents in aquaculture.
Collapse
Affiliation(s)
- Rosa León
- Laboratorio de Bioquímica, Facultad de Ciencias Experimentales, Campus de Excelencia Internacional del Mar (CEIMAR), Universidad de Huelva, 2110, Huelva, Spain
| | - María Ruiz
- Fish Innate Immune System Group, Department of Cell Biology and Histology, Faculty of Biology, Campus Regional de Excelencia Internacional "Campus Mare Nostrum", University of Murcia, 30100, Murcia, Spain
| | - Yulema Valero
- Fish Innate Immune System Group, Department of Cell Biology and Histology, Faculty of Biology, Campus Regional de Excelencia Internacional "Campus Mare Nostrum", University of Murcia, 30100, Murcia, Spain; Grupo de Marcadores Inmunológicos, Laboratorio de Genética e Inmunología Molecular, Instituto de Biología, Pontificia Universidad Católica de Valparaíso, Valparaíso, Chile
| | - Constanza Cárdenas
- Núcleo Biotecnológico de Curauma (NBC), Pontificia Universidad Católica de Valparaíso, Valparaíso, Chile
| | - Fanny Guzman
- Núcleo Biotecnológico de Curauma (NBC), Pontificia Universidad Católica de Valparaíso, Valparaíso, Chile
| | - Marta Vila
- Laboratorio de Bioquímica, Facultad de Ciencias Experimentales, Campus de Excelencia Internacional del Mar (CEIMAR), Universidad de Huelva, 2110, Huelva, Spain
| | - Alberto Cuesta
- Fish Innate Immune System Group, Department of Cell Biology and Histology, Faculty of Biology, Campus Regional de Excelencia Internacional "Campus Mare Nostrum", University of Murcia, 30100, Murcia, Spain.
| |
Collapse
|
49
|
Yang L, Sun Y, Xu Y, Hang B, Wang L, Zhen K, Hu B, Chen Y, Xia X, Hu J. Antibacterial Peptide BSN-37 Kills Extra- and Intra-Cellular Salmonella enterica Serovar Typhimurium by a Nonlytic Mode of Action. Front Microbiol 2020; 11:174. [PMID: 32117178 PMCID: PMC7019029 DOI: 10.3389/fmicb.2020.00174] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2019] [Accepted: 01/24/2020] [Indexed: 01/08/2023] Open
Abstract
The increasing rates of resistance to traditional anti-Salmonella agents have made the treatment of invasive salmonellosis more problematic, which necessitates the search for new antimicrobial compounds. In this study, the action mode of BSN-37, a novel antibacterial peptide (AMP) from bovine spleen neutrophils, was investigated against Salmonella enterica serovar Typhimurium (S. Typhimurium). Minimum inhibitory concentrations (MICs) and time-kill kinetics of BSN-37 were determined. The cell membrane changes of S. Typhimurium CVCC541 (ST) treated with BSN-37 were investigated by testing the fluorescence intensity of membrane probes and the release of cytoplasmic β-galactosidase activity. Likewise, cell morphological and ultrastructural changes were also observed using scanning and transmission electron microscopes. Furthermore, the cytotoxicity of BSN-37 was detected by a CCK-8 kit and real-time cell assay. The proliferation inhibition of BSN-37 against intracellular S. Typhimurium was performed in Madin-Darby canine kidney (MDCK) cells. The results demonstrated that BSN-37 exhibited strong antibacterial activity against ST (MICs, 16.67 μg/ml), which was not remarkably affected by the serum salts at a physiological concentration. However, the presence of CaCl2 led to an increase in MIC of BSN-37 by about 4-fold compared to that of ST. BSN-37 at the concentration of 100 μg/ml could completely kill ST after co-incubation for 6 h. Likewise, BSN-37 at different concentrations (50, 100, and 200 μg/ml) could increase the outer membrane permeability of ST but not impair its inner membrane integrity. Moreover, no broken and ruptured cells were found in the figures of scanning and transmission electron microscopes. These results demonstrate that BSN-37 exerts its antibacterial activity against S. Typhimurium by a non-lytic mode of action. Importantly, BSN-37 had no toxicity to the tested eukaryotic cells, even at a concentration of 800 μg/ml. BSN-37 could significantly inhibit the proliferation of intracellular S. Typhimurium.
Collapse
Affiliation(s)
- Lei Yang
- College of Animal Science and Veterinary Medicine, Henan Institute of Science and Technology, Xinxiang, China
| | - Yawei Sun
- College of Animal Science and Veterinary Medicine, Henan Institute of Science and Technology, Xinxiang, China
| | - Yanzhao Xu
- College of Animal Science and Veterinary Medicine, Henan Institute of Science and Technology, Xinxiang, China
| | - Bolin Hang
- College of Animal Science and Veterinary Medicine, Henan Institute of Science and Technology, Xinxiang, China
| | - Lei Wang
- College of Animal Science and Veterinary Medicine, Henan Institute of Science and Technology, Xinxiang, China
| | - Ke Zhen
- College of Animal Science and Veterinary Medicine, Henan Institute of Science and Technology, Xinxiang, China
| | - Bing Hu
- College of Animal Science and Veterinary Medicine, Henan Institute of Science and Technology, Xinxiang, China
| | - Yanan Chen
- College of Animal Science and Veterinary Medicine, Henan Institute of Science and Technology, Xinxiang, China
| | - Xiaojing Xia
- College of Animal Science and Veterinary Medicine, Henan Institute of Science and Technology, Xinxiang, China
| | - Jianhe Hu
- College of Animal Science and Veterinary Medicine, Henan Institute of Science and Technology, Xinxiang, China
| |
Collapse
|
50
|
Lissabet JFB, Belén LH, Farias JG. PPLK +C: A Bioinformatics Tool for Predicting Peptide Ligands of Potassium Channels Based on Primary Structure Information. Interdiscip Sci 2020; 12:258-263. [PMID: 31912313 DOI: 10.1007/s12539-019-00356-5] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2019] [Revised: 12/22/2019] [Accepted: 12/24/2019] [Indexed: 10/25/2022]
Abstract
Potassium channels play a key role in regulating the flow of ions through the plasma membrane, orchestrating many cellular processes including cell volume regulation, hormone secretion and electrical impulse formation. Ligand peptides of potassium channels are molecules used in basic and applied research and are now considered promising alternatives in the treatment of many diseases, such as cardiovascular diseases and cancer. Currently, there are various bioinformatics tools focused on the prediction of peptides with different activities. However, none of the current tools can predict ligand peptides of potassium channels. In this work, we developed a tool called PPLK+C; this is the first tool that can predict peptide ligands of potassium channels. We also evaluated several amino acid molecular features and four machine-learning algorithms for the prediction of potassium channel ligand peptides: random forest, nearest neighbors, support vector machine and artificial neural network. All the biological data used in this study for training and validating models were obtained from peptides with experimentally verified activity. PPLK+C is a bioinformatics software written in the Python programming language, which showed a high predictive capacity with a model generated with the random forest algorithm: 0.77 sensitivity, 0.94 specificity, 0.91 accuracy and 0.70 Matthews correlation coefficient. PPLK+C is a novel tool with a friendly interface that can be used for the discovery of novel ligand peptides of potassium channels with high reliability, using only primary structure information.
Collapse
Affiliation(s)
- Jorge Félix Beltrán Lissabet
- Department of Chemical Engineering, Faculty of Engineering and Sciences, Universidad de La Frontera, Av. Francisco Salazar 01145, P.O. Box 54-D, 4811230, Temuco, Chile
| | - Lisandra Herrera Belén
- Department of Chemical Engineering, Faculty of Engineering and Sciences, Universidad de La Frontera, Av. Francisco Salazar 01145, P.O. Box 54-D, 4811230, Temuco, Chile
| | - Jorge G Farias
- Department of Chemical Engineering, Faculty of Engineering and Sciences, Universidad de La Frontera, Av. Francisco Salazar 01145, P.O. Box 54-D, 4811230, Temuco, Chile.
| |
Collapse
|