1
|
Schaduangrat N, Khemawoot P, Jiso A, Charoenkwan P, Shoombuatong W. MetaCGRP is a high-precision meta-model for large-scale identification of CGRP inhibitors using multi-view information. Sci Rep 2024; 14:24764. [PMID: 39433940 PMCID: PMC11494111 DOI: 10.1038/s41598-024-75487-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2024] [Accepted: 10/07/2024] [Indexed: 10/23/2024] Open
Abstract
Migraine is considered one of the debilitating primary headache conditions with an estimated worldwide occurrence of approximately 14-15%, contributing highly to factors responsible for global disability. Calcitonin gene-related peptide (CGRP) is a neuropeptide that plays a crucial role in the pathophysiology of migraines and thus, its inhibition can help relieve migraine symptoms. However, conventional process of CGRP drug development has been laborious and time-consuming with incurred costs exceeding one billion dollars. On the other hand, machine learning (ML)-based approaches that are capable of accurately identifying CGRP inhibitors could greatly facilitate in expediting the discovery of novel CGRP drugs. Therefore, this study proposes a novel and high-accuracy meta-model, namely MetaCGRP, that can precisely identify CGRP inhibitors. To the best of our knowledge, MetaCGRP is the first SMILES-based approach that has been developed to identify CGRP inhibitors without the use of 3D structural information. In brief, we initially employed different molecular representation methods coupled with popular ML algorithms to construct a pool of baseline models. Then, all baseline models were optimized and used to generate multi-view features. Finally, we employed the feature selection method to optimize the multi-view features and determine the best feature subset to enable the construction of the meta-model. Both cross-validation and independent tests indicated that MetaCGRP clearly outperforms several conventional ML classifiers, with accuracies of 0.898 and 0.799 on the training and independent test datasets, respectively. In addition, MetaCGRP in conjunction with molecular docking was utilized to identify five potential natural product candidates from Thai herbal pharmacopoeia and analyze their binding affinity and interactions to CGRP. To facilitate community-wide efforts in expediting the discovery of novel CGRP inhibitors, a user-friendly web server for MetaCGRP is freely available at https://pmlabqsar.pythonanywhere.com/MetaCGRP .
Collapse
Affiliation(s)
- Nalini Schaduangrat
- Center for Research Innovation and Biomedical Informatics, Faculty of Medical Technology, Mahidol University, Bangkok, 10700, Thailand
| | - Phisit Khemawoot
- Chakri Naruebodindra Medical Institute, Faculty of Medicine Ramathibodi Hospital, Mahidol University, Samut Prakan, 10540, Thailand
| | - Apisada Jiso
- Chakri Naruebodindra Medical Institute, Faculty of Medicine Ramathibodi Hospital, Mahidol University, Samut Prakan, 10540, Thailand
| | - Phasit Charoenkwan
- Modern Management and Information Technology, College of Arts, Media and Technology, Chiang Mai University, Chiang Mai, 50200, Thailand.
| | - Watshara Shoombuatong
- Center for Research Innovation and Biomedical Informatics, Faculty of Medical Technology, Mahidol University, Bangkok, 10700, Thailand.
| |
Collapse
|
2
|
Ge R, Xia Y, Jiang M, Jia G, Jing X, Li Y, Cai Y. HybAVPnet: A Novel Hybrid Network Architecture for Antiviral Peptides Prediction. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2024; 21:1358-1365. [PMID: 38587961 DOI: 10.1109/tcbb.2024.3385635] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/10/2024]
Abstract
Viruses pose a great threat to human production and life, thus the research and development of antiviral drugs is urgently needed. Antiviral peptides play an important role in drug design and development. Compared with the time-consuming and laborious wet chemical experiment methods, it is critical to use computational methods to predict antiviral peptides accurately and rapidly. However, due to limited data, accurate prediction of antiviral peptides is still challenging and extracting effective feature representations from sequences is crucial for creating accurate models. This study introduces a novel two-step approach, named HybAVPnet, to predict antiviral peptides with a hybrid network architecture based on neural networks and traditional machine learning methods. We adopted a stacking-like structure to capture both the long-term dependencies and local evolution information to achieve a comprehensive and diverse prediction using the predicted labels and probabilities. Using an ensemble technique with the different kinds of features can reduce the variance without increasing the bias. The experimental result shows HybAVPnet can achieve better and more robust performance compared with the state-of-the-art methods, which makes it useful for the research and development of antiviral drugs. Meanwhile, it can also be extended to other peptide recognition problems because of its generalization ability.
Collapse
|
3
|
Lefin N, Herrera-Belén L, Farias JG, Beltrán JF. Review and perspective on bioinformatics tools using machine learning and deep learning for predicting antiviral peptides. Mol Divers 2024; 28:2365-2374. [PMID: 37626205 DOI: 10.1007/s11030-023-10718-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2023] [Accepted: 08/15/2023] [Indexed: 08/27/2023]
Abstract
Viruses constitute a constant threat to global health and have caused millions of human and animal deaths throughout human history. Despite advances in the discovery of antiviral compounds that help fight these pathogens, finding a solution to this problem continues to be a task that consumes time and financial resources. Currently, artificial intelligence (AI) has revolutionized many areas of the biological sciences, making it possible to decipher patterns in amino acid sequences that encode different functions and activities. Within the field of AI, machine learning, and deep learning algorithms have been used to discover antimicrobial peptides. Due to their effectiveness and specificity, antimicrobial peptides (AMPs) hold excellent promise for treating various infections caused by pathogens. Antiviral peptides (AVPs) are a specific type of AMPs that have activity against certain viruses. Unlike the research focused on the development of tools and methods for the prediction of antimicrobial peptides, those related to the prediction of AVPs are still scarce. Given the significance of AVPs as potential pharmaceutical options for human and animal health and the ongoing AI revolution, we have reviewed and summarized the current machine learning and deep learning-based tools and methods available for predicting these types of peptides.
Collapse
Affiliation(s)
- Nicolás Lefin
- Department of Chemical Engineering, Faculty of Engineering and Science, University of La Frontera, Ave. Francisco Salazar, 01145, Temuco, Chile
| | - Lisandra Herrera-Belén
- Departamento de Ciencias Básicas, Facultad de Ciencias, Universidad Santo Tomás, Temuco, Chile
| | - Jorge G Farias
- Department of Chemical Engineering, Faculty of Engineering and Science, University of La Frontera, Ave. Francisco Salazar, 01145, Temuco, Chile
| | - Jorge F Beltrán
- Department of Chemical Engineering, Faculty of Engineering and Science, University of La Frontera, Ave. Francisco Salazar, 01145, Temuco, Chile.
| |
Collapse
|
4
|
Wanas AS, Radwan MM, Marzouk AA, Elkaeed EB, Alsfouk BA, Mostafa AE, Eissa IH, Metwaly AM, ElSohly MA. Isolation and in silico investigation of cannflavins from Cannabis sativa leaves as potential anti-SARS-CoV-2 agents targeting the Papain-Like Protease. Nat Prod Res 2023:1-14. [PMID: 38100380 DOI: 10.1080/14786419.2023.2294111] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2023] [Accepted: 12/03/2023] [Indexed: 12/17/2023]
Abstract
This study aimed to isolate and identify three prenylflavonoids (cannflavin A, B, and C) from Cannabis sativa leaves using different chromatographic techniques. The potential of the isolated compounds against SARS-CoV-2 was suggested through several in silico analysis. Structural similarity studies against nine co-crystallized ligands of SARS-CoV-2's proteins indicated the similarities of the isolated cannflavins with the SARS-CoV-2 Papain-Like Protease (PLP) ligand, Y95. Then, flexible allignment study confirmed this similarity. Docking experiments showed successful binding of all cannflavins within the active pocket of PLP, with energies comparable to Y95. Among them, cannflavin A demonstrated the most similar binding mode, while cannflavin C exhibited the best energy. Molecular dynamics (MD) simulations and MM-GPSA confirmed the accurate binding of cannflavin A to the PLP. In silico ADMET studies indicated favourable drug-like properties for all three compounds, suggesting their potential as anti-SARS-CoV-2 agents. Further In vitro and In vivo investigations are necessary to validate these findings and establish their efficacy and safety profiles.
Collapse
Affiliation(s)
- Amira S Wanas
- National Center for Natural Products Research, School of Pharmacy, University of Mississippi, University, Mississippi, USA
- Department of Pharmacognosy, Faculty of Pharmacy, Minia University, Minia, Egypt
| | - Mohamed M Radwan
- National Center for Natural Products Research, School of Pharmacy, University of Mississippi, University, Mississippi, USA
| | - Adel A Marzouk
- National Center for Natural Products Research, School of Pharmacy, University of Mississippi, University, Mississippi, USA
- Department of Pharmaceutical Chemistry, Faculty of Pharmacy, Al-Azhar University, Assiut, Egypt
| | - Eslam B Elkaeed
- Department of Pharmaceutical Sciences, College of Pharmacy, AlMaarefa University, Riyadh, Saudi Arabia
| | - Bshra A Alsfouk
- Department of Pharmaceutical Sciences, College of Pharmacy, Princess Nourah bint Abdulrahman University, Riyadh, Saudi Arabia
| | - Ahmad E Mostafa
- Pharmacognosy and Medicinal Plants Department, Faculty of Pharmacy (Boys), Al-Azhar University, Cairo, Egypt
| | - Ibrahim H Eissa
- Pharmaceutical Medicinal Chemistry & Drug Design Department, Faculty of Pharmacy (Boys), Al-Azhar University, Cairo, Egypt
| | - Ahmed M Metwaly
- Pharmacognosy and Medicinal Plants Department, Faculty of Pharmacy (Boys), Al-Azhar University, Cairo, Egypt
- Biopharmaceutical Products Research Department, Genetic Engineering and Biotechnology Research Institute, City of Scientific Research and Technological Applications (SRTA-City), Alexandria, Egypt
| | - Mahmoud A ElSohly
- National Center for Natural Products Research, School of Pharmacy, University of Mississippi, University, Mississippi, USA
- Department of Pharmaceutics and Drug Delivery, School of Pharmacy, University, Mississippi, USA
| |
Collapse
|
5
|
Ma X, Liang Y, Zhang S. iAVPs-ResBi: Identifying antiviral peptides by using deep residual network and bidirectional gated recurrent unit. MATHEMATICAL BIOSCIENCES AND ENGINEERING : MBE 2023; 20:21563-21587. [PMID: 38124610 DOI: 10.3934/mbe.2023954] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/23/2023]
Abstract
Human history is also the history of the fight against viral diseases. From the eradication of viruses to coexistence, advances in biomedicine have led to a more objective understanding of viruses and a corresponding increase in the tools and methods to combat them. More recently, antiviral peptides (AVPs) have been discovered, which due to their superior advantages, have achieved great impact as antiviral drugs. Therefore, it is very necessary to develop a prediction model to accurately identify AVPs. In this paper, we develop the iAVPs-ResBi model using k-spaced amino acid pairs (KSAAP), encoding based on grouped weight (EBGW), enhanced grouped amino acid composition (EGAAC) based on the N5C5 sequence, composition, transition and distribution (CTD) based on physicochemical properties for multi-feature extraction. Then we adopt bidirectional long short-term memory (BiLSTM) to fuse features for obtaining the most differentiated information from multiple original feature sets. Finally, the deep model is built by combining improved residual network and bidirectional gated recurrent unit (BiGRU) to perform classification. The results obtained are better than those of the existing methods, and the accuracies are 95.07, 98.07, 94.29 and 97.50% on the four datasets, which show that iAVPs-ResBi can be used as an effective tool for the identification of antiviral peptides. The datasets and codes are freely available at https://github.com/yunyunliang88/iAVPs-ResBi.
Collapse
Affiliation(s)
- Xinyan Ma
- School of Science, Xi'an Polytechnic University, Xi'an 710048, China
| | - Yunyun Liang
- School of Science, Xi'an Polytechnic University, Xi'an 710048, China
| | - Shengli Zhang
- School of Mathematics and Statistics, Xidian University, Xi'an 710071, China
| |
Collapse
|
6
|
Hemmati S, Rasekhi Kazerooni H. Polypharmacological Cell-Penetrating Peptides from Venomous Marine Animals Based on Immunomodulating, Antimicrobial, and Anticancer Properties. Mar Drugs 2022; 20:md20120763. [PMID: 36547910 PMCID: PMC9787916 DOI: 10.3390/md20120763] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2022] [Revised: 11/25/2022] [Accepted: 11/30/2022] [Indexed: 12/09/2022] Open
Abstract
Complex pathological diseases, such as cancer, infection, and Alzheimer's, need to be targeted by multipronged curative. Various omics technologies, with a high rate of data generation, demand artificial intelligence to translate these data into druggable targets. In this study, 82 marine venomous animal species were retrieved, and 3505 cryptic cell-penetrating peptides (CPPs) were identified in their toxins. A total of 279 safe peptides were further analyzed for antimicrobial, anticancer, and immunomodulatory characteristics. Protease-resistant CPPs with endosomal-escape ability in Hydrophis hardwickii, nuclear-localizing peptides in Scorpaena plumieri, and mitochondrial-targeting peptides from Synanceia horrida were suitable for compartmental drug delivery. A broad-spectrum S. horrida-derived antimicrobial peptide with a high binding-affinity to bacterial membranes was an antigen-presenting cell (APC) stimulator that primes cytokine release and naïve T-cell maturation simultaneously. While antibiofilm and wound-healing peptides were detected in Synanceia verrucosa, APC epitopes as universal adjuvants for antiviral vaccination were in Pterois volitans and Conus monile. Conus pennaceus-derived anticancer peptides showed antiangiogenic and IL-2-inducing properties with moderate BBB-permeation and were defined to be a tumor-homing peptide (THP) with the ability to inhibit programmed death ligand-1 (PDL-1). Isoforms of RGD-containing peptides with innate antiangiogenic characteristics were in Conus tessulatus for tumor targeting. Inhibitors of neuropilin-1 in C. pennaceus are proposed for imaging probes or therapeutic delivery. A Conus betulinus cryptic peptide, with BBB-permeation, mitochondrial-targeting, and antioxidant capacity, was a stimulator of anti-inflammatory cytokines and non-inducer of proinflammation proposed for Alzheimer's. Conclusively, we have considered the dynamic interaction of cells, their microenvironment, and proportional-orchestrating-host- immune pathways by multi-target-directed CPPs resembling single-molecule polypharmacology. This strategy might fill the therapeutic gap in complex resistant disorders and increase the candidates' clinical-translation chance.
Collapse
Affiliation(s)
- Shiva Hemmati
- Department of Pharmaceutical Biotechnology, School of Pharmacy, Shiraz University of Medical Sciences, Shiraz 71345-1583, Iran
- Department of Pharmaceutical Biology, Faculty of Pharmaceutical Sciences, UCSI University, Cheras, Kuala Lumpur 56000, Malaysia
- Biotechnology Research Center, Shiraz University of Medical Sciences, Shiraz 71345-1583, Iran
- Correspondence: ; Tel.: +98-7132-424-128
| | | |
Collapse
|
7
|
Hasegawa K, Moriwaki Y, Terada T, Wei C, Shimizu K. Feedback-AVPGAN: Feedback-guided generative adversarial network for generating antiviral peptides. J Bioinform Comput Biol 2022; 20:2250026. [PMID: 36514872 DOI: 10.1142/s0219720022500263] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
In this study, we propose Feedback-AVPGAN, a system that aims to computationally generate novel antiviral peptides (AVPs). This system relies on the key premise of the Generative Adversarial Network (GAN) model and the Feedback method. GAN, a generative modeling approach that uses deep learning methods, comprises a generator and a discriminator. The generator is used to generate peptides; the generated proteins are fed to the discriminator to distinguish between the AVPs and non-AVPs. The original GAN design uses actual data to train the discriminator. However, not many AVPs have been experimentally obtained. To solve this problem, we used the Feedback method to allow the discriminator to learn from the existing as well as generated synthetic data. We implemented this method using a classifier module that classifies each peptide sequence generated by the GAN generator as AVP or non-AVP. The classifier uses the transformer network and achieves high classification accuracy. This mechanism enables the efficient generation of peptides with a high probability of exhibiting antiviral activity. Using the Feedback method, we evaluated various algorithms and their performance. Moreover, we modeled the structure of the generated peptides using AlphaFold2 and determined the peptides having similar physicochemical properties and structures to those of known AVPs, although with different sequences.
Collapse
Affiliation(s)
- Kano Hasegawa
- Department of Biotechnology, Graduate School of Agricultural and Life Sciences, Faculty of Agriculture The University of Tokyo, 1-1-1, Yayoi, Bunkyo-ku, Tokyo 113-8657, Japan
| | - Yoshitaka Moriwaki
- Department of Biotechnology, Graduate School of Agricultural and Life Sciences, Faculty of Agriculture The University of Tokyo, 1-1-1, Yayoi, Bunkyo-ku, Tokyo 113-8657, Japan.,Collaborative Research Institute for Innovative Microbiology, The Institute of Medical Science The University of Tokyo, 1-1-1, Yayoi, Bunkyo-ku, Tokyo 113-8657, Japan
| | - Tohru Terada
- Department of Biotechnology, Graduate School of Agricultural and Life Sciences, Faculty of Agriculture The University of Tokyo, 1-1-1, Yayoi, Bunkyo-ku, Tokyo 113-8657, Japan.,Collaborative Research Institute for Innovative Microbiology, The Institute of Medical Science The University of Tokyo, 1-1-1, Yayoi, Bunkyo-ku, Tokyo 113-8657, Japan
| | - Cao Wei
- Research Center for Agricultural Information Technology, National Agriculture and Food Research Organization, Tsukuba, Ibaraki 305-8517, Japan
| | - Kentaro Shimizu
- Department of Biotechnology, Graduate School of Agricultural and Life Sciences, Faculty of Agriculture The University of Tokyo, 1-1-1, Yayoi, Bunkyo-ku, Tokyo 113-8657, Japan.,Collaborative Research Institute for Innovative Microbiology, The Institute of Medical Science The University of Tokyo, 1-1-1, Yayoi, Bunkyo-ku, Tokyo 113-8657, Japan
| |
Collapse
|
8
|
Cheng Q, Zeng P, Chi Chan EW, Chen S. Development of Peptide-based Metallo-β-lactamase Inhibitors as a New Strategy to Combat Antimicrobial Resistance: A Mini-review. Curr Pharm Des 2022; 28:3538-3545. [PMID: 36177630 DOI: 10.2174/1381612828666220929154255] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2022] [Revised: 08/10/2022] [Accepted: 08/22/2022] [Indexed: 01/28/2023]
Abstract
Global dissemination of antimicrobial resistance (AMR) not only poses a significant threat to human health, food security, and social development but also results in millions of deaths each year. In Gram-negative bacteria, the primary mechanism of resistance to β-lactam antibiotics is the production of β-lactamases, one of which is carbapenem-hydrolyzing β-lactamases known as carbapenemases. As a general scheme, these enzymes are divided into Ambler class A, B, C, and D based on their protein sequence homology. Class B β-lactamases are also known as metallo-β-lactamases (MBLs). The incidence of recovery of bacteria expressing metallo-β- lactamases (MBLs) has increased dramatically in recent years, almost reaching a pandemic proportion. MBLs can be further divided into three subclasses (B1, B2, and B3) based on the homology of protein sequences as well as the differences in zinc coordination. The development of inhibitors is one effective strategy to suppress the activities of MBLs and restore the activity of β-lactam antibiotics. Although thousands of MBL inhibitors have been reported, none have been approved for clinical use. This review describes the clinical application potential of peptide-based drugs that exhibit inhibitory activity against MBLs identified in past decades. In this report, peptide-based inhibitors of MBLs are divided into several groups based on the mode of action, highlighting compounds of promising properties that are suitable for further advancement. We discuss how traditional computational tools, such as in silico screening and molecular docking, along with new methods, such as deep learning and machine learning, enable a more accurate and efficient design of peptide-based inhibitors of MBLs.
Collapse
Affiliation(s)
- Qipeng Cheng
- Anhui Provincial Key Laboratory of Molecular Enzymology and Mechanism of Major Diseases and Key Laboratory of Biomedicine in Gene Diseases and Health of Anhui Higher Education Institutes, College of Life Sciences, Anhui Normal University, Wuhu, Anhui, China
| | - Ping Zeng
- School of Pharmacy, The Chinese University of Hong Kong, Shatin, Hong Kong
| | - Edward Wai Chi Chan
- Department of Infectious Diseases and Public Health, Jockey Club College of Veterinary Medicine and Life Sciences, City University of Hong Kong, Kowloon, Hong Kong
| | - Sheng Chen
- Department of Infectious Diseases and Public Health, Jockey Club College of Veterinary Medicine and Life Sciences, City University of Hong Kong, Kowloon, Hong Kong
| |
Collapse
|
9
|
Schaduangrat N, Anuwongcharoen N, Moni MA, Lio' P, Charoenkwan P, Shoombuatong W. StackPR is a new computational approach for large-scale identification of progesterone receptor antagonists using the stacking strategy. Sci Rep 2022; 12:16435. [PMID: 36180453 PMCID: PMC9525257 DOI: 10.1038/s41598-022-20143-5] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2022] [Accepted: 09/09/2022] [Indexed: 11/24/2022] Open
Abstract
Progesterone receptors (PRs) are implicated in various cancers since their presence/absence can determine clinical outcomes. The overstimulation of progesterone can facilitate oncogenesis and thus, its modulation through PR inhibition is urgently needed. To address this issue, a novel stacked ensemble learning approach (termed StackPR) is presented for fast, accurate, and large-scale identification of PR antagonists using only SMILES notation without the need for 3D structural information. We employed six popular machine learning (ML) algorithms (i.e., logistic regression, partial least squares, k-nearest neighbor, support vector machine, extremely randomized trees, and random forest) coupled with twelve conventional molecular descriptors to create 72 baseline models. Then, a genetic algorithm in conjunction with the self-assessment-report approach was utilized to determine m out of the 72 baseline models as means of developing the final meta-predictor using the stacking strategy and tenfold cross-validation test. Experimental results on the independent test dataset show that StackPR achieved impressive predictive performance with an accuracy of 0.966 and Matthew's coefficient correlation of 0.925. In addition, analysis based on the SHapley Additive exPlanation algorithm and molecular docking indicates that aliphatic hydrocarbons and nitrogen-containing substructures were the most important features for having PR antagonist activity. Finally, we implemented an online webserver using StackPR, which is freely accessible at http://pmlabstack.pythonanywhere.com/StackPR . StackPR is anticipated to be a powerful computational tool for the large-scale identification of unknown PR antagonist candidates for follow-up experimental validation.
Collapse
Affiliation(s)
- Nalini Schaduangrat
- Center of Data Mining and Biomedical Informatics, Faculty of Medical Technology, Mahidol University, Bangkok, 10700, Thailand
| | - Nuttapat Anuwongcharoen
- Center of Data Mining and Biomedical Informatics, Faculty of Medical Technology, Mahidol University, Bangkok, 10700, Thailand
| | - Mohammad Ali Moni
- Artificial Intelligence & Digital Health Data Science, School of Health and Rehabilitation Sciences, Faculty of Health and Behavioural Sciences, The University of Queensland, St Lucia, QLD, 4072, Australia
| | - Pietro Lio'
- Department of Computer Science and Technology, University of Cambridge, Cambridge, CB3 0FD, UK
| | - Phasit Charoenkwan
- Modern Management and Information Technology, College of Arts, Media and Technology, Chiang Mai University, Chiang Mai, 50200, Thailand.
| | - Watshara Shoombuatong
- Center of Data Mining and Biomedical Informatics, Faculty of Medical Technology, Mahidol University, Bangkok, 10700, Thailand.
| |
Collapse
|
10
|
Kurata H, Tsukiyama S, Manavalan B. iACVP: markedly enhanced identification of anti-coronavirus peptides using a dataset-specific word2vec model. Brief Bioinform 2022; 23:6623727. [PMID: 35772910 DOI: 10.1093/bib/bbac265] [Citation(s) in RCA: 16] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2022] [Revised: 05/23/2022] [Accepted: 06/06/2022] [Indexed: 01/22/2023] Open
Abstract
The COVID-19 pandemic caused several million deaths worldwide. Development of anti-coronavirus drugs is thus urgent. Unlike conventional non-peptide drugs, antiviral peptide drugs are highly specific, easy to synthesize and modify, and not highly susceptible to drug resistance. To reduce the time and expense involved in screening thousands of peptides and assaying their antiviral activity, computational predictors for identifying anti-coronavirus peptides (ACVPs) are needed. However, few experimentally verified ACVP samples are available, even though a relatively large number of antiviral peptides (AVPs) have been discovered. In this study, we attempted to predict ACVPs using an AVP dataset and a small collection of ACVPs. Using conventional features, a binary profile and a word-embedding word2vec (W2V), we systematically explored five different machine learning methods: Transformer, Convolutional Neural Network, bidirectional Long Short-Term Memory, Random Forest (RF) and Support Vector Machine. Via exhaustive searches, we found that the RF classifier with W2V consistently achieved better performance on different datasets. The two main controlling factors were: (i) the dataset-specific W2V dictionary was generated from the training and independent test datasets instead of the widely used general UniProt proteome and (ii) a systematic search was conducted and determined the optimal k-mer value in W2V, which provides greater discrimination between positive and negative samples. Therefore, our proposed method, named iACVP, consistently provides better prediction performance compared with existing state-of-the-art methods. To assist experimentalists in identifying putative ACVPs, we implemented our model as a web server accessible via the following link: http://kurata35.bio.kyutech.ac.jp/iACVP.
Collapse
Affiliation(s)
- Hiroyuki Kurata
- Department of Bioscience and Bioinformatics, Kyushu Institute of Technology, 680-4 Kawazu, Iizuka, Fukuoka 820-8502, Japan
| | - Sho Tsukiyama
- Department of Bioscience and Bioinformatics, Kyushu Institute of Technology, 680-4 Kawazu, Iizuka, Fukuoka 820-8502, Japan
| | - Balachandran Manavalan
- Computational Biology and Bioinformatics Laboratory, Department of Integrative Biotechnology, College of Biotechnology and Bioengineering, Sungkyunkwan University, Suwon 16419, Gyeonggi-do, Republic of Korea
| |
Collapse
|
11
|
Charoenkwan P, Schaduangrat N, Lio' P, Moni MA, Manavalan B, Shoombuatong W. NEPTUNE: A novel computational approach for accurate and large-scale identification of tumor homing peptides. Comput Biol Med 2022; 148:105700. [PMID: 35715261 DOI: 10.1016/j.compbiomed.2022.105700] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2022] [Revised: 05/31/2022] [Accepted: 06/04/2022] [Indexed: 11/16/2022]
Abstract
Tumor homing peptides (THPs) play a crucial role in recognizing and specifically binding to cancer cells. Although experimental approaches can facilitate the precise identification of THPs, they are usually time-consuming, labor-intensive, and not cost-effective. However, computational approaches can identify THPs by utilizing sequence information alone, thus highlighting their great potential for large-scale identification of THPs. Herein, we propose NEPTUNE, a novel computational approach for the accurate and large-scale identification of THPs from sequence information. Specifically, we constructed variant baseline models from multiple feature encoding schemes coupled with six popular machine learning algorithms. Subsequently, we comprehensively assessed and investigated the effects of these baseline models on THP prediction. Finally, the probabilistic information generated by the optimal baseline models is fed into a support vector machine-based classifier to construct the final meta-predictor (NEPTUNE). Cross-validation and independent tests demonstrated that NEPTUNE achieved superior performance for THP prediction compared with its constituent baseline models and the existing methods. Moreover, we employed the powerful SHapley additive exPlanations method to improve the interpretation of NEPTUNE and elucidate the most important features for identifying THPs. Finally, we implemented an online web server using NEPTUNE, which is available at http://pmlabstack.pythonanywhere.com/NEPTUNE. NEPTUNE could be beneficial for the large-scale identification of unknown THP candidates for follow-up experimental validation.
Collapse
Affiliation(s)
- Phasit Charoenkwan
- Modern Management and Information Technology, College of Arts, Media and Technology, Chiang Mai University, Chiang Mai, 50200, Thailand
| | - Nalini Schaduangrat
- Center of Data Mining and Biomedical Informatics, Faculty of Medical Technology, Mahidol University, Bangkok, 10700, Thailand
| | - Pietro Lio'
- Department of Computer Science and Technology, University of Cambridge, Cambridge, CB3 0FD, UK
| | - Mohammad Ali Moni
- Artificial Intelligence & Digital Health, School of Health and Rehabilitation Sciences, Faculty of Health and Behavioural Sciences, The University of Queensland St Lucia, QLD, 4072, Australia
| | - Balachandran Manavalan
- Computational Biology and Bioinformatics Laboratory, Department of Integrative Biotechnology, College of Biotechnology and Bioengineering, Sungkyunkwan University, Suwon, 16419, Gyeonggi-do, Republic of Korea.
| | - Watshara Shoombuatong
- Center of Data Mining and Biomedical Informatics, Faculty of Medical Technology, Mahidol University, Bangkok, 10700, Thailand.
| |
Collapse
|
12
|
Charoenkwan P, Schaduangrat N, Hasan MM, Moni MA, Lió P, Shoombuatong W. Empirical comparison and analysis of machine learning-based predictors for predicting and analyzing of thermophilic proteins. EXCLI JOURNAL 2022; 21:554-570. [PMID: 35651661 PMCID: PMC9150013 DOI: 10.17179/excli2022-4723] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/28/2022] [Accepted: 02/21/2022] [Indexed: 12/15/2022]
Abstract
Thermophilic proteins (TPPs) are critical for basic research and in the food industry due to their ability to maintain a thermodynamically stable fold at extremely high temperatures. Thus, the expeditious identification of novel TPPs through computational models from protein sequences is very desirable. Over the last few decades, a number of computational methods, especially machine learning (ML)-based methods, for in silico prediction of TPPs have been developed. Therefore, it is desirable to revisit these methods and summarize their advantages and disadvantages in order to further develop new computational approaches to achieve more accurate and improved prediction of TPPs. With this goal in mind, we comprehensively investigate a large collection of fourteen state-of-the-art TPP predictors in terms of their dataset size, feature encoding schemes, feature selection strategies, ML algorithms, evaluation strategies and web server/software usability. To the best of our knowledge, this article represents the first comprehensive review on the development of ML-based methods for in silico prediction of TPPs. Among these TPP predictors, they can be classified into two groups according to the interpretability of ML algorithms employed (i.e., computational black-box methods and computational white-box methods). In order to perform the comparative analysis, we conducted a comparative study on several currently available TPP predictors based on two benchmark datasets. Finally, we provide future perspectives for the design and development of new computational models for TPP prediction. We hope that this comprehensive review will facilitate researchers in selecting an appropriate TPP predictor that is the most suitable one to deal with their purposes and provide useful perspectives for the development of more effective and accurate TPP predictors.
Collapse
Affiliation(s)
- Phasit Charoenkwan
- Modern Management and Information Technology, College of Arts, Media and Technology, Chiang Mai University, Chiang Mai, Thailand, 50200
| | - Nalini Schaduangrat
- Center of Data Mining and Biomedical Informatics, Faculty of Medical Technology, Mahidol University, Bangkok, Thailand, 10700
| | - Md Mehedi Hasan
- Tulane Center for Biomedical Informatics and Genomics, Division of Biomedical Informatics and Genomics, John W. Deming Department of Medicine, School of Medicine, Tulane University, New Orleans, LA 70112, USA
| | - Mohammad Ali Moni
- School of Health and Rehabilitation Sciences, Faculty of Health and Behavioural Sciences, the University of Queensland, St Lucia, QLD 4072, Australia
| | - Pietro Lió
- Department of Computer Science and Technology, University of Cambridge, Cambridge, CB3 0FD, UK
| | - Watshara Shoombuatong
- Center of Data Mining and Biomedical Informatics, Faculty of Medical Technology, Mahidol University, Bangkok, Thailand, 10700
| |
Collapse
|
13
|
Parra ALC, Bezerra LP, Shawar DE, Neto NAS, Mesquita FP, da Silva GO, Souza PFN. Synthetic antiviral peptides: a new way to develop targeted antiviral drugs. Future Virol 2022. [DOI: 10.2217/fvl-2021-0308] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
Abstract
The global concern over emerging and re-emerging viral infections has spurred the search for novel antiviral agents. Peptides with antiviral activity stand out, by overcoming limitations of the current drugs utilized, due to their biocompatibility, specificity and effectiveness. Synthetic peptides have been shown to be viable alternatives to natural peptides due to several difficulties of using of the latter in clinical trials. Various platforms have been utilized by researchers to predict the most effective peptide sequences against HIV, influenza, dengue, MERS and SARS. Synthetic peptides are already employed in the treatment of HIV infection. The novelty of this study is to discuss, for the first time, the potential of synthetic peptides as antiviral molecules. We conclude that synthetic peptides can act as new weapons against viral threats to humans.
Collapse
Affiliation(s)
- Aura LC Parra
- Department of Biochemistry & Molecular Biology, Federal University of Ceara, Fortaleza, Ceara, 60440-554, Brazil
| | - Leandro P Bezerra
- Department of Biochemistry & Molecular Biology, Federal University of Ceara, Fortaleza, Ceara, 60440-554, Brazil
| | - Dur E Shawar
- Department of Biochemistry & Molecular Biology, Federal University of Ceara, Fortaleza, Ceara, 60440-554, Brazil
| | - Nilton AS Neto
- Department of Biochemistry & Molecular Biology, Federal University of Ceara, Fortaleza, Ceara, 60440-554, Brazil
| | - Felipe P Mesquita
- Drug Research & Development Center (NPDM), Federal University of Ceará, Cel. Nunes de Melo, Rodolfo Teófilo, 1000, Fortaleza, Brazil
| | - Gabrielly O da Silva
- Department of Biochemistry & Molecular Biology, Federal University of Ceara, Fortaleza, Ceara, 60440-554, Brazil
| | - Pedro FN Souza
- Department of Biochemistry & Molecular Biology, Federal University of Ceara, Fortaleza, Ceara, 60440-554, Brazil
- Drug Research & Development Center (NPDM), Federal University of Ceará, Cel. Nunes de Melo, Rodolfo Teófilo, 1000, Fortaleza, Brazil
| |
Collapse
|
14
|
Manavalan B, Basith S, Lee G. Comparative analysis of machine learning-based approaches for identifying therapeutic peptides targeting SARS-CoV-2. Brief Bioinform 2022; 23:bbab412. [PMID: 34595489 PMCID: PMC8500067 DOI: 10.1093/bib/bbab412] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2021] [Revised: 08/27/2021] [Accepted: 09/07/2021] [Indexed: 01/08/2023] Open
Abstract
Coronavirus disease 2019 (COVID-19) has impacted public health as well as societal and economic well-being. In the last two decades, various prediction algorithms and tools have been developed for predicting antiviral peptides (AVPs). The current COVID-19 pandemic has underscored the need to develop more efficient and accurate machine learning (ML)-based prediction algorithms for the rapid identification of therapeutic peptides against severe acute respiratory syndrome coronavirus-2 (SARS-CoV-2). Several peptide-based ML approaches, including anti-coronavirus peptides (ACVPs), IL-6 inducing epitopes and other epitopes targeting SARS-CoV-2, have been implemented in COVID-19 therapeutics. Owing to the growing interest in the COVID-19 field, it is crucial to systematically compare the existing ML algorithms based on their performances. Accordingly, we comprehensively evaluated the state-of-the-art IL-6 and AVP predictors against coronaviruses in terms of core algorithms, feature encoding schemes, performance evaluation metrics and software usability. A comprehensive performance assessment was then conducted to evaluate the robustness and scalability of the existing predictors using well-constructed independent validation datasets. Additionally, we discussed the advantages and disadvantages of the existing methods, providing useful insights into the development of novel computational tools for characterizing and identifying epitopes or ACVPs. The insights gained from this review are anticipated to provide critical guidance to the scientific community in the rapid design and development of accurate and efficient next-generation in silico tools against SARS-CoV-2.
Collapse
Affiliation(s)
| | - Shaherin Basith
- Department of Physiology, Ajou University School of Medicine, Suwon 16499, Korea
| | - Gwang Lee
- Department of Physiology, Ajou University School of Medicine, Suwon 16499, Korea
| |
Collapse
|
15
|
βLact-Pred: A Predictor Developed for Identification of Beta-Lactamases Using Statistical Moments and PseAAC via 5-Step Rule. COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE 2021; 2021:8974265. [PMID: 34956358 PMCID: PMC8709780 DOI: 10.1155/2021/8974265] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/22/2021] [Accepted: 11/22/2021] [Indexed: 12/02/2022]
Abstract
Beta-lactamase (β-lactamase) produced by different bacteria confers resistance against β-lactam-containing drugs. The gene encoding β-lactamase is plasmid-borne and can easily be transferred from one bacterium to another during conjugation. By such transformations, the recipient also acquires resistance against the drugs of the β-lactam family. β-Lactam antibiotics play a vital significance in clinical treatment of disastrous diseases like soft tissue infections, gonorrhoea, skin infections, urinary tract infections, and bronchitis. Herein, we report a prediction classifier named as βLact-Pred for the identification of β-lactamase proteins. The computational model uses the primary amino acid sequence structure as its input. Various metrics are derived from the primary structure to form a feature vector. Experimentally determined data of positive and negative beta-lactamases are collected and transformed into feature vectors. An operating algorithm based on the artificial neural network is used by integrating the position relative features and sequence statistical moments in PseAAC for training the neural networks. The results for the proposed computational model were validated by employing numerous types of approach, i.e., self-consistency testing, jackknife testing, cross-validation, and independent testing. The overall accuracy of the predictor for self-consistency, jackknife testing, cross-validation, and independent testing presents 99.76%, 96.07%, 94.20%, and 91.65%, respectively, for the proposed model. Stupendous experimental results demonstrated that the proposed predictor “βLact-Pred” has surpassed results from the existing methods.
Collapse
|
16
|
Auliah FN, Nilamyani AN, Shoombuatong W, Alam MA, Hasan MM, Kurata H. PUP-Fuse: Prediction of Protein Pupylation Sites by Integrating Multiple Sequence Representations. Int J Mol Sci 2021; 22:ijms22042120. [PMID: 33672741 PMCID: PMC7924619 DOI: 10.3390/ijms22042120] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2021] [Revised: 02/12/2021] [Accepted: 02/18/2021] [Indexed: 12/30/2022] Open
Abstract
Pupylation is a type of reversible post-translational modification of proteins, which plays a key role in the cellular function of microbial organisms. Several proteomics methods have been developed for the prediction and analysis of pupylated proteins and pupylation sites. However, the traditional experimental methods are laborious and time-consuming. Hence, computational algorithms are highly needed that can predict potential pupylation sites using sequence features. In this research, a new prediction model, PUP-Fuse, has been developed for pupylation site prediction by integrating multiple sequence representations. Meanwhile, we explored the five types of feature encoding approaches and three machine learning (ML) algorithms. In the final model, we integrated the successive ML scores using a linear regression model. The PUP-Fuse achieved a Mathew correlation value of 0.768 by a 10-fold cross-validation test. It also outperformed existing predictors in an independent test. The web server of the PUP-Fuse with curated datasets is freely available.
Collapse
Affiliation(s)
- Firda Nurul Auliah
- Department of Bioscience and Bioinformatics, Kyushu Institute of Technology, 680-4 Kawazu, Iizuka, Fukuoka 820-8502, Japan; (F.N.A.); (A.N.N.); (M.M.H.)
| | - Andi Nur Nilamyani
- Department of Bioscience and Bioinformatics, Kyushu Institute of Technology, 680-4 Kawazu, Iizuka, Fukuoka 820-8502, Japan; (F.N.A.); (A.N.N.); (M.M.H.)
| | - Watshara Shoombuatong
- Center of Data Mining and Biomedical Informatics, Faculty of Medical Technology, Mahidol University, Bangkok 10700, Thailand;
| | - Md Ashad Alam
- Tulane Center for Biomedical Informatics and Genomics, Division of Biomedical Informatics and Genomics, John W. Deming Department of Medicine, School of Medicine, Tulane University, New Orleans, LA 70112, USA;
| | - Md Mehedi Hasan
- Department of Bioscience and Bioinformatics, Kyushu Institute of Technology, 680-4 Kawazu, Iizuka, Fukuoka 820-8502, Japan; (F.N.A.); (A.N.N.); (M.M.H.)
- Japan Society for the Promotion of Science, 5-3-1 Kojimachi, Chiyoda-ku, Tokyo 102-0083, Japan
| | - Hiroyuki Kurata
- Department of Bioscience and Bioinformatics, Kyushu Institute of Technology, 680-4 Kawazu, Iizuka, Fukuoka 820-8502, Japan; (F.N.A.); (A.N.N.); (M.M.H.)
- Correspondence:
| |
Collapse
|
17
|
Hasan MM, Alam MA, Shoombuatong W, Kurata H. IRC-Fuse: improved and robust prediction of redox-sensitive cysteine by fusing of multiple feature representations. J Comput Aided Mol Des 2021; 35:315-323. [PMID: 33392948 DOI: 10.1007/s10822-020-00368-0] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2020] [Accepted: 12/06/2020] [Indexed: 12/11/2022]
Abstract
Redox-sensitive cysteine (RSC) thiol contributes to many biological processes. The identification of RSC plays an important role in clarifying some mechanisms of redox-sensitive factors; nonetheless, experimental investigation of RSCs is expensive and time-consuming. The computational approaches that quickly and accurately identify candidate RSCs using the sequence information are urgently needed. Herein, an improved and robust computational predictor named IRC-Fuse was developed to identify the RSC by fusing of multiple feature representations. To enhance the performance of our model, we integrated the probability scores evaluated by the random forest models implementing different encoding schemes. Cross-validation results exhibited that the IRC-Fuse achieved accuracy and AUC of 0.741 and 0.807, respectively. The IRC-Fuse outperformed exiting methods with improvement of 10% and 13% on accuracy and MCC, respectively, over independent test data. Comparative analysis suggested that the IRC-Fuse was more effective and promising than the existing predictors. For the convenience of experimental scientists, the IRC-Fuse online web server was implemented and publicly accessible at http://kurata14.bio.kyutech.ac.jp/IRC-Fuse/ .
Collapse
Affiliation(s)
- Md Mehedi Hasan
- Department of Bioscience and Bioinformatics, Kyushu Institute of Technology, 680-4 Kawazu, Iizuka, Fukuoka, 820-8502, Japan. .,Japan Society for the Promotion of Science, 5-3-1 Kojimachi, Chiyoda-ku, Tokyo, 102-0083, Japan.
| | - Md Ashad Alam
- Tulane Center of Biomedical Informatics and Genomics, Division of Biomedical Informatics and Genomics, John W. Deming Department of Medicine, School of Medicine, Tulane University, New Orleans, LA, 70112, USA
| | - Watshara Shoombuatong
- Center of Data Mining and Biomedical Informatics, Faculty of Medical Technology, Mahidol University, Bangkok, 10700, Thailand
| | - Hiroyuki Kurata
- Department of Bioscience and Bioinformatics, Kyushu Institute of Technology, 680-4 Kawazu, Iizuka, Fukuoka, 820-8502, Japan.
| |
Collapse
|