1
|
Li W, Liu X, Liu Y, Zheng Z. High-Accuracy Identification and Structure-Activity Analysis of Antioxidant Peptides via Deep Learning and Quantum Chemistry. J Chem Inf Model 2025; 65:603-612. [PMID: 39772654 DOI: 10.1021/acs.jcim.4c01713] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/11/2025]
Abstract
Antioxidant peptides (AOPs) hold great promise for mitigating oxidative-stress-related diseases, but their discovery is hindered by inefficient and time-consuming traditional methods. To address this, we developed an innovative framework combining machine learning and quantum chemistry to accelerate AOP identification and analyze structure-activity relationships. A Bi-LSTM-based model, AOPP, achieved superior performance with accuracies of 0.9043 and 0.9267, precisions of 0.9767 and 0.9848, and Matthews correlation coefficients (MCCs) of 0.818 and 0.859 on two data sets, outperforming existing methods. Compared with XGBoost and LightGBM, AOPP demonstrated a 4.67% improvement in accuracy. Feature fusion significantly enhanced classification, as validated by UMAP visualization. Experimental validation of ten peptides confirmed the antioxidant activity, with LLA exhibiting the highest DPPH and ABTS scavenging rates (0.108 and 0.437 mmol/g, respectively). Quantum chemical calculations identified LLA's lowest HOMO-LUMO gap (ΔE = 0.26 eV) and C3-H26 as the key active site contributing to its superior antioxidant potential. This study highlights the synergy of machine learning and quantum chemistry, offering an efficient framework for AOP discovery with broad applications in therapeutics and functional foods.
Collapse
Affiliation(s)
- Wanxing Li
- School of Food Science and Technology, Jiangnan University, Wuxi214122, China
| | - Xuejing Liu
- School of Food Science and Technology, Jiangnan University, Wuxi214122, China
| | - Yuanfa Liu
- School of Food Science and Technology, Jiangnan University, Wuxi214122, China
| | - Zhaojun Zheng
- School of Food Science and Technology, Jiangnan University, Wuxi214122, China
| |
Collapse
|
2
|
Basith S, Manavalan B, Lee G. AntiT2DMP-Pred: Leveraging feature fusion and optimization for superior machine learning prediction of type 2 diabetes mellitus. Methods 2025; 234:264-274. [PMID: 39798942 DOI: 10.1016/j.ymeth.2025.01.003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2024] [Revised: 12/26/2024] [Accepted: 01/04/2025] [Indexed: 01/15/2025] Open
Abstract
Pancreatic α-amylase breaks down starch into isomaltose and maltose, which are further hydrolyzed by α-glucosidase in the intestine into monosaccharides, rapidly raising blood sugar levels and contributing to type 2 diabetes mellitus (T2DM). Synthetic inhibitors of carbohydrate-digesting enzymes are used to manage T2DM but may harm organ function over time. Bioactive peptides offer a safer alternative, avoiding such adverse effects. Computational methods for predicting antidiabetic peptides (ADPs) can significantly reduce the time and cost of experimental testing. While machine learning (ML) has been applied to identify ADPs, advancements in data analysis and algorithms continue to drive progress in the field. To address this, we developed AntiT2DMP-Pred, the first ML-based tool specifically designed for predicting type 2 antidiabetic peptides (T2ADPs). This tool employs a feature fusion strategy, combining ten highly discriminative feature descriptors chosen from a pool of 32 descriptors and eight ML algorithms, tested across a range of baseline models. AntiT2DMP-Pred demonstrated excellent performance, surpassing both baseline and feature-optimized models, with an accuracy (ACC) and Matthews' correlation coefficient (MCC) of 0.976 and 0.953 on the training dataset, and an ACC and MCC of 0.957 and 0.851 on the independent dataset. The web server (https://balalab-skku.org/AntiT2DMP-Pred) is freely accessible, enabling researchers worldwide to utilize it in their experimental workflows and contribute to the discovery and understanding of T2ADPs, ultimately supporting peptide-based therapeutic development for diabetes management.
Collapse
Affiliation(s)
- Shaherin Basith
- Department of Physiology, Ajou University School of Medicine, Suwon 16499 Republic of Korea.
| | - Balachandran Manavalan
- Department of Integrative Biotechnology, College of Biotechnology and Bioengineering, Sungkyunkwan University, Suwon 16419 Republic of Korea.
| | - Gwang Lee
- Department of Physiology, Ajou University School of Medicine, Suwon 16499 Republic of Korea; Department of Molecular Science and Technology, Ajou University, Suwon 16499 Republic of Korea.
| |
Collapse
|
3
|
Ramasundaram M, Sohn H, Madhavan T. A bird's-eye view of the biological mechanism and machine learning prediction approaches for cell-penetrating peptides. Front Artif Intell 2025; 7:1497307. [PMID: 39839972 PMCID: PMC11747587 DOI: 10.3389/frai.2024.1497307] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2024] [Accepted: 12/13/2024] [Indexed: 01/23/2025] Open
Abstract
Cell-penetrating peptides (CPPs) are highly effective at passing through eukaryotic membranes with various cargo molecules, like drugs, proteins, nucleic acids, and nanoparticles, without causing significant harm. Creating drug delivery systems with CPP is associated with cancer, genetic disorders, and diabetes due to their unique chemical properties. Wet lab experiments in drug discovery methodologies are time-consuming and expensive. Machine learning (ML) techniques can enhance and accelerate the drug discovery process with accurate and intricate data quality. ML classifiers, such as support vector machine (SVM), random forest (RF), gradient-boosted decision trees (GBDT), and different types of artificial neural networks (ANN), are commonly used for CPP prediction with cross-validation performance evaluation. Functional CPP prediction is improved by using these ML strategies by using CPP datasets produced by high-throughput sequencing and computational methods. This review focuses on several ML-based CPP prediction tools. We discussed the CPP mechanism to understand the basic functioning of CPPs through cells. A comparative analysis of diverse CPP prediction methods was conducted based on their algorithms, dataset size, feature encoding, software utilities, assessment metrics, and prediction scores. The performance of the CPP prediction was evaluated based on accuracy, sensitivity, specificity, and Matthews correlation coefficient (MCC) on independent datasets. In conclusion, this review will encourage the use of ML algorithms for finding effective CPPs, which will have a positive impact on future research on drug delivery and therapeutics.
Collapse
Affiliation(s)
- Maduravani Ramasundaram
- Department of Genetic Engineering, Computational Biology Lab, School of Bioengineering, SRM Institute of Science and Technology, SRM Nagar, Chennai, India
| | - Honglae Sohn
- Department of Chemistry and Department of Carbon Materials, Chosun University, Gwangju, Republic of Korea
| | - Thirumurthy Madhavan
- Department of Genetic Engineering, Computational Biology Lab, School of Bioengineering, SRM Institute of Science and Technology, SRM Nagar, Chennai, India
| |
Collapse
|
4
|
Saraswat A, Sharma U, Gandotra A, Wasan L, Artham S, Maitra A, Singh B. Pred-AHCP: Robust Feature Selection-Enabled Sequence-Specific Prediction of Anti-Hepatitis C Peptides via Machine Learning. J Chem Inf Model 2024; 64:9111-9124. [PMID: 39505690 DOI: 10.1021/acs.jcim.4c00900] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2024]
Abstract
Every year, an estimated 1.5 million people worldwide contract Hepatitis C, a significant contributor to liver problems. Although many studies have explored machine learning's potential to predict antiviral peptides, very few have addressed the problem of predicting peptides against specific viruses such as Hepatitis C. In this study, we demonstrate the application and fine-tuning of machine learning (ML) algorithms to predict peptides that are effective against Hepatitis C virus (HCV). We developed a fine-tuned and explainable ML model that harnesses the amino acid sequence of a peptide to predict its anti-hepatitis C potential. Specifically, features were computed based on sequence and physicochemical properties. The feature selection was performed using a combined strategy of mutual information and variance inflation factor. This facilitated the removal of redundant and multicollinear features, enhancing the model's generalizability in predicting anti-hepatitis C peptides (AHCPs). The model using the random forest algorithm produced the best performance with an accuracy of about 92%. The feature analysis highlights that the distributions of hydrophobicity, polarizability, coil-forming residues, frequency of glycine residues and the existence of dipeptide motifs VL, LV, and CC emerged as the key predictors for identifying AHCPs targeting different components of HCV. The developed model can be accessed through the Pred-AHCP web server, provided at http://tinyurl.com/web-Pred-AHCP. This resource facilitates the prediction and re-engineering of AHCPs for designing peptide-based therapeutics while also proposing an exploration of similar strategies for designing peptide inhibitors effective against other viruses. The developed ML model can also be used for validating peptide sequences generated using generative artificial intelligence methods for further optimization.
Collapse
Affiliation(s)
- Akash Saraswat
- Department of Applied Sciences, School of Engineering and Technology, BML Munjal University, Gurugram, Haryana 122413, India
| | - Utsav Sharma
- Department of Computer Science and Engineering, School of Engineering and Technology, BML Munjal University, Gurugram, Haryana 122413, India
| | - Aryan Gandotra
- Department of Computer Science and Engineering, School of Engineering and Technology, BML Munjal University, Gurugram, Haryana 122413, India
| | - Lakshit Wasan
- Department of Computer Science and Engineering, School of Engineering and Technology, BML Munjal University, Gurugram, Haryana 122413, India
| | - Sainithin Artham
- Department of Computer Science and Engineering, School of Engineering and Technology, BML Munjal University, Gurugram, Haryana 122413, India
| | - Arijit Maitra
- Department of Applied Sciences, School of Engineering and Technology, BML Munjal University, Gurugram, Haryana 122413, India
| | - Bipin Singh
- Centre for Life Sciences, Mahindra University, Hyderabad, Telangana 500043, India
| |
Collapse
|
5
|
Niu S, Fan H, Wang F, Yang X, Xia J. Identification of Multi-functional Therapeutic Peptides Based on Prototypical Supervised Contrastive Learning. Interdiscip Sci 2024:10.1007/s12539-024-00674-3. [PMID: 39714581 DOI: 10.1007/s12539-024-00674-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2024] [Revised: 10/31/2024] [Accepted: 11/04/2024] [Indexed: 12/24/2024]
Abstract
High-throughput sequencing has exponentially increased peptide sequences, necessitating a computational method to identify multi-functional therapeutic peptides (MFTP) from their sequences. However, existing computational methods are challenged by class imbalance, particularly in learning effective sequence representations. To address this, we propose PSCFA, a prototypical supervised contrastive learning with a feature augmentation method for MFTP prediction. We employ a two-stage training scheme to train the feature extractor and the classifier respectively, underpinned by the principle that better feature representation boosts classification accuracy. In the first stage, we utilize a prototypical supervised contrastive learning strategy to enhance the uniformity of feature space distribution, ensuring that the characteristics of samples within the same category are tightly clustered while those from different categories are more dispersed. In the second stage, a feature augmentation strategy that focuses on infrequent labels (tail labels) is used to refine the learning process of the classifier. We use a prototype-based variational autoencoder to capture semantic links among common labels (head labels) and their prototypes. This knowledge is then transferred to tail labels, generating enhanced features for classifier training. The experiments prove that the PSCFA method significantly outperforms existing methods for MFTP prediction, making a significant advancement in therapeutic peptide identification.
Collapse
Affiliation(s)
- Sitong Niu
- College of Mathematics and System sciences, Xinjiang University, Urumqi, 830046, Xinjiang, China
| | - Henghui Fan
- Institutes of Physical Science and Information Technology, Anhui University, Hefei, 230601, Anhui, China
- Information Materials and Intelligent Sensing Laboratory of Anhui Province, Key Laboratory of Intelligent Computing and Signal Processing of Ministry of Education, Institutes of Physical Science and Information Technology, Anhui University, Hefei, 230601, Anhui, China
| | - Fei Wang
- School of Artificial Intelligence, Anhui University, Hefei, 230601, Anhui, China
| | - Xiaomei Yang
- College of Mathematics and System sciences, Xinjiang University, Urumqi, 830046, Xinjiang, China.
| | - Junfeng Xia
- Institutes of Physical Science and Information Technology, Anhui University, Hefei, 230601, Anhui, China.
| |
Collapse
|
6
|
Hwang JS, Kim SG, George NP, Kwon M, Jang YE, Lee SS, Lee G. Biological Function Analysis of MicroRNAs and Proteins in the Cerebrospinal Fluid of Patients with Parkinson's Disease. Int J Mol Sci 2024; 25:13260. [PMID: 39769025 PMCID: PMC11678473 DOI: 10.3390/ijms252413260] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2024] [Revised: 12/01/2024] [Accepted: 12/05/2024] [Indexed: 01/11/2025] Open
Abstract
Parkinson's disease (PD) is a progressive neurodegenerative disorder characterized by alpha-synuclein aggregation into Lewy bodies in the neurons. Cerebrospinal fluid (CSF) is considered the most suited source for investigating PD pathogenesis and identifying biomarkers. While microRNA (miRNA) profiling can aid in the investigation of post-transcriptional regulation in neurodegenerative diseases, information on miRNAs in the CSF of patients with PD remains limited. This review combines miRNA analysis with proteomic profiling to explore the collective impact of CSF miRNAs on the neurodegenerative mechanisms in PD. We constructed separate networks for altered miRNAs and proteomes using a bioinformatics method. Altered miRNAs were poorly linked to biological functions owing to limited information; however, changes in protein expression were strongly associated with biological functions. Subsequently, the networks were integrated for further analysis. In silico prediction from the integrated network revealed relationships between miRNAs and proteins, highlighting increased reactive oxygen species generation, neuronal loss, and neurodegeneration and suppressed ATP synthesis, mitochondrial function, and neurotransmitter release in PD. The approach suggests the potential of miRNAs as biomarkers for critical mechanisms underlying PD. The combined strategy could enhance our understanding of the complex biochemical networks of miRNAs in PD and support the development of diagnostic and therapeutic strategies for precision medicine.
Collapse
Affiliation(s)
- Ji Su Hwang
- Department of Molecular Science and Technology, Ajou University, Suwon 16499, Republic of Korea; (J.S.H.); (S.G.K.); (N.P.G.); (M.K.); (Y.E.J.)
- Department of Physiology, Ajou University School of Medicine, Suwon 16499, Republic of Korea
| | - Seok Gi Kim
- Department of Molecular Science and Technology, Ajou University, Suwon 16499, Republic of Korea; (J.S.H.); (S.G.K.); (N.P.G.); (M.K.); (Y.E.J.)
- Department of Physiology, Ajou University School of Medicine, Suwon 16499, Republic of Korea
| | - Nimisha Pradeep George
- Department of Molecular Science and Technology, Ajou University, Suwon 16499, Republic of Korea; (J.S.H.); (S.G.K.); (N.P.G.); (M.K.); (Y.E.J.)
- Department of Physiology, Ajou University School of Medicine, Suwon 16499, Republic of Korea
| | - Minjun Kwon
- Department of Molecular Science and Technology, Ajou University, Suwon 16499, Republic of Korea; (J.S.H.); (S.G.K.); (N.P.G.); (M.K.); (Y.E.J.)
- Department of Physiology, Ajou University School of Medicine, Suwon 16499, Republic of Korea
| | - Yong Eun Jang
- Department of Molecular Science and Technology, Ajou University, Suwon 16499, Republic of Korea; (J.S.H.); (S.G.K.); (N.P.G.); (M.K.); (Y.E.J.)
- Department of Physiology, Ajou University School of Medicine, Suwon 16499, Republic of Korea
| | - Sang Seop Lee
- Department of Pharmacology, Inje University College of Medicine, Busan 47392, Republic of Korea;
| | - Gwang Lee
- Department of Molecular Science and Technology, Ajou University, Suwon 16499, Republic of Korea; (J.S.H.); (S.G.K.); (N.P.G.); (M.K.); (Y.E.J.)
- Department of Physiology, Ajou University School of Medicine, Suwon 16499, Republic of Korea
| |
Collapse
|
7
|
Shukla R, Singh TR. AlzGenPred - CatBoost-based gene classifier for predicting Alzheimer's disease using high-throughput sequencing data. Sci Rep 2024; 14:30294. [PMID: 39639110 PMCID: PMC11621786 DOI: 10.1038/s41598-024-82208-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2024] [Accepted: 12/03/2024] [Indexed: 12/07/2024] Open
Abstract
AD is a progressive neurodegenerative disorder characterized by memory loss. Due to the advancement in next-generation sequencing, an enormous amount of AD-associated genomics data is available. However, the information about the involvement of these genes in AD association is still a research topic. Therefore, AlzGenPred is developed to identify the AD-associated genes using machine-learning. A total of 13,504 features derived from eight sequence-encoding schemes were generated and evaluated using 16 machine learning algorithms. Network-based features significantly outperformed sequence-based features, effectively distinguishing AD-associated genes. In contrast, sequence-based features failed to classify accurately. To improve performance, we generated 24 fused features (6020 D) from sequence-based encodings, increasing accuracy by 5-7% using a two-step lightGBM-based recursive feature selection method. However, accuracy remained below 70% even after hyperparameter tuning. Therefore, network-based features were used to generate the CatBoost-based ML method AlzGenPred with 96.55% accuracy and 98.99% AUROC. The developed method is tested on the AlzGene dataset where it showed 96.43% accuracy. Then the model was validated using the transcriptomics dataset. AlzGenPred provides a reliable and user-friendly tool for identifying potential AD biomarkers, accelerating biomarker discovery, and advancing our understanding of AD. It is available at https://www.bioinfoindia.org/alzgenpred/ and https://github.com/shuklarohit815/AlzGenPred .
Collapse
Affiliation(s)
- Rohit Shukla
- Department of Biotechnology and Bioinformatics, Jaypee University of Information Technology (JUIT), Waknaghat, Solan, 173234, H.P., India
- Center of Excellence for Aging and Brain Repair, Morsani College of Medicine, University of South Florida, Tampa, 33613, FL, USA
| | - Tiratha Raj Singh
- Department of Biotechnology and Bioinformatics, Jaypee University of Information Technology (JUIT), Waknaghat, Solan, 173234, H.P., India.
- Centre of Healthcare Technologies and Informatics (CEHTI), Jaypee University of Information Technology (JUIT), Waknaghat, Solan, 173234, H.P., India.
| |
Collapse
|
8
|
Peng Q, Jiang L, Shen Y, Xu Y, Shen X, Zou L, Zhu Y, Shen Y. LC-MS metabolomics analysis of serum metabolites during neoadjuvant chemoradiotherapy in locally advanced rectal cancer. Clin Transl Oncol 2024; 26:3150-3168. [PMID: 38831193 DOI: 10.1007/s12094-024-03537-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2024] [Accepted: 05/18/2024] [Indexed: 06/05/2024]
Abstract
BACKGROUND This study aimed to investigate the serum metabolite profiles during neoadjuvant chemoradiotherapy (NCRT) in locally advanced rectal cancer (LARC) using liquid chromatography-mass spectrometry (LC-MS) metabolomics analysis. METHODS 60 serum samples were collected from 20 patients with LARC before, during, and after radiotherapy. LC-MS metabolomics analysis was performed to identify the metabolite variations. Functional annotation was applied to discover altered metabolic pathways. The key metabolites were screened and their ability to predict sensitivity to radiotherapy was calculated using random forests and ROC curves. RESULTS The results showed that NCRT led to significant changes in the serum metabolite profiles. The serum metabolic profiles showed an apparent separation between different time points and different sensitivity groups. Moreover, the functional annotation showed that the differential metabolites were associated with a series of important metabolic pathways. Pre-radiotherapy (3Z,6Z)-3,6-Nonadiena and pro-radiotherapy 1-Hydroxyibuprofen showed good predictive performance in discriminating the sensitive and non-sensitive group to NCRT, with an AUC of 0.812 and 0.75, respectively. Importantly, the combination of different metabolites significantly increased the predictive ability. CONCLUSION This study demonstrated the potential of LC-MS metabolomics for revealing the serum metabolite profiles during NCRT in LARC. The identified metabolites may serve as potential biomarkers and therapeutic targets for the management of this disease. Furthermore, the understanding of the affected metabolic pathways may help design more personalized therapeutic strategies for LARC patients.
Collapse
Affiliation(s)
- Qiliang Peng
- Department of Radiotherapy and Oncology, The Second Affiliated Hospital of Soochow University, Suzhou, China
- Institute of Radiotherapy & Oncology, Soochow University, Suzhou, China
- State Key Laboratory of Radiation Medicine and Protection, Soochow University, Suzhou, China
| | - Lili Jiang
- Department of Oncology, Nantong Haimen District People's Hospital, Jiangsu, China
| | - Yi Shen
- Department of Radiation Oncology, Suzhou Hospital, Affiliated Hospital of Medical School, Nanjing University, Suzhou, China
| | - Yao Xu
- Department of Radiology, The Second Affiliated Hospital of Soochow University, Suzhou, China
| | - Xinan Shen
- Department of Radiotherapy and Oncology, The Second Affiliated Hospital of Soochow University, Suzhou, China
- Institute of Radiotherapy & Oncology, Soochow University, Suzhou, China
| | - Li Zou
- Department of Radiotherapy and Oncology, The Second Affiliated Hospital of Soochow University, Suzhou, China
- Institute of Radiotherapy & Oncology, Soochow University, Suzhou, China
| | - Yaqun Zhu
- Department of Radiotherapy and Oncology, The Second Affiliated Hospital of Soochow University, Suzhou, China.
- Institute of Radiotherapy & Oncology, Soochow University, Suzhou, China.
| | - Yuntian Shen
- Department of Radiotherapy and Oncology, The Second Affiliated Hospital of Soochow University, Suzhou, China.
- Institute of Radiotherapy & Oncology, Soochow University, Suzhou, China.
| |
Collapse
|
9
|
Basith S, Sangaraju VK, Manavalan B, Lee G. mHPpred: Accurate identification of peptide hormones using multi-view feature learning. Comput Biol Med 2024; 183:109297. [PMID: 39442438 DOI: 10.1016/j.compbiomed.2024.109297] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2024] [Revised: 10/04/2024] [Accepted: 10/15/2024] [Indexed: 10/25/2024]
Abstract
Peptide hormones were first used in medicine in the early 20th century, with the pivotal event being the isolation and purification of insulin in 1921. These hormones are integral to a sophisticated system that emerged early in evolution to regulate growth, development, and homeostasis. They serve as targeted signaling molecules that transfer specific information between cells and organs, ensuring coordinated and precise physiological responses. While experimental methods for identifying peptide hormones present challenges such as low abundance, stability issues, and complexity, computational methods offer promising alternatives. Advances in machine learning and bioinformatics have facilitated the prediction of peptide hormones, further enhancing their therapeutic potential. In this study, we explored three different computational frameworks for peptide hormone identification and determined that the meta-approach was the most suitable. Firstly, we evaluated the discriminative power of 26 feature descriptors using a series of baseline models and identified seven feature descriptors with high predictive potential. Through a systematic approach, we then selected the top 20 performing baseline models and integrated their predicted probabilities to train a meta-model, leveraging the strengths of multiple prediction strategies. Our final light gradient boosting-based meta-model, mHPpred, significantly outperformed the existing method, HOPPred, on both benchmarking and independent datasets. Notably, mHPpred also demonstrated superior performance compared to the hybrid and integrative framework approaches employed in this study. This superiority demonstrates the effectiveness of our multi-view feature learning strategy in capturing discriminative features and providing a more accurate prediction model for peptide hormones. mHPpred is publicly accessible at: https://balalab-skku.org/mHPpred.
Collapse
Affiliation(s)
- Shaherin Basith
- Department of Physiology, Ajou University School of Medicine, Suwon, 16499, Republic of Korea.
| | - Vinoth Kumar Sangaraju
- Department of Integrative Biotechnology, College of Biotechnology and Bioengineering, Sungkyunkwan University, Suwon, 16419, Republic of Korea
| | - Balachandran Manavalan
- Department of Integrative Biotechnology, College of Biotechnology and Bioengineering, Sungkyunkwan University, Suwon, 16419, Republic of Korea.
| | - Gwang Lee
- Department of Physiology, Ajou University School of Medicine, Suwon, 16499, Republic of Korea; Department of Molecular Science and Technology, Ajou University, Suwon, 16499, Republic of Korea.
| |
Collapse
|
10
|
Hashemi S, Vosough P, Taghizadeh S, Savardashtaki A. Therapeutic peptide development revolutionized: Harnessing the power of artificial intelligence for drug discovery. Heliyon 2024; 10:e40265. [PMID: 39605829 PMCID: PMC11600032 DOI: 10.1016/j.heliyon.2024.e40265] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/20/2024] [Revised: 10/07/2024] [Accepted: 11/07/2024] [Indexed: 11/29/2024] Open
Abstract
Due to the spread of antibiotic resistance, global attention is focused on its inhibition and the expansion of effective medicinal compounds. The novel functional properties of peptides have opened up new horizons in personalized medicine. With artificial intelligence methods combined with therapeutic peptide products, pharmaceuticals and biotechnology advance drug development rapidly and reduce costs. Short-chain peptides inhibit a wide range of pathogens and have great potential for targeting diseases. To address the challenges of synthesis and sustainability, artificial intelligence methods, namely machine learning, must be integrated into their production. Learning methods can use complicated computations to select the active and toxic compounds of the drug and its metabolic activity. Through this comprehensive review, we investigated the artificial intelligence method as a potential tool for finding peptide-based drugs and providing a more accurate analysis of peptides through the introduction of predictable databases for effective selection and development.
Collapse
Affiliation(s)
- Samaneh Hashemi
- Student Research Committee, Shiraz University of Medical Sciences, Shiraz, Iran
- Department of Medical Biotechnology, School of Advanced Medical Sciences and Technologies, Shiraz University of Medical Sciences, Shiraz, Iran
| | - Parisa Vosough
- Student Research Committee, Shiraz University of Medical Sciences, Shiraz, Iran
- Department of Medical Biotechnology, School of Advanced Medical Sciences and Technologies, Shiraz University of Medical Sciences, Shiraz, Iran
| | - Saeed Taghizadeh
- Department of Medical Biotechnology, School of Advanced Medical Sciences and Technologies, Shiraz University of Medical Sciences, Shiraz, Iran
- Pharmaceutical Science Research Center, Shiraz University of Medical Sciences, Shiraz, Iran
| | - Amir Savardashtaki
- Department of Medical Biotechnology, School of Advanced Medical Sciences and Technologies, Shiraz University of Medical Sciences, Shiraz, Iran
- Infertility Research Center, Shiraz University of Medical Sciences, Shiraz, Iran
| |
Collapse
|
11
|
Fan X, Ye J, Zhong W, Shen H, Li H, Liu Z, Bai J, Du S. The Promoting Effect of Animal Bioactive Proteins and Peptide Components on Wound Healing: A Review. Int J Mol Sci 2024; 25:12561. [PMID: 39684273 DOI: 10.3390/ijms252312561] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2024] [Revised: 11/15/2024] [Accepted: 11/21/2024] [Indexed: 12/18/2024] Open
Abstract
The skin is the first line of defense to protect the host from external environmental damage. When the skin is damaged, the wound provides convenience for the invasion of external substances. The prolonged nonhealing of wounds can also lead to numerous subsequent complications, seriously affecting the quality of life of patients. To solve this problem, proteins and peptide components that promote wound healing have been discovered in animals, which can act on key pathways involved in wound healing, such as the PI3K/AKT, TGF-β, NF-κ B, and JAK/STAT pathways. So far, some formulations for topical drug delivery have been developed, including hydrogels, microneedles, and electrospinning nanofibers. In addition, some high-performance dressings have been utilized, which also have great potential in wound healing. Here, research progress on the promotion of wound healing by animal-derived proteins and peptide components is summarized, and future research directions are discussed.
Collapse
Affiliation(s)
- Xiaoyu Fan
- College of Traditional Chinese Medicine, Beijing University of Chinese Medicine, Beijing 102488, China
| | - Jinhong Ye
- College of Traditional Chinese Medicine, Beijing University of Chinese Medicine, Beijing 102488, China
| | - Wanling Zhong
- College of Traditional Chinese Medicine, Beijing University of Chinese Medicine, Beijing 102488, China
| | - Huijuan Shen
- College of Traditional Chinese Medicine, Beijing University of Chinese Medicine, Beijing 102488, China
| | - Huahua Li
- College of Traditional Chinese Medicine, Beijing University of Chinese Medicine, Beijing 102488, China
| | - Zhuyuan Liu
- College of Traditional Chinese Medicine, Beijing University of Chinese Medicine, Beijing 102488, China
| | - Jie Bai
- College of Traditional Chinese Medicine, Beijing University of Chinese Medicine, Beijing 102488, China
| | - Shouying Du
- College of Traditional Chinese Medicine, Beijing University of Chinese Medicine, Beijing 102488, China
| |
Collapse
|
12
|
Charoenkwan P, Chumnanpuen P, Schaduangrat N, Shoombuatong W. Stack-AVP: A Stacked Ensemble Predictor Based on Multi-view Information for Fast and Accurate Discovery of Antiviral Peptides. J Mol Biol 2024:168853. [PMID: 39510347 DOI: 10.1016/j.jmb.2024.168853] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2024] [Revised: 10/22/2024] [Accepted: 10/31/2024] [Indexed: 11/15/2024]
Abstract
AVPs, or antiviral peptides, are short chains of amino acids capable of inhibiting viral replication, preventing viral entry, or disrupting viral membranes. They represent a promising area of research for developing new antiviral therapies due to their potential to target a broad spectrum of viruses, incorporating those resistant to traditional antiviral drugs. However, traditional experimental methods for identifying AVPs are often costly and labour-intensive. Thus far, multiple computational methods have been introduced for the in silico identification of AVPs, but these methods still have certain shortcomings. In this study, we propose a novel stacked ensemble learning framework, termed Stack-AVP, for fast and accurate AVP identification. In Stack-AVP, we investigated heterogeneous prediction models, which were trained with 12 commonly used machine learning algorithms coupled with a wide range of multiple feature encoding schemes. Subsequently, these prediction models were adopted to generate multi-view features providing class information and probability information. Finally, we applied our feature selection method to determine the best feature subset for the construction of the final stacked model. Comparative assessments on the independent test dataset revealed that Stack-AVP surpassed the performance of current state-of-the-art methods, with an accuracy of 0.930, MCC of 0.860, and AUC of 0.975. Furthermore, it was found that our multi-view features exhibited a crucial mechanism to improve the prediction performance of AVPs. To facilitate experimental scientists in performing high-throughput identification of AVPs, the prediction sever Stack-AVP is publicly accessible at https://pmlabqsar.pythonanywhere.com/Stack-AVP.
Collapse
Affiliation(s)
- Phasit Charoenkwan
- Modern Management and Information Technology, College of Arts, Media and Technology, Chiang Mai University, Chiang Mai 50200, Thailand
| | - Pramote Chumnanpuen
- Department of Zoology, Faculty of Science, Kasetsart University, Bangkok 10900, Thailand; Kasetsart University International College (KUIC), Kasetsart University, Bangkok 10900, Thailand
| | - Nalini Schaduangrat
- Center for Research Innovation and Biomedical Informatics, Faculty of Medical Technology, Mahidol University, Bangkok 10700, Thailand
| | - Watshara Shoombuatong
- Center for Research Innovation and Biomedical Informatics, Faculty of Medical Technology, Mahidol University, Bangkok 10700, Thailand.
| |
Collapse
|
13
|
Shaon MSH, Karim T, Ali MM, Ahmed K, Bui FM, Chen L, Moni MA. A robust deep learning approach for identification of RNA 5-methyluridine sites. Sci Rep 2024; 14:25688. [PMID: 39465261 PMCID: PMC11514282 DOI: 10.1038/s41598-024-76148-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2024] [Accepted: 10/10/2024] [Indexed: 10/29/2024] Open
Abstract
RNA 5-methyluridine (m5U) sites play a significant role in understanding RNA modifications, which influence numerous biological processes such as gene expression and cellular functioning. Consequently, the identification of m5U sites can play a vital role in the integrity, structure, and function of RNA molecules. Therefore, this study introduces GRUpred-m5U, a novel deep learning-based framework based on a gated recurrent unit in mature RNA and full transcript RNA datasets. We used three descriptor groups: nucleic acid composition, pseudo nucleic acid composition, and physicochemical properties, which include five feature extraction methods ENAC, Kmer, DPCP, DPCP type 2, and PseDNC. Initially, we aggregated all the feature extraction methods and created a new merged set. Three hybrid models were developed employing deep-learning methods and evaluated through 10-fold cross-validation with seven evaluation metrics. After a comprehensive evaluation, the GRUpred-m5U model outperformed the other applied models, obtaining 98.41% and 96.70% accuracy on the two datasets, respectively. To our knowledge, the proposed model outperformed all the existing state-of-the-art technology. The proposed supervised machine learning model was evaluated using unsupervised machine learning techniques such as principal component analysis (PCA), and it was observed that the proposed method provided a valid performance for identifying m5U. Considering its multi-layered construction, the GRUpred-m5U model has tremendous potential for future applications in the biological industry. The model, which consisted of neurons processing complicated input, excelled at pattern recognition and produced reliable results. Despite its greater size, the model obtained accurate results, essential in detecting m5U.
Collapse
Affiliation(s)
| | - Tasmin Karim
- Department of Computer Science and Informatics, Oakland University, Rochester, MI, 48309, USA
| | - Md Mamun Ali
- Division of Biomedical Engineering, University of Saskatchewan, 57 Campus Drive, Saskatoon, SK, S7N 5A9, Canada
- Department of Software Engineering, Daffodil Smart City (DSC), Daffodil International University, Birulia, Savar, Dhaka, 1216, Bangladesh
| | - Kawsar Ahmed
- Department of Electrical and Computer Engineering, University of Saskatchewan, 57 Campus Drive, Saskatoon, SK, S7N 5A9, Canada.
- Group of Bio-photomatiχ, Department of Information and Communication Technology, Mawlana Bhashani Science and Technology University, Santosh, 1902, Tangail, Bangladesh.
- Health Informatics Research Lab, Department of Computer Science and Engineering, Daffodil International University, Daffodil Smart City, Dhaka, 1216, Birulia, Bangladesh.
| | - Francis M Bui
- Department of Electrical and Computer Engineering, University of Saskatchewan, 57 Campus Drive, Saskatoon, SK, S7N 5A9, Canada
| | - Li Chen
- Department of Electrical and Computer Engineering, University of Saskatchewan, 57 Campus Drive, Saskatoon, SK, S7N 5A9, Canada
| | - Mohammad Ali Moni
- AI & Digital Health Technology, Artificial Intelligence & Cyber Future Institute, Charles Sturt University, Bathurst, NSW, 2795, Australia.
- AI & Digital Health Technology, Rural Health Research Institute, Charles Sturt University, Orange, NSW, 2800, Australia.
| |
Collapse
|
14
|
Nissan N, Allen MC, Sabatino D, Biggar KK. Future Perspective: Harnessing the Power of Artificial Intelligence in the Generation of New Peptide Drugs. Biomolecules 2024; 14:1303. [PMID: 39456236 PMCID: PMC11505729 DOI: 10.3390/biom14101303] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2024] [Revised: 10/10/2024] [Accepted: 10/12/2024] [Indexed: 10/28/2024] Open
Abstract
The expansive field of drug discovery is continually seeking innovative approaches to identify and develop novel peptide-based therapeutics. With the advent of artificial intelligence (AI), there has been a transformative shift in the generation of new peptide drugs. AI offers a range of computational tools and algorithms that enables researchers to accelerate the therapeutic peptide pipeline. This review explores the current landscape of AI applications in peptide drug discovery, highlighting its potential, challenges, and ethical considerations. Additionally, it presents case studies and future prospectives that demonstrate the impact of AI on the generation of new peptide drugs.
Collapse
Affiliation(s)
- Nour Nissan
- Institute of Biochemistry, Departments of Biology & Chemistry, Carleton University, Ottawa, ON K1S 5B6, Canada (D.S.)
- NuvoBio Corporation, Ottawa, ON K1S 5B6, Canada
| | - Mitchell C. Allen
- Institute of Biochemistry, Departments of Biology & Chemistry, Carleton University, Ottawa, ON K1S 5B6, Canada (D.S.)
| | - David Sabatino
- Institute of Biochemistry, Departments of Biology & Chemistry, Carleton University, Ottawa, ON K1S 5B6, Canada (D.S.)
| | - Kyle K. Biggar
- Institute of Biochemistry, Departments of Biology & Chemistry, Carleton University, Ottawa, ON K1S 5B6, Canada (D.S.)
- NuvoBio Corporation, Ottawa, ON K1S 5B6, Canada
| |
Collapse
|
15
|
Han Z, Shen Z, Pei J, You Q, Zhang Q, Wang L. Transformation of peptides to small molecules in medicinal chemistry: Challenges and opportunities. Acta Pharm Sin B 2024; 14:4243-4265. [PMID: 39525591 PMCID: PMC11544290 DOI: 10.1016/j.apsb.2024.06.019] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2024] [Revised: 05/14/2024] [Accepted: 06/11/2024] [Indexed: 11/16/2024] Open
Abstract
Peptides are native binders involved in numerous physiological life procedures, such as cellular signaling, and serve as ready-made regulators of biochemical processes. Meanwhile, small molecules compose many drugs owing to their outstanding advantages of physiochemical properties and synthetic convenience. A novel field of research is converting peptides into small molecules, providing a convenient portable solution for drug design or peptidomic research. Endowing properties of peptides onto small molecules can evolutionarily combine the advantages of both moieties and improve the biological druggability of molecules. Herein, we present eight representative recent cases in this conversion and elaborate on the transformation process of each case. We discuss the innovative technological methods and research approaches involved, and analyze the applicability conditions of the approaches and methods in each case, guiding further modifications of peptides to small molecules. Finally, based on the aforementioned cases, we summarize a general procedure for peptide-to-small molecule modifications, listing the technological methods available for each transformation step and providing our insights on the applicable scenarios for these methods. This review aims to present the progress of peptide-to-small molecule modifications and propose our thoughts and perspectives for future research in this field.
Collapse
Affiliation(s)
- Zeyu Han
- State Key Laboratory of Natural Medicines and Jiangsu Key Laboratory of Drug Design and Optimization, China Pharmaceutical University, Nanjing 210009, China
- Department of Medicinal Chemistry, School of Pharmacy, China Pharmaceutical University, Nanjing 210009, China
| | - Zekai Shen
- State Key Laboratory of Natural Medicines and Jiangsu Key Laboratory of Drug Design and Optimization, China Pharmaceutical University, Nanjing 210009, China
- Department of Medicinal Chemistry, School of Pharmacy, China Pharmaceutical University, Nanjing 210009, China
| | - Jiayue Pei
- State Key Laboratory of Natural Medicines and Jiangsu Key Laboratory of Drug Design and Optimization, China Pharmaceutical University, Nanjing 210009, China
- Department of Medicinal Chemistry, School of Pharmacy, China Pharmaceutical University, Nanjing 210009, China
| | - Qidong You
- State Key Laboratory of Natural Medicines and Jiangsu Key Laboratory of Drug Design and Optimization, China Pharmaceutical University, Nanjing 210009, China
- Department of Medicinal Chemistry, School of Pharmacy, China Pharmaceutical University, Nanjing 210009, China
| | - Qiuyue Zhang
- State Key Laboratory of Natural Medicines and Jiangsu Key Laboratory of Drug Design and Optimization, China Pharmaceutical University, Nanjing 210009, China
- Department of Medicinal Chemistry, School of Pharmacy, China Pharmaceutical University, Nanjing 210009, China
| | - Lei Wang
- State Key Laboratory of Natural Medicines and Jiangsu Key Laboratory of Drug Design and Optimization, China Pharmaceutical University, Nanjing 210009, China
- Department of Medicinal Chemistry, School of Pharmacy, China Pharmaceutical University, Nanjing 210009, China
| |
Collapse
|
16
|
Ge R, Xia Y, Jiang M, Jia G, Jing X, Li Y, Cai Y. HybAVPnet: A Novel Hybrid Network Architecture for Antiviral Peptides Prediction. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2024; 21:1358-1365. [PMID: 38587961 DOI: 10.1109/tcbb.2024.3385635] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/10/2024]
Abstract
Viruses pose a great threat to human production and life, thus the research and development of antiviral drugs is urgently needed. Antiviral peptides play an important role in drug design and development. Compared with the time-consuming and laborious wet chemical experiment methods, it is critical to use computational methods to predict antiviral peptides accurately and rapidly. However, due to limited data, accurate prediction of antiviral peptides is still challenging and extracting effective feature representations from sequences is crucial for creating accurate models. This study introduces a novel two-step approach, named HybAVPnet, to predict antiviral peptides with a hybrid network architecture based on neural networks and traditional machine learning methods. We adopted a stacking-like structure to capture both the long-term dependencies and local evolution information to achieve a comprehensive and diverse prediction using the predicted labels and probabilities. Using an ensemble technique with the different kinds of features can reduce the variance without increasing the bias. The experimental result shows HybAVPnet can achieve better and more robust performance compared with the state-of-the-art methods, which makes it useful for the research and development of antiviral drugs. Meanwhile, it can also be extended to other peptide recognition problems because of its generalization ability.
Collapse
|
17
|
Puszkarska AM, Taddese B, Revell J, Davies G, Field J, Hornigold DC, Buchanan A, Vaughan TJ, Colwell LJ. Machine learning designs new GCGR/GLP-1R dual agonists with enhanced biological potency. Nat Chem 2024; 16:1436-1444. [PMID: 38755312 PMCID: PMC11374683 DOI: 10.1038/s41557-024-01532-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/26/2021] [Accepted: 04/08/2024] [Indexed: 05/18/2024]
Abstract
Several peptide dual agonists of the human glucagon receptor (GCGR) and the glucagon-like peptide-1 receptor (GLP-1R) are in development for the treatment of type 2 diabetes, obesity and their associated complications. Candidates must have high potency at both receptors, but it is unclear whether the limited experimental data available can be used to train models that accurately predict the activity at both receptors of new peptide variants. Here we use peptide sequence data labelled with in vitro potency at human GCGR and GLP-1R to train several models, including a deep multi-task neural-network model using multiple loss optimization. Model-guided sequence optimization was used to design three groups of peptide variants, with distinct ranges of predicted dual activity. We found that three of the model-designed sequences are potent dual agonists with superior biological activity. With our designs we were able to achieve up to sevenfold potency improvement at both receptors simultaneously compared to the best dual-agonist in the training set.
Collapse
Affiliation(s)
- Anna M Puszkarska
- Yusuf Hamied Department of Chemistry, University of Cambridge, Cambridge, UK
- Biologics Engineering, Oncology R&D, AstraZeneca, Cambridge, UK
| | - Bruck Taddese
- Discovery Sciences, R&D, AstraZeneca, Cambridge, UK
- Biologics Center (NBC) at the Novartis Institute for BioMedical Research (NIBR), Basel, Switzerland
| | | | - Graeme Davies
- Research and Early Development, Cardiovascular, Renal and Metabolism (CVRM), BioPharmaceuticals R&D, AstraZeneca, Cambridge, UK
| | - Joss Field
- Research and Early Development, Cardiovascular, Renal and Metabolism (CVRM), BioPharmaceuticals R&D, AstraZeneca, Cambridge, UK
| | - David C Hornigold
- Research and Early Development, Cardiovascular, Renal and Metabolism (CVRM), BioPharmaceuticals R&D, AstraZeneca, Cambridge, UK
| | - Andrew Buchanan
- Biologics Engineering, Oncology R&D, AstraZeneca, Cambridge, UK
| | - Tristan J Vaughan
- Biologics Engineering, Oncology R&D, AstraZeneca, Cambridge, UK
- Immunocore Ltd., Abingdon, UK
| | - Lucy J Colwell
- Yusuf Hamied Department of Chemistry, University of Cambridge, Cambridge, UK.
- Google DeepMind, Cambridge, MA, USA.
| |
Collapse
|
18
|
Rathore AS, Choudhury S, Arora A, Tijare P, Raghava GPS. ToxinPred 3.0: An improved method for predicting the toxicity of peptides. Comput Biol Med 2024; 179:108926. [PMID: 39038391 DOI: 10.1016/j.compbiomed.2024.108926] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2023] [Revised: 05/17/2024] [Accepted: 07/17/2024] [Indexed: 07/24/2024]
Abstract
Toxicity emerges as a prominent challenge in the design of therapeutic peptides, causing the failure of numerous peptides during clinical trials. In 2013, our group developed ToxinPred, a computational method that has been extensively adopted by the scientific community for predicting peptide toxicity. In this paper, we propose a refined variant of ToxinPred that showcases improved reliability and accuracy in predicting peptide toxicity. Initially, we utilized a similarity/alignment-based approach employing BLAST to predict toxic peptides, which yielded satisfactory accuracy; however, the method suffered from inadequate coverage. Subsequently, we employed a motif-based approach using MERCI software to uncover specific patterns or motifs that are exclusively observed in toxic peptides. The search for these motifs in peptides allowed us to predict toxic peptides with a high level of specificity with poor sensitivity. To overcome the coverage limitations, we developed alignment-free methods using machine/deep learning techniques to balance sensitivity and specificity of prediction. Deep learning model (ANN - LSTM with fixed sequence length) developed using one-hot encoding achieved a maximum AUROC of 0.93 with MCC of 0.71 on an independent dataset. Machine learning model (extra tree) developed using compositional features of peptides achieved a maximum AUROC of 0.95 with MCC of 0.78. We also developed large language models and achieved maximum AUC of 0.93 using ESM2-t33. Finally, we developed hybrid or ensemble methods combining two or more methods to enhance performance. Our specific hybrid method, which combines a motif-based approach with a machine learning-based model, achieved a maximum AUROC of 0.98 with MCC 0.81 on an independent dataset. In this study, all models were trained and tested on 80 % of data using five-fold cross-validation and evaluated on the remaining 20 % of data called independent dataset. The evaluation of all methods on an independent dataset revealed that the method proposed in this study exhibited better performance than existing methods. To cater to the needs of the scientific community, we have developed a standalone software, pip package and web-based server ToxinPred3 (https://github.com/raghavagps/toxinpred3 and https://webs.iiitd.edu.in/raghava/toxinpred3/).
Collapse
Affiliation(s)
- Anand Singh Rathore
- Department of Computational Biology, Indraprastha Institute of Information Technology, Okhla Phase 3, New Delhi, 110020, India.
| | - Shubham Choudhury
- Department of Computational Biology, Indraprastha Institute of Information Technology, Okhla Phase 3, New Delhi, 110020, India.
| | - Akanksha Arora
- Department of Computational Biology, Indraprastha Institute of Information Technology, Okhla Phase 3, New Delhi, 110020, India.
| | - Purva Tijare
- Department of Computational Biology, Indraprastha Institute of Information Technology, Okhla Phase 3, New Delhi, 110020, India.
| | - Gajendra P S Raghava
- Department of Computational Biology, Indraprastha Institute of Information Technology, Okhla Phase 3, New Delhi, 110020, India.
| |
Collapse
|
19
|
Pham NT, Zhang Y, Rakkiyappan R, Manavalan B. HOTGpred: Enhancing human O-linked threonine glycosylation prediction using integrated pretrained protein language model-based features and multi-stage feature selection approach. Comput Biol Med 2024; 179:108859. [PMID: 39029431 DOI: 10.1016/j.compbiomed.2024.108859] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2024] [Revised: 06/19/2024] [Accepted: 07/06/2024] [Indexed: 07/21/2024]
Abstract
O-linked glycosylation is a complex post-translational modification (PTM) in human proteins that plays a critical role in regulating various cellular metabolic and signaling pathways. In contrast to N-linked glycosylation, O-linked glycosylation lacks specific sequence features and maintains an unstable core structure. Identifying O-linked threonine glycosylation sites (OTGs) remains challenging, requiring extensive experimental tests. While bioinformatics tools have emerged for predicting OTGs, their reliance on limited conventional features and absence of well-defined feature selection strategies limit their effectiveness. To address these limitations, we introduced HOTGpred (Human O-linked Threonine Glycosylation predictor), employing a multi-stage feature selection process to identify the optimal feature set for accurately identifying OTGs. Initially, we assessed 25 different feature sets derived from various pretrained protein language model (PLM)-based embeddings and conventional feature descriptors using nine classifiers. Subsequently, we integrated the top five embeddings linearly and determined the most effective scoring function for ranking hybrid features, identifying the optimal feature set through a process of sequential forward search. Among the classifiers, the extreme gradient boosting (XGBT)-based model, using the optimal feature set (HOTGpred), achieved 92.03 % accuracy on the training dataset and 88.25 % on the balanced independent dataset. Notably, HOTGpred significantly outperformed the current state-of-the-art methods on both the balanced and imbalanced independent datasets, demonstrating its superior prediction capabilities. Additionally, SHapley Additive exPlanations (SHAP) and ablation analyses were conducted to identify the features contributing most significantly to HOTGpred. Finally, we developed an easy-to-navigate web server, accessible at https://balalab-skku.org/HOTGpred/, to support glycobiologists in their research on glycosylation structure and function.
Collapse
Affiliation(s)
- Nhat Truong Pham
- Department of Integrative Biotechnology, College of Biotechnology and Bioengineering, Sungkyunkwan University, Suwon, 16419, Gyeonggi-do, Republic of Korea
| | - Ying Zhang
- Beidahuang Industry Group General Hospital, Harbin, China
| | - Rajan Rakkiyappan
- Department of Mathematics, Bharathiar University, Coimbatore, 641046, Tamil Nadu, India.
| | - Balachandran Manavalan
- Department of Integrative Biotechnology, College of Biotechnology and Bioengineering, Sungkyunkwan University, Suwon, 16419, Gyeonggi-do, Republic of Korea.
| |
Collapse
|
20
|
Zhang X, Wu Y, Lin J, Lu S, Lu X, Cheng A, Chen H, Zhang W, Luan X. Insights into therapeutic peptides in the cancer-immunity cycle: Update and challenges. Acta Pharm Sin B 2024; 14:3818-3833. [PMID: 39309492 PMCID: PMC11413705 DOI: 10.1016/j.apsb.2024.05.013] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2023] [Revised: 03/05/2024] [Accepted: 04/12/2024] [Indexed: 09/25/2024] Open
Abstract
Immunotherapies hold immense potential for achieving durable potency and long-term survival opportunities in cancer therapy. As vital biological mediators, peptides with high tissue penetration and superior selectivity offer significant promise for enhancing cancer immunotherapies (CITs). However, physicochemical peptide features such as conformation and stability pose challenges to their on-target efficacy. This review provides a comprehensive overview of recent advancements in therapeutic peptides targeting key steps of the cancer-immunity cycle (CIC), including tumor antigen presentation, immune cell regulation, and immune checkpoint signaling. Particular attention is given to the opportunities and challenges associated with these peptides in boosting CIC within the context of clinical progress. Furthermore, possible future developments in this field are also discussed to provide insights into emerging CITs with robust efficacy and safety profiles.
Collapse
Affiliation(s)
- Xiaokun Zhang
- Shanghai Frontiers Science Center for Chinese Medicine Chemical Biology, Institute of Interdisciplinary Integrative Medicine Research and Shuguang Hospital, Shanghai University of Traditional Chinese Medicine, Shanghai 201203, China
| | - Ye Wu
- Shanghai Frontiers Science Center for Chinese Medicine Chemical Biology, Institute of Interdisciplinary Integrative Medicine Research and Shuguang Hospital, Shanghai University of Traditional Chinese Medicine, Shanghai 201203, China
| | - Jiayi Lin
- Shanghai Frontiers Science Center for Chinese Medicine Chemical Biology, Institute of Interdisciplinary Integrative Medicine Research and Shuguang Hospital, Shanghai University of Traditional Chinese Medicine, Shanghai 201203, China
| | - Shengxin Lu
- Shanghai Frontiers Science Center for Chinese Medicine Chemical Biology, Institute of Interdisciplinary Integrative Medicine Research and Shuguang Hospital, Shanghai University of Traditional Chinese Medicine, Shanghai 201203, China
| | - Xinchen Lu
- Shanghai Frontiers Science Center for Chinese Medicine Chemical Biology, Institute of Interdisciplinary Integrative Medicine Research and Shuguang Hospital, Shanghai University of Traditional Chinese Medicine, Shanghai 201203, China
- Department of Pharmacology, School of Pharmacy, Fudan University, Shanghai 201203, China
| | - Aoyu Cheng
- Shanghai Frontiers Science Center for Chinese Medicine Chemical Biology, Institute of Interdisciplinary Integrative Medicine Research and Shuguang Hospital, Shanghai University of Traditional Chinese Medicine, Shanghai 201203, China
| | - Hongzhuan Chen
- Shanghai Frontiers Science Center for Chinese Medicine Chemical Biology, Institute of Interdisciplinary Integrative Medicine Research and Shuguang Hospital, Shanghai University of Traditional Chinese Medicine, Shanghai 201203, China
| | - Weidong Zhang
- Shanghai Frontiers Science Center for Chinese Medicine Chemical Biology, Institute of Interdisciplinary Integrative Medicine Research and Shuguang Hospital, Shanghai University of Traditional Chinese Medicine, Shanghai 201203, China
- Institute of Medicinal Plant Development, Chinese Academy of Medical Science &, Peking Union Medical College, Beijing 100193, China
- School of Pharmacy, Second Military Medical University, Shanghai 200433, China
| | - Xin Luan
- Shanghai Frontiers Science Center for Chinese Medicine Chemical Biology, Institute of Interdisciplinary Integrative Medicine Research and Shuguang Hospital, Shanghai University of Traditional Chinese Medicine, Shanghai 201203, China
| |
Collapse
|
21
|
de Llano García D, Marrero-Ponce Y, Agüero-Chapin G, Ferri FJ, Antunes A, Martinez-Rios F, Rodríguez H. Innovative Alignment-Based Method for Antiviral Peptide Prediction. Antibiotics (Basel) 2024; 13:768. [PMID: 39200068 PMCID: PMC11350826 DOI: 10.3390/antibiotics13080768] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2024] [Revised: 08/08/2024] [Accepted: 08/09/2024] [Indexed: 09/01/2024] Open
Abstract
Antiviral peptides (AVPs) represent a promising strategy for addressing the global challenges of viral infections and their growing resistances to traditional drugs. Lab-based AVP discovery methods are resource-intensive, highlighting the need for efficient computational alternatives. In this study, we developed five non-trained but supervised multi-query similarity search models (MQSSMs) integrated into the StarPep toolbox. Rigorous testing and validation across diverse AVP datasets confirmed the models' robustness and reliability. The top-performing model, M13+, demonstrated impressive results, with an accuracy of 0.969 and a Matthew's correlation coefficient of 0.71. To assess their competitiveness, the top five models were benchmarked against 14 publicly available machine-learning and deep-learning AVP predictors. The MQSSMs outperformed these predictors, highlighting their efficiency in terms of resource demand and public accessibility. Another significant achievement of this study is the creation of the most comprehensive dataset of antiviral sequences to date. In general, these results suggest that MQSSMs are promissory tools to develop good alignment-based models that can be successfully applied in the screening of large datasets for new AVP discovery.
Collapse
Affiliation(s)
- Daniela de Llano García
- School of Chemical Sciences and Engineering, Yachay Tech University, Hda. San José s/n y Proyecto Yachay, Urcuquí 100119, Imbabura, Ecuador; (D.d.L.G.); (H.R.)
| | - Yovani Marrero-Ponce
- Universidad San Francisco de Quito (USFQ), Grupo de Medicina Molecular y Traslacional (MeM&T), Colegio de Ciencias de la Salud (COCSA), Escuela de Medicina, Edificio de Especialidades Médicas, Instituto de Simulación Computacional (ISC-USFQ), Diego de Robles y vía Interoceánica, Quito 170157, Pichincha, Ecuador
- Facultad de Ingeniería, Universidad Panamericana, Augusto Rodin 498, Benito Juárez 03920, Ciudad de México, Mexico;
- Computer Science Department, Universitat de València, 46100 Valencia, Burjassot, Spain;
| | - Guillermin Agüero-Chapin
- CIIMAR—Centro Interdisciplinar de Investigação Marinha e Ambiental, Universidade do Porto, Terminal de Cruzeiros do Porto de Leixões, Av. General Norton de Matos, s/n, 4450-208 Porto, Portugal;
- Departamento de Biologia, Faculdade de Ciências, Universidade do Porto, Rua do Campo Alegre, 4169-007 Porto, Portugal
| | - Francesc J. Ferri
- Computer Science Department, Universitat de València, 46100 Valencia, Burjassot, Spain;
| | - Agostinho Antunes
- CIIMAR—Centro Interdisciplinar de Investigação Marinha e Ambiental, Universidade do Porto, Terminal de Cruzeiros do Porto de Leixões, Av. General Norton de Matos, s/n, 4450-208 Porto, Portugal;
- Departamento de Biologia, Faculdade de Ciências, Universidade do Porto, Rua do Campo Alegre, 4169-007 Porto, Portugal
| | - Felix Martinez-Rios
- Facultad de Ingeniería, Universidad Panamericana, Augusto Rodin 498, Benito Juárez 03920, Ciudad de México, Mexico;
| | - Hortensia Rodríguez
- School of Chemical Sciences and Engineering, Yachay Tech University, Hda. San José s/n y Proyecto Yachay, Urcuquí 100119, Imbabura, Ecuador; (D.d.L.G.); (H.R.)
| |
Collapse
|
22
|
Garai S, Thomas J, Dey P, Das D. LGBM-ACp: an ensemble model for anticancer peptide prediction and in silico screening with potential drug targets. Mol Divers 2024; 28:1965-1981. [PMID: 36637711 DOI: 10.1007/s11030-023-10602-0] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2022] [Accepted: 01/06/2023] [Indexed: 01/14/2023]
Abstract
Conventional cancer therapies are highly expensive and have serious complications. An alternative approach now emphasizes on the development of small, biologically active peptides without acute toxicity. Experimental screening to find curative anticancer peptides (ACP) often gives rise to multiple obstacles and is time dependent. Consequently, developing an effective computational technique to identify promising ACP candidates prior to preclinical research is in high demand. This study proposed a machine-learning framework that used the light gradient-boosting machine as a classifier and two compositional and two binary profile features as input. The ensemble model displayed an accuracy, MCC, and AUROC of 97.52%, 0.91, and 0.98, respectively, which outclassed most of the existing sequence-based computational tools. A distinct dataset of non-mutagenic, non-toxic, and non-inhibitory Cytochrome P-450 peptides was used to validate the hybrid model. The most relevant ACP in the alternative dataset was compared with two standard ACPs, beta defensin 2, and cecropin-A. Molecular docking of the predicted peptide revealed that it has a strong binding affinity with twenty-five anticancer drug targets, most notably phosphoenolpyruvate carboxykinase (- 7.2 kcal/mol). Additionally, molecular dynamics simulation and principal component analysis supported the stability of the peptide-receptor complex. Overall, the present findings will take a step forward in rational drug design through rapid identification and screening of therapeutic peptides.
Collapse
Affiliation(s)
- Swarnava Garai
- Department of Bioengineering, NIT Agartala, Tripura, 799046, India
| | - Juanit Thomas
- Department of Bioengineering, NIT Agartala, Tripura, 799046, India
| | - Palash Dey
- Civil Engineering Department, The ICFAI University, Tripura, 799210, India
| | - Deeplina Das
- Department of Bioengineering, NIT Agartala, Tripura, 799046, India.
| |
Collapse
|
23
|
Balaji PD, Selvam S, Sohn H, Madhavan T. MLASM: Machine learning based prediction of anticancer small molecules. Mol Divers 2024; 28:2153-2161. [PMID: 38554168 DOI: 10.1007/s11030-024-10823-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2024] [Accepted: 02/10/2024] [Indexed: 04/01/2024]
Abstract
Cancer, being the second leading cause of death globally. So, the development of effective anticancer treatments is crucial in the field of medicine. Anticancer peptides (ACPs) have shown promising therapeutic potential in cancer treatment compared to traditional methods. However, the process of identifying ACPs through experimental means is often time-intensive and expensive. To overcome this issue, we employed a machine learning-based approach for the first time to develop an anticancer model using small molecules. Anticancer small molecules (ACSMs) are compounds that have been developed to target and inhibit cancer cells. In this study, we used 10,000 compounds to develop the machine learning models using five algorithms such as, Random Forest (RF), Light gradient boosting machine (LightGBM), K-nearest neighbors (KNN), Decision tree (DT) and Extreme Gradient Boosting (XGB). The developed models were evaluated using the test set and top three models were identified (RF, LightGBM and XGB). Furthermore, to validate the predictive performance of our models, we have performed external validation using an FDA approved anticancer compounds/drugs. Following this analysis, we found that our LightGBM model correctly predicted 9 compounds as active. However, RF and XGB exhibited some limitations by predicting 8 and 7 compounds as active out of 10, respectively. These results demonstrate that, when compared to RF and XGB, the LightGBM model showcase robust prediction capabilities, achieving a superior accuracy of 79% with an AUC of 0.88. These findings provide promising insights into the potential of our approach for predicting anticancer small molecules, highlighting the role of machine learning in advancing cancer treatment research.
Collapse
Affiliation(s)
- Priya Dharshini Balaji
- Computational Biology Laboratory, Department of Genetic Engineering, School of Bio-Engineering, SRM Institute of Science and Technology, Kattankulathur, Chengalpattu, Tamil Nadu, 603203, India
| | - Subathra Selvam
- Computational Biology Laboratory, Department of Genetic Engineering, School of Bio-Engineering, SRM Institute of Science and Technology, Kattankulathur, Chengalpattu, Tamil Nadu, 603203, India
| | - Honglae Sohn
- Department of Chemistry, Department of Carbon Materials, Chosun University, Gwangju, South Korea
| | - Thirumurthy Madhavan
- Computational Biology Laboratory, Department of Genetic Engineering, School of Bio-Engineering, SRM Institute of Science and Technology, Kattankulathur, Chengalpattu, Tamil Nadu, 603203, India.
| |
Collapse
|
24
|
Arif M, Musleh S, Fida H, Alam T. PLMACPred prediction of anticancer peptides based on protein language model and wavelet denoising transformation. Sci Rep 2024; 14:16992. [PMID: 39043738 PMCID: PMC11266708 DOI: 10.1038/s41598-024-67433-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2024] [Accepted: 07/11/2024] [Indexed: 07/25/2024] Open
Abstract
Anticancer peptides (ACPs) perform a promising role in discovering anti-cancer drugs. The growing research on ACPs as therapeutic agent is increasing due to its minimal side effects. However, identifying novel ACPs using wet-lab experiments are generally time-consuming, labor-intensive, and expensive. Leveraging computational methods for fast and accurate prediction of ACPs would harness the drug discovery process. Herein, a machine learning-based predictor, called PLMACPred, is developed for identifying ACPs from peptide sequence only. PLMACPred adopted a set of encoding schemes representing evolutionary-property, composition-property, and protein language model (PLM), i.e., evolutionary scale modeling (ESM-2)- and ProtT5-based embedding to encode peptides. Then, two-dimensional (2D) wavelet denoising (WD) was employed to remove the noise from extracted features. Finally, ensemble-based cascade deep forest (CDF) model was developed to identify ACP. PLMACPred model attained superior performance on all three benchmark datasets, namely, ACPmain, ACPAlter, and ACP740 over tenfold cross validation and independent dataset. PLMACPred outperformed the existing models and improved the prediction accuracy by 18.53%, 2.4%, 7.59% on ACPmain, ACPalter, ACP740 dataset, respectively. We showed that embedding from ProtT5 and ESM-2 was capable of capturing better contextual information from the entire sequence than the other encoding schemes for ACP prediction. For the explainability of proposed model, SHAP (SHapley Additive exPlanations) method was used to analyze the feature effect on the ACP prediction. A list of novel sequence motifs was proposed from the ACP sequence using MEME suites. We believe, PLMACPred will support in accelerating the discovery of novel ACPs as well as other activities of microbial peptides.
Collapse
Affiliation(s)
- Muhammad Arif
- College of Science and Engineering, Hamad Bin Khalifa University, Doha, Qatar
| | - Saleh Musleh
- College of Science and Engineering, Hamad Bin Khalifa University, Doha, Qatar
| | - Huma Fida
- Department of Microbiology, Abdul Wali Khan University, Mardan, KPK, Pakistan
| | - Tanvir Alam
- College of Science and Engineering, Hamad Bin Khalifa University, Doha, Qatar.
| |
Collapse
|
25
|
Jain S, Gupta S, Patiyal S, Raghava GPS. THPdb2: compilation of FDA approved therapeutic peptides and proteins. Drug Discov Today 2024; 29:104047. [PMID: 38830503 DOI: 10.1016/j.drudis.2024.104047] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2024] [Revised: 04/30/2024] [Accepted: 05/29/2024] [Indexed: 06/05/2024]
Abstract
During the past 20 years, there has been a significant increase in the number of protein-based drugs approved by the US Food and Drug Administration (FDA). This paper presents THPdb2, an updated version of the THPdb database, which holds information about all types of protein-based drugs, including peptides, antibodies, and biosimilar proteins. THPdb2 contains a total of 6,385 entries, providing comprehensive information about 894 FDA-approved therapeutic proteins, including 354 monoclonal antibodies and 85 peptides or polypeptides. Each entry includes the name of therapeutic molecule, the amino acid sequence, physical and chemical properties, and route of drug administration. The therapeutic molecules that are included in the database target a wide range of biological molecules, such as receptors, factors, and proteins, and have been approved for the treatment of various diseases, including cancers, infectious diseases, and immune disorders.
Collapse
Affiliation(s)
- Shipra Jain
- Department of Computational Biology, Indraprastha Institute of Information Technology, Okhla Phase 3, New Delhi 110020, India
| | - Srijanee Gupta
- Department of Computational Biology, Indraprastha Institute of Information Technology, Okhla Phase 3, New Delhi 110020, India
| | - Sumeet Patiyal
- Cancer and Data Science Laboratory (CDSL), National Cancer Institute, National Institutes of Health, Bethesda, MD 20892, USA
| | - Gajendra P S Raghava
- Department of Computational Biology, Indraprastha Institute of Information Technology, Okhla Phase 3, New Delhi 110020, India.
| |
Collapse
|
26
|
Basith S, Pham NT, Manavalan B, Lee G. SEP-AlgPro: An efficient allergen prediction tool utilizing traditional machine learning and deep learning techniques with protein language model features. Int J Biol Macromol 2024; 273:133085. [PMID: 38871100 DOI: 10.1016/j.ijbiomac.2024.133085] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2023] [Revised: 05/20/2024] [Accepted: 06/09/2024] [Indexed: 06/15/2024]
Abstract
Allergy is a hypersensitive condition in which individuals develop objective symptoms when exposed to harmless substances at a dose that would cause no harm to a "normal" person. Most current computational methods for allergen identification rely on homology or conventional machine learning using limited set of feature descriptors or validation on specific datasets, making them inefficient and inaccurate. Here, we propose SEP-AlgPro for the accurate identification of allergen protein from sequence information. We analyzed 10 conventional protein-based features and 14 different features derived from protein language models to gauge their effectiveness in differentiating allergens from non-allergens using 15 different classifiers. However, the final optimized model employs top 10 feature descriptors with top seven machine learning classifiers. Results show that the features derived from protein language models exhibit superior discriminative capabilities compared to traditional feature sets. This enabled us to select the most discriminatory baseline models, whose predicted outputs were aggregated and used as input to a deep neural network for the final allergen prediction. Extensive case studies showed that SEP-AlgPro outperforms state-of-the-art predictors in accurately identifying allergens. A user-friendly web server was developed and made freely available at https://balalab-skku.org/SEP-AlgPro/, making it a powerful tool for identifying potential allergens.
Collapse
Affiliation(s)
- Shaherin Basith
- Department of Physiology, Ajou University School of Medicine, Suwon 16499, Republic of Korea.
| | - Nhat Truong Pham
- Department of Integrative Biotechnology, College of Biotechnology and Bioengineering, Sungkyunkwan University, Suwon 16419, Republic of Korea
| | - Balachandran Manavalan
- Department of Integrative Biotechnology, College of Biotechnology and Bioengineering, Sungkyunkwan University, Suwon 16419, Republic of Korea.
| | - Gwang Lee
- Department of Physiology, Ajou University School of Medicine, Suwon 16499, Republic of Korea; Department of Molecular Science and Technology, Ajou University, Suwon 16499, Republic of Korea.
| |
Collapse
|
27
|
Pham NT, Terrance AT, Jeon YJ, Rakkiyappan R, Manavalan B. ac4C-AFL: A high-precision identification of human mRNA N4-acetylcytidine sites based on adaptive feature representation learning. MOLECULAR THERAPY. NUCLEIC ACIDS 2024; 35:102192. [PMID: 38779332 PMCID: PMC11108997 DOI: 10.1016/j.omtn.2024.102192] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/07/2023] [Accepted: 04/18/2024] [Indexed: 05/25/2024]
Abstract
RNA N4-acetylcytidine (ac4C) is a highly conserved RNA modification that plays a crucial role in controlling mRNA stability, processing, and translation. Consequently, accurate identification of ac4C sites across the genome is critical for understanding gene expression regulation mechanisms. In this study, we have developed ac4C-AFL, a bioinformatics tool that precisely identifies ac4C sites from primary RNA sequences. In ac4C-AFL, we identified the optimal sequence length for model building and implemented an adaptive feature representation strategy that is capable of extracting the most representative features from RNA. To identify the most relevant features, we proposed a novel ensemble feature importance scoring strategy to rank features effectively. We then used this information to conduct the sequential forward search, which individually determine the optimal feature set from the 16 sequence-derived feature descriptors. Utilizing these optimal feature descriptors, we constructed 176 baseline models using 11 popular classifiers. The most efficient baseline models were identified using the two-step feature selection approach, whose predicted scores were integrated and trained with the appropriate classifier to develop the final prediction model. Our rigorous cross-validations and independent tests demonstrate that ac4C-AFL surpasses contemporary tools in predicting ac4C sites. Moreover, we have developed a publicly accessible web server at https://balalab-skku.org/ac4C-AFL/.
Collapse
Affiliation(s)
- Nhat Truong Pham
- Department of Integrative Biotechnology, College of Biotechnology and Bioengineering, Sungkyunkwan University, Suwon, Gyeonggi-do 16419, Republic of Korea
| | - Annie Terrina Terrance
- Department of Integrative Biotechnology, College of Biotechnology and Bioengineering, Sungkyunkwan University, Suwon, Gyeonggi-do 16419, Republic of Korea
| | - Young-Jun Jeon
- Department of Integrative Biotechnology, College of Biotechnology and Bioengineering, Sungkyunkwan University, Suwon, Gyeonggi-do 16419, Republic of Korea
| | - Rajan Rakkiyappan
- Department of Mathematics, Bharathiar University, Coimbatore, Tamil Nadu 641046, India
| | - Balachandran Manavalan
- Department of Integrative Biotechnology, College of Biotechnology and Bioengineering, Sungkyunkwan University, Suwon, Gyeonggi-do 16419, Republic of Korea
| |
Collapse
|
28
|
Coelho LP, Santos-Júnior CD, de la Fuente-Nunez C. Challenges in computational discovery of bioactive peptides in 'omics data. Proteomics 2024; 24:e2300105. [PMID: 38458994 PMCID: PMC11537280 DOI: 10.1002/pmic.202300105] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2023] [Revised: 02/06/2024] [Accepted: 02/06/2024] [Indexed: 03/10/2024]
Abstract
Peptides have a plethora of activities in biological systems that can potentially be exploited biotechnologically. Several peptides are used clinically, as well as in industry and agriculture. The increase in available 'omics data has recently provided a large opportunity for mining novel enzymes, biosynthetic gene clusters, and molecules. While these data primarily consist of DNA sequences, other types of data provide important complementary information. Due to their size, the approaches proven successful at discovering novel proteins of canonical size cannot be naïvely applied to the discovery of peptides. Peptides can be encoded directly in the genome as short open reading frames (smORFs), or they can be derived from larger proteins by proteolysis. Both of these peptide classes pose challenges as simple methods for their prediction result in large numbers of false positives. Similarly, functional annotation of larger proteins, traditionally based on sequence similarity to infer orthology and then transferring functions between characterized proteins and uncharacterized ones, cannot be applied for short sequences. The use of these techniques is much more limited and alternative approaches based on machine learning are used instead. Here, we review the limitations of traditional methods as well as the alternative methods that have recently been developed for discovering novel bioactive peptides with a focus on prokaryotic genomes and metagenomes.
Collapse
Affiliation(s)
- Luis Pedro Coelho
- Centre for Microbiome Research, School of Biomedical Sciences, Queensland University of Technology, Woolloongabba, Queensland, Australia
- Institute of Science and Technology for Brain-Inspired Intelligence – ISTBI, Fudan University, Shanghai, China
| | - Célio Dias Santos-Júnior
- Institute of Science and Technology for Brain-Inspired Intelligence – ISTBI, Fudan University, Shanghai, China
- Laboratory of Microbial Processes & Biodiversity – LMPB, Hydrobiology Department, Federal University of São Carlos – UFSCar, São Paulo, Brazil
| | - Cesar de la Fuente-Nunez
- Machine Biology Group, Departments of Psychiatry and Microbiology, Institute for Biomedical Informatics, Institute for Translational Medicine and Therapeutics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania, USA
- Departments of Bioengineering and Chemical and Biomolecular Engineering, School of Engineering and Applied Science, University of Pennsylvania, Philadelphia, Pennsylvania, USA
- Department of Chemistry, School of Arts and Sciences, University of Pennsylvania, Philadelphia, Pennsylvania, USA
- Penn Institute for Computational Science, University of Pennsylvania, Philadelphia, Pennsylvania, USA
| |
Collapse
|
29
|
Liao YH, Chen SZ, Bin YN, Zhao JP, Feng XL, Zheng CH. UsIL-6: An unbalanced learning strategy for identifying IL-6 inducing peptides by undersampling technique. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2024; 250:108176. [PMID: 38677081 DOI: 10.1016/j.cmpb.2024.108176] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/18/2022] [Revised: 03/26/2024] [Accepted: 04/11/2024] [Indexed: 04/29/2024]
Abstract
BACKGROUND AND OBJECTIVE Interleukin-6 (IL-6) is the critical factor of early warning, monitoring, and prognosis in the inflammatory storm of COVID-19 cases. IL-6 inducing peptides, which can induce cytokine IL-6 production, are very important for the development of diagnosis and immunotherapy. Although the existing methods have some success in predicting IL-6 inducing peptides, there is still room for improvement in the performance of these models in practical application. METHODS In this study, we proposed UsIL-6, a high-performance bioinformatics tool for identifying IL-6 inducing peptides. First, we extracted five groups of physicochemical properties and sequence structural information from IL-6 inducing peptide sequences, and obtained a 636-dimensional feature vector, we also employed NearMiss3 undersampling method and normalization method StandardScaler to process the data. Then, a 40-dimensional optimal feature vector was obtained by Boruta feature selection method. Finally, we combined this feature vector with extreme randomization tree classifier to build the final model UsIL-6. RESULTS The AUC value of UsIL-6 on the independent test dataset was 0.87, and the BACC value was 0.808, which indicated that UsIL-6 had better performance than the existing methods in IL-6 inducing peptide recognition. CONCLUSIONS The performance comparison on independent test dataset confirmed that UsIL-6 could achieve the highest performance, best robustness, and most excellent generalization ability. We hope that UsIL-6 will become a valuable method to identify, annotate and characterize new IL-6 inducing peptides.
Collapse
Affiliation(s)
- Yan-Hong Liao
- School of Mathematics and System Science, Xinjiang University, Urumqi, Xinjiang 830017, China
| | - Shou-Zhi Chen
- School of Mathematics and System Science, Xinjiang University, Urumqi, Xinjiang 830017, China
| | - Yan-Nan Bin
- School of Computer Science and Technology, Anhui University, Hefei, Anhui 230601, China
| | - Jian-Ping Zhao
- School of Mathematics and System Science, Xinjiang University, Urumqi, Xinjiang 830017, China.
| | - Xin-Long Feng
- School of Mathematics and System Science, Xinjiang University, Urumqi, Xinjiang 830017, China.
| | - Chun-Hou Zheng
- School of Mathematics and System Science, Xinjiang University, Urumqi, Xinjiang 830017, China; School of Computer Science and Technology, Anhui University, Hefei, Anhui 230601, China
| |
Collapse
|
30
|
Li F, Xu B, Lu Z, Chen J, Fu Y, Huang J, Wang Y, Li X. Hollow CoFe Nanozymes Integrated with Oncolytic Peptides Designed via Machine-Learning for Tumor Therapy. SMALL (WEINHEIM AN DER BERGSTRASSE, GERMANY) 2024; 20:e2311101. [PMID: 38234132 DOI: 10.1002/smll.202311101] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/01/2023] [Revised: 12/23/2023] [Indexed: 01/19/2024]
Abstract
Developing novel substances to synergize with nanozymes is a challenging yet indispensable task to enable the nanozyme-based therapeutics to tackle individual variations in tumor physicochemical properties. The advancement of machine learning (ML) has provided a useful tool to enhance the accuracy and efficiency in developing synergistic substances. In this study, ML models to mine low-cytotoxicity oncolytic peptides are applied. The filtering Pipeline is constructed using a traversal design and the Autogluon framework. Through the Pipeline, 37 novel peptides with high oncolytic activity against cancer cells and low cytotoxicity to normal cells are identified from a library of 25,740 sequences. Combining dataset testing with cytotoxicity experiments, an 80% accuracy rate is achieved, verifying the reliability of ML predictions. Peptide C2 is proven to possess membranolytic functions specifically for tumor cells as targeted by Pipeline. Then Peptide C2 with CoFe hollow hydroxide nanozyme (H-CF) to form the peptide/H-CF composite is integrated. The new composite exhibited acid-triggered membranolytic function and potent peroxidase-like (POD-like) activity, which induce ferroptosis to tumor cells and inhibits tumor growth. The study suggests that this novel ML-assisted design approach can offer an accurate and efficient paradigm for developing both oncolytic peptides and synergistic peptides for catalytic materials.
Collapse
Affiliation(s)
- Feiyu Li
- State Key Laboratory of Silicon and Advanced Semiconductor Materials, School of Materials Science and Engineering, Zhejiang University, Hangzhou, 310058, China
- ZJU-Hangzhou Global Science and Technology Innovation Center, Zhejiang University, Hangzhou, 311215, China
| | - Bocheng Xu
- ZJU-Hangzhou Global Science and Technology Innovation Center, Zhejiang University, Hangzhou, 311215, China
- Institute of Feed Science, College of Animal Science, Zhejiang University, Hangzhou, 310058, China
| | - Zijie Lu
- State Key Laboratory of Silicon and Advanced Semiconductor Materials, School of Materials Science and Engineering, Zhejiang University, Hangzhou, 310058, China
- ZJU-Hangzhou Global Science and Technology Innovation Center, Zhejiang University, Hangzhou, 311215, China
| | - Jiafei Chen
- Affiliated Hospital of Stomatology, Medical College, Zhejiang University, Hangzhou, 310000, China
| | - Yike Fu
- State Key Laboratory of Silicon and Advanced Semiconductor Materials, School of Materials Science and Engineering, Zhejiang University, Hangzhou, 310058, China
- ZJU-Hangzhou Global Science and Technology Innovation Center, Zhejiang University, Hangzhou, 311215, China
| | - Jie Huang
- Department of Mechanical Engineering, University College London, London, WC1E 7JE, UK
| | - Yizhen Wang
- ZJU-Hangzhou Global Science and Technology Innovation Center, Zhejiang University, Hangzhou, 311215, China
- Institute of Feed Science, College of Animal Science, Zhejiang University, Hangzhou, 310058, China
| | - Xiang Li
- State Key Laboratory of Silicon and Advanced Semiconductor Materials, School of Materials Science and Engineering, Zhejiang University, Hangzhou, 310058, China
- ZJU-Hangzhou Global Science and Technology Innovation Center, Zhejiang University, Hangzhou, 311215, China
| |
Collapse
|
31
|
Fang Y, Luo M, Ren Z, Wei L, Wei DQ. CELA-MFP: a contrast-enhanced and label-adaptive framework for multi-functional therapeutic peptides prediction. Brief Bioinform 2024; 25:bbae348. [PMID: 39038935 PMCID: PMC11262836 DOI: 10.1093/bib/bbae348] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2024] [Revised: 05/27/2024] [Accepted: 07/08/2024] [Indexed: 07/24/2024] Open
Abstract
Functional peptides play crucial roles in various biological processes and hold significant potential in many fields such as drug discovery and biotechnology. Accurately predicting the functions of peptides is essential for understanding their diverse effects and designing peptide-based therapeutics. Here, we propose CELA-MFP, a deep learning framework that incorporates feature Contrastive Enhancement and Label Adaptation for predicting Multi-Functional therapeutic Peptides. CELA-MFP utilizes a protein language model (pLM) to extract features from peptide sequences, which are then fed into a Transformer decoder for function prediction, effectively modeling correlations between different functions. To enhance the representation of each peptide sequence, contrastive learning is employed during training. Experimental results demonstrate that CELA-MFP outperforms state-of-the-art methods on most evaluation metrics for two widely used datasets, MFBP and MFTP. The interpretability of CELA-MFP is demonstrated by visualizing attention patterns in pLM and Transformer decoder. Finally, a user-friendly online server for predicting multi-functional peptides is established as the implementation of the proposed CELA-MFP and can be freely accessed at http://dreamai.cmii.online/CELA-MFP.
Collapse
Affiliation(s)
- Yitian Fang
- State Key Laboratory of Microbial Metabolism, Joint International Research Laboratory of Metabolic and Developmental Sciences, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, 800 Dongchuan Road, Shanghai 200240, China
- Peng Cheng Laboratory, 2 Xingke 1st Street, Nanshan District, Shenzhen 518055, China
| | - Mingshuang Luo
- Peng Cheng Laboratory, 2 Xingke 1st Street, Nanshan District, Shenzhen 518055, China
| | - Zhixiang Ren
- Peng Cheng Laboratory, 2 Xingke 1st Street, Nanshan District, Shenzhen 518055, China
| | - Leyi Wei
- Centre for Artificial Intelligence Driven Drug Discovery, Faculty of Applied Sciences, Macao Polytechnic University, R. de Luís Gonzaga Gomes, Macao 999078, China
- School of Informatics, Xiamen University, 422 Siming South Road, Xiamen 361005, China
| | - Dong-Qing Wei
- State Key Laboratory of Microbial Metabolism, Joint International Research Laboratory of Metabolic and Developmental Sciences, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, 800 Dongchuan Road, Shanghai 200240, China
- Peng Cheng Laboratory, 2 Xingke 1st Street, Nanshan District, Shenzhen 518055, China
| |
Collapse
|
32
|
Niu Y, Li Z, Chen Z, Huang W, Tan J, Tian F, Yang T, Fan Y, Wei J, Mu J. Efficient screening of pharmacological broad-spectrum anti-cancer peptides utilizing advanced bidirectional Encoder representation from Transformers strategy. Heliyon 2024; 10:e30373. [PMID: 38765108 PMCID: PMC11101728 DOI: 10.1016/j.heliyon.2024.e30373] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2023] [Revised: 04/24/2024] [Accepted: 04/24/2024] [Indexed: 05/21/2024] Open
Abstract
In the vanguard of oncological advancement, this investigation delineates the integration of deep learning paradigms to refine the screening process for Anticancer Peptides (ACPs), epitomizing a new frontier in broad-spectrum oncolytic therapeutics renowned for their targeted antitumor efficacy and specificity. Conventional methodologies for ACP identification are marred by prohibitive time and financial exigencies, representing a formidable impediment to the evolution of precision oncology. In response, our research heralds the development of a groundbreaking screening apparatus that marries Natural Language Processing (NLP) with the Pseudo Amino Acid Composition (PseAAC) technique, thereby inaugurating a comprehensive ACP compendium for the extraction of quintessential primary and secondary structural attributes. This innovative methodological approach is augmented by an optimized BERT model, meticulously calibrated for ACP detection, which conspicuously surpasses existing BERT variants and traditional machine learning algorithms in both accuracy and selectivity. Subjected to rigorous validation via five-fold cross-validation and external assessment, our model exhibited exemplary performance, boasting an average Area Under the Curve (AUC) of 0.9726 and an F1 score of 0.9385, with external validation further affirming its prowess (AUC of 0.9848 and F1 of 0.9371). These findings vividly underscore the method's unparalleled efficacy and prospective utility in the precise identification and prognostication of ACPs, significantly ameliorating the financial and temporal burdens traditionally associated with ACP research and development. Ergo, this pioneering screening paradigm promises to catalyze the discovery and clinical application of ACPs, constituting a seminal stride towards the realization of more efficacious and economically viable precision oncology interventions.
Collapse
Affiliation(s)
- Yupeng Niu
- College of Information Engineering, Sichuan Agricultural University, Ya'an 625000, China
- Artificial intelligence laboratory, Sichuan Agricultural University, Ya'an 625000, China
| | - Zhenghao Li
- College of Information Engineering, Sichuan Agricultural University, Ya'an 625000, China
- Artificial intelligence laboratory, Sichuan Agricultural University, Ya'an 625000, China
| | - Ziao Chen
- College of Law, Sichuan Agricultural University, Ya'an 625000, China
- Artificial intelligence laboratory, Sichuan Agricultural University, Ya'an 625000, China
| | - Wenyuan Huang
- College of Information Engineering, Sichuan Agricultural University, Ya'an 625000, China
- Artificial intelligence laboratory, Sichuan Agricultural University, Ya'an 625000, China
| | - Jingxuan Tan
- College of Information Engineering, Sichuan Agricultural University, Ya'an 625000, China
- Artificial intelligence laboratory, Sichuan Agricultural University, Ya'an 625000, China
| | - Fa Tian
- College of Information Engineering, Sichuan Agricultural University, Ya'an 625000, China
| | - Tao Yang
- College of Information Engineering, Sichuan Agricultural University, Ya'an 625000, China
- Artificial intelligence laboratory, Sichuan Agricultural University, Ya'an 625000, China
| | - Yamin Fan
- College of Information Engineering, Sichuan Agricultural University, Ya'an 625000, China
- Artificial intelligence laboratory, Sichuan Agricultural University, Ya'an 625000, China
| | - Jiangshu Wei
- College of Information Engineering, Sichuan Agricultural University, Ya'an 625000, China
| | - Jiong Mu
- College of Information Engineering, Sichuan Agricultural University, Ya'an 625000, China
- Artificial intelligence laboratory, Sichuan Agricultural University, Ya'an 625000, China
| |
Collapse
|
33
|
Dash R, Jabbari E. A Structure Independent Molecular Fragment Interfuse Model for Mesoscale Dissipative Particle Dynamics Simulation of Peptides. ACS OMEGA 2024; 9:18001-18022. [PMID: 38680324 PMCID: PMC11044228 DOI: 10.1021/acsomega.3c09534] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/29/2023] [Revised: 03/07/2024] [Accepted: 04/02/2024] [Indexed: 05/01/2024]
Abstract
There is a need to develop robust computational models for mesoscale simulation of the structure of peptides over large length scales toward the discovery of novel peptides for medical applications to address the issues of peptide aggregation, enzymatic degradation, and short half-life. The primary objective was to predict the structure and conformation of peptides whose native structures are not known. This work presents a new model for computation of interaction parameters between the beads in coarse-grained dissipative particle dynamics (DPD) simulation that is properly calibrated for amino acids, supports compressibility requirement of water molecules, and accounts for subtle differences in the structure of amino acids and the charge in the side chain of charged amino acids. This new model is referred to as Structure Independent Molecular Fragment Interfuse Model, abbreviated as SIMFIM, because it accounts for specific interactions between different beads, which represent molecular fragments of the amino acids, in calculating nonbonded interaction parameters in the absence of knowing the actual peptide structure. The electrostatic interactions are incorporated in this model by using a normal distribution of charges around the center of the beads to prevent the collapse of oppositely charged soft beads. The uniquely parameterized DPD force field in the SIMFIM model is optimized for a given peptide with respect to the degree of coarse-grained graining for simulating the peptide over long times and length scales. The SIMFIM model was tested in this work using four peptides, namely, TrpZip2, Rubrivinodin, Lihuanodin, and IC3-CB1/Gai peptides, whose structures were sourced from the Protein Data Bank. The SIMFIM model predicted radius of gyration (Rg) values for the peptides closer to the actual structures as compared to the conventional model, and there was less deviation between the predicted and actual structures of the peptides.
Collapse
Affiliation(s)
- Ricky
Anshuman Dash
- Biomimetic Materials and
Tissue Engineering Laboratory, Chemical Engineering Department, University of South Carolina, 301 Main Street, Columbia, South Carolina 29208, United States
| | - Esmaiel Jabbari
- Biomimetic Materials and
Tissue Engineering Laboratory, Chemical Engineering Department, University of South Carolina, 301 Main Street, Columbia, South Carolina 29208, United States
| |
Collapse
|
34
|
Xu M, Pang J, Ye Y, Zhang Z. Integrating Traditional Machine Learning and Deep Learning for Precision Screening of Anticancer Peptides: A Novel Approach for Efficient Drug Discovery. ACS OMEGA 2024; 9:16820-16831. [PMID: 38617603 PMCID: PMC11007766 DOI: 10.1021/acsomega.4c01374] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/12/2024] [Revised: 03/03/2024] [Accepted: 03/22/2024] [Indexed: 04/16/2024]
Abstract
The rapid and effective identification of anticancer peptides (ACPs) by computer technology provides a new perspective for cancer treatment. In the identification process of ACPs, accurate sequence encoding and effective classification models are crucial for predicting their biological activity. Traditional machine learning methods have been widely applied in sequence analysis, but deep learning provides a new approach to capture sequence complexity. In this study, a two-stage ACPs classification model was innovatively proposed. Three novel coding strategies were explored; two mainstream Natural Language Processing (NLP) models and 11 machine learning models were fused to identify ACPs, which significantly improved the prediction accuracy of ACPs. We analyzed the correlation between peptide chain amino acids and evaluated the relevant performance of the model by the ROC curve and t-SNE dimensionality reduction technique. The results indicated that the deep learning and machine learning fusion models of M3E-base and KNeighborsDist models, especially when considering the semantic information on amino acid sequences, achieved the highest average accuracy (AvgAcc) of 0.939, with an AUC value as high as 0.97. Then, in vitro cell experiments were used to verify that the two ACPs predicted by the model had antitumor efficacy. This study provides a convenient and effective method for screening ACPs. With further optimization and testing, these strategies have the potential to play an important role in drug discovery and design.
Collapse
Affiliation(s)
- Meiqi Xu
- Key
Laboratory of Novel Targets and Drug Study for Neural Repair of Zhejiang
Province, School of Medicine, Hangzhou City
University, Hangzhou 310015, Zhejiang, China
| | - Jiefu Pang
- School
of Computer Science, Hangzhou Dianzi University, Hangzhou 310018, Zhejiang, China
| | - Yangyang Ye
- Key
Laboratory of Novel Targets and Drug Study for Neural Repair of Zhejiang
Province, School of Medicine, Hangzhou City
University, Hangzhou 310015, Zhejiang, China
| | - Ziyi Zhang
- Key
Laboratory of Novel Targets and Drug Study for Neural Repair of Zhejiang
Province, School of Medicine, Hangzhou City
University, Hangzhou 310015, Zhejiang, China
| |
Collapse
|
35
|
Scalzitti N, Miralavy I, Korenchan DE, Farrar CT, Gilad AA, Banzhaf W. Computational peptide discovery with a genetic programming approach. J Comput Aided Mol Des 2024; 38:17. [PMID: 38570405 PMCID: PMC11416381 DOI: 10.1007/s10822-024-00558-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2023] [Accepted: 03/07/2024] [Indexed: 04/05/2024]
Abstract
The development of peptides for therapeutic targets or biomarkers for disease diagnosis is a challenging task in protein engineering. Current approaches are tedious, often time-consuming and require complex laboratory data due to the vast search spaces that need to be considered. In silico methods can accelerate research and substantially reduce costs. Evolutionary algorithms are a promising approach for exploring large search spaces and can facilitate the discovery of new peptides. This study presents the development and use of a new variant of the genetic-programming-based POET algorithm, called POETRegex , where individuals are represented by a list of regular expressions. This algorithm was trained on a small curated dataset and employed to generate new peptides improving the sensitivity of peptides in magnetic resonance imaging with chemical exchange saturation transfer (CEST). The resulting model achieves a performance gain of 20% over the initial POET models and is able to predict a candidate peptide with a 58% performance increase compared to the gold-standard peptide. By combining the power of genetic programming with the flexibility of regular expressions, new peptide targets were identified that improve the sensitivity of detection by CEST. This approach provides a promising research direction for the efficient identification of peptides with therapeutic or diagnostic potential.
Collapse
Affiliation(s)
- Nicolas Scalzitti
- BEACON Center of Evolution in Action, Michigan State University, East Lansing, MI, USA
- Department of Computer Science and Engineering, Michigan State University, East Lansing, MI, USA
| | - Iliya Miralavy
- BEACON Center of Evolution in Action, Michigan State University, East Lansing, MI, USA
- Department of Computer Science and Engineering, Michigan State University, East Lansing, MI, USA
| | - David E Korenchan
- Athinoula A. Martinos Center for Biomedical Imaging, Department of Radiology, Massachusetts General Hospital and Harvard Medical School, Boston, MA, USA
| | - Christian T Farrar
- Athinoula A. Martinos Center for Biomedical Imaging, Department of Radiology, Massachusetts General Hospital and Harvard Medical School, Boston, MA, USA
| | - Assaf A Gilad
- BEACON Center of Evolution in Action, Michigan State University, East Lansing, MI, USA.
- Department of Chemical Engineering, Michigan State University, East Lansing, MI, USA.
- Department of Radiology, Michigan State University, East Lansing, MI, USA.
| | - Wolfgang Banzhaf
- BEACON Center of Evolution in Action, Michigan State University, East Lansing, MI, USA.
- Department of Computer Science and Engineering, Michigan State University, East Lansing, MI, USA.
| |
Collapse
|
36
|
Zhao Z, Laps S, Gichtin JS, Metanis N. Selenium chemistry for spatio-selective peptide and protein functionalization. Nat Rev Chem 2024; 8:211-229. [PMID: 38388838 DOI: 10.1038/s41570-024-00579-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 01/15/2024] [Indexed: 02/24/2024]
Abstract
The ability to construct a peptide or protein in a spatio-specific manner is of great interest for therapeutic and biochemical research. However, the various functional groups present in peptide sequences and the need to perform chemistry under mild and aqueous conditions make selective protein functionalization one of the greatest synthetic challenges. The fascinating paradox of selenium (Se) - being found in both toxic compounds and also harnessed by nature for essential biochemical processes - has inspired the recent exploration of selenium chemistry for site-selective functionalization of peptides and proteins. In this Review, we discuss such approaches, including metal-free and metal-catalysed transformations, as well as traceless chemical modifications. We report their advantages, limitations and applications, as well as future research avenues.
Collapse
Affiliation(s)
- Zhenguang Zhao
- Institute of Chemistry, The Hebrew University of Jerusalem, Jerusalem, Israel
- Department of Chemistry and Chemical Biology, Harvard University, Cambridge, MA, USA
| | - Shay Laps
- Institute of Chemistry, The Hebrew University of Jerusalem, Jerusalem, Israel
| | - Jacob S Gichtin
- Institute of Chemistry, The Hebrew University of Jerusalem, Jerusalem, Israel
| | - Norman Metanis
- Institute of Chemistry, The Hebrew University of Jerusalem, Jerusalem, Israel.
- Casali Center for Applied Chemistry, The Hebrew University of Jerusalem, Jerusalem, Israel.
- The Center for Nanoscience and Nanotechnology, The Hebrew University of Jerusalem, Jerusalem, Israel.
| |
Collapse
|
37
|
Almeida JR. The Century-Long Journey of Peptide-Based Drugs. Antibiotics (Basel) 2024; 13:196. [PMID: 38534631 DOI: 10.3390/antibiotics13030196] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2024] [Accepted: 02/17/2024] [Indexed: 03/28/2024] Open
Abstract
The pioneering medical application of peptides as therapeutics began approximately a century ago; however, they remain clinically relevant candidates garnering more attention on the drug development agenda [...].
Collapse
Affiliation(s)
- José R Almeida
- Biomolecules Discovery Group, Universidad Regional Amazónica Ikiam, Km 7 Via Muyuna, Tena 150101, Ecuador
- School of Pharmacy, University of Reading, Reading RG6 6UB, UK
| |
Collapse
|
38
|
Pan X, Li Y, Huang P, Staecker H, He M. Extracellular vesicles for developing targeted hearing loss therapy. J Control Release 2024; 366:460-478. [PMID: 38182057 DOI: 10.1016/j.jconrel.2023.12.050] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2023] [Revised: 12/19/2023] [Accepted: 12/28/2023] [Indexed: 01/07/2024]
Abstract
Substantial efforts have been made for local administration of small molecules or biologics in treating hearing loss diseases caused by either trauma, genetic mutations, or drug ototoxicity. Recently, extracellular vesicles (EVs) naturally secreted from cells have drawn increasing attention on attenuating hearing impairment from both preclinical studies and clinical studies. Highly emerging field utilizing diverse bioengineering technologies for developing EVs as the bioderived therapeutic materials, along with artificial intelligence (AI)-based targeting toolkits, shed the light on the unique properties of EVs specific to inner ear delivery. This review will illuminate such exciting research field from fundamentals of hearing protective functions of EVs to biotechnology advancement and potential clinical translation of functionalized EVs. Specifically, the advancements in assessing targeting ligands using AI algorithms are systematically discussed. The overall translational potential of EVs is reviewed in the context of auditory sensing system for developing next generation gene therapy.
Collapse
Affiliation(s)
- Xiaoshu Pan
- Department of Pharmaceutics, College of Pharmacy, University of Florida, Gainesville, Florida 32610, United States
| | - Yanjun Li
- Department of Medicinal Chemistry, Center for Natural Products, Drug Discovery and Development, University of Florida, Gainesville, Florida 32610, United States
| | - Peixin Huang
- Department of Otolaryngology, Head and Neck Surgery, University of Kansas School of Medicine, Kansas City, Kansas 66160, United States
| | - Hinrich Staecker
- Department of Otolaryngology, Head and Neck Surgery, University of Kansas School of Medicine, Kansas City, Kansas 66160, United States.
| | - Mei He
- Department of Pharmaceutics, College of Pharmacy, University of Florida, Gainesville, Florida 32610, United States.
| |
Collapse
|
39
|
Karim T, Shaon MSH, Sultan MF, Hasan MZ, Kafy AA. ANNprob-ACPs: A novel anticancer peptide identifier based on probabilistic feature fusion approach. Comput Biol Med 2024; 169:107915. [PMID: 38171261 DOI: 10.1016/j.compbiomed.2023.107915] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2023] [Revised: 12/28/2023] [Accepted: 12/29/2023] [Indexed: 01/05/2024]
Abstract
Anticancer Peptides (ACPs) offer significant potential as cancer treatment drugs in this modern era. Quickly identifying active compounds from protein sequences is crucial for healthcare and cancer treatment. In this paper ANNprob-ACPs, a novel and effective model for detecting ACPs has been implemented based on nine feature encoding techniques, including AAC, CC, W2V, DPC, PAAC, QSO, CTDC, CTDT, and CKSAAGP. After analyzing the performance of several machine learning models, the six best models were selected based on their overall performances in every evaluation metric. The probability scores of each model were subsequently aggregated and used as input of our meta- model, called ANNprob-ACPs. Our model outperformed all others and its potential to lead to phenomenal identification of ACPs. The results of this study showed notable improvement in 10-fold cross-validation and independent test, with accuracy of 93.72% and 90.62%, respectively. Our proposed model, ANNprob-ACPs outperformed existing approaches in terms of accuracy and effectiveness in discovering ACPs. By using SHAP, this study obtained the physicochemical properties of QSO, and compositional properties of DPC, AAC, and PAAC are more impactful for our model's performances, which have a major impact on a drug's interactions and future discoveries. Consequently, this model is crucial for the future and has a high probability of detecting ACPs more frequently. We developed a web server of ANNprob-ACPs, which is accessible at ANNprob-ACPs webserver.
Collapse
Affiliation(s)
- Tasmin Karim
- Department of Computer Science & Engineering, Daffodil International University, Daffodil Smart City, Birulia, Dhaka, 1216, Bangladesh; Health Informatics Research Lab, Department of Computer Science and Engineering, Daffodil International University, Daffodil Smart City, Birulia, Dhaka, 1216, Bangladesh.
| | - Md Shazzad Hossain Shaon
- Department of Computer Science & Engineering, Daffodil International University, Daffodil Smart City, Birulia, Dhaka, 1216, Bangladesh; Health Informatics Research Lab, Department of Computer Science and Engineering, Daffodil International University, Daffodil Smart City, Birulia, Dhaka, 1216, Bangladesh.
| | - Md Fahim Sultan
- Department of Computer Science & Engineering, Daffodil International University, Daffodil Smart City, Birulia, Dhaka, 1216, Bangladesh; Health Informatics Research Lab, Department of Computer Science and Engineering, Daffodil International University, Daffodil Smart City, Birulia, Dhaka, 1216, Bangladesh.
| | - Md Zahid Hasan
- Department of Computer Science & Engineering, Daffodil International University, Daffodil Smart City, Birulia, Dhaka, 1216, Bangladesh; Health Informatics Research Lab, Department of Computer Science and Engineering, Daffodil International University, Daffodil Smart City, Birulia, Dhaka, 1216, Bangladesh.
| | - Abdulla-Al Kafy
- Department of Urban & Regional Planning, Rajshahi University of Engineering & Technology (RUET), Rajshahi, 6204, Bangladesh.
| |
Collapse
|
40
|
Feijoo-Coronel ML, Mendes B, Ramírez D, Peña-Varas C, de los Monteros-Silva NQE, Proaño-Bolaños C, de Oliveira LC, Lívio DF, da Silva JA, da Silva JMSF, Pereira MGAG, Rodrigues MQRB, Teixeira MM, Granjeiro PA, Patel K, Vaiyapuri S, Almeida JR. Antibacterial and Antiviral Properties of Chenopodin-Derived Synthetic Peptides. Antibiotics (Basel) 2024; 13:78. [PMID: 38247637 PMCID: PMC10812719 DOI: 10.3390/antibiotics13010078] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2023] [Revised: 01/10/2024] [Accepted: 01/11/2024] [Indexed: 01/23/2024] Open
Abstract
Antimicrobial peptides have been developed based on plant-derived molecular scaffolds for the treatment of infectious diseases. Chenopodin is an abundant seed storage protein in quinoa, an Andean plant with high nutritional and therapeutic properties. Here, we used computer- and physicochemical-based strategies and designed four peptides derived from the primary structure of Chenopodin. Two peptides reproduce natural fragments of 14 amino acids from Chenopodin, named Chen1 and Chen2, and two engineered peptides of the same length were designed based on the Chen1 sequence. The two amino acids of Chen1 containing amide side chains were replaced by arginine (ChenR) or tryptophan (ChenW) to generate engineered cationic and hydrophobic peptides. The evaluation of these 14-mer peptides on Staphylococcus aureus and Escherichia coli showed that Chen1 does not have antibacterial activity up to 512 µM against these strains, while other peptides exhibited antibacterial effects at lower concentrations. The chemical substitutions of glutamine and asparagine by amino acids with cationic or aromatic side chains significantly favoured their antibacterial effects. These peptides did not show significant hemolytic activity. The fluorescence microscopy analysis highlighted the membranolytic nature of Chenopodin-derived peptides. Using molecular dynamic simulations, we found that a pore is formed when multiple peptides are assembled in the membrane. Whereas, some of them form secondary structures when interacting with the membrane, allowing water translocations during the simulations. Finally, Chen2 and ChenR significantly reduced SARS-CoV-2 infection. These findings demonstrate that Chenopodin is a highly useful template for the design, engineering, and manufacturing of non-toxic, antibacterial, and antiviral peptides.
Collapse
Affiliation(s)
- Marcia L. Feijoo-Coronel
- Biomolecules Discovery Group, Universidad Regional Amazónica Ikiam, Km 7 Via Muyuna, Tena 150101, Ecuador
| | - Bruno Mendes
- Biomolecules Discovery Group, Universidad Regional Amazónica Ikiam, Km 7 Via Muyuna, Tena 150101, Ecuador
| | - David Ramírez
- Departamento de Farmacología, Facultad de Ciencias Biológicas, Universidad de Concepción, Concepción 4030000, Chile
| | - Carlos Peña-Varas
- Departamento de Farmacología, Facultad de Ciencias Biológicas, Universidad de Concepción, Concepción 4030000, Chile
| | | | - Carolina Proaño-Bolaños
- Biomolecules Discovery Group, Universidad Regional Amazónica Ikiam, Km 7 Via Muyuna, Tena 150101, Ecuador
| | - Leonardo Camilo de Oliveira
- Centro de Pesquisa e Desenvolvimento de Fármacos, Departamento de Bioquímica e Imunologia, Instituto de Ciências Biológicas, Federal University of Minas Gerais, Belo Horizonte 31270-901, Brazil
| | - Diego Fernandes Lívio
- Campus Centro Oeste, Federal University of São João Del-Rei, Rua Sebastião Gonçalves Filho, n 400, Chanadour, Divinópolis 35501-296, Brazil
| | - José Antônio da Silva
- Campus Centro Oeste, Federal University of São João Del-Rei, Rua Sebastião Gonçalves Filho, n 400, Chanadour, Divinópolis 35501-296, Brazil
| | - José Maurício S. F. da Silva
- Departamento de Bioquímica, Centro de Ciências Biomédicas, Federal University of Alfenas, Rua Gabriel Monteiro da Silva, 700, Sala E209, Alfenas 37130-001, Brazil
| | - Marília Gabriella A. G. Pereira
- Departamento de Bioquímica, Centro de Ciências Biomédicas, Federal University of Alfenas, Rua Gabriel Monteiro da Silva, 700, Sala E209, Alfenas 37130-001, Brazil
| | - Marina Q. R. B. Rodrigues
- Departamento de Bioquímica, Centro de Ciências Biomédicas, Federal University of Alfenas, Rua Gabriel Monteiro da Silva, 700, Sala E209, Alfenas 37130-001, Brazil
- Departamento de Engenharia de Biossistemas, Campus Dom Bosco, Federal University of São João Del-Rei, Praça Dom Helvécio, 74, Fábricas, São João del-Rei 36301-160, Brazil
| | - Mauro M. Teixeira
- Centro de Pesquisa e Desenvolvimento de Fármacos, Departamento de Bioquímica e Imunologia, Instituto de Ciências Biológicas, Federal University of Minas Gerais, Belo Horizonte 31270-901, Brazil
| | - Paulo Afonso Granjeiro
- Campus Centro Oeste, Federal University of São João Del-Rei, Rua Sebastião Gonçalves Filho, n 400, Chanadour, Divinópolis 35501-296, Brazil
| | - Ketan Patel
- School of Biological Sciences, University of Reading, Reading RG6 6UB, UK
| | | | - José R. Almeida
- Biomolecules Discovery Group, Universidad Regional Amazónica Ikiam, Km 7 Via Muyuna, Tena 150101, Ecuador
- School of Pharmacy, University of Reading, Reading RG6 6UB, UK
| |
Collapse
|
41
|
Wong YH, Lee SH. Short Fragmented Peptides from Pardachirus Marmoratus Exhibit Stronger Anticancer Activities in In Silico Residue Replacement and Analyses. Curr Drug Discov Technol 2024; 21:e220224227304. [PMID: 38409702 DOI: 10.2174/0115701638290855240207114727] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2023] [Revised: 01/16/2024] [Accepted: 01/24/2024] [Indexed: 02/28/2024]
Abstract
BACKGROUND Cancer is a worldwide issue. It has been observed that conventional therapies face many problems, such as side effects and drug resistance. Recent research reportedly used marine-derived products to treat various diseases and explored their potential in treating cancers. OBJECTIVE This study aims to discover short-length anticancer peptides derived from pardaxin 6 through an in silico approach. METHODS Fragmented peptides ranging from 5 to 15 amino acids were derived from the pardaxin 6 parental peptide. These peptides were further replaced with one residue and, along with the original fragmented peptides, were predicted for their SVM scores and physicochemical properties. The top 5 derivative peptides were further examined for their toxicity, hemolytic probability, peptide structures, docking models, and energy scores using various web servers. The trend of in silico analysis outputs across 5 to 15 amino acid fragments was further analyzed. RESULTS Results showed that when the amino acids were increased, SVM scores of the original fragmented peptides were also increased. Designed peptides had increased SVM scores, which was aligned with previous studies where the single residue replacement transformed the non-anticancer peptide into an anticancer agent. Moreover, in vitro studies validated that the designed peptides retained or enhanced anticancer effects against different cancer cell lines. Interestingly, a decreasing trend was observed in those fragmented derivative peptides. CONCLUSION Single residue replacement in fragmented pardaxin 6 was found to produce stronger anticancer agents through in silico predictions. Through bioinformatics tools, fragmented peptides improved the efficiency of marine-derived drugs with higher efficacy and lower hemolytic effects in treating cancers.
Collapse
Affiliation(s)
- Yong Hui Wong
- School of Biosciences, Faculty of Health and Medical Sciences, Taylor's University, Subang Jaya, 47500, Malaysia
| | - Sau Har Lee
- School of Biosciences, Faculty of Health and Medical Sciences, Taylor's University, Subang Jaya, 47500, Malaysia
- Digital Health and Medical Advancements Impact Lab, Taylor's University, Subang Jaya, 47500, Malaysia
| |
Collapse
|
42
|
Peng Q, Tao J, Xu Y, Shen Y, Wang Y, Jiao Y, Mao Y, Zhu Y, Liu Y, Tian Y. Lipid metabolism-associated genes serve as potential predictive biomarkers in neoadjuvant chemoradiotherapy combined with immunotherapy in rectal cancer. Transl Oncol 2024; 39:101828. [PMID: 38000147 DOI: 10.1016/j.tranon.2023.101828] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2023] [Revised: 10/26/2023] [Accepted: 11/09/2023] [Indexed: 11/26/2023] Open
Abstract
BACKGROUND The aim of this study was to investigate the potential role of lipid metabolism-associated genes (LMAGs) in neoadjuvant chemoradiotherapy (nCRT) and immunotherapy for rectal cancer. METHODS Differential LMAGs were characterized and functional enrichment analysis was performed. Multiple machine learning algorithms were combined to explore candidate LMAGs. ROC analysis was performed to evaluate the predicting accuracy of candidate LMAGs. The expression patterns, prognostic value, genetic alterations, and immune cell infiltration of the top-ranked LMAGs were investigated. RESULTS We identified 45 LMAGs that were differentially expressed in tumor samples of nCRT responders and non-responders. These LMAGs were closely associated with lipid metabolism-related biological processes and pathways. ROC analysis revealed that the SREBF2 gene, an important transcription factor in regulating lipid metabolism, was the highest predictor of nCRT in rectal cancer. SREBF2 was highly expressed in rectal cancer tissues and high expression of SREBF2 was associated with favorable prognosis. Multivariate analysis showed that SREBF2 was an independent prognostic factor, and we integrated it with other clinical factors to establish an effective prognostic nomogram. SREBF2 also played a synergistic role with its co-expressed genes in the prognostic process of rectal cancer. Furthermore, SREBF2 was demonstrated to be closely associated with multiple immune infiltrating cells, and immunotherapy-related genes and may be used to predict the response to immunotherapy. CONCLUSION Our study suggests that LMAGs may serve as promising biomarkers in nCRT combined with immunotherapy for rectal cancer. However, large-scale clinical trials and biological experiments are necessary to demonstrate the efficacy and underlying mechanisms.
Collapse
Affiliation(s)
- Qiliang Peng
- Department of Radiotherapy & Oncology, The Second Affiliated Hospital of Soochow University, Suzhou, China; Institute of Radiotherapy & Oncology, Soochow University, Suzhou, China; State Key Laboratory of Radiation Medicine and Protection, Soochow University, Suzhou, China
| | - Jialong Tao
- Department of Oncology, The Second Affiliated Hospital of Soochow University, Suzhou, China
| | - Yingjie Xu
- Department of Cardiology, The Second Affiliated Hospital of Soochow University, Suzhou, China
| | - Yi Shen
- Department of Radiation Oncology, Suzhou Hospital, Affiliated Hospital of Medical School, Nanjing University, Suzhou, China
| | - Yong Wang
- Department of Radiotherapy & Oncology, The Second Affiliated Hospital of Soochow University, Suzhou, China; Institute of Radiotherapy & Oncology, Soochow University, Suzhou, China
| | - Yang Jiao
- Re-Stem Biotechnology Co., Ltd, Suzhou, China
| | - Yiheng Mao
- Department of Radiotherapy & Oncology, The Second Affiliated Hospital of Soochow University, Suzhou, China; Institute of Radiotherapy & Oncology, Soochow University, Suzhou, China
| | - Yaqun Zhu
- Department of Radiotherapy & Oncology, The Second Affiliated Hospital of Soochow University, Suzhou, China; Institute of Radiotherapy & Oncology, Soochow University, Suzhou, China.
| | - Yulong Liu
- Department of Oncology, The Second Affiliated Hospital of Soochow University, Suzhou, China; State Key Laboratory of Radiation Medicine and Protection, School of Radiation Medicine and Protection, Medical College of Soochow University, Suzhou, China.
| | - Ye Tian
- Department of Radiotherapy & Oncology, The Second Affiliated Hospital of Soochow University, Suzhou, China; Institute of Radiotherapy & Oncology, Soochow University, Suzhou, China.
| |
Collapse
|
43
|
Feng H, Wang F, Li N, Xu Q, Zheng G, Sun X, Hu M, Li X, Xing G, Zhang G. Use of tree-based machine learning methods to screen affinitive peptides based on docking data. Mol Inform 2023; 42:e202300143. [PMID: 37696773 DOI: 10.1002/minf.202300143] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2023] [Revised: 09/03/2023] [Accepted: 09/11/2023] [Indexed: 09/13/2023]
Abstract
Screening peptides with good affinity is an important step in peptide-drug discovery. Recent advancement in computer and data science have made machine learning a useful tool in accurately affinitive-peptide screening. In current study, four different tree-based algorithms, including Classification and regression trees (CART), C5.0 decision tree (C50), Bagged CART (BAG) and Random Forest (RF), were employed to explore the relationship between experimental peptide affinities and virtual docking data, and the performance of each model was also compared in parallel. All four algorithms showed better performances on dataset pre-scaled, -centered and -PCA than other pre-processed dataset. After model re-built and hyperparameter optimization, the optimal C50 model (C50O) showed the best performances in terms of Accuracy, Kappa, Sensitivity, Specificity, F1, MCC and AUC when validated on test data and an unknown PEDV datasets evaluation (Accuracy=80.4 %). BAG and RFO (the optimal RF), as two best models during training process, did not performed as expecting during in testing and unknown dataset validations. Furthermore, the high correlation of the predictions of RFO and BAG to C50O implied the high stability and robustness of their prediction. Whereas although the good performance on unknown dataset, the poor performance in test data validation and correlation analysis indicated CARTO could not be used for future data prediction. To accurately evaluate the peptide affinity, the current study firstly gave a tree-model competition on affinitive peptide prediction by using virtual docking data, which would expand the application of machine learning algorithms in studying PepPIs and benefit the development of peptide therapeutics.
Collapse
Affiliation(s)
- Hua Feng
- Henan Key Laboratory of Animal Immunology, Henan Academy of Agricultural Sciences, Zhengzhou, China
| | - Fangyu Wang
- Henan Key Laboratory of Animal Immunology, Henan Academy of Agricultural Sciences, Zhengzhou, China
| | - Ning Li
- College of Food Science and Technology, Henan Agricultural University, Zhengzhou, China
| | - Qian Xu
- Henan Key Laboratory of Animal Immunology, Henan Academy of Agricultural Sciences, Zhengzhou, China
| | - Guanming Zheng
- Public Health and Preventive Medicine Teaching and Research Center, Henan University of Chinese Medicine, Zhengzhou, Henan, China
| | - Xuefeng Sun
- Henan Key Laboratory of Animal Immunology, Henan Academy of Agricultural Sciences, Zhengzhou, China
| | - Man Hu
- Henan Key Laboratory of Animal Immunology, Henan Academy of Agricultural Sciences, Zhengzhou, China
| | - Xuewu Li
- Henan Key Laboratory of Animal Immunology, Henan Academy of Agricultural Sciences, Zhengzhou, China
| | - Guangxu Xing
- Henan Key Laboratory of Animal Immunology, Henan Academy of Agricultural Sciences, Zhengzhou, China
| | - Gaiping Zhang
- Henan Key Laboratory of Animal Immunology, Henan Academy of Agricultural Sciences, Zhengzhou, China
- Longhu Modern Immunology Laboratory, Zhengzhou, China
- School of Advanced Agricultural sciences, Peking University, Beijing, China
- Jiangsu Co-Innovation Center for the Prevention and Control of Important Animal Infectious Diseases and Zoonoses, Yangzhou University, Yangzhou, Jiangsu, China
| |
Collapse
|
44
|
Pham NT, Rakkiyapan R, Park J, Malik A, Manavalan B. H2Opred: a robust and efficient hybrid deep learning model for predicting 2'-O-methylation sites in human RNA. Brief Bioinform 2023; 25:bbad476. [PMID: 38180830 PMCID: PMC10768780 DOI: 10.1093/bib/bbad476] [Citation(s) in RCA: 16] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2023] [Revised: 11/22/2023] [Accepted: 11/28/2023] [Indexed: 01/07/2024] Open
Abstract
2'-O-methylation (2OM) is the most common post-transcriptional modification of RNA. It plays a crucial role in RNA splicing, RNA stability and innate immunity. Despite advances in high-throughput detection, the chemical stability of 2OM makes it difficult to detect and map in messenger RNA. Therefore, bioinformatics tools have been developed using machine learning (ML) algorithms to identify 2OM sites. These tools have made significant progress, but their performances remain unsatisfactory and need further improvement. In this study, we introduced H2Opred, a novel hybrid deep learning (HDL) model for accurately identifying 2OM sites in human RNA. Notably, this is the first application of HDL in developing four nucleotide-specific models [adenine (A2OM), cytosine (C2OM), guanine (G2OM) and uracil (U2OM)] as well as a generic model (N2OM). H2Opred incorporated both stacked 1D convolutional neural network (1D-CNN) blocks and stacked attention-based bidirectional gated recurrent unit (Bi-GRU-Att) blocks. 1D-CNN blocks learned effective feature representations from 14 conventional descriptors, while Bi-GRU-Att blocks learned feature representations from five natural language processing-based embeddings extracted from RNA sequences. H2Opred integrated these feature representations to make the final prediction. Rigorous cross-validation analysis demonstrated that H2Opred consistently outperforms conventional ML-based single-feature models on five different datasets. Moreover, the generic model of H2Opred demonstrated a remarkable performance on both training and testing datasets, significantly outperforming the existing predictor and other four nucleotide-specific H2Opred models. To enhance accessibility and usability, we have deployed a user-friendly web server for H2Opred, accessible at https://balalab-skku.org/H2Opred/. This platform will serve as an invaluable tool for accurately predicting 2OM sites within human RNA, thereby facilitating broader applications in relevant research endeavors.
Collapse
Affiliation(s)
- Nhat Truong Pham
- Department of Integrative Biotechnology, College of Biotechnology and Bioengineering, Sungkyunkwan University, Suwon, 16419, Republic of Korea
| | - Rajan Rakkiyapan
- Department of Mathematics, Bharathiar University, Coimbatore - 641046, Tamil Nadu, India
| | - Jongsun Park
- InfoBoss inc. and InfoBoss Research Center, Gangnam-gu, Seoul 06278, Republic of Korea
| | - Adeel Malik
- Institute of Intelligence Informatics Technology, Sangmyung University, Seoul, 03016, Republic of Korea
| | - Balachandran Manavalan
- Department of Integrative Biotechnology, College of Biotechnology and Bioengineering, Sungkyunkwan University, Suwon, 16419, Republic of Korea
| |
Collapse
|
45
|
Yu S, Liao B, Zhu W, Peng D, Wu F. Accurate prediction and key protein sequence feature identification of cyclins. Brief Funct Genomics 2023; 22:411-419. [PMID: 37118891 DOI: 10.1093/bfgp/elad014] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2023] [Revised: 03/03/2023] [Accepted: 03/17/2023] [Indexed: 04/30/2023] Open
Abstract
Cyclin proteins are a group of proteins that activate the cell cycle by forming complexes with cyclin-dependent kinases. Identifying cyclins correctly can provide key clues to understanding the function of cyclins. However, due to the low similarity between cyclin protein sequences, the advancement of a machine learning-based approach to identify cycles is urgently needed. In this study, cyclin protein sequence features were extracted using the profile-based auto-cross covariance method. Then the features were ranked and selected with maximum relevance-maximum distance (MRMD) 1.0 and MRMD2.0. Finally, the prediction model was assessed through 10-fold cross-validation. The computational experiments showed that the best protein sequence features generated by MRMD1.0 could correctly predict 98.2% of cyclins using the random forest (RF) classifier, whereas seven-dimensional key protein sequence features identified with MRMD2.0 could correctly predict 96.1% of cyclins, which was superior to previous studies on the same dataset both in terms of dimensionality and performance comparisons. Therefore, our work provided a valuable tool for identifying cyclins. The model data can be downloaded from https://github.com/YUshunL/cyclin.
Collapse
Affiliation(s)
- Shaoyou Yu
- Key Laboratory of Computational Science and Application of Hainan Province, Haikou, China
- Key Laboratory of Data Science and Intelligence Education, Hainan Normal University, Ministry of Education, Haikou, China
- School of Mathematics and Statistics, Hainan Normal University, Haikou, China
| | - Bo Liao
- Key Laboratory of Computational Science and Application of Hainan Province, Haikou, China
- Key Laboratory of Data Science and Intelligence Education, Hainan Normal University, Ministry of Education, Haikou, China
- School of Mathematics and Statistics, Hainan Normal University, Haikou, China
| | - Wen Zhu
- Key Laboratory of Computational Science and Application of Hainan Province, Haikou, China
- Key Laboratory of Data Science and Intelligence Education, Hainan Normal University, Ministry of Education, Haikou, China
- School of Mathematics and Statistics, Hainan Normal University, Haikou, China
| | - Dejun Peng
- Key Laboratory of Computational Science and Application of Hainan Province, Haikou, China
- Key Laboratory of Data Science and Intelligence Education, Hainan Normal University, Ministry of Education, Haikou, China
- School of Mathematics and Statistics, Hainan Normal University, Haikou, China
| | - Fangxiang Wu
- Key Laboratory of Computational Science and Application of Hainan Province, Haikou, China
- Key Laboratory of Data Science and Intelligence Education, Hainan Normal University, Ministry of Education, Haikou, China
- School of Mathematics and Statistics, Hainan Normal University, Haikou, China
| |
Collapse
|
46
|
Ferreira R, Amado F, Vitorino R. Empowering peptidomics: utilizing computational tools and approaches. Bioanalysis 2023; 15:1315-1325. [PMID: 37737150 DOI: 10.4155/bio-2023-0102] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/23/2023] Open
Abstract
Bioinformatics plays a critical role in the advancement of peptidomics by providing powerful tools for data analysis, interpretation and integration. Peptidomics is concerned with the study of peptides, short chains of amino acids with diverse biological functions. This area includes peptide identification and characterization, database construction, de novo sequencing, functional annotation, omics data integration and systems biology. Artificial intelligence techniques, such as machine learning and natural language processing, aid in the interpretation of peptide sequence data and the generation of biological insights. By using bioinformatics approaches, peptidomics researchers can accelerate peptide discovery, understand their functions and gain insights into complex molecular interactions.
Collapse
Affiliation(s)
- Rita Ferreira
- LAQV-REQUIMTE, Department of Chemistry, University of Aveiro, 3810-193 Aveiro, Portugal
| | - Francisco Amado
- LAQV-REQUIMTE, Department of Chemistry, University of Aveiro, 3810-193 Aveiro, Portugal
| | - Rui Vitorino
- LAQV-REQUIMTE, Department of Chemistry, University of Aveiro, 3810-193 Aveiro, Portugal
- Unidade de Investigação Cardiovascular, Departamento de Cirurgia e Fisiologia, Universidade do Porto, Porto, Portugal
- iBiMED, Department of Medical Sciences, University of Aveiro, Aveiro, Portugal
| |
Collapse
|
47
|
Lv H, Yan K, Liu B. TPpred-LE: therapeutic peptide function prediction based on label embedding. BMC Biol 2023; 21:238. [PMID: 37904157 PMCID: PMC10617231 DOI: 10.1186/s12915-023-01740-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/17/2023] [Accepted: 10/17/2023] [Indexed: 11/01/2023] Open
Abstract
BACKGROUND Therapeutic peptides play an essential role in human physiology, treatment paradigms and bio-pharmacy. Several computational methods have been developed to identify the functions of therapeutic peptides based on binary classification and multi-label classification. However, these methods fail to explicitly exploit the relationship information among different functions, preventing the further improvement of the prediction performance. Besides, with the development of peptide detection technology, peptide functions will be more comprehensively discovered. Therefore, it is necessary to explore computational methods for detecting therapeutic peptide functions with limited labeled data. RESULTS In this study, a novel method called TPpred-LE based on Transformer framework was proposed for predicting therapeutic peptide multiple functions, which can explicitly extract the function correlation information by using label embedding methodology and exploit the specificity information based on function-specific classifiers. Besides, we incorporated the multi-label classifier retraining approach (MCRT) into TPpred-LE to detect the new therapeutic functions with limited labeled data. Experimental results demonstrate that TPpred-LE outperforms the other state-of-the-art methods, and TPpred-LE with MCRT is robust for the limited labeled data. CONCLUSIONS In summary, TPpred-LE is a function-specific classifier for accurate therapeutic peptide function prediction, demonstrating the importance of the relationship information for therapeutic peptide function prediction. MCRT is a simple but effective strategy to detect functions with limited labeled data.
Collapse
Affiliation(s)
- Hongwu Lv
- School of Computer Science and Technology, Beijing Institute of Technology, Beijing, 100081, China
| | - Ke Yan
- School of Computer Science and Technology, Beijing Institute of Technology, Beijing, 100081, China
| | - Bin Liu
- School of Computer Science and Technology, Beijing Institute of Technology, Beijing, 100081, China.
- Advanced Research Institute of Multidisciplinary Science, Beijing Institute of Technology, No. 5, South Zhongguancun Street, Haidian District, Beijing, 100081, China.
| |
Collapse
|
48
|
Guan C, Luo J, Li S, Tan ZL, Wang Y, Chen H, Yamamoto N, Zhang C, Lu Y, Chen J, Xing XH. Exploration of DPP-IV Inhibitory Peptide Design Rules Assisted by the Deep Learning Pipeline That Identifies the Restriction Enzyme Cutting Site. ACS OMEGA 2023; 8:39662-39672. [PMID: 37901493 PMCID: PMC10601436 DOI: 10.1021/acsomega.3c05571] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/30/2023] [Accepted: 09/27/2023] [Indexed: 10/31/2023]
Abstract
The mining of antidiabetic dipeptidyl peptidase IV (DPP-IV) inhibitory peptides (DPP-IV-IPs) is currently a costly and laborious process. Due to the absence of rational peptide design rules, it relies on cumbersome screening of unknown enzyme hydrolysates. Here, we present an enhanced deep learning model called bidirectional encoder representation (BERT)-DPPIV, specifically designed to classify DPP-IV-IPs and explore their design rules to discover potent candidates. The end-to-end model utilizes a fine-tuned BERT architecture to extract structural/functional information from input peptides and accurately identify DPP-IV-Ips from input peptides. Experimental results in the benchmark data set showed BERT-DPPIV yielded state-of-the-art accuracy and MCC of 0.894 and 0.790, surpassing the 0.797 and 0.594 obtained by the sequence-feature model. Furthermore, we leveraged the attention mechanism to uncover that our model could recognize the restriction enzyme cutting site and specific residues that contribute to the inhibition of DPP-IV. Moreover, guided by BERT-DPPIV, proposed design rules for DPP-IV inhibitory tripeptides and pentapeptides were validated, and they can be used to screen potent DPP-IV-IPs.
Collapse
Affiliation(s)
- Changge Guan
- Key
Laboratory for Industrial Biocatalysis, Ministry of Education of China,
Department of Chemical Engineering, Tsinghua
University, Beijing 100084, China
| | - Jiawei Luo
- Department
of Computer Science and Technology, Harbin
Institute of Technology, Shenzhen 518055, China
| | - Shucheng Li
- Key
Laboratory for Industrial Biocatalysis, Ministry of Education of China,
Department of Chemical Engineering, Tsinghua
University, Beijing 100084, China
| | - Zheng Lin Tan
- School
of Life Science and Technology, Tokyo Institute
of Technology, 4259 Nagatsutacho, Midori Ward, Yokohama,
Kanagawa Prefecture 226-0026, Japan
| | - Yi Wang
- Key
Laboratory for Industrial Biocatalysis, Ministry of Education of China,
Department of Chemical Engineering, Tsinghua
University, Beijing 100084, China
| | - Haihong Chen
- Institute
of Biopharmaceutical and Health Engineering, Tsinghua Shenzhen International Graduate School, Shenzhen 518055, China
- Institute
of Biomedical Health Technology and Engineering, Shenzhen Bay Laboratory, Shenzhen 518118, China
| | - Naoyuki Yamamoto
- School
of Life Science and Technology, Tokyo Institute
of Technology, 4259 Nagatsutacho, Midori Ward, Yokohama,
Kanagawa Prefecture 226-0026, Japan
| | - Chong Zhang
- Key
Laboratory for Industrial Biocatalysis, Ministry of Education of China,
Department of Chemical Engineering, Tsinghua
University, Beijing 100084, China
- Center
for Synthetic and Systems Biology, Tsinghua
University, Beijing 100084, China
| | - Yuan Lu
- Key
Laboratory for Industrial Biocatalysis, Ministry of Education of China,
Department of Chemical Engineering, Tsinghua
University, Beijing 100084, China
| | - Junjie Chen
- Department
of Computer Science and Technology, Harbin
Institute of Technology, Shenzhen 518055, China
| | - Xin-Hui Xing
- Key
Laboratory for Industrial Biocatalysis, Ministry of Education of China,
Department of Chemical Engineering, Tsinghua
University, Beijing 100084, China
- Institute
of Biopharmaceutical and Health Engineering, Tsinghua Shenzhen International Graduate School, Shenzhen 518055, China
- Institute
of Biomedical Health Technology and Engineering, Shenzhen Bay Laboratory, Shenzhen 518118, China
- Center
for Synthetic and Systems Biology, Tsinghua
University, Beijing 100084, China
| |
Collapse
|
49
|
Basith S, Pham NT, Song M, Lee G, Manavalan B. ADP-Fuse: A novel two-layer machine learning predictor to identify antidiabetic peptides and diabetes types using multiview information. Comput Biol Med 2023; 165:107386. [PMID: 37619323 DOI: 10.1016/j.compbiomed.2023.107386] [Citation(s) in RCA: 14] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2023] [Revised: 08/03/2023] [Accepted: 08/14/2023] [Indexed: 08/26/2023]
Abstract
Diabetes mellitus has become a major public health concern associated with high mortality and reduced life expectancy and can cause blindness, heart attacks, kidney failure, lower limb amputations, and strokes. A new generation of antidiabetic peptides (ADPs) that act on β-cells or T-cells to regulate insulin production is being developed to alleviate the effects of diabetes. However, the lack of effective peptide-mining tools has hampered the discovery of these promising drugs. Hence, novel computational tools need to be developed urgently. In this study, we present ADP-Fuse, a novel two-layer prediction framework capable of accurately identifying ADPs or non-ADPs and categorizing them into type 1 and type 2 ADPs. First, we comprehensively evaluated 22 peptide sequence-derived features coupled with eight notable machine learning algorithms. Subsequently, the most suitable feature descriptors and classifiers for both layers were identified. The output of these single-feature models, embedded with multiview information, was trained with an appropriate classifier to provide the final prediction. Comprehensive cross-validation and independent tests substantiate that ADP-Fuse surpasses single-feature models and the feature fusion approach for the prediction of ADPs and their types. In addition, the SHapley Additive exPlanation method was used to elucidate the contributions of individual features to the prediction of ADPs and their types. Finally, a user-friendly web server for ADP-Fuse was developed and made publicly accessible (https://balalab-skku.org/ADP-Fuse), enabling the swift screening and identification of novel ADPs and their types. This framework is expected to contribute significantly to antidiabetic peptide identification.
Collapse
Affiliation(s)
- Shaherin Basith
- Department of Physiology, Ajou University School of Medicine, Suwon, 16499, Republic of Korea
| | - Nhat Truong Pham
- Department of Integrative Biotechnology, College of Biotechnology and Bioengineering, Sungkyunkwan University, Suwon, 16419, Republic of Korea
| | - Minkyung Song
- Department of Integrative Biotechnology, College of Biotechnology and Bioengineering, Sungkyunkwan University, Suwon, 16419, Republic of Korea; Department of Biopharmaceutical Convergence, Sungkyunkwan University, Suwon, 16419, Republic of Korea.
| | - Gwang Lee
- Department of Physiology, Ajou University School of Medicine, Suwon, 16499, Republic of Korea; Department of Molecular Science and Technology, Ajou University, Suwon, 16499, Republic of Korea.
| | - Balachandran Manavalan
- Department of Integrative Biotechnology, College of Biotechnology and Bioengineering, Sungkyunkwan University, Suwon, 16419, Republic of Korea.
| |
Collapse
|
50
|
Charoenkwan P, Kongsompong S, Schaduangrat N, Chumnanpuen P, Shoombuatong W. TIPred: a novel stacked ensemble approach for the accelerated discovery of tyrosinase inhibitory peptides. BMC Bioinformatics 2023; 24:356. [PMID: 37735626 PMCID: PMC10512532 DOI: 10.1186/s12859-023-05463-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2023] [Accepted: 09/01/2023] [Indexed: 09/23/2023] Open
Abstract
BACKGROUND Tyrosinase is an enzyme involved in melanin production in the skin. Several hyperpigmentation disorders involve the overproduction of melanin and instability of tyrosinase activity resulting in darker, discolored patches on the skin. Therefore, discovering tyrosinase inhibitory peptides (TIPs) is of great significance for basic research and clinical treatments. However, the identification of TIPs using experimental methods is generally cost-ineffective and time-consuming. RESULTS Herein, a stacked ensemble learning approach, called TIPred, is proposed for the accurate and quick identification of TIPs by using sequence information. TIPred explored a comprehensive set of various baseline models derived from well-known machine learning (ML) algorithms and heterogeneous feature encoding schemes from multiple perspectives, such as chemical structure properties, physicochemical properties, and composition information. Subsequently, 130 baseline models were trained and optimized to create new probabilistic features. Finally, the feature selection approach was utilized to determine the optimal feature vector for developing TIPred. Both tenfold cross-validation and independent test methods were employed to assess the predictive capability of TIPred by using the stacking strategy. Experimental results showed that TIPred significantly outperformed the state-of-the-art method in terms of the independent test, with an accuracy of 0.923, MCC of 0.757 and an AUC of 0.977. CONCLUSIONS The proposed TIPred approach could be a valuable tool for rapidly discovering novel TIPs and effectively identifying potential TIP candidates for follow-up experimental validation. Moreover, an online webserver of TIPred is publicly available at http://pmlabstack.pythonanywhere.com/TIPred .
Collapse
Affiliation(s)
- Phasit Charoenkwan
- Modern Management and Information Technology, College of Arts, Media and Technology, Chiang Mai University, Chiang Mai, 50200, Thailand
| | - Sasikarn Kongsompong
- Interdisciplinary Graduate Program in Bioscience, Faculty of Science, Kasetsart University, Bangkok, 10900, Thailand
| | - Nalini Schaduangrat
- Center for Research Innovation and Biomedical Informatics, Faculty of Medical Technology, Mahidol University, Bangkok, 10700, Thailand
| | - Pramote Chumnanpuen
- Department of Zoology, Faculty of Science, Kasetsart University, Bangkok, 10900, Thailand.
- Omics Center for Agriculture, Bioresources, Food, and Health, Kasetsart University (OmiKU), Bangkok, 10900, Thailand.
| | - Watshara Shoombuatong
- Center for Research Innovation and Biomedical Informatics, Faculty of Medical Technology, Mahidol University, Bangkok, 10700, Thailand.
| |
Collapse
|