1
|
Bizzotto E, Zampieri G, Treu L, Filannino P, Di Cagno R, Campanaro S. Classification of bioactive peptides: A systematic benchmark of models and encodings. Comput Struct Biotechnol J 2024; 23:2442-2452. [PMID: 38867723 PMCID: PMC11168199 DOI: 10.1016/j.csbj.2024.05.040] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/19/2024] [Revised: 05/10/2024] [Accepted: 05/22/2024] [Indexed: 06/14/2024] Open
Abstract
Bioactive peptides are short amino acid chains possessing biological activity and exerting physiological effects relevant to human health. Despite their therapeutic value, their identification remains a major problem, as it mainly relies on time-consuming in vitro tests. While bioinformatic tools for the identification of bioactive peptides are available, they are focused on specific functional classes and have not been systematically tested on realistic settings. To tackle this problem, bioactive peptide sequences and functions were here gathered from a variety of databases to generate a unified collection of bioactive peptides from microbial fermentation. This collection was organized into nine functional classes including some previously studied and some unexplored such as immunomodulatory, opioid and cardiovascular peptides. Upon assessing their sequence properties, four alternative encoding methods were tested in combination with a multitude of machine learning algorithms, from basic classifiers like logistic regression to advanced algorithms like BERT. Tests on a total of 171 models showed that, while some functions are intrinsically easier to detect, no single combination of classifiers and encoders worked universally well for all classes. For this reason, we unified all the best individual models for each class and generated CICERON (Classification of bIoaCtive pEptides fRom micrObial fermeNtation), a classification tool for the functional classification of peptides. State-of-the-art classifiers were found to underperform on our realistic benchmark dataset compared to the models included in CICERON. Altogether, our work provides a tool for real-world peptide classification and can serve as a benchmark for future model development.
Collapse
Affiliation(s)
- Edoardo Bizzotto
- Department of Biology, University of Padua, Via U. Bassi 58/b, Padova 35131, Italy
| | - Guido Zampieri
- Department of Biology, University of Padua, Via U. Bassi 58/b, Padova 35131, Italy
| | - Laura Treu
- Department of Biology, University of Padua, Via U. Bassi 58/b, Padova 35131, Italy
| | - Pasquale Filannino
- Department of Soil, Plant and Food Science, University of Bari Aldo Moro, Via G. Amendola 165/a, Bari 70126, Italy
| | - Raffaella Di Cagno
- Faculty of Agricultural, Environmental and Food Sciences, Free University of Bolzano, Piazza Universita, 5, Bolzano 39100, Italy
| | - Stefano Campanaro
- Department of Biology, University of Padua, Via U. Bassi 58/b, Padova 35131, Italy
| |
Collapse
|
2
|
Sultan MF, Shaon MSH, Karim T, Ali MM, Hasan MZ, Ahmed K, Bui FM, Chen L, Dhasarathan V, Moni MA. MLAFP-XN: Leveraging neural network model for development of antifungal peptide identification tool. Heliyon 2024; 10:e37820. [PMID: 39323787 PMCID: PMC11422610 DOI: 10.1016/j.heliyon.2024.e37820] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2024] [Revised: 08/23/2024] [Accepted: 09/10/2024] [Indexed: 09/27/2024] Open
Abstract
Infectious fungi have been an increasing global concern in the present era. A promising approach to tackle this pressing concern involves utilizing Antifungal peptides (AFP) to develop an antifungal drug that can selectively eliminate fungal pathogens from a host with minimal toxicity to the host. Accordingly, identifying precise therapeutic antifungal peptides is crucial for developing effective drugs and treatments. This study proposed MLAFP-XN, a neural network-based strategy for accurately detecting active AFP in sequencing data to achieve this objective. In this work, eight feature extraction techniques and the XGB feature selection strategy are utilized together to present an enhanced methodology. A total of 24 classification models were evaluated, and the most effective four have been selected. Each of these models demonstrated superior accuracy on independent test sets, with respective scores of 97.93 %, 99.47 %, and 99.48 %. Our model outperforms current state of the art methods. In addition, we created a companion website to demonstrate our AFP recognition process and use SHAP to identify the most influential properties.
Collapse
Affiliation(s)
- Md. Fahim Sultan
- Department of Computer Science and Engineering, Daffodil International University, Daffodil Smart City (DSC), Birulia, Savar, Dhaka, 1216, Bangladesh
| | - Md. Shazzad Hossain Shaon
- Department of Computer Science and Engineering, Daffodil International University, Daffodil Smart City (DSC), Birulia, Savar, Dhaka, 1216, Bangladesh
| | - Tasmin Karim
- Department of Computer Science and Engineering, Daffodil International University, Daffodil Smart City (DSC), Birulia, Savar, Dhaka, 1216, Bangladesh
| | - Md. Mamun Ali
- Division of Biomedical Engineering, University of Saskatchewan, 57 Campus Drive, Saskatoon, SK, S7N 5A9, Canada
- Department of Software Engineering, Daffodil International University, Daffodil Smart City (DSC), Birulia, Savar, Dhaka, 1216, Bangladesh
- Health Informatics Research Lab, Department of Computer Science and Engineering, Daffodil International University, Daffodil Smart City (DSC), Birulia, Savar, Dhaka, 1216, Bangladesh
| | - Md. Zahid Hasan
- Department of Computer Science and Engineering, Daffodil International University, Daffodil Smart City (DSC), Birulia, Savar, Dhaka, 1216, Bangladesh
| | - Kawsar Ahmed
- Department of Electrical and Computer Engineering, University of Saskatchewan, 57 Campus Drive, Saskatoon, SK, S7N 5A9, Canada
- Group of Bio-photomatiχ, Information and Communication Technology, Mawlana Bhashani Science and Technology University, Santosh, Tangail, 1902, Bangladesh
- Health Informatics Research Lab, Department of Computer Science and Engineering, Daffodil International University, Daffodil Smart City (DSC), Birulia, Savar, Dhaka, 1216, Bangladesh
| | - Francis M. Bui
- Department of Electrical and Computer Engineering, University of Saskatchewan, 57 Campus Drive, Saskatoon, SK, S7N 5A9, Canada
| | - Li Chen
- Department of Electrical and Computer Engineering, University of Saskatchewan, 57 Campus Drive, Saskatoon, SK, S7N 5A9, Canada
| | - Vigneswaran Dhasarathan
- Department of ECE, Centre for IoT and AI (CITI), KPR Institute of Engineering and Technology, Coimbatore, Tamil Nadu, India
| | - Mohammad Ali Moni
- AI & Digital Health Technology, Artifcial Intelligence & Cyber Future Institute, Charles Stuart University, Bathurst, NSW, 2795, Australia
- AI & Digital Health Technology, Rural Health Research Institute, Charles Stuart University, Orange, NSW 2800, Australia
| |
Collapse
|
3
|
Li J, Liao L, Jia M, Chen Z, Liu X. Latent relation shared learning for endometrial cancer diagnosis with incomplete multi-modality medical images. iScience 2024; 27:110509. [PMID: 39161958 PMCID: PMC11332793 DOI: 10.1016/j.isci.2024.110509] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2024] [Revised: 05/22/2024] [Accepted: 07/11/2024] [Indexed: 08/21/2024] Open
Abstract
Magnetic resonance imaging (MRI), ultrasound (US), and contrast-enhanced ultrasound (CEUS) can provide different image data about uterus, which have been used in the preoperative assessment of endometrial cancer. In practice, not all the patients have complete multi-modality medical images due to the high cost or long examination period. Most of the existing methods need to perform data cleansing or discard samples with missing modalities, which will influence the performance of the model. In this work, we propose an incomplete multi-modality images data fusion method based on latent relation shared to overcome this limitation. The shared space contains the common latent feature representation and modality-specific latent feature representation from the complete and incomplete multi-modality data, which jointly exploits both consistent and complementary information among multiple images. The experimental results show that our method outperforms the current representative approaches in terms of classification accuracy, sensitivity, specificity, and area under curve (AUC). Furthermore, our method performs well under varying imaging missing rates.
Collapse
Affiliation(s)
- Jiaqi Li
- School of Computer Science and Technology, Beijing Institute of Technology, Beijing 100081, China
- Beijing Engineering Research Center of High Volume Language Information Processing and Cloud Computing Applications, Beijing 100081, China
| | - Lejian Liao
- School of Computer Science and Technology, Beijing Institute of Technology, Beijing 100081, China
- Beijing Engineering Research Center of High Volume Language Information Processing and Cloud Computing Applications, Beijing 100081, China
| | - Meihuizi Jia
- School of Computer Science and Technology, Beijing Institute of Technology, Beijing 100081, China
- Beijing Engineering Research Center of High Volume Language Information Processing and Cloud Computing Applications, Beijing 100081, China
| | - Zhendong Chen
- School of Computer Science and Technology, Beijing Institute of Technology, Beijing 100081, China
- Beijing Engineering Research Center of High Volume Language Information Processing and Cloud Computing Applications, Beijing 100081, China
| | - Xin Liu
- Department of Ultrasound, Beijing Tiantan Hospital, Capital Medical University, Beijing 100070, China
| |
Collapse
|
4
|
Xu Y, Zhang S, Zhu F, Liang Y. A deep learning model for anti-inflammatory peptides identification based on deep variational autoencoder and contrastive learning. Sci Rep 2024; 14:18451. [PMID: 39117712 PMCID: PMC11310449 DOI: 10.1038/s41598-024-69419-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2024] [Accepted: 08/05/2024] [Indexed: 08/10/2024] Open
Abstract
As a class of biologically active molecules with significant immunomodulatory and anti-inflammatory effects, anti-inflammatory peptides have important application value in the medical and biotechnology fields due to their unique biological functions. Research on the identification of anti-inflammatory peptides provides important theoretical foundations and practical value for a deeper understanding of the biological mechanisms of inflammation and immune regulation, as well as for the development of new drugs and biotechnological applications. Therefore, it is necessary to develop more advanced computational models for identifying anti-inflammatory peptides. In this study, we propose a deep learning model named DAC-AIPs based on variational autoencoder and contrastive learning for accurate identification of anti-inflammatory peptides. In the sequence encoding part, the incorporation of multi-hot encoding helps capture richer sequence information. The autoencoder, composed of convolutional layers and linear layers, can learn latent features and reconstruct features, with variational inference enhancing the representation capability of latent features. Additionally, the introduction of contrastive learning aims to improve the model's classification ability. Through cross-validation and independent dataset testing experiments, DAC-AIPs achieves superior performance compared to existing state-of-the-art models. In cross-validation, the classification accuracy of DAC-AIPs reached around 88%, which is 7% higher than previous models. Furthermore, various ablation experiments and interpretability experiments validate the effectiveness of DAC-AIPs. Finally, a user-friendly online predictor is designed to enhance the practicality of the model, and the server is freely accessible at http://dac-aips.online .
Collapse
Affiliation(s)
- Yujie Xu
- School of Mathematics and Statistics, Xidian University, Xi'an, 710071, People's Republic of China
| | - Shengli Zhang
- School of Mathematics and Statistics, Xidian University, Xi'an, 710071, People's Republic of China.
| | - Feng Zhu
- Center for Translational Medicine, The First Affiliated Hospital of Xi'an Jiaotong University, Xi'an, 710061, People's Republic of China
| | - Yunyun Liang
- School of Science, Xi'an Polytechnic University, Xi'an, 710048, People's Republic of China
| |
Collapse
|
5
|
Hao Y, Liu X, Fu H, Shao X, Cai W. PGAT-ABPp: harnessing protein language models and graph attention networks for antibacterial peptide identification with remarkable accuracy. Bioinformatics 2024; 40:btae497. [PMID: 39120878 PMCID: PMC11338452 DOI: 10.1093/bioinformatics/btae497] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2024] [Revised: 07/24/2024] [Accepted: 08/08/2024] [Indexed: 08/10/2024] Open
Abstract
MOTIVATION The emergence of drug-resistant pathogens represents a formidable challenge to global health. Using computational methods to identify the antibacterial peptides (ABPs), an alternative antimicrobial agent, has demonstrated advantages in further drug design studies. Most of the current approaches, however, rely on handcrafted features and underutilize structural information, which may affect prediction performance. RESULTS To present an ultra-accurate model for ABP identification, we propose a novel deep learning approach, PGAT-ABPp. PGAT-ABPp leverages structures predicted by AlphaFold2 and a pretrained protein language model, ProtT5-XL-U50 (ProtT5), to construct graphs. Then the graph attention network (GAT) is adopted to learn global discriminative features from the graphs. PGAT-ABPp outperforms the other fourteen state-of-the-art models in terms of accuracy, F1-score and Matthews Correlation Coefficient on the independent test dataset. The results show that ProtT5 has significant advantages in the identification of ABPs and the introduction of spatial information further improves the prediction performance of the model. The interpretability analysis of key residues in known active ABPs further underscores the superiority of PGAT-ABPp. AVAILABILITY AND IMPLEMENTATION The datasets and source codes for the PGAT-ABPp model are available at https://github.com/moonseter/PGAT-ABPp/.
Collapse
Affiliation(s)
- Yuelei Hao
- Research Center for Analytical Sciences, Tianjin Key Laboratory of Biosensing and Molecular Recognition, State Key Laboratory of Medicinal Chemical Biology, College of Chemistry, Nankai University, Tianjin 300071, China
- Haihe Laboratory of Sustainable Chemical Transformations, Tianjin 300192, China
| | - Xuyang Liu
- Research Center for Analytical Sciences, Tianjin Key Laboratory of Biosensing and Molecular Recognition, State Key Laboratory of Medicinal Chemical Biology, College of Chemistry, Nankai University, Tianjin 300071, China
- Haihe Laboratory of Sustainable Chemical Transformations, Tianjin 300192, China
| | - Haohao Fu
- Research Center for Analytical Sciences, Tianjin Key Laboratory of Biosensing and Molecular Recognition, State Key Laboratory of Medicinal Chemical Biology, College of Chemistry, Nankai University, Tianjin 300071, China
- Haihe Laboratory of Sustainable Chemical Transformations, Tianjin 300192, China
| | - Xueguang Shao
- Research Center for Analytical Sciences, Tianjin Key Laboratory of Biosensing and Molecular Recognition, State Key Laboratory of Medicinal Chemical Biology, College of Chemistry, Nankai University, Tianjin 300071, China
- Haihe Laboratory of Sustainable Chemical Transformations, Tianjin 300192, China
| | - Wensheng Cai
- Research Center for Analytical Sciences, Tianjin Key Laboratory of Biosensing and Molecular Recognition, State Key Laboratory of Medicinal Chemical Biology, College of Chemistry, Nankai University, Tianjin 300071, China
- Haihe Laboratory of Sustainable Chemical Transformations, Tianjin 300192, China
| |
Collapse
|
6
|
Song H, Lin X, Zhang H, Yin H. ACP-ESM2: The prediction of anticancer peptides based on pre-trained classifier. Comput Biol Chem 2024; 110:108091. [PMID: 38735271 DOI: 10.1016/j.compbiolchem.2024.108091] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2024] [Revised: 04/07/2024] [Accepted: 04/29/2024] [Indexed: 05/14/2024]
Abstract
Anticancer peptides (ACPs) are a type of protein molecule that has anti-cancer activity and can inhibit cancer cell growth and survival. Traditional classification approaches for ACPs are expensive and time-consuming. This paper proposes a pre-trained classifier model, ESM2-GRU, for ACP prediction to make it easier to predict ACPs, gain a better understanding of the structural and functional differences of anti-cancer peptides, and optimize the design for the development of more effective anti-cancer treatment strategies. The model is made up of the ESM2 pre-trained model, a bidirectional GRU recurrent neural network, and a fully connected layer. ACP sequences are first fed into the ESM2 model, which then expands the dimensions before feeding the findings back into the bidirectional GRU recurrent neural network. Finally, the fully connected layer generates the ultimate output. Experimental validation demonstrates that the ESM2-GRU model greatly improves classification performance on the benchmark dataset ACP606, with AUC, ACC, and MCC values of 0.975, 0.852, and 0.738, respectively. This exceptional prediction potential helps to identify specific types of anti-cancer peptides, improving their targeting and selectivity and, therefore, furthering the development of tailored medicine and treatments.
Collapse
Affiliation(s)
- Huijia Song
- School of Information Engineering, Beijing Institute of Petrochemical Technology, Beijing 102617, China
| | - Xiaozhu Lin
- School of Information Engineering, Beijing Institute of Petrochemical Technology, Beijing 102617, China.
| | - Huainian Zhang
- School of Information Engineering, Beijing Institute of Petrochemical Technology, Beijing 102617, China
| | - Huijuan Yin
- School of Information Engineering, Beijing Institute of Petrochemical Technology, Beijing 102617, China
| |
Collapse
|
7
|
Fang Y, Luo M, Ren Z, Wei L, Wei DQ. CELA-MFP: a contrast-enhanced and label-adaptive framework for multi-functional therapeutic peptides prediction. Brief Bioinform 2024; 25:bbae348. [PMID: 39038935 PMCID: PMC11262836 DOI: 10.1093/bib/bbae348] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2024] [Revised: 05/27/2024] [Accepted: 07/08/2024] [Indexed: 07/24/2024] Open
Abstract
Functional peptides play crucial roles in various biological processes and hold significant potential in many fields such as drug discovery and biotechnology. Accurately predicting the functions of peptides is essential for understanding their diverse effects and designing peptide-based therapeutics. Here, we propose CELA-MFP, a deep learning framework that incorporates feature Contrastive Enhancement and Label Adaptation for predicting Multi-Functional therapeutic Peptides. CELA-MFP utilizes a protein language model (pLM) to extract features from peptide sequences, which are then fed into a Transformer decoder for function prediction, effectively modeling correlations between different functions. To enhance the representation of each peptide sequence, contrastive learning is employed during training. Experimental results demonstrate that CELA-MFP outperforms state-of-the-art methods on most evaluation metrics for two widely used datasets, MFBP and MFTP. The interpretability of CELA-MFP is demonstrated by visualizing attention patterns in pLM and Transformer decoder. Finally, a user-friendly online server for predicting multi-functional peptides is established as the implementation of the proposed CELA-MFP and can be freely accessed at http://dreamai.cmii.online/CELA-MFP.
Collapse
Affiliation(s)
- Yitian Fang
- State Key Laboratory of Microbial Metabolism, Joint International Research Laboratory of Metabolic and Developmental Sciences, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, 800 Dongchuan Road, Shanghai 200240, China
- Peng Cheng Laboratory, 2 Xingke 1st Street, Nanshan District, Shenzhen 518055, China
| | - Mingshuang Luo
- Peng Cheng Laboratory, 2 Xingke 1st Street, Nanshan District, Shenzhen 518055, China
| | - Zhixiang Ren
- Peng Cheng Laboratory, 2 Xingke 1st Street, Nanshan District, Shenzhen 518055, China
| | - Leyi Wei
- Centre for Artificial Intelligence Driven Drug Discovery, Faculty of Applied Sciences, Macao Polytechnic University, R. de Luís Gonzaga Gomes, Macao 999078, China
- School of Informatics, Xiamen University, 422 Siming South Road, Xiamen 361005, China
| | - Dong-Qing Wei
- State Key Laboratory of Microbial Metabolism, Joint International Research Laboratory of Metabolic and Developmental Sciences, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, 800 Dongchuan Road, Shanghai 200240, China
- Peng Cheng Laboratory, 2 Xingke 1st Street, Nanshan District, Shenzhen 518055, China
| |
Collapse
|
8
|
Akbar S, Zou Q, Raza A, Alarfaj FK. iAFPs-Mv-BiTCN: Predicting antifungal peptides using self-attention transformer embedding and transform evolutionary based multi-view features with bidirectional temporal convolutional networks. Artif Intell Med 2024; 151:102860. [PMID: 38552379 DOI: 10.1016/j.artmed.2024.102860] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2023] [Revised: 02/21/2024] [Accepted: 03/25/2024] [Indexed: 04/26/2024]
Abstract
Globally, fungal infections have become a major health concern in humans. Fungal diseases generally occur due to the invading fungus appearing on a specific portion of the body and becoming hard for the human immune system to resist. The recent emergence of COVID-19 has intensely increased different nosocomial fungal infections. The existing wet-laboratory-based medications are expensive, time-consuming, and may have adverse side effects on normal cells. In the last decade, peptide therapeutics have gained significant attention due to their high specificity in targeting affected cells without affecting healthy cells. Motivated by the significance of peptide-based therapies, we developed a highly discriminative prediction scheme called iAFPs-Mv-BiTCN to predict antifungal peptides correctly. The training peptides are encoded using word embedding methods such as skip-gram and attention mechanism-based bidirectional encoder representation using transformer. Additionally, transform-based evolutionary features are generated using the Pseduo position-specific scoring matrix using discrete wavelet transform (PsePSSM-DWT). The fused vector of word embedding and evolutionary descriptors is formed to compensate for the limitations of single encoding methods. A Shapley Additive exPlanations (SHAP) based global interpolation approach is applied to reduce training costs by choosing the optimal feature set. The selected feature set is trained using a bi-directional temporal convolutional network (BiTCN). The proposed iAFPs-Mv-BiTCN model achieved a predictive accuracy of 98.15 % and an AUC of 0.99 using training samples. In the case of the independent samples, our model obtained an accuracy of 94.11 % and an AUC of 0.98. Our iAFPs-Mv-BiTCN model outperformed existing models with a ~4 % and ~5 % higher accuracy using training and independent samples, respectively. The reliability and efficacy of the proposed iAFPs-Mv-BiTCN model make it a valuable tool for scientists and may perform a beneficial role in pharmaceutical design and research academia.
Collapse
Affiliation(s)
- Shahid Akbar
- Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu 610054, China; Department of Computer Science, Abdul Wali Khan University Mardan, KP 23200, Pakistan
| | - Quan Zou
- Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu 610054, China; Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, Quzhou 324000, PR China.
| | - Ali Raza
- Department of Physical and Numerical Sciences, Qurtuba University of Science and Information Technology, Peshawar, KP 25124, Pakistan
| | - Fawaz Khaled Alarfaj
- Department of Management Information Systems (MIS), School of Business, King Faisal University (KFU), Al-Ahsa 31982, Saudi Arabia
| |
Collapse
|
9
|
Wu JS, Liu Y, Ge F, Yu DJ. Prediction of protein-ATP binding residues using multi-view feature learning via contextual-based co-attention network. Comput Biol Med 2024; 172:108227. [PMID: 38460308 DOI: 10.1016/j.compbiomed.2024.108227] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2023] [Revised: 01/17/2024] [Accepted: 02/25/2024] [Indexed: 03/11/2024]
Abstract
Accurately predicting protein-ATP binding residues is critical for protein function annotation and drug discovery. Computational methods dedicated to the prediction of binding residues based on protein sequence information have exhibited notable advancements in predictive accuracy. Nevertheless, these methods continue to grapple with several formidable challenges, including limited means of extracting more discriminative features and inadequate algorithms for integrating protein and residue information. To address the problems, we propose ATP-Deep, a novel protein-ATP binding residues predictor. ATP-Deep harnesses the capabilities of unsupervised pre-trained language models and incorporates domain-specific evolutionary context information from homologous sequences. It further refines the embedding at the residue level through integration with corresponding protein-level information and employs a contextual-based co-attention mechanism to adeptly fuse multiple sources of features. The performance evaluation results on the benchmark datasets reveal that ATP-Deep achieves an AUC of 0.954 and 0.951, respectively, surpassing the performance of the state-of-the-art model. These findings underscore the effectiveness of assimilating protein-level information and deploying a contextual-based co-attention mechanism grounded in context to bolster the prediction performance of protein-ATP binding residues.
Collapse
Affiliation(s)
- Jia-Shun Wu
- School of Computer Science and Engineering, Nanjing University of Science and Technology, 200 Xiaolingwei, Nanjing, 210094, China
| | - Yan Liu
- School of Information Engineering, Yangzhou University, 196 West Huayang, Yangzhou, 225100, China
| | - Fang Ge
- State Key Laboratory of Organic Electronics and Information Displays & Institute of Advanced Materials (IAM), Nanjing University of Posts & Telecommunications, 9 Wenyuan Road, Nanjing 210023, China
| | - Dong-Jun Yu
- School of Computer Science and Engineering, Nanjing University of Science and Technology, 200 Xiaolingwei, Nanjing, 210094, China.
| |
Collapse
|
10
|
Xu J, Ruan X, Yang J, Hu B, Li S, Hu J. SME-MFP: A novel spatiotemporal neural network with multiangle initialization embedding toward multifunctional peptides prediction. Comput Biol Chem 2024; 109:108033. [PMID: 38412804 DOI: 10.1016/j.compbiolchem.2024.108033] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2023] [Revised: 01/09/2024] [Accepted: 02/17/2024] [Indexed: 02/29/2024]
Abstract
As a promising alternative to conventional antibiotic drugs in the biomedical field, functional peptide has been widely used in disease treatment owing to its low toxicity, high absorption rate, and biological activity. Recently, several machine learning methods have been developed for functional peptide prediction. However, the main research heavily relies on statistical features and few consider multifunctional peptide identification. So, we propose SME-MFP, a novel predictor in the imbalanced multi-label functional peptide datasets. First, we employ physicochemical and evolutionary information to represent the peptide sequence's initialization features from multiple perspectives. Second, the features are fused and then put into spatial feature extractors, where the residual connection and multiscale convolutional neural network extract more discriminative features of different lengths' peptide sequences. Besides, we also design AFT-based temporal feature extractors to fully capture the global interactions of the sequences. Finally, devising a new loss to replace the traditional cross entropy loss to settle the class imbalance problems. The results show that our framework not only enhances the model's ability to capture sequence features effectively, but also accuracy improves by 3.89% over existing methods on public peptide datasets.
Collapse
Affiliation(s)
- Jing Xu
- State Key Laboratory of Public Big Data, Guizhou University, Guiyang 550025, China
| | - Xiaoli Ruan
- State Key Laboratory of Public Big Data, Guizhou University, Guiyang 550025, China.
| | - Jing Yang
- State Key Laboratory of Public Big Data, Guizhou University, Guiyang 550025, China
| | - Bingqi Hu
- State Key Laboratory of Public Big Data, Guizhou University, Guiyang 550025, China
| | - Shaobo Li
- State Key Laboratory of Public Big Data, Guizhou University, Guiyang 550025, China
| | - Jianjun Hu
- Department of Computer Science and Engineering, University of South Carolina, Columbia 29208, USA
| |
Collapse
|
11
|
Zhang S, Zhao Y, Liang Y. AACFlow: an end-to-end model based on attention augmented convolutional neural network and flow-attention mechanism for identification of anticancer peptides. Bioinformatics 2024; 40:btae142. [PMID: 38452348 PMCID: PMC10973939 DOI: 10.1093/bioinformatics/btae142] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/24/2023] [Revised: 03/01/2024] [Accepted: 03/06/2024] [Indexed: 03/09/2024] Open
Abstract
MOTIVATION Anticancer peptides (ACPs) have natural cationic properties and can act on the anionic cell membrane of cancer cells to kill cancer cells. Therefore, ACPs have become a potential anticancer drug with good research value and prospect. RESULTS In this article, we propose AACFlow, an end-to-end model for identification of ACPs based on deep learning. End-to-end models have more room to automatically adjust according to the data, making the overall fit better and reducing error propagation. The combination of attention augmented convolutional neural network (AAConv) and multi-layer convolutional neural network (CNN) forms a deep representation learning module, which is used to obtain global and local information on the sequence. Based on the concept of flow network, multi-head flow-attention mechanism is introduced to mine the deep features of the sequence to improve the efficiency of the model. On the independent test dataset, the ACC, Sn, Sp, and AUC values of AACFlow are 83.9%, 83.0%, 84.8%, and 0.892, respectively, which are 4.9%, 1.5%, 8.0%, and 0.016 higher than those of the baseline model. The MCC value is 67.85%. In addition, we visualize the features extracted by each module to enhance the interpretability of the model. Various experiments show that our model is more competitive in predicting ACPs.
Collapse
Affiliation(s)
- Shengli Zhang
- School of Mathematics and Statistics, Xidian University, Xi'an 710071, China
| | - Ya Zhao
- School of Mathematics and Statistics, Xidian University, Xi'an 710071, China
| | - Yunyun Liang
- School of Science, Xi’an Polytechnic University, Xi'an 710048, China
| |
Collapse
|
12
|
Ma Y, Zhang B, Liu Z, Liu Y, Wang J, Li X, Feng F, Ni Y, Li S. IAS-FET: An intelligent assistant system and an online platform for enhancing successful rate of in-vitro fertilization embryo transfer technology based on clinical features. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2024; 245:108050. [PMID: 38301430 DOI: 10.1016/j.cmpb.2024.108050] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/14/2023] [Revised: 01/20/2024] [Accepted: 01/23/2024] [Indexed: 02/03/2024]
Abstract
BACKGROUND Among all of the assisted reproductive technology (ART) methods, in vitro fertilization-embryo transfer (IVF-ET) holds a prominent position as a key solution for overcoming infertility. However, its success rate hovers at a modest 30% to 70%. Adding to the challenge is the absence of effective models and clinical tools capable of predicting the outcome of IVF-ET before embryo formation. Our study is dedicated to filling this critical gap by aiming to predict IVF-ET outcomes and ultimately enhance the success rate of this transformative procedure. METHODS In this retrospective study, infertile patients who received artificial assisted pregnancy treatment at Gansu Provincial Maternity and Child-care Hospital in China were enrolled from 2016 to 2020. Individual's clinical information were studied by cascade XGBoost method to build an intelligent assisted system for predicting the outcome of IVF-ET, called IAS-FET. The cascade XGBoost model was trained using clinical information from 2292 couples and externally tested using clinical information from 573 couples. In addition, several schemes which will be of help for patients to adjust their physical condition to improve their success rate on ART were suggested by IAS-FET. RESULTS The outcome of IVF-ET can be predicted by the built IAS-FET method with the area under curve (AUC) value of 0.8759 on the external test set. Besides, this IAS-FET method can provide several schemes to improve the successful rate of IVF-ET outcomes. The built tool for IAS-FET is addressed as a free platform online at http://www.cppdd.cn/ART for the convenient usage of users. CONCLUSIONS It suggested the significant influence of personal clinical features for the success of ART. The proposed system IAS-FET based on the top 27 factors could be a promising tool to predict the outcome of ART and propose a plan for the patient's physical adjustment. With the help of IAS-FET, patients can take informed steps towards increasing their chances of a successful outcome on their journey to parenthood.
Collapse
Affiliation(s)
- Ying Ma
- Gansu Provincial Maternity and Child-care Hospital, Lanzhou, Gansu 730030, China
| | - Bowen Zhang
- School of Medical Information and Engineering, Xuzhou Medical University, Xuzhou, Jiangsu 221004, China; School of Electronic Information and Communications, Huazhong University of Science and Technology, Wuhan, Hubei 430073, China
| | - Zhaoqing Liu
- School of Medical Information and Engineering, Xuzhou Medical University, Xuzhou, Jiangsu 221004, China
| | - Yujie Liu
- School of Medical Information and Engineering, Xuzhou Medical University, Xuzhou, Jiangsu 221004, China
| | - Jiarui Wang
- School of Medical Information and Engineering, Xuzhou Medical University, Xuzhou, Jiangsu 221004, China
| | - Xingxuan Li
- School of Chemistry and Chemical Engineering, Lanzhou University, Lanzhou, Gansu 730030, China
| | - Fan Feng
- Gansu Provincial Maternity and Child-care Hospital, Lanzhou, Gansu 730030, China
| | - Yali Ni
- Gansu Provincial Maternity and Child-care Hospital, Lanzhou, Gansu 730030, China
| | - Shuyan Li
- School of Medical Information and Engineering, Xuzhou Medical University, Xuzhou, Jiangsu 221004, China.
| |
Collapse
|
13
|
Feifei W, Wenrou S, Sining K, Siyu Z, Xiaolei F, Junxiang L, Congfen H, Xuhui L. A novel functional peptide, named EQ-9 (ESETRILLQ), identified by virtual screening from regenerative cell secretome and its potential anti-aging and restoration effects in topical applications. Peptides 2023; 169:171078. [PMID: 37579838 DOI: 10.1016/j.peptides.2023.171078] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/24/2023] [Revised: 08/10/2023] [Accepted: 08/11/2023] [Indexed: 08/16/2023]
Abstract
Skin aging refers to a degenerative process that can be affected and regulated by intrinsic and extrinsic factors. The mesenchymal stem cell secretome covers a considerable number of regenerative molecules with anti-aging effects in a wide variety of circumstances. However, it is complex, time-consuming, and costly to identify specific compounds from thousands of natural molecules using conventional methods. With the development of computational biology and machine learning, an efficient workflow was generated to identify novel peptides with anti-aging and skin restoration potential. One of the candidate peptides was discovered and subsequently truncated to a novel peptide named EQ-9, with promising anti-aging effects for topical applications at a concentration of 10 ppm validated by experimental validation. The above-described paradigm is expected to be further applied to the virtual screening of novel peptide molecules targeting specific biological functions from a wide variety of natural resources.
Collapse
Affiliation(s)
- Wang Feifei
- Yunnan Botanee Bio-technology Group Co., Ltd., Yunnan, China; Yunnan Yunke Characteristic Plant Extraction Laboratory Co., Ltd., Yunnan, China
| | - Su Wenrou
- Yunnan Botanee Bio-technology Group Co., Ltd., Yunnan, China; Yunnan Yunke Characteristic Plant Extraction Laboratory Co., Ltd., Yunnan, China
| | - Kang Sining
- AGECODE R&D Center, Yangtze Delta Region Institute of Tsinghua University, Zhejiang, China; Harvest Biotech (Zhejiang) Co., Ltd., Zhejiang, China
| | - Zhu Siyu
- AGECODE R&D Center, Yangtze Delta Region Institute of Tsinghua University, Zhejiang, China; Harvest Biotech (Zhejiang) Co., Ltd., Zhejiang, China
| | - Fu Xiaolei
- AGECODE R&D Center, Yangtze Delta Region Institute of Tsinghua University, Zhejiang, China; Harvest Biotech (Zhejiang) Co., Ltd., Zhejiang, China
| | - Li Junxiang
- AGECODE R&D Center, Yangtze Delta Region Institute of Tsinghua University, Zhejiang, China; Harvest Biotech (Zhejiang) Co., Ltd., Zhejiang, China
| | - He Congfen
- Beijing Technology and Business University, Beijing Key Lab of Plant Resources Research and Development, Beijing, China
| | - Li Xuhui
- AGECODE R&D Center, Yangtze Delta Region Institute of Tsinghua University, Zhejiang, China; Zhejiang Provincial Key Laboratory of Applied Enzymology, Yangtze Delta Region Institute of Tsinghua University, Zhejiang, China.
| |
Collapse
|
14
|
Yao L, Zhang Y, Li W, Chung C, Guan J, Zhang W, Chiang Y, Lee T. DeepAFP: An effective computational framework for identifying antifungal peptides based on deep learning. Protein Sci 2023; 32:e4758. [PMID: 37595093 PMCID: PMC10503419 DOI: 10.1002/pro.4758] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2023] [Revised: 08/02/2023] [Accepted: 08/10/2023] [Indexed: 08/20/2023]
Abstract
Fungal infections have become a significant global health issue, affecting millions worldwide. Antifungal peptides (AFPs) have emerged as a promising alternative to conventional antifungal drugs due to their low toxicity and low propensity for inducing resistance. In this study, we developed a deep learning-based framework called DeepAFP to efficiently identify AFPs. DeepAFP fully leverages and mines composition information, evolutionary information, and physicochemical properties of peptides by employing combined kernels from multiple branches of convolutional neural network with bi-directional long short-term memory layers. In addition, DeepAFP integrates a transfer learning strategy to obtain efficient representations of peptides for improving model performance. DeepAFP demonstrates strong predictive ability on carefully curated datasets, yielding an accuracy of 93.29% and an F1-score of 93.45% on the DeepAFP-Main dataset. The experimental results show that DeepAFP outperforms existing AFP prediction tools, achieving state-of-the-art performance. Finally, we provide a downloadable AFP prediction tool to meet the demands of large-scale prediction and facilitate the usage of our framework by the public or other researchers. Our framework can accurately identify AFPs in a short time without requiring significant human and material resources, and hence can accelerate the development of AFPs as well as contribute to the treatment of fungal infections. Furthermore, our method can provide new perspectives for other biological sequence analysis tasks.
Collapse
Affiliation(s)
- Lantian Yao
- Kobilka Institute of Innovative Drug Discovery, School of MedicineThe Chinese University of Hong KongShenzhenChina
- School of Science and EngineeringThe Chinese University of Hong KongShenzhenChina
| | - Yuntian Zhang
- School of MedicineThe Chinese University of Hong KongShenzhenChina
| | - Wenshuo Li
- School of Science and EngineeringThe Chinese University of Hong KongShenzhenChina
| | - Chia‐Ru Chung
- Department of Computer Science and Information EngineeringNational Central UniversityTaoyuanTaiwan
| | - Jiahui Guan
- School of MedicineThe Chinese University of Hong KongShenzhenChina
| | - Wenyang Zhang
- School of MedicineThe Chinese University of Hong KongShenzhenChina
| | - Ying‐Chih Chiang
- Kobilka Institute of Innovative Drug Discovery, School of MedicineThe Chinese University of Hong KongShenzhenChina
- School of MedicineThe Chinese University of Hong KongShenzhenChina
| | - Tzong‐Yi Lee
- Institute of Bioinformatics and Systems BiologyNational Yang Ming Chiao Tung UniversityHsinchuTaiwan
- Center for Intelligent Drug Systems and Smart Bio‐devices (IDS2B)National Yang Ming Chiao Tung UniversityHsinchuTaiwan
| |
Collapse
|