1
|
Kundu P, Beura S, Mondal S, Das AK, Ghosh A. Machine learning for the advancement of genome-scale metabolic modeling. Biotechnol Adv 2024; 74:108400. [PMID: 38944218 DOI: 10.1016/j.biotechadv.2024.108400] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2023] [Revised: 05/13/2024] [Accepted: 06/23/2024] [Indexed: 07/01/2024]
Abstract
Constraint-based modeling (CBM) has evolved as the core systems biology tool to map the interrelations between genotype, phenotype, and external environment. The recent advancement of high-throughput experimental approaches and multi-omics strategies has generated a plethora of new and precise information from wide-ranging biological domains. On the other hand, the continuously growing field of machine learning (ML) and its specialized branch of deep learning (DL) provide essential computational architectures for decoding complex and heterogeneous biological data. In recent years, both multi-omics and ML have assisted in the escalation of CBM. Condition-specific omics data, such as transcriptomics and proteomics, helped contextualize the model prediction while analyzing a particular phenotypic signature. At the same time, the advanced ML tools have eased the model reconstruction and analysis to increase the accuracy and prediction power. However, the development of these multi-disciplinary methodological frameworks mainly occurs independently, which limits the concatenation of biological knowledge from different domains. Hence, we have reviewed the potential of integrating multi-disciplinary tools and strategies from various fields, such as synthetic biology, CBM, omics, and ML, to explore the biochemical phenomenon beyond the conventional biological dogma. How the integrative knowledge of these intersected domains has improved bioengineering and biomedical applications has also been highlighted. We categorically explained the conventional genome-scale metabolic model (GEM) reconstruction tools and their improvement strategies through ML paradigms. Further, the crucial role of ML and DL in omics data restructuring for GEM development has also been briefly discussed. Finally, the case-study-based assessment of the state-of-the-art method for improving biomedical and metabolic engineering strategies has been elaborated. Therefore, this review demonstrates how integrating experimental and in silico strategies can help map the ever-expanding knowledge of biological systems driven by condition-specific cellular information. This multiview approach will elevate the application of ML-based CBM in the biomedical and bioengineering fields for the betterment of society and the environment.
Collapse
Affiliation(s)
- Pritam Kundu
- School School of Energy Science and Engineering, Indian Institute of Technology Kharagpur, West Bengal 721302, India
| | - Satyajit Beura
- Department of Bioscience and Biotechnology, Indian Institute of Technology, Kharagpur, West Bengal 721302, India
| | - Suman Mondal
- P.K. Sinha Centre for Bioenergy and Renewables, Indian Institute of Technology Kharagpur, West Bengal 721302, India
| | - Amit Kumar Das
- Department of Bioscience and Biotechnology, Indian Institute of Technology, Kharagpur, West Bengal 721302, India
| | - Amit Ghosh
- School School of Energy Science and Engineering, Indian Institute of Technology Kharagpur, West Bengal 721302, India; P.K. Sinha Centre for Bioenergy and Renewables, Indian Institute of Technology Kharagpur, West Bengal 721302, India.
| |
Collapse
|
2
|
Bhattarai S, Tayara H, Chong KT. Advancing Peptide-Based Cancer Therapy with AI: In-Depth Analysis of State-of-the-Art AI Models. J Chem Inf Model 2024; 64:4941-4957. [PMID: 38874445 DOI: 10.1021/acs.jcim.4c00295] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/15/2024]
Abstract
Anticancer peptides (ACPs) play a vital role in selectively targeting and eliminating cancer cells. Evaluating and comparing predictions from various machine learning (ML) and deep learning (DL) techniques is challenging but crucial for anticancer drug research. We conducted a comprehensive analysis of 15 ML and 10 DL models, including the models released after 2022, and found that support vector machines (SVMs) with feature combination and selection significantly enhance overall performance. DL models, especially convolutional neural networks (CNNs) with light gradient boosting machine (LGBM) based feature selection approaches, demonstrate improved characterization. Assessment using a new test data set (ACP10) identifies ACPred, MLACP 2.0, AI4ACP, mACPred, and AntiCP2.0_AAC as successive optimal predictors, showcasing robust performance. Our review underscores current prediction tool limitations and advocates for an omnidirectional ACP prediction framework to propel ongoing research.
Collapse
Affiliation(s)
- Sadik Bhattarai
- Department of Electronics and Information Engineering, Jeonbuk National University, Jeonju-si, 54896 Jeollabuk-do, South Korea
| | - Hilal Tayara
- School of International Engineering and Science, Jeonbuk National University, Jeonju-si, 54896 Jeollabuk-do, South Korea
| | - Kil To Chong
- Department of Electronics and Information Engineering, Jeonbuk National University, Jeonju-si, 54896 Jeollabuk-do, South Korea
- Advanced Electronics and Information Research Center, Jeonbuk National University, Jeonju-si, 54896 Jeollabuk-do, South Korea
| |
Collapse
|
3
|
Kang Y, Zhang H, Wang X, Yang Y, Jia Q. MMDB: Multimodal dual-branch model for multi-functional bioactive peptide prediction. Anal Biochem 2024; 690:115491. [PMID: 38460901 DOI: 10.1016/j.ab.2024.115491] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2023] [Revised: 01/21/2024] [Accepted: 02/19/2024] [Indexed: 03/11/2024]
Abstract
Bioactive peptides can hinder oxidative processes and microbial spoilage in foodstuffs and play important roles in treating diverse diseases and disorders. While most of the methods focus on single-functional bioactive peptides and have obtained promising prediction performance, it is still a significant challenge to accurately detect complex and diverse functions simultaneously with the quick increase of multi-functional bioactive peptides. In contrast to previous research on multi-functional bioactive peptide prediction based solely on sequence, we propose a novel multimodal dual-branch (MMDB) lightweight deep learning model that designs two different branches to effectively capture the complementary information of peptide sequence and structural properties. Specifically, a multi-scale dilated convolution with Bi-LSTM branch is presented to effectively model the different scales sequence properties of peptides while a multi-layer convolution branch is proposed to capture structural information. To the best of our knowledge, this is the first effective extraction of peptide sequence features using multi-scale dilated convolution without parameter increase. Multimodal features from both branches are integrated via a fully connected layer for multi-label classification. Compared to state-of-the-art methods, our MMDB model exhibits competitive results across metrics, with a 9.1% Coverage increase and 5.3% and 3.5% improvements in Precision and Accuracy, respectively.
Collapse
Affiliation(s)
- Yan Kang
- National Pilot School of Software, Yunnan University, Kunming, 650091, Yunnan, China; Yunnan Key Laboratory of Software Engineering, China
| | - Huadong Zhang
- National Pilot School of Software, Yunnan University, Kunming, 650091, Yunnan, China
| | - Xinchao Wang
- National Pilot School of Software, Yunnan University, Kunming, 650091, Yunnan, China
| | - Yun Yang
- National Pilot School of Software, Yunnan University, Kunming, 650091, Yunnan, China; Yunnan Key Laboratory of Software Engineering, China.
| | - Qi Jia
- School of Information Science, Yunnan University, Kunming, 650091, Yunnan, China
| |
Collapse
|
4
|
Ghafoor H, Asim MN, Ibrahim MA, Ahmed S, Dengel A. CAPTURE: Comprehensive anti-cancer peptide predictor with a unique amino acid sequence encoder. Comput Biol Med 2024; 176:108538. [PMID: 38759585 DOI: 10.1016/j.compbiomed.2024.108538] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2024] [Revised: 04/26/2024] [Accepted: 04/28/2024] [Indexed: 05/19/2024]
Abstract
Anticancer peptides (ACPs) key properties including bioactivity, high efficacy, low toxicity, and lack of drug resistance make them ideal candidates for cancer therapies. To deeply explore the potential of ACPs and accelerate development of cancer therapies, although 53 Artificial Intelligence supported computational predictors have been developed for ACPs and non ACPs classification but only one predictor has been developed for ACPs functional types annotations. Moreover, these predictors extract amino acids distribution patterns to transform peptides sequences into statistical vectors that are further fed to classifiers for discriminating peptides sequences and annotating peptides functional classes. Overall, these predictors remain fail in extracting diverse types of amino acids distribution patterns from peptide sequences. The paper in hand presents a unique CARE encoder that transforms peptides sequences into statistical vectors by extracting 4 different types of distribution patterns including correlation, distribution, composition, and transition. Across public benchmark dataset, proposed encoder potential is explored under two different evaluation settings namely; intrinsic and extrinsic. Extrinsic evaluation indicates that 12 different machine learning classifiers achieve superior performance with the proposed encoder as compared to 55 existing encoders. Furthermore, an intrinsic evaluation reveals that, unlike existing encoders, the proposed encoder generates more discriminative clusters for ACPs and non-ACPs classes. Across 8 public benchmark ACPs and non-ACPs classification datasets, proposed encoder and Adaboost classifier based CAPTURE predictor outperforms existing predictors with an average accuracy, recall and MCC score of 1%, 4%, and 2% respectively. In generalizeability evaluation case study, across 7 benchmark anti-microbial peptides classification datasets, CAPTURE surpasses existing predictors by an average AU-ROC of 2%. CAPTURE predictive pipeline along with label powerset method outperforms state-of-the-art ACPs functional types predictor by 5%, 5%, 5%, 6%, and 3% in terms of average accuracy, subset accuracy, precision, recall, and F1 respectively. CAPTURE web application is available at https://sds_genetic_analysis.opendfki.de/CAPTURE.
Collapse
Affiliation(s)
- Hina Ghafoor
- Department of Computer Science, Rhineland-Palatinate Technical University of Kaiserslautern-Landau, Kaiserslautern, 67663, Germany; German Research Center for Artificial Intelligence GmbH, Kaiserslautern, 67663, Germany
| | - Muhammad Nabeel Asim
- German Research Center for Artificial Intelligence GmbH, Kaiserslautern, 67663, Germany.
| | - Muhammad Ali Ibrahim
- Department of Computer Science, Rhineland-Palatinate Technical University of Kaiserslautern-Landau, Kaiserslautern, 67663, Germany; German Research Center for Artificial Intelligence GmbH, Kaiserslautern, 67663, Germany
| | - Sheraz Ahmed
- German Research Center for Artificial Intelligence GmbH, Kaiserslautern, 67663, Germany
| | - Andreas Dengel
- Department of Computer Science, Rhineland-Palatinate Technical University of Kaiserslautern-Landau, Kaiserslautern, 67663, Germany; German Research Center for Artificial Intelligence GmbH, Kaiserslautern, 67663, Germany
| |
Collapse
|
5
|
Song H, Lin X, Zhang H, Yin H. ACP-ESM2: The prediction of anticancer peptides based on pre-trained classifier. Comput Biol Chem 2024; 110:108091. [PMID: 38735271 DOI: 10.1016/j.compbiolchem.2024.108091] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2024] [Revised: 04/07/2024] [Accepted: 04/29/2024] [Indexed: 05/14/2024]
Abstract
Anticancer peptides (ACPs) are a type of protein molecule that has anti-cancer activity and can inhibit cancer cell growth and survival. Traditional classification approaches for ACPs are expensive and time-consuming. This paper proposes a pre-trained classifier model, ESM2-GRU, for ACP prediction to make it easier to predict ACPs, gain a better understanding of the structural and functional differences of anti-cancer peptides, and optimize the design for the development of more effective anti-cancer treatment strategies. The model is made up of the ESM2 pre-trained model, a bidirectional GRU recurrent neural network, and a fully connected layer. ACP sequences are first fed into the ESM2 model, which then expands the dimensions before feeding the findings back into the bidirectional GRU recurrent neural network. Finally, the fully connected layer generates the ultimate output. Experimental validation demonstrates that the ESM2-GRU model greatly improves classification performance on the benchmark dataset ACP606, with AUC, ACC, and MCC values of 0.975, 0.852, and 0.738, respectively. This exceptional prediction potential helps to identify specific types of anti-cancer peptides, improving their targeting and selectivity and, therefore, furthering the development of tailored medicine and treatments.
Collapse
Affiliation(s)
- Huijia Song
- School of Information Engineering, Beijing Institute of Petrochemical Technology, Beijing 102617, China
| | - Xiaozhu Lin
- School of Information Engineering, Beijing Institute of Petrochemical Technology, Beijing 102617, China.
| | - Huainian Zhang
- School of Information Engineering, Beijing Institute of Petrochemical Technology, Beijing 102617, China
| | - Huijuan Yin
- School of Information Engineering, Beijing Institute of Petrochemical Technology, Beijing 102617, China
| |
Collapse
|
6
|
Liao YH, Chen SZ, Bin YN, Zhao JP, Feng XL, Zheng CH. UsIL-6: An unbalanced learning strategy for identifying IL-6 inducing peptides by undersampling technique. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2024; 250:108176. [PMID: 38677081 DOI: 10.1016/j.cmpb.2024.108176] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/18/2022] [Revised: 03/26/2024] [Accepted: 04/11/2024] [Indexed: 04/29/2024]
Abstract
BACKGROUND AND OBJECTIVE Interleukin-6 (IL-6) is the critical factor of early warning, monitoring, and prognosis in the inflammatory storm of COVID-19 cases. IL-6 inducing peptides, which can induce cytokine IL-6 production, are very important for the development of diagnosis and immunotherapy. Although the existing methods have some success in predicting IL-6 inducing peptides, there is still room for improvement in the performance of these models in practical application. METHODS In this study, we proposed UsIL-6, a high-performance bioinformatics tool for identifying IL-6 inducing peptides. First, we extracted five groups of physicochemical properties and sequence structural information from IL-6 inducing peptide sequences, and obtained a 636-dimensional feature vector, we also employed NearMiss3 undersampling method and normalization method StandardScaler to process the data. Then, a 40-dimensional optimal feature vector was obtained by Boruta feature selection method. Finally, we combined this feature vector with extreme randomization tree classifier to build the final model UsIL-6. RESULTS The AUC value of UsIL-6 on the independent test dataset was 0.87, and the BACC value was 0.808, which indicated that UsIL-6 had better performance than the existing methods in IL-6 inducing peptide recognition. CONCLUSIONS The performance comparison on independent test dataset confirmed that UsIL-6 could achieve the highest performance, best robustness, and most excellent generalization ability. We hope that UsIL-6 will become a valuable method to identify, annotate and characterize new IL-6 inducing peptides.
Collapse
Affiliation(s)
- Yan-Hong Liao
- School of Mathematics and System Science, Xinjiang University, Urumqi, Xinjiang 830017, China
| | - Shou-Zhi Chen
- School of Mathematics and System Science, Xinjiang University, Urumqi, Xinjiang 830017, China
| | - Yan-Nan Bin
- School of Computer Science and Technology, Anhui University, Hefei, Anhui 230601, China
| | - Jian-Ping Zhao
- School of Mathematics and System Science, Xinjiang University, Urumqi, Xinjiang 830017, China.
| | - Xin-Long Feng
- School of Mathematics and System Science, Xinjiang University, Urumqi, Xinjiang 830017, China.
| | - Chun-Hou Zheng
- School of Mathematics and System Science, Xinjiang University, Urumqi, Xinjiang 830017, China; School of Computer Science and Technology, Anhui University, Hefei, Anhui 230601, China
| |
Collapse
|
7
|
Chen Z, Wang R, Guo J, Wang X. The role and future prospects of artificial intelligence algorithms in peptide drug development. Biomed Pharmacother 2024; 175:116709. [PMID: 38713945 DOI: 10.1016/j.biopha.2024.116709] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2024] [Revised: 05/01/2024] [Accepted: 05/02/2024] [Indexed: 05/09/2024] Open
Abstract
Peptide medications have been more well-known in recent years due to their many benefits, including low side effects, high biological activity, specificity, effectiveness, and so on. Over 100 peptide medications have been introduced to the market to treat a variety of illnesses. Most of these peptide medications are developed on the basis of endogenous peptides or natural peptides, which frequently required expensive, time-consuming, and extensive tests to confirm. As artificial intelligence advances quickly, it is now possible to build machine learning or deep learning models that screen a large number of candidate sequences for therapeutic peptides. Therapeutic peptides, such as those with antibacterial or anticancer properties, have been developed by the application of artificial intelligence algorithms.The process of finding and developing peptide drugs is outlined in this review, along with a few related cases that were helped by AI and conventional methods. These resources will open up new avenues for peptide drug development and discovery, helping to meet the pressing needs of clinical patients for disease treatment. Although peptide drugs are a new class of biopharmaceuticals that distinguish them from chemical and small molecule drugs, their clinical purpose and value cannot be ignored. However, the traditional peptide drug research and development has a long development cycle and high investment, and the creation of peptide medications will be substantially hastened by the AI-assisted (AI+) mode, offering a new boost for combating diseases.
Collapse
Affiliation(s)
- Zhiheng Chen
- School of Biological Science and Medical Engineering, Beihang University, Beijing 100083, China.
| | - Ruoxi Wang
- School of Biological Science and Medical Engineering, Beihang University, Beijing 100083, China.
| | - Junqi Guo
- School of Biological Science and Medical Engineering, Beihang University, Beijing 100083, China.
| | - Xiaogang Wang
- Guangdong Provincial Key Laboratory of Bone and Joint Degenerative Diseases, The Third Affiliated Hospital of Southern Medical University, Guangzhou, Guangdong 510630, China.
| |
Collapse
|
8
|
Fang Y, Luo M, Ren Z, Wei L, Wei DQ. CELA-MFP: a contrast-enhanced and label-adaptive framework for multi-functional therapeutic peptides prediction. Brief Bioinform 2024; 25:bbae348. [PMID: 39038935 PMCID: PMC11262836 DOI: 10.1093/bib/bbae348] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2024] [Revised: 05/27/2024] [Accepted: 07/08/2024] [Indexed: 07/24/2024] Open
Abstract
Functional peptides play crucial roles in various biological processes and hold significant potential in many fields such as drug discovery and biotechnology. Accurately predicting the functions of peptides is essential for understanding their diverse effects and designing peptide-based therapeutics. Here, we propose CELA-MFP, a deep learning framework that incorporates feature Contrastive Enhancement and Label Adaptation for predicting Multi-Functional therapeutic Peptides. CELA-MFP utilizes a protein language model (pLM) to extract features from peptide sequences, which are then fed into a Transformer decoder for function prediction, effectively modeling correlations between different functions. To enhance the representation of each peptide sequence, contrastive learning is employed during training. Experimental results demonstrate that CELA-MFP outperforms state-of-the-art methods on most evaluation metrics for two widely used datasets, MFBP and MFTP. The interpretability of CELA-MFP is demonstrated by visualizing attention patterns in pLM and Transformer decoder. Finally, a user-friendly online server for predicting multi-functional peptides is established as the implementation of the proposed CELA-MFP and can be freely accessed at http://dreamai.cmii.online/CELA-MFP.
Collapse
Affiliation(s)
- Yitian Fang
- State Key Laboratory of Microbial Metabolism, Joint International Research Laboratory of Metabolic and Developmental Sciences, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, 800 Dongchuan Road, Shanghai 200240, China
- Peng Cheng Laboratory, 2 Xingke 1st Street, Nanshan District, Shenzhen 518055, China
| | - Mingshuang Luo
- Peng Cheng Laboratory, 2 Xingke 1st Street, Nanshan District, Shenzhen 518055, China
| | - Zhixiang Ren
- Peng Cheng Laboratory, 2 Xingke 1st Street, Nanshan District, Shenzhen 518055, China
| | - Leyi Wei
- Centre for Artificial Intelligence Driven Drug Discovery, Faculty of Applied Sciences, Macao Polytechnic University, R. de Luís Gonzaga Gomes, Macao 999078, China
- School of Informatics, Xiamen University, 422 Siming South Road, Xiamen 361005, China
| | - Dong-Qing Wei
- State Key Laboratory of Microbial Metabolism, Joint International Research Laboratory of Metabolic and Developmental Sciences, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, 800 Dongchuan Road, Shanghai 200240, China
- Peng Cheng Laboratory, 2 Xingke 1st Street, Nanshan District, Shenzhen 518055, China
| |
Collapse
|
9
|
Karakaya O, Kilimci ZH. An efficient consolidation of word embedding and deep learning techniques for classifying anticancer peptides: FastText+BiLSTM. PeerJ Comput Sci 2024; 10:e1831. [PMID: 38435607 PMCID: PMC10909209 DOI: 10.7717/peerj-cs.1831] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2023] [Accepted: 12/31/2023] [Indexed: 03/05/2024]
Abstract
Anticancer peptides (ACPs) are a group of peptides that exhibit antineoplastic properties. The utilization of ACPs in cancer prevention can present a viable substitute for conventional cancer therapeutics, as they possess a higher degree of selectivity and safety. Recent scientific advancements generate an interest in peptide-based therapies which offer the advantage of efficiently treating intended cells without negatively impacting normal cells. However, as the number of peptide sequences continues to increase rapidly, developing a reliable and precise prediction model becomes a challenging task. In this work, our motivation is to advance an efficient model for categorizing anticancer peptides employing the consolidation of word embedding and deep learning models. First, Word2Vec, GloVe, FastText, One-Hot-Encoding approaches are evaluated as embedding techniques for the purpose of extracting peptide sequences. Then, the output of embedding models are fed into deep learning approaches CNN, LSTM, BiLSTM. To demonstrate the contribution of proposed framework, extensive experiments are carried on widely-used datasets in the literature, ACPs250 and independent. Experiment results show the usage of proposed model enhances classification accuracy when compared to the state-of-the-art studies. The proposed combination, FastText+BiLSTM, exhibits 92.50% of accuracy for ACPs250 dataset, and 96.15% of accuracy for the Independent dataset, thence determining new state-of-the-art.
Collapse
Affiliation(s)
- Onur Karakaya
- Research and Development Inc., Turkcell Technology, İstanbul, Turkey
| | - Zeynep Hilal Kilimci
- Department of Information Systems Engineering, Kocaeli University, Kocaeli, Turkey
| |
Collapse
|
10
|
Liu GY, Yu D, Fan MM, Zhang X, Jin ZY, Tang C, Liu XF. Antimicrobial resistance crisis: could artificial intelligence be the solution? Mil Med Res 2024; 11:7. [PMID: 38254241 PMCID: PMC10804841 DOI: 10.1186/s40779-024-00510-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/18/2023] [Accepted: 01/08/2024] [Indexed: 01/24/2024] Open
Abstract
Antimicrobial resistance is a global public health threat, and the World Health Organization (WHO) has announced a priority list of the most threatening pathogens against which novel antibiotics need to be developed. The discovery and introduction of novel antibiotics are time-consuming and expensive. According to WHO's report of antibacterial agents in clinical development, only 18 novel antibiotics have been approved since 2014. Therefore, novel antibiotics are critically needed. Artificial intelligence (AI) has been rapidly applied to drug development since its recent technical breakthrough and has dramatically improved the efficiency of the discovery of novel antibiotics. Here, we first summarized recently marketed novel antibiotics, and antibiotic candidates in clinical development. In addition, we systematically reviewed the involvement of AI in antibacterial drug development and utilization, including small molecules, antimicrobial peptides, phage therapy, essential oils, as well as resistance mechanism prediction, and antibiotic stewardship.
Collapse
Affiliation(s)
- Guang-Yu Liu
- Department of Immunology and Pathogen Biology, School of Basic Medical Sciences, Hangzhou Normal University, Key Laboratory of Aging and Cancer Biology of Zhejiang Province, Key Laboratory of Inflammation and Immunoregulation of Hangzhou, Hangzhou Normal University, Hangzhou, 311121, China
| | - Dan Yu
- National Key Discipline of Pediatrics Key Laboratory of Major Diseases in Children Ministry of Education, Laboratory of Dermatology, Beijing Pediatric Research Institute, Beijing Children's Hospital, Capital Medical University, National Center for Children's Health, Beijing, 100045, China
| | - Mei-Mei Fan
- Department of Immunology and Pathogen Biology, School of Basic Medical Sciences, Hangzhou Normal University, Key Laboratory of Aging and Cancer Biology of Zhejiang Province, Key Laboratory of Inflammation and Immunoregulation of Hangzhou, Hangzhou Normal University, Hangzhou, 311121, China
| | - Xu Zhang
- Robert and Arlene Kogod Center on Aging, Mayo Clinic, Rochester, MN, 55905, USA
- Department of Biochemistry and Molecular Biology, Mayo Clinic, Rochester, MN, 55905, USA
| | - Ze-Yu Jin
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, 77030, USA
| | - Christoph Tang
- Sir William Dunn School of Pathology, University of Oxford, Oxford, OX1 3RE, UK.
| | - Xiao-Fen Liu
- Institute of Antibiotics, Huashan Hospital, Fudan University, Key Laboratory of Clinical Pharmacology of Antibiotics, National Health Commission of the People's Republic of China, National Clinical Research Centre for Aging and Medicine, Huashan Hospital, Fudan University, Shanghai, 200040, China.
| |
Collapse
|
11
|
Purohit K, Reddy N, Sunna A. Exploring the Potential of Bioactive Peptides: From Natural Sources to Therapeutics. Int J Mol Sci 2024; 25:1391. [PMID: 38338676 PMCID: PMC10855437 DOI: 10.3390/ijms25031391] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2023] [Revised: 01/18/2024] [Accepted: 01/21/2024] [Indexed: 02/12/2024] Open
Abstract
Bioactive peptides, specific protein fragments with positive health effects, are gaining traction in drug development for advantages like enhanced penetration, low toxicity, and rapid clearance. This comprehensive review navigates the intricate landscape of peptide science, covering discovery to functional characterization. Beginning with a peptidomic exploration of natural sources, the review emphasizes the search for novel peptides. Extraction approaches, including enzymatic hydrolysis, microbial fermentation, and specialized methods for disulfide-linked peptides, are extensively covered. Mass spectrometric analysis techniques for data acquisition and identification, such as liquid chromatography, capillary electrophoresis, untargeted peptide analysis, and bioinformatics, are thoroughly outlined. The exploration of peptide bioactivity incorporates various methodologies, from in vitro assays to in silico techniques, including advanced approaches like phage display and cell-based assays. The review also discusses the structure-activity relationship in the context of antimicrobial peptides (AMPs), ACE-inhibitory peptides (ACEs), and antioxidative peptides (AOPs). Concluding with key findings and future research directions, this interdisciplinary review serves as a comprehensive reference, offering a holistic understanding of peptides and their potential therapeutic applications.
Collapse
Affiliation(s)
- Kruttika Purohit
- School of Natural Sciences, Macquarie University, Sydney, NSW 2109, Australia;
- Australian Research Council Industrial Transformation Training Centre for Facilitated Advancement of Australia’s Bioactives (FAAB), Sydney, NSW 2109, Australia;
| | - Narsimha Reddy
- Australian Research Council Industrial Transformation Training Centre for Facilitated Advancement of Australia’s Bioactives (FAAB), Sydney, NSW 2109, Australia;
- School of Science, Parramatta Campus, Western Sydney University, Penrith, NSW 2751, Australia
| | - Anwar Sunna
- School of Natural Sciences, Macquarie University, Sydney, NSW 2109, Australia;
- Australian Research Council Industrial Transformation Training Centre for Facilitated Advancement of Australia’s Bioactives (FAAB), Sydney, NSW 2109, Australia;
- Biomolecular Discovery Research Centre, Macquarie University, Sydney, NSW 2109, Australia
| |
Collapse
|
12
|
Chang L, Mondal A, Singh B, Martínez-Noa Y, Perez A. Revolutionizing Peptide-Based Drug Discovery: Advances in the Post-AlphaFold Era. WILEY INTERDISCIPLINARY REVIEWS. COMPUTATIONAL MOLECULAR SCIENCE 2024; 14:e1693. [PMID: 38680429 PMCID: PMC11052547 DOI: 10.1002/wcms.1693] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/06/2023] [Accepted: 09/18/2023] [Indexed: 05/01/2024]
Abstract
Peptide-based drugs offer high specificity, potency, and selectivity. However, their inherent flexibility and differences in conformational preferences between their free and bound states create unique challenges that have hindered progress in effective drug discovery pipelines. The emergence of AlphaFold (AF) and Artificial Intelligence (AI) presents new opportunities for enhancing peptide-based drug discovery. We explore recent advancements that facilitate a successful peptide drug discovery pipeline, considering peptides' attractive therapeutic properties and strategies to enhance their stability and bioavailability. AF enables efficient and accurate prediction of peptide-protein structures, addressing a critical requirement in computational drug discovery pipelines. In the post-AF era, we are witnessing rapid progress with the potential to revolutionize peptide-based drug discovery such as the ability to rank peptide binders or classify them as binders/non-binders and the ability to design novel peptide sequences. However, AI-based methods are struggling due to the lack of well-curated datasets, for example to accommodate modified amino acids or unconventional cyclization. Thus, physics-based methods, such as docking or molecular dynamics simulations, continue to hold a complementary role in peptide drug discovery pipelines. Moreover, MD-based tools offer valuable insights into binding mechanisms, as well as the thermodynamic and kinetic properties of complexes. As we navigate this evolving landscape, a synergistic integration of AI and physics-based methods holds the promise of reshaping the landscape of peptide-based drug discovery.
Collapse
Affiliation(s)
- Liwei Chang
- Department of Chemistry, University of Florida, Gainesville, FL 32611
| | - Arup Mondal
- Department of Chemistry, University of Florida, Gainesville, FL 32611
| | - Bhumika Singh
- Department of Chemistry, University of Florida, Gainesville, FL 32611
| | | | - Alberto Perez
- Department of Chemistry and Quantum Theory Project, University of Florida, Gainesville, FL 32611
| |
Collapse
|
13
|
Wang Z, Meng J, Li H, Xia S, Wang Y, Luan Y. PAMPred: A hierarchical evolutionary ensemble framework for identifying plant antimicrobial peptides. Comput Biol Med 2023; 166:107545. [PMID: 37806057 DOI: 10.1016/j.compbiomed.2023.107545] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2023] [Revised: 09/04/2023] [Accepted: 09/28/2023] [Indexed: 10/10/2023]
Abstract
Antimicrobial peptides (AMPs) play a crucial role in plant immune regulation, growth and development stages, which have attracted significant attentions in recent years. As the wet-lab experiments are laborious and cost-prohibitive, it is indispensable to develop computational methods to discover novel plant AMPs accurately. In this study, we presented a hierarchical evolutionary ensemble framework, named PAMPred, which consisted of a multi-level heterogeneous architecture to identify plant AMPs. Specifically, to address the existing class imbalance problem, a cluster-based resampling method was adopted to build multiple balanced subsets. Then, several peptide features including sequence information-based and physicochemical properties-based features were fed into the different types of basic learners to increase the ensemble diversity. For boosting the predictive capability of PAMPred, the improved particle swarm optimization (PSO) algorithm and dynamic ensemble pruning strategy were used to optimize the weights at different levels adaptively. Furthermore, extensive ten-fold cross-validation and independent testing experimental results demonstrated that PAMPred achieved excellent prediction performance and generalization ability, and outperformed the state-of-the-art methods. It also indicated that the proposed method could serve as an effective auxiliary tool to identify plant AMPs, which would be conducive to explore the immune regulatory mechanism of plants.
Collapse
Affiliation(s)
- Zhaowei Wang
- School of Computer Science and Technology, Dalian University of Technology, Dalian, Liaoning 116024, China
| | - Jun Meng
- School of Computer Science and Technology, Dalian University of Technology, Dalian, Liaoning 116024, China.
| | - Haibin Li
- School of Computer Science and Technology, Dalian University of Technology, Dalian, Liaoning 116024, China
| | - Shihao Xia
- School of Computer Science and Technology, Dalian University of Technology, Dalian, Liaoning 116024, China
| | - Yu Wang
- School of Computer Science and Technology, Dalian University of Technology, Dalian, Liaoning 116024, China
| | - Yushi Luan
- School of Bioengineering, Dalian University of Technology, Dalian, Liaoning 116024, China
| |
Collapse
|
14
|
Sun M, Hu H, Pang W, Zhou Y. ACP-BC: A Model for Accurate Identification of Anticancer Peptides Based on Fusion Features of Bidirectional Long Short-Term Memory and Chemically Derived Information. Int J Mol Sci 2023; 24:15447. [PMID: 37895128 PMCID: PMC10607064 DOI: 10.3390/ijms242015447] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2023] [Revised: 09/10/2023] [Accepted: 10/20/2023] [Indexed: 10/29/2023] Open
Abstract
Anticancer peptides (ACPs) have been proven to possess potent anticancer activities. Although computational methods have emerged for rapid ACPs identification, their accuracy still needs improvement. In this study, we propose a model called ACP-BC, a three-channel end-to-end model that utilizes various combinations of data augmentation techniques. In the first channel, features are extracted from the raw sequence using a bidirectional long short-term memory network. In the second channel, the entire sequence is converted into a chemical molecular formula, which is further simplified using Simplified Molecular Input Line Entry System notation to obtain deep abstract features through a bidirectional encoder representation transformer (BERT). In the third channel, we manually selected four effective features according to dipeptide composition, binary profile feature, k-mer sparse matrix, and pseudo amino acid composition. Notably, the application of chemical BERT in predicting ACPs is novel and successfully integrated into our model. To validate the performance of our model, we selected two benchmark datasets, ACPs740 and ACPs240. ACP-BC achieved prediction accuracy with 87% and 90% on these two datasets, respectively, representing improvements of 1.3% and 7% compared to existing state-of-the-art methods on these datasets. Therefore, systematic comparative experiments have shown that the ACP-BC can effectively identify anticancer peptides.
Collapse
Affiliation(s)
- Mingwei Sun
- Key Laboratory of Symbol Computation and Knowledge Engineering of Ministry of Education, College of Computer Science and Technology, Jilin University, Changchun 130012, China; (M.S.); (H.H.)
| | - Haoyuan Hu
- Key Laboratory of Symbol Computation and Knowledge Engineering of Ministry of Education, College of Computer Science and Technology, Jilin University, Changchun 130012, China; (M.S.); (H.H.)
| | - Wei Pang
- School of Mathematical and Computer Sciences, Heriot-Watt University, Edinburgh EH14 4AS, UK;
| | - You Zhou
- Key Laboratory of Symbol Computation and Knowledge Engineering of Ministry of Education, College of Computer Science and Technology, Jilin University, Changchun 130012, China; (M.S.); (H.H.)
- College of Software, Jilin University, Changchun 130012, China
| |
Collapse
|
15
|
Tao H, Shan S, Fu H, Zhu C, Liu B. An Augmented Sample Selection Framework for Prediction of Anticancer Peptides. Molecules 2023; 28:6680. [PMID: 37764455 PMCID: PMC10535447 DOI: 10.3390/molecules28186680] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2023] [Revised: 09/14/2023] [Accepted: 09/15/2023] [Indexed: 09/29/2023] Open
Abstract
Anticancer peptides (ACPs) have promising prospects for cancer treatment. Traditional ACP identification experiments have the limitations of low efficiency and high cost. In recent years, data-driven deep learning techniques have shown significant potential for ACP prediction. However, data-driven prediction models rely heavily on extensive training data. Furthermore, the current publicly accessible ACP dataset is limited in size, leading to inadequate model generalization. While data augmentation effectively expands dataset size, existing techniques for augmenting ACP data often generate noisy samples, adversely affecting prediction performance. Therefore, this paper proposes a novel augmented sample selection framework for the prediction of anticancer peptides (ACPs-ASSF). First, the prediction model is trained using raw data. Then, the augmented samples generated using the data augmentation technique are fed into the trained model to compute pseudo-labels and estimate the uncertainty of the model prediction. Finally, samples with low uncertainty, high confidence, and pseudo-labels consistent with the original labels are selected and incorporated into the training set to retrain the model. The evaluation results for the ACP240 and ACP740 datasets show that ACPs-ASSF achieved accuracy improvements of up to 5.41% and 5.68%, respectively, compared to the traditional data augmentation method.
Collapse
Affiliation(s)
- Huawei Tao
- Key Laboratory of Food Information Processing and Control, Ministry of Education, Henan University of Technology, Zhengzhou 450001, China; (H.T.); (S.S.); (H.F.); (C.Z.)
- Henan Engineering Laboratory of Grain IOT Technology, Henan University of Technology, Zhengzhou 450001, China
| | - Shuai Shan
- Key Laboratory of Food Information Processing and Control, Ministry of Education, Henan University of Technology, Zhengzhou 450001, China; (H.T.); (S.S.); (H.F.); (C.Z.)
- Henan Engineering Laboratory of Grain IOT Technology, Henan University of Technology, Zhengzhou 450001, China
| | - Hongliang Fu
- Key Laboratory of Food Information Processing and Control, Ministry of Education, Henan University of Technology, Zhengzhou 450001, China; (H.T.); (S.S.); (H.F.); (C.Z.)
- Henan Engineering Laboratory of Grain IOT Technology, Henan University of Technology, Zhengzhou 450001, China
| | - Chunhua Zhu
- Key Laboratory of Food Information Processing and Control, Ministry of Education, Henan University of Technology, Zhengzhou 450001, China; (H.T.); (S.S.); (H.F.); (C.Z.)
- Henan Engineering Laboratory of Grain IOT Technology, Henan University of Technology, Zhengzhou 450001, China
| | - Boye Liu
- College of Food Science and Engineering, Henan University of Technology, Zhengzhou 450001, China
| |
Collapse
|
16
|
Zhang Y, Yu L, Jing R, Han B, Luo J. Fast and Efficient Design of Deep Neural Networks for Predicting N 7-Methylguanosine Sites Using autoBioSeqpy. ACS OMEGA 2023; 8:19728-19740. [PMID: 37305295 PMCID: PMC10249100 DOI: 10.1021/acsomega.3c01371] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/28/2023] [Accepted: 05/10/2023] [Indexed: 06/13/2023]
Abstract
N7-Methylguanosine (m7G) is a crucial post-transcriptional RNA modification that plays a pivotal role in regulating gene expression. Accurately identifying m7G sites is a fundamental step in understanding the biological functions and regulatory mechanisms associated with this modification. While whole-genome sequencing is the gold standard for RNA modification site detection, it is a time-consuming, expensive, and intricate process. Recently, computational approaches, especially deep learning (DL) techniques, have gained popularity in achieving this objective. Convolutional neural networks and recurrent neural networks are examples of DL algorithms that have emerged as versatile tools for modeling biological sequence data. However, developing an efficient network architecture with superior performance remains a challenging task, requiring significant expertise, time, and effort. To address this, we previously introduced a tool called autoBioSeqpy, which streamlines the design and implementation of DL networks for biological sequence classification. In this study, we utilized autoBioSeqpy to develop, train, evaluate, and fine-tune sequence-level DL models for predicting m7G sites. We provided detailed descriptions of these models, along with a step-by-step guide on their execution. The same methodology can be applied to other systems dealing with similar biological questions. The benchmark data and code utilized in this study can be accessed for free at http://github.com/jingry/autoBioSeeqpy/tree/2.0/examples/m7G.
Collapse
Affiliation(s)
- Yonglin Zhang
- Department
of Pharmacy, Affiliated Hospital of North
Sichuan Medical College, Nanchong 637000, China
| | - Lezheng Yu
- School
of Chemistry and Materials Science, Guizhou
Education University, Guiyang 550024, China
| | - Runyu Jing
- School
of Cyber Science and Engineering, Sichuan
University, Chengdu 610017, China
| | - Bin Han
- GCP
Center/Institute of Drug Clinical Trials, Affiliated Hospital of North Sichuan Medical College, Nanchong 637503, China
| | - Jiesi Luo
- Basic
Medical College, Southwest Medical University, Luzhou 646099, Sichuan, China
- Key
Medical
Laboratory of New Drug Discovery and Druggability Evaluation, Luzhou
Key Laboratory of Activity Screening and Druggability Evaluation for
Chinese Materia Medica, Southwest Medical
University, Luzhou 646099, China
| |
Collapse
|
17
|
Kazmirchuk TDD, Bradbury-Jost C, Withey TA, Gessese T, Azad T, Samanfar B, Dehne F, Golshani A. Peptides of a Feather: How Computation Is Taking Peptide Therapeutics under Its Wing. Genes (Basel) 2023; 14:1194. [PMID: 37372372 DOI: 10.3390/genes14061194] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2023] [Revised: 05/24/2023] [Accepted: 05/26/2023] [Indexed: 06/29/2023] Open
Abstract
Leveraging computation in the development of peptide therapeutics has garnered increasing recognition as a valuable tool to generate novel therapeutics for disease-related targets. To this end, computation has transformed the field of peptide design through identifying novel therapeutics that exhibit enhanced pharmacokinetic properties and reduced toxicity. The process of in-silico peptide design involves the application of molecular docking, molecular dynamics simulations, and machine learning algorithms. Three primary approaches for peptide therapeutic design including structural-based, protein mimicry, and short motif design have been predominantly adopted. Despite the ongoing progress made in this field, there are still significant challenges pertaining to peptide design including: enhancing the accuracy of computational methods; improving the success rate of preclinical and clinical trials; and developing better strategies to predict pharmacokinetics and toxicity. In this review, we discuss past and present research pertaining to the design and development of in-silico peptide therapeutics in addition to highlighting the potential of computation and artificial intelligence in the future of disease therapeutics.
Collapse
Affiliation(s)
- Thomas David Daniel Kazmirchuk
- Department of Biology, and the Ottawa Institute of Systems Biology (OISB), Carleton University, Ottawa, ON K1S 5B6, Canada
| | - Calvin Bradbury-Jost
- Department of Biology, and the Ottawa Institute of Systems Biology (OISB), Carleton University, Ottawa, ON K1S 5B6, Canada
| | - Taylor Ann Withey
- Department of Biology, and the Ottawa Institute of Systems Biology (OISB), Carleton University, Ottawa, ON K1S 5B6, Canada
| | - Tadesse Gessese
- Department of Biology, and the Ottawa Institute of Systems Biology (OISB), Carleton University, Ottawa, ON K1S 5B6, Canada
| | - Taha Azad
- Department of Microbiology and Infectious Diseases, Université de Sherbrooke, Sherbrooke, QC J1E 4K8, Canada
- Centre de Recherche du Centre Hospitalier Universitaire de Sherbrooke (CHUS), Sherbrooke, QC J1H 5N4, Canada
| | - Bahram Samanfar
- Department of Biology, and the Ottawa Institute of Systems Biology (OISB), Carleton University, Ottawa, ON K1S 5B6, Canada
- Agriculture and Agri-Food Canada, Ottawa Research and Development Centre (ORDC), Ottawa, ON K1A 0C6, Canada
| | - Frank Dehne
- School of Computer Science, Carleton University, Ottawa, ON K1S 5B6, Canada
| | - Ashkan Golshani
- Department of Biology, and the Ottawa Institute of Systems Biology (OISB), Carleton University, Ottawa, ON K1S 5B6, Canada
| |
Collapse
|
18
|
Li Y, Ma D, Chen D, Chen Y. ACP-GBDT: An improved anticancer peptide identification method with gradient boosting decision tree. Front Genet 2023; 14:1165765. [PMID: 37065496 PMCID: PMC10090421 DOI: 10.3389/fgene.2023.1165765] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2023] [Accepted: 03/09/2023] [Indexed: 03/31/2023] Open
Abstract
Cancer is one of the most dangerous diseases in the world, killing millions of people every year. Drugs composed of anticancer peptides have been used to treat cancer with low side effects in recent years. Therefore, identifying anticancer peptides has become a focus of research. In this study, an improved anticancer peptide predictor named ACP-GBDT, based on gradient boosting decision tree (GBDT) and sequence information, is proposed. To encode the peptide sequences included in the anticancer peptide dataset, ACP-GBDT uses a merged-feature composed of AAIndex and SVMProt-188D. A GBDT is adopted to train the prediction model in ACP-GBDT. Independent testing and ten-fold cross-validation show that ACP-GBDT can effectively distinguish anticancer peptides from non-anticancer ones. The comparison results of the benchmark dataset show that ACP-GBDT is simpler and more effective than other existing anticancer peptide prediction methods.
Collapse
Affiliation(s)
- Yanjuan Li
- College of Electrical and Information Engineering, Quzhou University, Quzhou, China
| | - Di Ma
- College of Computer, Hangzhou Dianzi University, Hangzhou, China
| | - Dong Chen
- College of Electrical and Information Engineering, Quzhou University, Quzhou, China
- *Correspondence: Dong Chen, ; Yu Chen,
| | - Yu Chen
- College of Information and Computer Engineering, Northeast Forestry University, Harbin, China
- *Correspondence: Dong Chen, ; Yu Chen,
| |
Collapse
|
19
|
Deep learning drives efficient discovery of novel antihypertensive peptides from soybean protein isolate. Food Chem 2023; 404:134690. [DOI: 10.1016/j.foodchem.2022.134690] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2022] [Revised: 09/29/2022] [Accepted: 10/17/2022] [Indexed: 11/06/2022]
|
20
|
Yao L, Li W, Zhang Y, Deng J, Pang Y, Huang Y, Chung CR, Yu J, Chiang YC, Lee TY. Accelerating the Discovery of Anticancer Peptides through Deep Forest Architecture with Deep Graphical Representation. Int J Mol Sci 2023; 24:ijms24054328. [PMID: 36901759 PMCID: PMC10001941 DOI: 10.3390/ijms24054328] [Citation(s) in RCA: 9] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/07/2023] [Revised: 02/02/2023] [Accepted: 02/07/2023] [Indexed: 02/24/2023] Open
Abstract
Cancer is one of the leading diseases threatening human life and health worldwide. Peptide-based therapies have attracted much attention in recent years. Therefore, the precise prediction of anticancer peptides (ACPs) is crucial for discovering and designing novel cancer treatments. In this study, we proposed a novel machine learning framework (GRDF) that incorporates deep graphical representation and deep forest architecture for identifying ACPs. Specifically, GRDF extracts graphical features based on the physicochemical properties of peptides and integrates their evolutionary information along with binary profiles for constructing models. Moreover, we employ the deep forest algorithm, which adopts a layer-by-layer cascade architecture similar to deep neural networks, enabling excellent performance on small datasets but without complicated tuning of hyperparameters. The experiment shows GRDF exhibits state-of-the-art performance on two elaborate datasets (Set 1 and Set 2), achieving 77.12% accuracy and 77.54% F1-score on Set 1, as well as 94.10% accuracy and 94.15% F1-score on Set 2, exceeding existing ACP prediction methods. Our models exhibit greater robustness than the baseline algorithms commonly used for other sequence analysis tasks. In addition, GRDF is well-interpretable, enabling researchers to better understand the features of peptide sequences. The promising results demonstrate that GRDF is remarkably effective in identifying ACPs. Therefore, the framework presented in this study could assist researchers in facilitating the discovery of anticancer peptides and contribute to developing novel cancer treatments.
Collapse
Affiliation(s)
- Lantian Yao
- Kobilka Institute of Innovative Drug Discovery, School of Medicine, The Chinese University of Hong Kong (Shenzhen), 2001 Longxiang Road, Shenzhen 518172, China
- School of Science and Engineering, The Chinese University of Hong Kong (Shenzhen), 2001 Longxiang Road, Shenzhen 518172, China
| | - Wenshuo Li
- School of Science and Engineering, The Chinese University of Hong Kong (Shenzhen), 2001 Longxiang Road, Shenzhen 518172, China
- Warshel Institute for Computational Biology, School of Medicine, The Chinese University of Hong Kong (Shenzhen), 2001 Longxiang Road, Shenzhen 518172, China
| | - Yuntian Zhang
- Warshel Institute for Computational Biology, School of Medicine, The Chinese University of Hong Kong (Shenzhen), 2001 Longxiang Road, Shenzhen 518172, China
- School of Medicine, The Chinese University of Hong Kong (Shenzhen), 2001 Longxiang Road, Shenzhen 518172, China
| | - Junyang Deng
- School of Medicine, The Chinese University of Hong Kong (Shenzhen), 2001 Longxiang Road, Shenzhen 518172, China
| | - Yuxuan Pang
- School of Science and Engineering, The Chinese University of Hong Kong (Shenzhen), 2001 Longxiang Road, Shenzhen 518172, China
- Warshel Institute for Computational Biology, School of Medicine, The Chinese University of Hong Kong (Shenzhen), 2001 Longxiang Road, Shenzhen 518172, China
| | - Yixian Huang
- Warshel Institute for Computational Biology, School of Medicine, The Chinese University of Hong Kong (Shenzhen), 2001 Longxiang Road, Shenzhen 518172, China
- School of Medicine, The Chinese University of Hong Kong (Shenzhen), 2001 Longxiang Road, Shenzhen 518172, China
| | - Chia-Ru Chung
- Kobilka Institute of Innovative Drug Discovery, School of Medicine, The Chinese University of Hong Kong (Shenzhen), 2001 Longxiang Road, Shenzhen 518172, China
| | - Jinhan Yu
- Warshel Institute for Computational Biology, School of Medicine, The Chinese University of Hong Kong (Shenzhen), 2001 Longxiang Road, Shenzhen 518172, China
- School of Medicine, The Chinese University of Hong Kong (Shenzhen), 2001 Longxiang Road, Shenzhen 518172, China
| | - Ying-Chih Chiang
- Kobilka Institute of Innovative Drug Discovery, School of Medicine, The Chinese University of Hong Kong (Shenzhen), 2001 Longxiang Road, Shenzhen 518172, China
- Correspondence: (Y.-C.C.); (T.-Y.L.)
| | - Tzong-Yi Lee
- Warshel Institute for Computational Biology, School of Medicine, The Chinese University of Hong Kong (Shenzhen), 2001 Longxiang Road, Shenzhen 518172, China
- Correspondence: (Y.-C.C.); (T.-Y.L.)
| |
Collapse
|
21
|
Zhang H, Saravanan KM, Wei Y, Jiao Y, Yang Y, Pan Y, Wu X, Zhang JZH. Deep Learning-Based Bioactive Therapeutic Peptide Generation and Screening. J Chem Inf Model 2023; 63:835-845. [PMID: 36724090 DOI: 10.1021/acs.jcim.2c01485] [Citation(s) in RCA: 9] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/02/2023]
Abstract
Many bioactive peptides demonstrated therapeutic effects over complicated diseases, such as antiviral, antibacterial, anticancer, etc. It is possible to generate a large number of potentially bioactive peptides using deep learning in a manner analogous to the generation of de novo chemical compounds using the acquired bioactive peptides as a training set. Such generative techniques would be significant for drug development since peptides are much easier and cheaper to synthesize than compounds. Despite the limited availability of deep learning-based peptide-generating models, we have built an LSTM model (called LSTM_Pep) to generate de novo peptides and fine-tuned the model to generate de novo peptides with specific prospective therapeutic benefits. Remarkably, the Antimicrobial Peptide Database has been effectively utilized to generate various kinds of potential active de novo peptides. We proposed a pipeline for screening those generated peptides for a given target and used the main protease of SARS-COV-2 as a proof-of-concept. Moreover, we have developed a deep learning-based protein-peptide prediction model (DeepPep) for rapid screening of the generated peptides for the given targets. Together with the generating model, we have demonstrated that iteratively fine-tuning training, generating, and screening peptides for higher-predicted binding affinity peptides can be achieved. Our work sheds light on developing deep learning-based methods and pipelines to effectively generate and obtain bioactive peptides with a specific therapeutic effect and showcases how artificial intelligence can help discover de novo bioactive peptides that can bind to a particular target.
Collapse
Affiliation(s)
- Haiping Zhang
- Shenzhen Institute of Synthetic Biology, Faculty of Synthetic Biology, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, Guangdong, China
| | - Konda Mani Saravanan
- Department of Biotechnology, Bharath Institute of Higher Education and Research, Chennai 600073, Tamil Nadu, India
| | - Yanjie Wei
- Center for High Performance Computing, Joint Engineering Research Center for Health Big Data Intelligent Analysis Technology, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, Guangdong, China
| | - Yang Jiao
- Faculty of Computer Science and Control Engineering, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China
| | - Yang Yang
- Shenzhen Key Laboratory of Pathogen and Immunity, National Clinical Research Center for infectious disease, State Key Discipline of Infectious Disease, Shenzhen Third People's Hospital, Second Hospital Affiliated to Southern University of Science and Technology, Shenzhen 518112, China
| | - Yi Pan
- Center for High Performance Computing, Joint Engineering Research Center for Health Big Data Intelligent Analysis Technology, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, Guangdong, China.,Faculty of Computer Science and Control Engineering, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China
| | - Xuli Wu
- School of Medicine, Shenzhen University, Shenzhen 518060, Guangdong, China
| | - John Z H Zhang
- Shenzhen Institute of Synthetic Biology, Faculty of Synthetic Biology, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, Guangdong, China.,East China Normal University, Shanghai 200062, China.,NYU-ECNU Center for Computational Chemistry at NYU Shanghai, Shanghai 200062, China
| |
Collapse
|
22
|
Ghaly G, Tallima H, Dabbish E, Badr ElDin N, Abd El-Rahman MK, Ibrahim MAA, Shoeib T. Anti-Cancer Peptides: Status and Future Prospects. Molecules 2023; 28:molecules28031148. [PMID: 36770815 PMCID: PMC9920184 DOI: 10.3390/molecules28031148] [Citation(s) in RCA: 12] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2022] [Revised: 12/26/2022] [Accepted: 01/19/2023] [Indexed: 01/26/2023] Open
Abstract
The dramatic rise in cancer incidence, alongside treatment deficiencies, has elevated cancer to the second-leading cause of death globally. The increasing morbidity and mortality of this disease can be traced back to a number of causes, including treatment-related side effects, drug resistance, inadequate curative treatment and tumor relapse. Recently, anti-cancer bioactive peptides (ACPs) have emerged as a potential therapeutic choice within the pharmaceutical arsenal due to their high penetration, specificity and fewer side effects. In this contribution, we present a general overview of the literature concerning the conformational structures, modes of action and membrane interaction mechanisms of ACPs, as well as provide recent examples of their successful employment as targeting ligands in cancer treatment. The use of ACPs as a diagnostic tool is summarized, and their advantages in these applications are highlighted. This review expounds on the main approaches for peptide synthesis along with their reconstruction and modification needed to enhance their therapeutic effect. Computational approaches that could predict therapeutic efficacy and suggest ACP candidates for experimental studies are discussed. Future research prospects in this rapidly expanding area are also offered.
Collapse
Affiliation(s)
- Gehane Ghaly
- Department of Chemistry, The American University in Cairo, New Cairo 11835, Egypt
| | - Hatem Tallima
- Department of Chemistry, The American University in Cairo, New Cairo 11835, Egypt
| | - Eslam Dabbish
- Department of Chemistry, The American University in Cairo, New Cairo 11835, Egypt
| | - Norhan Badr ElDin
- Analytical Chemistry Department, Faculty of Pharmacy, Cairo University, Kasr-El Aini Street, Cairo 11562, Egypt
| | - Mohamed K. Abd El-Rahman
- Analytical Chemistry Department, Faculty of Pharmacy, Cairo University, Kasr-El Aini Street, Cairo 11562, Egypt
- Department of Chemistry and Chemical Biology, Harvard University, 12 Oxford Street, Cambridge, MA 02138, USA
| | - Mahmoud A. A. Ibrahim
- Computational Chemistry Laboratory, Chemistry Department, Faculty of Science, Minia University, Minia 61519, Egypt
- School of Health Sciences, University of Kwa-Zulu-Natal, Westville, Durban 4000, South Africa
| | - Tamer Shoeib
- Department of Chemistry, The American University in Cairo, New Cairo 11835, Egypt
- Correspondence:
| |
Collapse
|
23
|
Yuan Q, Chen K, Yu Y, Le NQK, Chua MCH. Prediction of anticancer peptides based on an ensemble model of deep learning and machine learning using ordinal positional encoding. Brief Bioinform 2023; 24:6987656. [PMID: 36642410 DOI: 10.1093/bib/bbac630] [Citation(s) in RCA: 32] [Impact Index Per Article: 32.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2022] [Revised: 12/01/2022] [Accepted: 12/28/2022] [Indexed: 01/17/2023] Open
Abstract
Anticancer peptides (ACPs) are the types of peptides that have been demonstrated to have anticancer activities. Using ACPs to prevent cancer could be a viable alternative to conventional cancer treatments because they are safer and display higher selectivity. Due to ACP identification being highly lab-limited, expensive and lengthy, a computational method is proposed to predict ACPs from sequence information in this study. The process includes the input of the peptide sequences, feature extraction in terms of ordinal encoding with positional information and handcrafted features, and finally feature selection. The whole model comprises of two modules, including deep learning and machine learning algorithms. The deep learning module contained two channels: bidirectional long short-term memory (BiLSTM) and convolutional neural network (CNN). Light Gradient Boosting Machine (LightGBM) was used in the machine learning module. Finally, this study voted the three models' classification results for the three paths resulting in the model ensemble layer. This study provides insights into ACP prediction utilizing a novel method and presented a promising performance. It used a benchmark dataset for further exploration and improvement compared with previous studies. Our final model has an accuracy of 0.7895, sensitivity of 0.8153 and specificity of 0.7676, and it was increased by at least 2% compared with the state-of-the-art studies in all metrics. Hence, this paper presents a novel method that can potentially predict ACPs more effectively and efficiently. The work and source codes are made available to the community of researchers and developers at https://github.com/khanhlee/acp-ope/.
Collapse
Affiliation(s)
- Qitong Yuan
- Institute of Systems Science, National University of Singapore, 25 Heng Mui Keng Terrace, 119615, Singapore, Singapore
| | - Keyi Chen
- Institute of Systems Science, National University of Singapore, 25 Heng Mui Keng Terrace, 119615, Singapore, Singapore
| | - Yimin Yu
- Institute of Systems Science, National University of Singapore, 25 Heng Mui Keng Terrace, 119615, Singapore, Singapore
| | - Nguyen Quoc Khanh Le
- Professional Master Program in Artificial Intelligence in Medicine, College of Medicine, Taipei Medical University, 250 Wuxing St, 106, Taipei, Taiwan.,Research Center for Artificial Intelligence in Medicine, Taipei Medical University, 250 Wuxing St, 106, Taipei, Taiwan.,Translational Imaging Research Center, Taipei Medical University Hospital, 252 Wuxing St, 110, Taipei, Taiwan
| | - Matthew Chin Heng Chua
- Institute of Systems Science, National University of Singapore, 25 Heng Mui Keng Terrace, 119615, Singapore, Singapore
| |
Collapse
|
24
|
Liang Y, Ma X. iACP-GE: accurate identification of anticancer peptides by using gradient boosting decision tree and extra tree. SAR AND QSAR IN ENVIRONMENTAL RESEARCH 2023; 34:1-19. [PMID: 36562289 DOI: 10.1080/1062936x.2022.2160011] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/24/2022] [Accepted: 12/12/2022] [Indexed: 06/17/2023]
Abstract
Cancer is one of the main diseases threatening human life, accounting for millions of deaths around the world each year. Traditional physical and chemical methods for cancer treatment are extremely time-consuming, lab-intensive, expensive, inefficient and difficult to be applied in a high-throughput way. Hence, it is an urgent task to develop automated computational methods to enable fast and accurate identification of anticancer peptides (ACPs). In this paper, we develop a novel model named iACP-GE to identify ACPs. Multi-features are extracted by using binary encoding, enhanced grouped amino acid composition and BLOSUM62 encoding based on the N5C5 sequence, as well as detrended forward moving-average auto-cross correlation analysis based on physicochemical properties of 20 natural amino acids. Thus, 835 features are obtained for each sample, in order to avoid information redundancy, gradient boosting decision tree was adopted as the feature selection strategy. Then, the optimal feature subset is input to the extra tree classifier. The accuracies of ACP740 and ACP240 datasets with the 5-fold cross-validation were 90.54% and 91.25%, respectively. Experimental results indicate that iACP-GE significantly outperforms several existing models on ACP740 and ACP240 datasets and can be used as an effective tool for the identification of ACPs. The datasets and source codes for iACP-GE are available at https://github.com/yunyunliang88/iACP-GE.
Collapse
Affiliation(s)
- Y Liang
- School of Science, Xi'an Polytechnic University, Xi'an, P. R. China
| | - X Ma
- School of Science, Xi'an Polytechnic University, Xi'an, P. R. China
| |
Collapse
|
25
|
Zhou C, Peng D, Liao B, Jia R, Wu F. ACP_MS: prediction of anticancer peptides based on feature extraction. Brief Bioinform 2022; 23:6793775. [PMID: 36326080 DOI: 10.1093/bib/bbac462] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2022] [Revised: 09/10/2022] [Accepted: 09/27/2022] [Indexed: 11/06/2022] Open
Abstract
Anticancer peptides (ACPs) are bioactive peptides with antitumor activity and have become the most promising drugs in the treatment of cancer. Therefore, the accurate prediction of ACPs is of great significance to the research of cancer diseases. In the paper, we developed a more efficient prediction model called ACP_MS. Firstly, the monoMonoKGap method is used to extract the characteristic of anticancer peptide sequences and form the digital features. Then, the AdaBoost model is used to select the most discriminating features from the digital features. Finally, a stochastic gradient descent algorithm is introduced to identify anticancer peptide sequences. We adopt 7-fold cross-validation and independent test set validation, and the final accuracy of the main dataset reached 92.653% and 91.597%, respectively. The accuracy of the alternate dataset reached 98.678% and 98.317%, respectively. Compared with other advanced prediction models, the ACP_MS model improves the identification ability of anticancer peptide sequences. The data of this model can be downloaded from the public website for free https://github.com/Zhoucaimao1998/Zc.
Collapse
Affiliation(s)
- Caimao Zhou
- Key Laboratory of Computational Science and Application of Hainan Province, Haikou, China.,Key Laboratory of Data Science and Intelligence Education, Hainan Normal University, Ministry of Education, Haikou, China.,School of Mathematics and Statistics, Hainan Normal University, Haikou, China
| | - Dejun Peng
- Key Laboratory of Computational Science and Application of Hainan Province, Haikou, China.,Key Laboratory of Data Science and Intelligence Education, Hainan Normal University, Ministry of Education, Haikou, China.,School of Mathematics and Statistics, Hainan Normal University, Haikou, China
| | - Bo Liao
- Key Laboratory of Computational Science and Application of Hainan Province, Haikou, China.,Key Laboratory of Data Science and Intelligence Education, Hainan Normal University, Ministry of Education, Haikou, China.,School of Mathematics and Statistics, Hainan Normal University, Haikou, China
| | - Ranran Jia
- Key Laboratory of Computational Science and Application of Hainan Province, Haikou, China.,Key Laboratory of Data Science and Intelligence Education, Hainan Normal University, Ministry of Education, Haikou, China.,School of Mathematics and Statistics, Hainan Normal University, Haikou, China
| | - Fangxiang Wu
- Key Laboratory of Computational Science and Application of Hainan Province, Haikou, China.,Key Laboratory of Data Science and Intelligence Education, Hainan Normal University, Ministry of Education, Haikou, China.,School of Mathematics and Statistics, Hainan Normal University, Haikou, China
| |
Collapse
|
26
|
Pandiyan S, Wang L. A comprehensive review on recent approaches for cancer drug discovery associated with artificial intelligence. Comput Biol Med 2022; 150:106140. [PMID: 36179510 DOI: 10.1016/j.compbiomed.2022.106140] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2022] [Revised: 07/20/2022] [Accepted: 09/18/2022] [Indexed: 11/03/2022]
Abstract
Through the revolutionization of artificial intelligence (AI) technologies in clinical research, significant improvement is observed in diagnosis of cancer. Utilization of these AI technologies, such as machine and deep learning, is imperative for the discovery of novel anticancer drugs and improves existing/ongoing cancer therapeutics. However, building a model for complicated cancers and their types remains a challenge due to lack of effective therapeutics that hinder the establishment of effective computational tools. In this review, we exploit recent approaches and state-of-the-art in implementing AI methods for anticancer drug discovery, and discussed how advances in these applications need to be considered in the current cancer therapeutics. Considering the immense potential of AI, we explore molecular docking and their interactions to recognize metabolic activities that support drug design. Finally, we highlight corresponding strategies in applying machine and deep learning methods to various types of cancer with their pros and cons.
Collapse
Affiliation(s)
- Sanjeevi Pandiyan
- Research Center for Intelligent Information Technology, Nantong University, Nantong, China; School of Information Science and Technology, Nantong University, Nantong, China; Nantong Research Institute for Advanced Communication Technologies, Nantong, China
| | - Li Wang
- Research Center for Intelligent Information Technology, Nantong University, Nantong, China; School of Information Science and Technology, Nantong University, Nantong, China; Nantong Research Institute for Advanced Communication Technologies, Nantong, China.
| |
Collapse
|
27
|
The dynamic landscape of peptide activity prediction. Comput Struct Biotechnol J 2022; 20:6526-6533. [DOI: 10.1016/j.csbj.2022.11.043] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2022] [Revised: 11/21/2022] [Accepted: 11/21/2022] [Indexed: 11/27/2022] Open
|
28
|
ACP-ADA: A Boosting Method with Data Augmentation for Improved Prediction of Anticancer Peptides. Int J Mol Sci 2022; 23:ijms232012194. [PMID: 36293050 PMCID: PMC9603247 DOI: 10.3390/ijms232012194] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2022] [Revised: 10/08/2022] [Accepted: 10/11/2022] [Indexed: 11/30/2022] Open
Abstract
Cancer is the second-leading cause of death worldwide, and therapeutic peptides that target and destroy cancer cells have received a great deal of interest in recent years. Traditional wet experiments are expensive and inefficient for identifying novel anticancer peptides; therefore, the development of an effective computational approach is essential to recognize ACP candidates before experimental methods are used. In this study, we proposed an Ada-boosting algorithm with the base learner random forest called ACP-ADA, which integrates binary profile feature, amino acid index, and amino acid composition with a 210-dimensional feature space vector to represent the peptides. Training samples in the feature space were augmented to increase the sample size and further improve the performance of the model in the case of insufficient samples. Furthermore, we used five-fold cross-validation to find model parameters, and the cross-validation results showed that ACP-ADA outperforms existing methods for this feature combination with data augmentation in terms of performance metrics. Specifically, ACP-ADA recorded an average accuracy of 86.4% and a Mathew’s correlation coefficient of 74.01% for dataset ACP740 and 90.83% and 81.65% for dataset ACP240; consequently, it can be a very useful tool in drug development and biomedical research.
Collapse
|
29
|
Chen S, Li Q, Zhao J, Bin Y, Zheng C. NeuroPred-CLQ: incorporating deep temporal convolutional networks and multi-head attention mechanism to predict neuropeptides. Brief Bioinform 2022; 23:6672901. [DOI: 10.1093/bib/bbac319] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2022] [Revised: 06/27/2022] [Accepted: 07/14/2022] [Indexed: 11/13/2022] Open
Abstract
Abstract
Neuropeptides (NPs) are a particular class of informative substances in the immune system and physiological regulation. They play a crucial role in regulating physiological functions in various biological growth and developmental stages. In addition, NPs are crucial for developing new drugs for the treatment of neurological diseases. With the development of molecular biology techniques, some data-driven tools have emerged to predict NPs. However, it is necessary to improve the predictive performance of these tools for NPs. In this study, we developed a deep learning model (NeuroPred-CLQ) based on the temporal convolutional network (TCN) and multi-head attention mechanism to identify NPs effectively and translate the internal relationships of peptide sequences into numerical features by the Word2vec algorithm. The experimental results show that NeuroPred-CLQ learns data information effectively, achieving 93.6% accuracy and 98.8% AUC on the independent test set. The model has better performance in identifying NPs than the state-of-the-art predictors. Visualization of features using t-distribution random neighbor embedding shows that the NeuroPred-CLQ can clearly distinguish the positive NPs from the negative ones. We believe the NeuroPred-CLQ can facilitate drug development and clinical trial studies to treat neurological disorders.
Collapse
Affiliation(s)
- Shouzhi Chen
- School of Mathematics and System Science, Xinjiang University , Urumqi, China
| | - Qing Li
- School of Mathematics and System Science, Xinjiang University , Urumqi, China
| | - Jianping Zhao
- School of Mathematics and System Science, Xinjiang University , Urumqi, China
| | - Yannan Bin
- School of Computer Science and Technology, Anhui University , Hefei, China
| | - Chunhou Zheng
- School of Mathematics and System Science, Xinjiang University , Urumqi, China
- School of Computer Science and Technology, Anhui University , Hefei, China
| |
Collapse
|
30
|
Zou H, Yang F, Yin Z. Integrating multiple sequence features for identifying anticancer peptides. Comput Biol Chem 2022; 99:107711. [DOI: 10.1016/j.compbiolchem.2022.107711] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2021] [Revised: 05/16/2022] [Accepted: 05/29/2022] [Indexed: 11/03/2022]
|
31
|
Zhu L, Ye C, Hu X, Yang S, Zhu C. ACP-check: An anticancer peptide prediction model based on bidirectional long short-term memory and multi-features fusion strategy. Comput Biol Med 2022; 148:105868. [PMID: 35868046 DOI: 10.1016/j.compbiomed.2022.105868] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2022] [Revised: 06/14/2022] [Accepted: 07/09/2022] [Indexed: 11/16/2022]
Abstract
The anticancer peptide is an emerging anticancer drug that has become an effective alternative to chemotherapy and targeted therapy due to fewer side effects and resistance. The traditional biological experimental method for identifying anticancer peptides is a time-consuming and complicated process that hinders large-scale, rapid, and effective identification. In this paper, we propose a model based on a bidirectional long short-term memory network and multi-features fusion, called ACP-check, which employs a bidirectional long short-term memory network to extract time-dependent information features from peptide sequences, and combines them with amino acid sequence features including binary profile feature, dipeptide composition, the composition of k-spaced amino acid group pairs, amino acid composition, and sequence-order-coupling number. To verify the performance of the model, six benchmark datasets are selected, including ACPred-Fuse, ACPred-FL, ACP240, ACP740, main and alternate datasets of AntiCP2.0. In terms of Matthews correlation coefficients, ACP-check obtains 0.37, 0.82, 0.80, 0.75, 0.56, and 0.86 on six datasets respectively, which is an improvement by 2%-86% than existing state-of-the-art anticancer peptides prediction methods. Furthermore, ACP-check achieves prediction accuracy with 0.91, 0.91, 0.90, 0.87, 0.78, and 0.93 respectively, which increases range from 1%-49%. Overall, the comparison experiment shows that ACP-check can accurately identify anticancer peptides by sequence-level information. The code and data are available at http://www.cczubio.top/ACP-check/.
Collapse
Affiliation(s)
- Lun Zhu
- School of Computer Science and Artificial Intelligence Aliyun School of Big Data School of Software, Changzhou University, Changzhou, 213164, China
| | - Chenyang Ye
- School of Computer Science and Artificial Intelligence Aliyun School of Big Data School of Software, Changzhou University, Changzhou, 213164, China
| | - Xuemei Hu
- Key Laboratory of Symbol Computation and Knowledge Engineering of Ministry of Education, College of Computer Science and Technology, Jilin University, Changchun, 130012, China
| | - Sen Yang
- School of Computer Science and Artificial Intelligence Aliyun School of Big Data School of Software, Changzhou University, Changzhou, 213164, China; Changzhou No.2 People's Hospital, the Affiliated Hospital of Nanjing Medical University, Changzhou, 213164, China.
| | - Chenyang Zhu
- School of Computer Science and Artificial Intelligence Aliyun School of Big Data School of Software, Changzhou University, Changzhou, 213164, China
| |
Collapse
|
32
|
Otović E, Njirjak M, Kalafatovic D, Mauša G. Sequential Properties Representation Scheme for Recurrent Neural Network-Based Prediction of Therapeutic Peptides. J Chem Inf Model 2022; 62:2961-2972. [PMID: 35704881 DOI: 10.1021/acs.jcim.2c00526] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
The discovery of therapeutic peptides is often accelerated by means of virtual screening supported by machine learning-based predictive models. The predictive performance of such models is sensitive to the choice of data and its representation scheme. While the peptide physicochemical and compositional representations fail to distinguish sequence permutations, the amino acid arrangement within the sequence lacks the important information contained in physicochemical, conformational, topological, and geometrical properties. In this paper, we propose a solution to the identified information gap by implementing a hybrid scheme that complements the best traits from both approaches with the aim of predicting antimicrobial and antiviral activities based on experimental data from DRAMP 2.0, AVPdb, and Uniprot data repositories. Using the Friedman test of statistical significance, we compared our hybrid, sequential properties approach to peptide properties, one-hot vector encoding, and word embedding schemes in the 10-fold cross-validation setting, with respect to the F1 score, Matthews correlation coefficient, geometric mean, recall, and precision evaluation metrics. Moreover, the sequence modeling neural network was employed to gain insight into the synergic effect of both properties- and amino acid order-based predictions. The results suggest that sequential properties significantly (P < 0.01) surpasses the aforementioned state-of-the-art representation schemes. This makes it a strong candidate for increasing the predictive power of screening methods based on machine learning, applicable to any category of peptides.
Collapse
Affiliation(s)
- Erik Otović
- University of Rijeka, Faculty of Engineering, 51000 Rijeka, Croatia
| | - Marko Njirjak
- University of Rijeka, Faculty of Engineering, 51000 Rijeka, Croatia
| | - Daniela Kalafatovic
- University of Rijeka, Department of Biotechnology, 51000 Rijeka, Croatia.,University of Rijeka, Center for Artificial Intelligence and Cybersecurity, 51000 Rijeka, Croatia
| | - Goran Mauša
- University of Rijeka, Faculty of Engineering, 51000 Rijeka, Croatia.,University of Rijeka, Center for Artificial Intelligence and Cybersecurity, 51000 Rijeka, Croatia
| |
Collapse
|
33
|
Multi-channel CNN based anticancer peptides identification. Anal Biochem 2022; 650:114707. [PMID: 35568159 DOI: 10.1016/j.ab.2022.114707] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2021] [Revised: 01/27/2022] [Accepted: 04/27/2022] [Indexed: 11/20/2022]
Abstract
Cancer is one of the most dangerous diseases in the world that often leads to misery and death. Current treatments include different kinds of anticancer therapy which exhibit different types of side effects. Because of certain physicochemical properties, anticancer peptides (ACPs) have opened a new path of treatments for this deadly disease. That is why a well-performed methodology for identifying novel anticancer peptides has great importance in the fight against cancer. In addition to the laboratory techniques, various machine learning and deep learning methodologies have developed in recent years for this task. Although these models have shown reasonable predictive ability, there's still room for improvement in terms of performance and exploring new types of algorithms. In this work, we have proposed a novel multi-channel convolutional neural network (CNN) for identifying anticancer peptides from protein sequences. We have collected data from the existing state-of-the-art methodologies and applied binary encoding for data preprocessing. We have also employed k-fold cross-validation to train our models on benchmark datasets and compared our models' performance on the independent datasets. The comparison has indicated our models' superiority on various evaluation metrics. We think our work can be a valuable asset in finding novel anticancer peptides. We have provided a user-friendly web server for academic purposes and it is publicly available at: \texttt{http://103.99.176.239/iacp-cnn/}.
Collapse
|
34
|
Development of Anticancer Peptides Using Artificial Intelligence and Combinational Therapy for Cancer Therapeutics. Pharmaceutics 2022; 14:pharmaceutics14050997. [PMID: 35631583 PMCID: PMC9147327 DOI: 10.3390/pharmaceutics14050997] [Citation(s) in RCA: 14] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2022] [Revised: 04/28/2022] [Accepted: 05/04/2022] [Indexed: 01/27/2023] Open
Abstract
Cancer is a group of diseases causing abnormal cell growth, altering the genome, and invading or spreading to other parts of the body. Among therapeutic peptide drugs, anticancer peptides (ACPs) have been considered to target and kill cancer cells because cancer cells have unique characteristics such as a high negative charge and abundance of microvilli in the cell membrane when compared to a normal cell. ACPs have several advantages, such as high specificity, cost-effectiveness, low immunogenicity, minimal toxicity, and high tolerance under normal physiological conditions. However, the development and identification of ACPs are time-consuming and expensive in traditional wet-lab-based approaches. Thus, the application of artificial intelligence on the approaches can save time and reduce the cost to identify candidate ACPs. Recently, machine learning (ML), deep learning (DL), and hybrid learning (ML combined DL) have emerged into the development of ACPs without experimental analysis, owing to advances in computer power and big data from the power system. Additionally, we suggest that combination therapy with classical approaches and ACPs might be one of the impactful approaches to increase the efficiency of cancer therapy.
Collapse
|
35
|
Alqahtani A. Application of Artificial Intelligence in Discovery and Development of Anticancer and Antidiabetic Therapeutic Agents. EVIDENCE-BASED COMPLEMENTARY AND ALTERNATIVE MEDICINE : ECAM 2022; 2022:6201067. [PMID: 35509623 PMCID: PMC9060979 DOI: 10.1155/2022/6201067] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/04/2022] [Revised: 03/17/2022] [Accepted: 04/05/2022] [Indexed: 11/18/2022]
Abstract
Spectacular developments in molecular and cellular biology have led to important discoveries in cancer research. Despite cancer is one of the major causes of morbidity and mortality globally, diabetes is one of the most leading sources of group of disorders. Artificial intelligence (AI) has been considered the fourth industrial revolution machine. The most major hurdles in drug discovery and development are the time and expenditures required to sustain the drug research pipeline. Large amounts of data can be explored and generated by AI, which can then be converted into useful knowledge. Because of this, the world's largest drug companies have already begun to use AI in their drug development research. In the present era, AI has a huge amount of potential for the rapid discovery and development of new anticancer drugs. Clinical studies, electronic medical records, high-resolution medical imaging, and genomic assessments are just a few of the tools that could aid drug development. Large data sets are available to researchers in the pharmaceutical and medical fields, which can be analyzed by advanced AI systems. This review looked at how computational biology and AI technologies may be utilized in cancer precision drug development by combining knowledge of cancer medicines, drug resistance, and structural biology. This review also highlighted a realistic assessment of the potential for AI in understanding and managing diabetes.
Collapse
Affiliation(s)
- Amal Alqahtani
- College of Medicine, Imam Abdulrahman Bin Faisal University, Dammam, 31541, Saudi Arabia
- Department of Basic Sciences, Deanship of Preparatory Year and Supporting Studies, Imam Abdulrahman Bin Faisal University, P.O. Box 1982, Dammam 34212, Saudi Arabia
| |
Collapse
|
36
|
Yu L, Zhang Y, Xue L, Liu F, Chen Q, Luo J, Jing R. Systematic Analysis and Accurate Identification of DNA N4-Methylcytosine Sites by Deep Learning. Front Microbiol 2022; 13:843425. [PMID: 35401453 PMCID: PMC8989013 DOI: 10.3389/fmicb.2022.843425] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/25/2021] [Accepted: 02/21/2022] [Indexed: 11/13/2022] Open
Abstract
DNA N4-methylcytosine (4mC) is a pivotal epigenetic modification that plays an essential role in DNA replication, repair, expression and differentiation. To gain insight into the biological functions of 4mC, it is critical to identify their modification sites in the genomics. Recently, deep learning has become increasingly popular in recent years and frequently employed for the 4mC site identification. However, a systematic analysis of how to build predictive models using deep learning techniques is still lacking. In this work, we first summarized all existing deep learning-based predictors and systematically analyzed their models, features and datasets, etc. Then, using a typical standard dataset with three species (A. thaliana, C. elegans, and D. melanogaster), we assessed the contribution of different model architectures, encoding methods and the attention mechanism in establishing a deep learning-based model for the 4mC site prediction. After a series of optimizations, convolutional-recurrent neural network architecture using the one-hot encoding and attention mechanism achieved the best overall prediction performance. Extensive comparison experiments were conducted based on the same dataset. This work will be helpful for researchers who would like to build the 4mC prediction models using deep learning in the future.
Collapse
Affiliation(s)
- Lezheng Yu
- School of Chemistry and Materials Science, Guizhou Education University, Guiyang, China
| | - Yonglin Zhang
- Department of Pharmacology, School of Pharmacy, Southwest Medical University, Luzhou, China
| | - Li Xue
- School of Public Health, Southwest Medical University, Luzhou, China
| | - Fengjuan Liu
- School of Geography and Resources, Guizhou Education University, Guiyang, China
| | - Qi Chen
- Department of Endocrinology and Metabolism, The Affiliated Hospital of Southwest Medical University, Luzhou, China
| | - Jiesi Luo
- Department of Pharmacology, School of Pharmacy, Southwest Medical University, Luzhou, China.,Department of Pharmacy, The Affiliated Hospital of Southwest Medical University, Luzhou, China
| | - Runyu Jing
- School of Cyber Science and Engineering, Sichuan University, Chengdu, China
| |
Collapse
|
37
|
ACPNet: A Deep Learning Network to Identify Anticancer Peptides by Hybrid Sequence Information. Molecules 2022; 27:molecules27051544. [PMID: 35268644 PMCID: PMC8912097 DOI: 10.3390/molecules27051544] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2022] [Revised: 02/20/2022] [Accepted: 02/23/2022] [Indexed: 12/18/2022] Open
Abstract
Cancer is one of the most dangerous threats to human health. One of the issues is drug resistance action, which leads to side effects after drug treatment. Numerous therapies have endeavored to relieve the drug resistance action. Recently, anticancer peptides could be a novel and promising anticancer candidate, which can inhibit tumor cell proliferation, migration, and suppress the formation of tumor blood vessels, with fewer side effects. However, it is costly, laborious and time consuming to identify anticancer peptides by biological experiments with a high throughput. Therefore, accurately identifying anti-cancer peptides becomes a key and indispensable step for anticancer peptides therapy. Although some existing computer methods have been developed to predict anticancer peptides, the accuracy still needs to be improved. Thus, in this study, we propose a deep learning-based model, called ACPNet, to distinguish anticancer peptides from non-anticancer peptides (non-ACPs). ACPNet employs three different types of peptide sequence information, peptide physicochemical properties and auto-encoding features linking the training process. ACPNet is a hybrid deep learning network, which fuses fully connected networks and recurrent neural networks. The comparison with other existing methods on ACPs82 datasets shows that ACPNet not only achieves the improvement of 1.2% Accuracy, 2.0% F1-score, and 7.2% Recall, but also gets balanced performance on the Matthews correlation coefficient. Meanwhile, ACPNet is verified on an independent dataset, with 20 proven anticancer peptides, and only one anticancer peptide is predicted as non-ACPs. The comparison and independent validation experiment indicate that ACPNet can accurately distinguish anticancer peptides from non-ACPs.
Collapse
|
38
|
You H, Yu L, Tian S, Ma X, Xing Y, Song J, Wu W. Anti-cancer Peptide Recognition Based on Grouped Sequence and Spatial Dimension Integrated Networks. Interdiscip Sci 2021; 14:196-208. [PMID: 34637113 DOI: 10.1007/s12539-021-00481-0] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2021] [Revised: 09/05/2021] [Accepted: 09/09/2021] [Indexed: 11/24/2022]
Abstract
The diversification of the characteristic sequences of anti-cancer peptides has imposed difficulties on research. To effectively predict new anti-cancer peptides, this paper proposes a more suitable feature grouping sequence and spatial dimension-integrated network algorithm for anti-cancer peptide sequence prediction called GRCI-Net. The main process is as follows: First, we implemented the fusion reduction of binary structure features and K-mer sparse matrix features through principal component analysis and generated a set of new features; second, we constructed a new bidirectional long- and short-term memory network. We used traditional convolution and dilated convolution to acquire features in the spatial dimension using the memory network's grouping sequence model, which is designed to better handle the diversification of anti-cancer peptide feature sequences and to fully learn the contextual information between features. Finally, we achieved the fusion of grouping sequence features and spatial dimensional integration features through two sets of dense network layers, achieved the prediction of anti-cancer peptides through the sigmoid function, and verified the approach with two public datasets, ACP740 (accuracy reached 0.8230) and ACP240 (accuracy reached 0.8750). The following is a link to the model code and datasets mentioned in this article: https://github.com/ YouHongfeng101/ACP-DL.
Collapse
Affiliation(s)
- Hongfeng You
- College of Information Science and Engineering, Xinjiang University, 666 Shengli Road, Tianshan District, Urumqi, Xinjiang, China
| | - Long Yu
- Network Center, Xinjiang University, Xinjiang, China.
| | - Shengwei Tian
- School of Software, Xinjiang University, Tianshan District, 666 Shengli Road, Urumqi, Xinjiang, China
| | - Xiang Ma
- Department of Cardiology, The First Affiliated Hospital of Xinjiang Medical University, Urumqi, 830011, China
| | - Yan Xing
- Imaging Center, The First Affiliated Hospital of Xinjiang Medical University, No. 137, LiYuShan South Road, Urumqi, Xinjiang, China
| | - Jinmiao Song
- College of Information Science and Engineering, Xinjiang University, Urumqi, Xinjiang, China
| | - Weidong Wu
- People's Hospital of Xinjiang Uygur Autonomous Region, Urumqi, Xinjiang, China
| |
Collapse
|
39
|
Cai L, Wang L, Fu X, Zeng X. Active Semisupervised Model for Improving the Identification of Anticancer Peptides. ACS OMEGA 2021; 6:23998-24008. [PMID: 34568678 PMCID: PMC8459422 DOI: 10.1021/acsomega.1c03132] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/15/2021] [Indexed: 06/13/2023]
Abstract
Cancer is one of the most dangerous threats to human health. Accurate identification of anticancer peptides (ACPs) is valuable for the development and design of new anticancer agents. However, most machine-learning algorithms have limited ability to identify ACPs, and their accuracy is sensitive to the amount of label data. In this paper, we construct a new technology that combines active learning (AL) and label propagation (LP) algorithm to solve this problem, called (ACP-ALPM). First, we develop an efficient feature representation method based on various descriptor information and coding information of the peptide sequence. Then, an AL strategy is used to filter out the most informative data for model training, and a more powerful LP classifier is cast through continuous iterations. Finally, we evaluate the performance of ACP-ALPM and compare it with that of some of the state-of-the-art and classic methods; experimental results show that our method is significantly superior to them. In addition, through the experimental comparison of random selection and AL on three public data sets, it is proved that the AL strategy is more effective. Notably, a visualization experiment further verified that AL can utilize unlabeled data to improve the performance of the model. We hope that our method can be extended to other types of peptides and provide more inspiration for other similar work.
Collapse
Affiliation(s)
- Lijun Cai
- Department of Information
Science and Technology, Hunan University, Changsha, Hunan 410000, China
| | - Li Wang
- Department of Information
Science and Technology, Hunan University, Changsha, Hunan 410000, China
| | - Xiangzheng Fu
- Department of Information
Science and Technology, Hunan University, Changsha, Hunan 410000, China
| | - Xiangxiang Zeng
- Department of Information
Science and Technology, Hunan University, Changsha, Hunan 410000, China
| |
Collapse
|
40
|
Jiang Y, Yin Z, Zhao J, Sun J, Zhao D, Zeng XA, Li H, Huang M, Wu J. Antioxidant mechanism exploration of the tripeptide Val-Asn-Pro generated from Jiuzao and its potential application in baijiu. Food Chem Toxicol 2021; 155:112402. [PMID: 34246709 DOI: 10.1016/j.fct.2021.112402] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2021] [Revised: 07/05/2021] [Accepted: 07/07/2021] [Indexed: 01/27/2023]
Abstract
Val-Asn-Pro (VNP) was identified from the raw material of baijiu distillation (Jiupei) and exhibit antioxidant activity in vitro. In this study, residue after baijiu distillation (Jiuzao) was used to seek the antioxidant peptide VNP with the methods reported inthe previous study. Its potential antioxidant mechanism in vivo was further assessed. Gene and protein expressions of Nrf2/Keap1-p38MAPK/PI3K-MafK signaling pathway and downstream enzymes (i.e., CAT, GPX1, SOD1, and HO-1) in AAPH-induced oxidative stress Sprague-Dawley (SD) rats were investigated. Influence of VNP on baijiu characteristics was also investigated. Based on the results, VNP was identified with a content of 5.25 mg/g Jiuzao. VNP significantly mitigated excess oxidative stress via activation of Nrf2/Keap1-p38MAPK/PI3K-MafK signaling pathway and activated downstream antioxidant enzymes. Furthermore, VNP showed unconspicuous influence on the flavor and taste of baijiu when added into baijiu and the content remained stable during storage. These results indicated that VNP is a potent antioxidant component isolated from Jiuzao that can be used in baijiu to enhance its antioxidant effect without affecting the main flavor and taste. The utilization of these functional components can also increase the added value of Jiuzao.
Collapse
Affiliation(s)
- Yunsong Jiang
- School of Food Science and Engineering, South China University of Technology, Guangzhou, 510640, China; Beijing Laboratory of Food Quality and Safety, Beijing Technology and Business University, Beijing, 100048, China
| | - Zhongtian Yin
- Beijing Laboratory of Food Quality and Safety, Beijing Technology and Business University, Beijing, 100048, China; College of Food Science and Nutritional Engineering, China Agricultural University, Beijing, 100048, China
| | - Jiwen Zhao
- Technocal Center of Bandaojing Co.Ltd., Gaoqing, Shandong, 256300, China
| | - Jinyuan Sun
- Beijing Laboratory of Food Quality and Safety, Beijing Technology and Business University, Beijing, 100048, China.
| | - Dongrui Zhao
- Beijing Laboratory of Food Quality and Safety, Beijing Technology and Business University, Beijing, 100048, China
| | - Xin-An Zeng
- School of Food Science and Engineering, South China University of Technology, Guangzhou, 510640, China
| | - Hehe Li
- Beijing Laboratory of Food Quality and Safety, Beijing Technology and Business University, Beijing, 100048, China; College of Food Science and Nutritional Engineering, China Agricultural University, Beijing, 100048, China
| | - Mingquan Huang
- Beijing Laboratory of Food Quality and Safety, Beijing Technology and Business University, Beijing, 100048, China
| | - Jihong Wu
- Beijing Laboratory of Food Quality and Safety, Beijing Technology and Business University, Beijing, 100048, China
| |
Collapse
|
41
|
Nasiri F, Atanaki FF, Behrouzi S, Kavousi K, Bagheri M. CpACpP: In Silico Cell-Penetrating Anticancer Peptide Prediction Using a Novel Bioinformatics Framework. ACS OMEGA 2021; 6:19846-19859. [PMID: 34368571 PMCID: PMC8340416 DOI: 10.1021/acsomega.1c02569] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/17/2021] [Accepted: 07/13/2021] [Indexed: 05/12/2023]
Abstract
Cell-penetrating anticancer peptides (Cp-ACPs) are considered promising candidates in solid tumor and hematologic cancer therapies. Current approaches for the design and discovery of Cp-ACPs trust the expensive high-throughput screenings that often give rise to multiple obstacles, including instrumentation adaptation and experimental handling. The application of machine learning (ML) tools developed for peptide activity prediction is importantly of growing interest. In this study, we applied the random forest (RF)-, support vector machine (SVM)-, and eXtreme gradient boosting (XGBoost)-based algorithms to predict the active Cp-ACPs using an experimentally validated data set. The model, CpACpP, was developed on the basis of two independent cell-penetrating peptide (CPP) and anticancer peptide (ACP) subpredictors. Various compositional and physiochemical-based features were combined or selected using the multilayered recursive feature elimination (RFE) method for both data sets. Our results showed that the ACP subclassifiers obtain a mean performance accuracy (ACC) of 0.98 with an area under curve (AUC) ≈ 0.98 vis-à-vis the CPP predictors displaying relevant values of ∼0.94 and ∼0.95 via the hybrid-based features and independent data sets, respectively. Also, the predicting evaluation of Cp-ACPs gave accuracies of ∼0.79 and 0.89 on a series of independent sequences by applying our CPP and ACP classifiers, respectively, which leaves the performance of our predictors better than the earlier reported ACPred, mACPpred, MLCPP, and CPPred-RF. The described consensus-based fusion method additionally reached an AUC of 0.94 for the prediction of Cp-ACP (http://cbb1.ut.ac.ir/CpACpP/Index).
Collapse
Affiliation(s)
- Farid Nasiri
- Peptide
Chemistry Laboratory, Department of Biochemistry, Institute of Biochemistry
and Biophysics (IBB), University of Tehran, Tehran 14176-14335, Iran
| | - Fereshteh Fallah Atanaki
- Laboratory
of Complex Biological Systems and Bioinformatics (CBB), Department
of Bioinformatics, Institute of Biochemistry and Biophysics (IBB), University of Tehran, Tehran 14176-14411, Iran
| | - Saman Behrouzi
- Laboratory
of Complex Biological Systems and Bioinformatics (CBB), Department
of Bioinformatics, Institute of Biochemistry and Biophysics (IBB), University of Tehran, Tehran 14176-14411, Iran
| | - Kaveh Kavousi
- Laboratory
of Complex Biological Systems and Bioinformatics (CBB), Department
of Bioinformatics, Institute of Biochemistry and Biophysics (IBB), University of Tehran, Tehran 14176-14411, Iran
| | - Mojtaba Bagheri
- Peptide
Chemistry Laboratory, Department of Biochemistry, Institute of Biochemistry
and Biophysics (IBB), University of Tehran, Tehran 14176-14335, Iran
| |
Collapse
|
42
|
Cao R, Wang M, Bin Y, Zheng C. DLFF-ACP: prediction of ACPs based on deep learning and multi-view features fusion. PeerJ 2021; 9:e11906. [PMID: 34414035 PMCID: PMC8344685 DOI: 10.7717/peerj.11906] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2021] [Accepted: 07/14/2021] [Indexed: 01/10/2023] Open
Abstract
An emerging type of therapeutic agent, anticancer peptides (ACPs), has attracted attention because of its lower risk of toxic side effects. However process of identifying ACPs using experimental methods is both time-consuming and laborious. In this study, we developed a new and efficient algorithm that predicts ACPs by fusing multi-view features based on dual-channel deep neural network ensemble model. In the model, one channel used the convolutional neural network CNN to automatically extract the potential spatial features of a sequence. Another channel was used to process and extract more effective features from handcrafted features. Additionally, an effective feature fusion method was explored for the mutual fusion of different features. Finally, we adopted the neural network to predict ACPs based on the fusion features. The performance comparisons across the single and fusion features showed that the fusion of multi-view features could effectively improve the model's predictive ability. Among these, the fusion of the features extracted by the CNN and composition of k-spaced amino acid group pairs achieved the best performance. To further validate the performance of our model, we compared it with other existing methods using two independent test sets. The results showed that our model's area under curve was 0.90, which was higher than that of the other existing methods on the first test set and higher than most of the other existing methods on the second test set. The source code and datasets are available at https://github.com/wame-ng/DLFF-ACP.
Collapse
Affiliation(s)
- Ruifen Cao
- Key Laboratory of Intelligent Computing and Signal Processing of Ministry of Education, School of Computer Science and Technology, Anhui University, Hefei, Anhui, China
- Engineering Research Center of Big Data Application in Private Health Medicine, Fujian Province University, Putian, Fujian, China
| | - Meng Wang
- Key Laboratory of Intelligent Computing and Signal Processing of Ministry of Education, School of Computer Science and Technology, Anhui University, Hefei, Anhui, China
| | - Yannan Bin
- Key Laboratory of Intelligent Computing and Signal Processing of Ministry of Education, School of Computer Science and Technology, Anhui University, Hefei, Anhui, China
- Institutes of Physical Science and Information Technology, Anhui University, Hefei, Anhui, China
| | - Chunhou Zheng
- Key Laboratory of Intelligent Computing and Signal Processing of Ministry of Education, School of Computer Science and Technology, Anhui University, Hefei, Anhui, China
- Engineering Research Center of Big Data Application in Private Health Medicine, Fujian Province University, Putian, Fujian, China
| |
Collapse
|
43
|
Gupta R, Srivastava D, Sahu M, Tiwari S, Ambasta RK, Kumar P. Artificial intelligence to deep learning: machine intelligence approach for drug discovery. Mol Divers 2021; 25:1315-1360. [PMID: 33844136 PMCID: PMC8040371 DOI: 10.1007/s11030-021-10217-3] [Citation(s) in RCA: 264] [Impact Index Per Article: 88.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2021] [Accepted: 03/22/2021] [Indexed: 02/06/2023]
Abstract
Drug designing and development is an important area of research for pharmaceutical companies and chemical scientists. However, low efficacy, off-target delivery, time consumption, and high cost impose a hurdle and challenges that impact drug design and discovery. Further, complex and big data from genomics, proteomics, microarray data, and clinical trials also impose an obstacle in the drug discovery pipeline. Artificial intelligence and machine learning technology play a crucial role in drug discovery and development. In other words, artificial neural networks and deep learning algorithms have modernized the area. Machine learning and deep learning algorithms have been implemented in several drug discovery processes such as peptide synthesis, structure-based virtual screening, ligand-based virtual screening, toxicity prediction, drug monitoring and release, pharmacophore modeling, quantitative structure-activity relationship, drug repositioning, polypharmacology, and physiochemical activity. Evidence from the past strengthens the implementation of artificial intelligence and deep learning in this field. Moreover, novel data mining, curation, and management techniques provided critical support to recently developed modeling algorithms. In summary, artificial intelligence and deep learning advancements provide an excellent opportunity for rational drug design and discovery process, which will eventually impact mankind. The primary concern associated with drug design and development is time consumption and production cost. Further, inefficiency, inaccurate target delivery, and inappropriate dosage are other hurdles that inhibit the process of drug delivery and development. With advancements in technology, computer-aided drug design integrating artificial intelligence algorithms can eliminate the challenges and hurdles of traditional drug design and development. Artificial intelligence is referred to as superset comprising machine learning, whereas machine learning comprises supervised learning, unsupervised learning, and reinforcement learning. Further, deep learning, a subset of machine learning, has been extensively implemented in drug design and development. The artificial neural network, deep neural network, support vector machines, classification and regression, generative adversarial networks, symbolic learning, and meta-learning are examples of the algorithms applied to the drug design and discovery process. Artificial intelligence has been applied to different areas of drug design and development process, such as from peptide synthesis to molecule design, virtual screening to molecular docking, quantitative structure-activity relationship to drug repositioning, protein misfolding to protein-protein interactions, and molecular pathway identification to polypharmacology. Artificial intelligence principles have been applied to the classification of active and inactive, monitoring drug release, pre-clinical and clinical development, primary and secondary drug screening, biomarker development, pharmaceutical manufacturing, bioactivity identification and physiochemical properties, prediction of toxicity, and identification of mode of action.
Collapse
Affiliation(s)
- Rohan Gupta
- Molecular Neuroscience and Functional Genomics Laboratory, Department of Biotechnology, Delhi Technological University (Formerly DCE), Shahbad Daulatpur, Bawana Road, Delhi, 110042, India
| | - Devesh Srivastava
- Molecular Neuroscience and Functional Genomics Laboratory, Department of Biotechnology, Delhi Technological University (Formerly DCE), Shahbad Daulatpur, Bawana Road, Delhi, 110042, India
| | - Mehar Sahu
- Molecular Neuroscience and Functional Genomics Laboratory, Department of Biotechnology, Delhi Technological University (Formerly DCE), Shahbad Daulatpur, Bawana Road, Delhi, 110042, India
| | - Swati Tiwari
- Molecular Neuroscience and Functional Genomics Laboratory, Department of Biotechnology, Delhi Technological University (Formerly DCE), Shahbad Daulatpur, Bawana Road, Delhi, 110042, India
| | - Rashmi K Ambasta
- Molecular Neuroscience and Functional Genomics Laboratory, Department of Biotechnology, Delhi Technological University (Formerly DCE), Shahbad Daulatpur, Bawana Road, Delhi, 110042, India
| | - Pravir Kumar
- Molecular Neuroscience and Functional Genomics Laboratory, Department of Biotechnology, Delhi Technological University (Formerly DCE), Shahbad Daulatpur, Bawana Road, Delhi, 110042, India.
| |
Collapse
|
44
|
Chen J, Cheong HH, Siu SWI. xDeep-AcPEP: Deep Learning Method for Anticancer Peptide Activity Prediction Based on Convolutional Neural Network and Multitask Learning. J Chem Inf Model 2021; 61:3789-3803. [PMID: 34327990 DOI: 10.1021/acs.jcim.1c00181] [Citation(s) in RCA: 29] [Impact Index Per Article: 9.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022]
Abstract
Cancer is one of the leading causes of death worldwide. Conventional cancer treatment relies on radiotherapy and chemotherapy, but both methods bring severe side effects to patients, as these therapies not only attack cancer cells but also damage normal cells. Anticancer peptides (ACPs) are a promising alternative as therapeutic agents that are efficient and selective against tumor cells. Here, we propose a deep learning method based on convolutional neural networks to predict biological activity (EC50, LC50, IC50, and LD50) against six tumor cells, including breast, colon, cervix, lung, skin, and prostate. We show that models derived with multitask learning achieve better performance than conventional single-task models. In repeated 5-fold cross validation using the CancerPPD data set, the best models with the applicability domain defined obtain an average mean squared error of 0.1758, Pearson's correlation coefficient of 0.8086, and Kendall's correlation coefficient of 0.6156. As a step toward model interpretability, we infer the contribution of each residue in the sequence to the predicted activity by means of feature importance weights derived from the convolutional layers of the model. The present method, referred to as xDeep-AcPEP, will help to identify effective ACPs in rational peptide design for therapeutic purposes. The data, script files for reproducing the experiments, and the final prediction models can be downloaded from http://github.com/chen709847237/xDeep-AcPEP. The web server to directly access this prediction method is at https://app.cbbio.online/acpep/home.
Collapse
Affiliation(s)
- Jiarui Chen
- Department of Computer and Information Science, University of Macau, Avenida da Universidade, Taipa, Macau 999078, China
| | - Hong Hin Cheong
- Department of Computer and Information Science, University of Macau, Avenida da Universidade, Taipa, Macau 999078, China
| | - Shirley W I Siu
- Department of Computer and Information Science, University of Macau, Avenida da Universidade, Taipa, Macau 999078, China.,School of Pharmaceutical Sciences, Universiti Sains Malaysia, 11800 USM, Penang, Malaysia
| |
Collapse
|
45
|
Chen XG, Zhang W, Yang X, Li C, Chen H. ACP-DA: Improving the Prediction of Anticancer Peptides Using Data Augmentation. Front Genet 2021; 12:698477. [PMID: 34276801 PMCID: PMC8279753 DOI: 10.3389/fgene.2021.698477] [Citation(s) in RCA: 17] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2021] [Accepted: 06/07/2021] [Indexed: 12/09/2022] Open
Abstract
Anticancer peptides (ACPs) have provided a promising perspective for cancer treatment, and the prediction of ACPs is very important for the discovery of new cancer treatment drugs. It is time consuming and expensive to use experimental methods to identify ACPs, so computational methods for ACP identification are urgently needed. There have been many effective computational methods, especially machine learning-based methods, proposed for such predictions. Most of the current machine learning methods try to find suitable features or design effective feature learning techniques to accurately represent ACPs. However, the performance of these methods can be further improved for cases with insufficient numbers of samples. In this article, we propose an ACP prediction model called ACP-DA (Data Augmentation), which uses data augmentation for insufficient samples to improve the prediction performance. In our method, to better exploit the information of peptide sequences, peptide sequences are represented by integrating binary profile features and AAindex features, and then the samples in the training set are augmented in the feature space. After data augmentation, the samples are used to train the machine learning model, which is used to predict ACPs. The performance of ACP-DA exceeds that of existing methods, and ACP-DA achieves better performance in the prediction of ACPs compared with a method without data augmentation. The proposed method is available at http://github.com/chenxgscuec/ACPDA.
Collapse
Affiliation(s)
- Xian-Gan Chen
- School of Biomedical Engineering, South-Central University for Nationalities, Wuhan, China.,Hubei Key Laboratory of Medical Information Analysis and Tumor Diagnosis & Treatment, South-Central University for Nationalities, Wuhan, China.,Key Laboratory of Cognitive Science (South-Central University for Nationalities), State Ethnic Affairs Commission, Wuhan, China
| | - Wen Zhang
- College of Informatics, Huazhong Agricultural University, Wuhan, China.,Hubei Engineering Technology Research Center of Agricultural Big Data, Wuhan, China
| | - Xiaofei Yang
- School of Biomedical Engineering, South-Central University for Nationalities, Wuhan, China.,Hubei Key Laboratory of Medical Information Analysis and Tumor Diagnosis & Treatment, South-Central University for Nationalities, Wuhan, China.,Key Laboratory of Cognitive Science (South-Central University for Nationalities), State Ethnic Affairs Commission, Wuhan, China
| | - Chenhong Li
- School of Biomedical Engineering, South-Central University for Nationalities, Wuhan, China.,Hubei Key Laboratory of Medical Information Analysis and Tumor Diagnosis & Treatment, South-Central University for Nationalities, Wuhan, China.,Key Laboratory of Cognitive Science (South-Central University for Nationalities), State Ethnic Affairs Commission, Wuhan, China
| | - Hengling Chen
- School of Biomedical Engineering, South-Central University for Nationalities, Wuhan, China.,Hubei Key Laboratory of Medical Information Analysis and Tumor Diagnosis & Treatment, South-Central University for Nationalities, Wuhan, China.,Key Laboratory of Cognitive Science (South-Central University for Nationalities), State Ethnic Affairs Commission, Wuhan, China
| |
Collapse
|
46
|
Zhao Y, Wang S, Fei W, Feng Y, Shen L, Yang X, Wang M, Wu M. Prediction of Anticancer Peptides with High Efficacy and Low Toxicity by Hybrid Model Based on 3D Structure of Peptides. Int J Mol Sci 2021; 22:5630. [PMID: 34073203 PMCID: PMC8198792 DOI: 10.3390/ijms22115630] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2021] [Revised: 04/30/2021] [Accepted: 05/19/2021] [Indexed: 02/07/2023] Open
Abstract
Recently, anticancer peptides (ACPs) have emerged as unique and promising therapeutic agents for cancer treatment compared with antibody and small molecule drugs. In addition to experimental methods of ACPs discovery, it is also necessary to develop accurate machine learning models for ACP prediction. In this study, features were extracted from the three-dimensional (3D) structure of peptides to develop the model, compared to most of the previous computational models, which are based on sequence information. In order to develop ACPs with more potency, more selectivity and less toxicity, the model for predicting ACPs, hemolytic peptides and toxic peptides were established by peptides 3D structure separately. Multiple datasets were collected according to whether the peptide sequence was chemically modified. After feature extraction and screening, diverse algorithms were used to build the model. Twelve models with excellent performance (Acc > 90%) in the ACPs mixed datasets were used to form a hybrid model to predict the candidate ACPs, and then the optimal model of hemolytic peptides (Acc = 73.68%) and toxic peptides (Acc = 85.5%) was used for safety prediction. Novel ACPs were found by using those models, and five peptides were randomly selected to determine their anticancer activity and toxic side effects in vitro experiments.
Collapse
Affiliation(s)
| | | | | | | | | | | | - Min Wang
- State Key Laboratory of Natural Medicines, School of Life Science and Technology, China Pharmaceutical University, Nanjing 210009, China; (Y.Z.); (S.W.); (W.F.); (Y.F.); (L.S.); (X.Y.)
| | - Min Wu
- State Key Laboratory of Natural Medicines, School of Life Science and Technology, China Pharmaceutical University, Nanjing 210009, China; (Y.Z.); (S.W.); (W.F.); (Y.F.); (L.S.); (X.Y.)
| |
Collapse
|