151
|
Shao Y, Chou KC. pLoc_Deep-mEuk: Predict Subcellular Localization of Eukaryotic Proteins by Deep Learning. ACTA ACUST UNITED AC 2020. [DOI: 10.4236/ns.2020.126034] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]
|
152
|
iQSP: A Sequence-Based Tool for the Prediction and Analysis of Quorum Sensing Peptides via Chou's 5-Steps Rule and Informative Physicochemical Properties. Int J Mol Sci 2019; 21:ijms21010075. [PMID: 31861928 PMCID: PMC6981611 DOI: 10.3390/ijms21010075] [Citation(s) in RCA: 36] [Impact Index Per Article: 7.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2019] [Revised: 12/13/2019] [Accepted: 12/18/2019] [Indexed: 01/18/2023] Open
Abstract
Understanding of quorum-sensing peptides (QSPs) in their functional mechanism plays an essential role in finding new opportunities to combat bacterial infections by designing drugs. With the avalanche of the newly available peptide sequences in the post-genomic age, it is highly desirable to develop a computational model for efficient, rapid and high-throughput QSP identification purely based on the peptide sequence information alone. Although, few methods have been developed for predicting QSPs, their prediction accuracy and interpretability still requires further improvements. Thus, in this work, we proposed an accurate sequence-based predictor (called iQSP) and a set of interpretable rules (called IR-QSP) for predicting and analyzing QSPs. In iQSP, we utilized a powerful support vector machine (SVM) cooperating with 18 informative features from physicochemical properties (PCPs). Rigorous independent validation test showed that iQSP achieved maximum accuracy and MCC of 93.00% and 0.86, respectively. Furthermore, a set of interpretable rules IR-QSP was extracted by using random forest model and the 18 informative PCPs. Finally, for the convenience of experimental scientists, the iQSP web server was established and made freely available online. It is anticipated that iQSP will become a useful tool or at least as a complementary existing method for predicting and analyzing QSPs.
Collapse
|
153
|
Zhong W, Zhong B, Zhang H, Chen Z, Chen Y. Identification of Anti-cancer Peptides Based on Multi-classifier System. Comb Chem High Throughput Screen 2019; 22:694-704. [PMID: 31793417 DOI: 10.2174/1386207322666191203141102] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2019] [Revised: 07/18/2019] [Accepted: 07/30/2019] [Indexed: 01/01/2023]
Abstract
AIMS AND OBJECTIVE Cancer is one of the deadliest diseases, taking the lives of millions every year. Traditional methods of treating cancer are expensive and toxic to normal cells. Fortunately, anti-cancer peptides (ACPs) can eliminate this side effect. However, the identification and development of new anti-cancer peptides through experiments take a lot of time and money, therefore, it is necessary to develop a fast and accurate calculation model to identify the anti-cancer peptide. Machine learning algorithms are a good choice. MATERIALS AND METHODS In our study, a multi-classifier system was used, combined with multiple machine learning models, to predict anti-cancer peptides. These individual learners are composed of different feature information and algorithms, and form a multi-classifier system by voting. RESULTS AND CONCLUSION The experiments show that the overall prediction rate of each individual learner is above 80% and the overall accuracy of multi-classifier system for anti-cancer peptides prediction can reach 95.93%, which is better than the existing prediction model.
Collapse
Affiliation(s)
- Wanben Zhong
- School of Computer Science and Technology, Huaqiao University, Xiamen, Fujian, 361021, China
| | - Bineng Zhong
- School of Computer Science and Technology, Huaqiao University, Xiamen, Fujian, 361021, China.,Key Laboratory of Intelligent Perception and Systems for High-Dimensional Information of Ministry of Education, Nanjing University of Science and Technology, Nanjing, 210094, China
| | - Hongbo Zhang
- School of Computer Science and Technology, Huaqiao University, Xiamen, Fujian, 361021, China
| | - Ziyi Chen
- School of Computer Science and Technology, Huaqiao University, Xiamen, Fujian, 361021, China
| | - Yan Chen
- School of Computer Science and Technology, Huaqiao University, Xiamen, Fujian, 361021, China
| |
Collapse
|
154
|
pLoc_bal-mHum: Predict subcellular localization of human proteins by PseAAC and quasi-balancing training dataset. Genomics 2019; 111:1274-1282. [DOI: 10.1016/j.ygeno.2018.08.007] [Citation(s) in RCA: 56] [Impact Index Per Article: 11.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2018] [Revised: 08/14/2018] [Accepted: 08/16/2018] [Indexed: 12/17/2022]
|
155
|
iRSpot-DTS: Predict recombination spots by incorporating the dinucleotide-based spare-cross covariance information into Chou's pseudo components. Genomics 2019; 111:1760-1770. [DOI: 10.1016/j.ygeno.2018.11.031] [Citation(s) in RCA: 27] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2018] [Revised: 11/29/2018] [Accepted: 11/30/2018] [Indexed: 12/16/2022]
|
156
|
Chou KC. Impacts of Pseudo Amino Acid Components and 5-steps Rule to Proteomics and Proteome Analysis. Curr Top Med Chem 2019; 19:2283-2300. [DOI: 10.2174/1568026619666191018100141] [Citation(s) in RCA: 24] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2019] [Revised: 08/18/2019] [Accepted: 08/26/2019] [Indexed: 01/27/2023]
Abstract
Stimulated by the 5-steps rule during the last decade or so, computational proteomics has achieved remarkable progresses in the following three areas: (1) protein structural class prediction; (2) protein subcellular location prediction; (3) post-translational modification (PTM) site prediction. The results obtained by these predictions are very useful not only for an in-depth study of the functions of proteins and their biological processes in a cell, but also for developing novel drugs against major diseases such as cancers, Alzheimer’s, and Parkinson’s. Moreover, since the targets to be predicted may have the multi-label feature, two sets of metrics are introduced: one is for inspecting the global prediction quality, while the other for the local prediction quality. All the predictors covered in this review have a userfriendly web-server, through which the majority of experimental scientists can easily obtain their desired data without the need to go through the complicated mathematics.
Collapse
Affiliation(s)
- Kuo-Chen Chou
- Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu, 610054, China
| |
Collapse
|
157
|
Schaduangrat N, Nantasenamat C, Prachayasittikul V, Shoombuatong W. Meta-iAVP: A Sequence-Based Meta-Predictor for Improving the Prediction of Antiviral Peptides Using Effective Feature Representation. Int J Mol Sci 2019; 20:ijms20225743. [PMID: 31731751 PMCID: PMC6888698 DOI: 10.3390/ijms20225743] [Citation(s) in RCA: 77] [Impact Index Per Article: 15.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2019] [Revised: 11/07/2019] [Accepted: 11/13/2019] [Indexed: 12/31/2022] Open
Abstract
In spite of the large-scale production and widespread distribution of vaccines and antiviral drugs, viruses remain a prominent human disease. Recently, the discovery of antiviral peptides (AVPs) has become an influential antiviral agent due to their extraordinary advantages. With the avalanche of newly-found peptide sequences in the post-genomic era, there is a great demand to develop a sequence-based predictor for timely identifying AVPs as this information is very useful for both basic research and drug development. In this study, we propose a novel sequence-based meta-predictor with an effective feature representation, called Meta-iAVP, for the accurate prediction of AVPs from given peptide sequences. Herein, the effective feature representation was extracted from a set of prediction scores derived from various machine learning algorithms and types of features. To the best of our knowledge, the model proposed herein represents the first meta-based approach for the prediction of AVPs. An overall accuracy and Matthews correlation coefficient of 95.20% and 0.90, respectively, was achieved from the independent test set on an objective benchmark dataset. Comparative analysis suggested that Meta-iAVP was superior to that of existing methods and therefore represents a useful tool for AVP prediction. Finally, in an effort to facilitate high-throughput prediction of AVPs, the model was deployed as the Meta-iAVP web server and is made freely available online at http://codes.bio/meta-iavp/ where users can submit query peptide sequences for determining the likelihood of whether or not these peptides are AVPs.
Collapse
Affiliation(s)
- Nalini Schaduangrat
- Center of Data Mining and Biomedical Informatics, Faculty of Medical Technology, Mahidol University, Bangkok 10700, Thailand; (N.S.); (C.N.)
| | - Chanin Nantasenamat
- Center of Data Mining and Biomedical Informatics, Faculty of Medical Technology, Mahidol University, Bangkok 10700, Thailand; (N.S.); (C.N.)
| | - Virapong Prachayasittikul
- Department of Clinical Microbiology and Applied Technology, Faculty of Medical Technology, Mahidol University, Bangkok 10700, Thailand;
| | - Watshara Shoombuatong
- Center of Data Mining and Biomedical Informatics, Faculty of Medical Technology, Mahidol University, Bangkok 10700, Thailand; (N.S.); (C.N.)
- Correspondence: ; Tel.: +66-2441-4371 (ext. 2715)
| |
Collapse
|
158
|
Rao B, Zhou C, Zhang G, Su R, Wei L. ACPred-Fuse: fusing multi-view information improves the prediction of anticancer peptides. Brief Bioinform 2019; 21:1846-1855. [DOI: 10.1093/bib/bbz088] [Citation(s) in RCA: 57] [Impact Index Per Article: 11.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2019] [Revised: 06/06/2019] [Accepted: 06/22/2019] [Indexed: 02/04/2023] Open
Abstract
Abstract
Fast and accurate identification of the peptides with anticancer activity potential from large-scale proteins is currently a challenging task. In this study, we propose a new machine learning predictor, namely, ACPred-Fuse, that can automatically and accurately predict protein sequences with or without anticancer activity in peptide form. Specifically, we establish a feature representation learning model that can explore class and probabilistic information embedded in anticancer peptides (ACPs) by integrating a total of 29 different sequence-based feature descriptors. In order to make full use of various multiview information, we further fused the class and probabilistic features with handcrafted sequential features and then optimized the representation ability of the multiview features, which are ultimately used as input for training our prediction model. By comparing the multiview features and existing feature descriptors, we demonstrate that the fused multiview features have more discriminative ability to capture the characteristics of ACPs. In addition, the information from different views is complementary for the performance improvement. Finally, our benchmarking comparison results showed that the proposed ACPred-Fuse is more precise and promising in the identification of ACPs than existing predictors. To facilitate the use of the proposed predictor, we built a web server, which is now freely available via http://server.malab.cn/ACPred-Fuse.
Collapse
Affiliation(s)
- Bing Rao
- School of Mechanical Electronic & Information Engineering, China University of Mining &Technology, Beijing, China
| | - Chen Zhou
- School of Computer Science and Technology, College of Intelligence and Computing, Tianjin University, Tianjin, China
| | - Guoying Zhang
- School of Mechanical Electronic & Information Engineering, China University of Mining &Technology, Beijing, China
| | - Ran Su
- School of Software, College of Intelligence and Computing, Tianjin University, Tianjin, China
| | - Leyi Wei
- School of Computer Science and Technology, College of Intelligence and Computing, Tianjin University, Tianjin, China
| |
Collapse
|
159
|
Xie NZ, Li JX, Huang RB. Biological Production of (S)-acetoin: A State-of-the-Art Review. Curr Top Med Chem 2019; 19:2348-2356. [PMID: 31648637 DOI: 10.2174/1568026619666191018111424] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2018] [Revised: 08/28/2019] [Accepted: 09/02/2019] [Indexed: 12/24/2022]
Abstract
Acetoin is an important four-carbon compound that has many applications in foods, chemical synthesis, cosmetics, cigarettes, soaps, and detergents. Its stereoisomer (S)-acetoin, a high-value chiral compound, can also be used to synthesize optically active drugs, which could enhance targeting properties and reduce side effects. Recently, considerable progress has been made in the development of biotechnological routes for (S)-acetoin production. In this review, various strategies for biological (S)- acetoin production are summarized, and their constraints and possible solutions are described. Furthermore, future prospects of biological production of (S)-acetoin are discussed.
Collapse
Affiliation(s)
- Neng-Zhong Xie
- National Engineering Research Center for Non-Food Biorefinery, State Key Laboratory of Non-Food Biomass and Enzyme Technology, Guangxi Key Laboratory of Bio-refinery, Guangxi Biomass Engineering Technology Research Center, Guangxi Academy of Sciences, 98 Daling Road, Nanning, 530007, China
| | - Jian-Xiu Li
- National Engineering Research Center for Non-Food Biorefinery, State Key Laboratory of Non-Food Biomass and Enzyme Technology, Guangxi Key Laboratory of Bio-refinery, Guangxi Biomass Engineering Technology Research Center, Guangxi Academy of Sciences, 98 Daling Road, Nanning, 530007, China
| | - Ri-Bo Huang
- National Engineering Research Center for Non-Food Biorefinery, State Key Laboratory of Non-Food Biomass and Enzyme Technology, Guangxi Key Laboratory of Bio-refinery, Guangxi Biomass Engineering Technology Research Center, Guangxi Academy of Sciences, 98 Daling Road, Nanning, 530007, China.,State Key Laboratory for Conservation and Utilization of Subtropical Agro-Bioresources, College of Life Science and Technology, Guangxi University, 100 Daxue Road, Nanning, 530004, China
| |
Collapse
|
160
|
Chou KC. Advances in Predicting Subcellular Localization of Multi-label Proteins and its Implication for Developing Multi-target Drugs. Curr Med Chem 2019; 26:4918-4943. [PMID: 31060481 DOI: 10.2174/0929867326666190507082559] [Citation(s) in RCA: 78] [Impact Index Per Article: 15.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2018] [Revised: 01/29/2019] [Accepted: 01/31/2019] [Indexed: 12/16/2022]
Abstract
The smallest unit of life is a cell, which contains numerous protein molecules. Most
of the functions critical to the cell’s survival are performed by these proteins located in its different
organelles, usually called ‘‘subcellular locations”. Information of subcellular localization
for a protein can provide useful clues about its function. To reveal the intricate pathways at the
cellular level, knowledge of the subcellular localization of proteins in a cell is prerequisite.
Therefore, one of the fundamental goals in molecular cell biology and proteomics is to determine
the subcellular locations of proteins in an entire cell. It is also indispensable for prioritizing
and selecting the right targets for drug development. Unfortunately, it is both timeconsuming
and costly to determine the subcellular locations of proteins purely based on experiments.
With the avalanche of protein sequences generated in the post-genomic age, it is highly
desired to develop computational methods for rapidly and effectively identifying the subcellular
locations of uncharacterized proteins based on their sequences information alone. Actually,
considerable progresses have been achieved in this regard. This review is focused on those
methods, which have the capacity to deal with multi-label proteins that may simultaneously
exist in two or more subcellular location sites. Protein molecules with this kind of characteristic
are vitally important for finding multi-target drugs, a current hot trend in drug development.
Focused in this review are also those methods that have use-friendly web-servers established so
that the majority of experimental scientists can use them to get the desired results without the
need to go through the detailed mathematics involved.
Collapse
Affiliation(s)
- Kuo-Chen Chou
- Gordon Life Science Institute, Boston, MA 02478, United States
| |
Collapse
|
161
|
Khan YD, Amin N, Hussain W, Rasool N, Khan SA, Chou KC. iProtease-PseAAC(2L): A two-layer predictor for identifying proteases and their types using Chou's 5-step-rule and general PseAAC. Anal Biochem 2019; 588:113477. [PMID: 31654612 DOI: 10.1016/j.ab.2019.113477] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2019] [Revised: 10/02/2019] [Accepted: 10/18/2019] [Indexed: 12/16/2022]
Abstract
Proteases are a type of enzymes, which perform the process of proteolysis. Proteolysis normally refers to protein and peptide degradation which is crucial for the survival, growth and wellbeing of a cell. Moreover, proteases have a strong association with therapeutics and drug development. The proteases are classified into five different types according to their nature and physiochemical characteristics. Mostly the methods used to differentiate protease from other proteins and identify their class requires a clinical test which is usually time-consuming and operator dependent. Herein, we report a classifier named iProtease-PseAAC (2L) for identifying proteases and their classes. The predictor is developed employing the flow of 5-step rule, initiating from the collection of benchmark dataset and terminating at the development of predictor. Rigorous verification and validation tests are performed and metrics are collected to calculate the authenticity of the trained model. The self-consistency validation gives the 98.32% accuracy, for cross-validation the accuracy is 90.71% and jackknife gives 96.07% accuracy. The average accuracy for level-2 i.e. protease classification is 95.77%. Based on the above-mentioned results, it is concluded that iProtease-PseAAC (2L) has the great ability to identify the proteases and their classes using a given protein sequence.
Collapse
Affiliation(s)
- Yaser Daanial Khan
- Department of Computer Science, School of Systems and Technology, University of Management and Technology, P.O. Box 10033, C-II, Johar Town, Lahore, 54770, Pakistan.
| | - Najm Amin
- Department of Computer Science, School of Systems and Technology, University of Management and Technology, P.O. Box 10033, C-II, Johar Town, Lahore, 54770, Pakistan
| | - Waqar Hussain
- National Center of Artificial Intelligence, Punjab University College of Information Technology, University of the Punjab, Lahore, Pakistan
| | - Nouman Rasool
- Dr Panjwani Center for Molecular Medicine and Drug Research, International Center for Chemical and Biological Sciences, University of Karachi, Karachi, 75270, Pakistan
| | - Sher Afzal Khan
- Faculty of Computing and Information Technology in Rabigh, Jeddah, 21577, Saudi Arabia; Abdul Wali Khan University, Department of Computer Sciences, Mardan, Pakistan
| | - Kuo-Chen Chou
- Gordon Life Science Institute, Boston, MA, 02478, USA
| |
Collapse
|
162
|
Abstract
The smallest unit of life is a cell, which contains numerous protein molecules. Most
of the functions critical to the cell’s survival are performed by these proteins located in its different
organelles, usually called ‘‘subcellular locations”. Information of subcellular localization
for a protein can provide useful clues about its function. To reveal the intricate pathways at the
cellular level, knowledge of the subcellular localization of proteins in a cell is prerequisite.
Therefore, one of the fundamental goals in molecular cell biology and proteomics is to determine
the subcellular locations of proteins in an entire cell. It is also indispensable for prioritizing
and selecting the right targets for drug development. Unfortunately, it is both timeconsuming
and costly to determine the subcellular locations of proteins purely based on experiments.
With the avalanche of protein sequences generated in the post-genomic age, it is highly
desired to develop computational methods for rapidly and effectively identifying the subcellular
locations of uncharacterized proteins based on their sequences information alone. Actually,
considerable progresses have been achieved in this regard. This review is focused on those
methods, which have the capacity to deal with multi-label proteins that may simultaneously
exist in two or more subcellular location sites. Protein molecules with this kind of characteristic
are vitally important for finding multi-target drugs, a current hot trend in drug development.
Focused in this review are also those methods that have use-friendly web-servers established so
that the majority of experimental scientists can use them to get the desired results without the
need to go through the detailed mathematics involved.
Collapse
Affiliation(s)
- Kuo-Chen Chou
- Gordon Life Science Institute, Boston, MA 02478, United States
| |
Collapse
|
163
|
Liang R, Xie J, Zhang C, Zhang M, Huang H, Huo H, Cao X, Niu B. Identifying Cancer Targets Based on Machine Learning Methods via Chou's 5-steps Rule and General Pseudo Components. Curr Top Med Chem 2019; 19:2301-2317. [PMID: 31622219 DOI: 10.2174/1568026619666191016155543] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2019] [Revised: 07/19/2019] [Accepted: 08/26/2019] [Indexed: 01/09/2023]
Abstract
In recent years, the successful implementation of human genome project has made people realize that genetic, environmental and lifestyle factors should be combined together to study cancer due to the complexity and various forms of the disease. The increasing availability and growth rate of 'big data' derived from various omics, opens a new window for study and therapy of cancer. In this paper, we will introduce the application of machine learning methods in handling cancer big data including the use of artificial neural networks, support vector machines, ensemble learning and naïve Bayes classifiers.
Collapse
Affiliation(s)
- Ruirui Liang
- School of Life Sciences, Shanghai University, Shanghai, 200444, China
| | - Jiayang Xie
- School of Life Sciences, Shanghai University, Shanghai, 200444, China
| | - Chi Zhang
- Foshan Huaxia Eye Hospital, Huaxia Eye Hospital Group, Foshan 528000, China
| | - Mengying Zhang
- School of Life Sciences, Shanghai University, Shanghai, 200444, China
| | - Hai Huang
- School of Life Sciences, Shanghai University, Shanghai, 200444, China
| | - Haizhong Huo
- Department of General Surgery, Shanghai Ninth People's Hospital Affiliated to Shanghai Jiao Tong University School of Medicine, Shanghai 200011, China
| | - Xin Cao
- Zhongshan Hospital, Institute of Clinical Science, Shanghai Medical College, Fudan University, Shanghai 200032, China
| | - Bing Niu
- School of Life Sciences, Shanghai University, Shanghai, 200444, China
| |
Collapse
|
164
|
Wu C, Gao R, Zhang Y, De Marinis Y. PTPD: predicting therapeutic peptides by deep learning and word2vec. BMC Bioinformatics 2019; 20:456. [PMID: 31492094 PMCID: PMC6728961 DOI: 10.1186/s12859-019-3006-z] [Citation(s) in RCA: 49] [Impact Index Per Article: 9.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2019] [Accepted: 07/25/2019] [Indexed: 01/10/2023] Open
Abstract
*: Background In the search for therapeutic peptides for disease treatments, many efforts have been made to identify various functional peptides from large numbers of peptide sequence databases. In this paper, we propose an effective computational model that uses deep learning and word2vec to predict therapeutic peptides (PTPD). *: Results Representation vectors of all k-mers were obtained through word2vec based on k-mer co-existence information. The original peptide sequences were then divided into k-mers using the windowing method. The peptide sequences were mapped to the input layer by the embedding vector obtained by word2vec. Three types of filters in the convolutional layers, as well as dropout and max-pooling operations, were applied to construct feature maps. These feature maps were concatenated into a fully connected dense layer, and rectified linear units (ReLU) and dropout operations were included to avoid over-fitting of PTPD. The classification probabilities were generated by a sigmoid function. PTPD was then validated using two datasets: an independent anticancer peptide dataset and a virulent protein dataset, on which it achieved accuracies of 96% and 94%, respectively. *: Conclusions PTPD identified novel therapeutic peptides efficiently, and it is suitable for application as a useful tool in therapeutic peptide design.
Collapse
Affiliation(s)
- Chuanyan Wu
- School of Control Science and Engineering, Shandong University, Jingshi Road, Jinan, 250061, China.,Diabetes and Endocrinology, Lund University, Malmo, 20502, Sweden
| | - Rui Gao
- School of Control Science and Engineering, Shandong University, Jingshi Road, Jinan, 250061, China.
| | - Yusen Zhang
- School of Mathematics and Statistics, Shandong University at Weihai, Weihai, 264209, China
| | - Yang De Marinis
- Diabetes and Endocrinology, Lund University, Malmo, 20502, Sweden
| |
Collapse
|
165
|
Liu K, Chen W, Lin H. XG-PseU: an eXtreme Gradient Boosting based method for identifying pseudouridine sites. Mol Genet Genomics 2019; 295:13-21. [DOI: 10.1007/s00438-019-01600-9] [Citation(s) in RCA: 45] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2019] [Accepted: 07/29/2019] [Indexed: 01/08/2023]
|
166
|
Gabernet G, Gautschi D, Müller AT, Neuhaus CS, Armbrecht L, Dittrich PS, Hiss JA, Schneider G. In silico design and optimization of selective membranolytic anticancer peptides. Sci Rep 2019; 9:11282. [PMID: 31375699 PMCID: PMC6677754 DOI: 10.1038/s41598-019-47568-9] [Citation(s) in RCA: 35] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2018] [Accepted: 06/17/2019] [Indexed: 12/31/2022] Open
Abstract
Membranolytic anticancer peptides represent a potential strategy in the fight against cancer. However, our understanding of the underlying structure-activity relationships and the mechanisms driving their cell selectivity is still limited. We developed a computational approach as a step towards the rational design of potent and selective anticancer peptides. This machine learning model distinguishes between peptides with and without anticancer activity. This classifier was experimentally validated by synthesizing and testing a selection of 12 computationally generated peptides. In total, 83% of these predictions were correct. We then utilized an evolutionary molecular design algorithm to improve the peptide selectivity for cancer cells. This simulated molecular evolution process led to a five-fold selectivity increase with regard to human dermal microvascular endothelial cells and more than ten-fold improvement towards human erythrocytes. The results of the present study advocate for the applicability of machine learning models and evolutionary algorithms to design and optimize novel synthetic anticancer peptides with reduced hemolytic liability and increased cell-type selectivity.
Collapse
Affiliation(s)
- Gisela Gabernet
- Department of Chemistry and Applied Biosciences, ETH Zurich, Zurich, Switzerland
| | - Damian Gautschi
- Department of Chemistry and Applied Biosciences, ETH Zurich, Zurich, Switzerland
| | - Alex T Müller
- Department of Chemistry and Applied Biosciences, ETH Zurich, Zurich, Switzerland
| | - Claudia S Neuhaus
- Department of Chemistry and Applied Biosciences, ETH Zurich, Zurich, Switzerland
| | - Lucas Armbrecht
- Department of Biosystems Science and Engineering, ETH Zurich, Basel, Switzerland
| | - Petra S Dittrich
- Department of Biosystems Science and Engineering, ETH Zurich, Basel, Switzerland
| | - Jan A Hiss
- Department of Chemistry and Applied Biosciences, ETH Zurich, Zurich, Switzerland
| | - Gisbert Schneider
- Department of Chemistry and Applied Biosciences, ETH Zurich, Zurich, Switzerland.
| |
Collapse
|
167
|
Peng LX, Liu XH, Lu B, Liao SM, Zhou F, Huang JM, Chen D, Troy FA, Zhou GP, Huang RB. The Inhibition of Polysialyltranseferase ST8SiaIV Through Heparin Binding to Polysialyltransferase Domain (PSTD). Med Chem 2019; 15:486-495. [PMID: 30569872 DOI: 10.2174/1573406415666181218101623] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2018] [Revised: 10/23/2018] [Accepted: 12/12/2018] [Indexed: 11/22/2022]
Abstract
BACKGROUND The polysialic acid (polySia) is a unique carbohydrate polymer produced on the surface Of Neuronal Cell Adhesion Molecule (NCAM) in a number of cancer cells, and strongly correlates with the migration and invasion of tumor cells and with aggressive, metastatic disease and poor clinical prognosis in the clinic. Its synthesis is catalyzed by two polysialyltransferases (polySTs), ST8SiaIV (PST) and ST8SiaII (STX). Selective inhibition of polySTs, therefore, presents a therapeutic opportunity to inhibit tumor invasion and metastasis due to NCAM polysialylation. Heparin has been found to be effective in inhibiting the ST8Sia IV activity, but no clear molecular rationale. It has been found that polysialyltransferase domain (PSTD) in polyST plays a significant role in influencing polyST activity, and thus it is critical for NCAM polysialylation based on the previous studies. OBJECTIVE To determine whether the three different types of heparin (unfractionated hepain (UFH), low molecular heparin (LMWH) and heparin tetrasaccharide (DP4)) is bound to the PSTD; and if so, what are the critical residues of the PSTD for these binding complexes? METHODS Fluorescence quenching analysis, the Circular Dichroism (CD) spectroscopy, and NMR spectroscopy were used to determine and analyze interactions of PSTD-UFH, PSTD-LMWH, and PSTD-DP4. RESULTS The fluorescence quenching analysis indicates that the PSTD-UFH binding is the strongest and the PSTD-DP4 binding is the weakest among these three types of the binding; the CD spectra showed that mainly the PSTD-heparin interactions caused a reduction in signal intensity but not marked decrease in α-helix content; the NMR data of the PSTD-DP4 and the PSTDLMWH interactions showed that the different types of heparin shared 12 common binding sites at N247, V251, R252, T253, S257, R265, Y267, W268, L269, V273, I275, and K276, which were mainly distributed in the long α-helix of the PSTD and the short 3-residue loop of the C-terminal PSTD. In addition, three residues K246, K250 and A254 were bound to the LMWH, but not to DP4. This suggests that the PSTD-LMWH binding is stronger than the PSTD-DP4 binding, and the LMWH is a more effective inhibitor than DP4. CONCLUSION The findings in the present study demonstrate that PSTD domain is a potential target of heparin and may provide new insights into the molecular rationale of heparin-inhibiting NCAM polysialylation.
Collapse
Affiliation(s)
- Li-Xin Peng
- Life Science and Technology College, Guangxi University, Nanning, Guangxi, 530004 China; 2Institute of Biophysics, Chinese Academy of Sciences, Beijing, China.,National Engineering Research Center for Non-food Biorefinery, Guangxi Academy of Sciences, 98 Daling Road, Nanning, Guangxi 530007, China
| | - Xue-Hui Liu
- Institute of Biophysics, Chinese Academy of Sciences, Beijing, China
| | - Bo Lu
- National Engineering Research Center for Non-food Biorefinery, Guangxi Academy of Sciences, 98 Daling Road, Nanning, Guangxi 530007, China
| | - Si-Ming Liao
- National Engineering Research Center for Non-food Biorefinery, Guangxi Academy of Sciences, 98 Daling Road, Nanning, Guangxi 530007, China
| | - Feng Zhou
- National Engineering Research Center for Non-food Biorefinery, Guangxi Academy of Sciences, 98 Daling Road, Nanning, Guangxi 530007, China
| | - Ji-Min Huang
- National Engineering Research Center for Non-food Biorefinery, Guangxi Academy of Sciences, 98 Daling Road, Nanning, Guangxi 530007, China
| | - Dong Chen
- National Engineering Research Center for Non-food Biorefinery, Guangxi Academy of Sciences, 98 Daling Road, Nanning, Guangxi 530007, China
| | - Frederic A Troy
- Department of Biochemistry and Molecular Medicine, University of California School of Medicine, Davis, CL, United States
| | - Guo-Ping Zhou
- National Engineering Research Center for Non-food Biorefinery, Guangxi Academy of Sciences, 98 Daling Road, Nanning, Guangxi 530007, China.,Gordon Life Science Institute, 53 South Cottage Road Belmont, MA 02478, United States
| | - Ri-Bo Huang
- Life Science and Technology College, Guangxi University, Nanning, Guangxi, 530004 China; 2Institute of Biophysics, Chinese Academy of Sciences, Beijing, China.,National Engineering Research Center for Non-food Biorefinery, Guangxi Academy of Sciences, 98 Daling Road, Nanning, Guangxi 530007, China
| |
Collapse
|
168
|
|
169
|
Du X, Diao Y, Liu H, Li S. MsDBP: Exploring DNA-Binding Proteins by Integrating Multiscale Sequence Information via Chou’s Five-Step Rule. J Proteome Res 2019; 18:3119-3132. [DOI: 10.1021/acs.jproteome.9b00226] [Citation(s) in RCA: 58] [Impact Index Per Article: 11.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
Affiliation(s)
- Xiuquan Du
- The School of Computer Science and Technology, Anhui University, Hefei, Anhui, China
| | - Yanyu Diao
- The School of Computer Science and Technology, Anhui University, Hefei, Anhui, China
| | - Heng Liu
- Department of Gastroenterology, The First Affiliated Hospital of Anhui Medical University, Hefei, Anhui, China
| | - Shuo Li
- Department of Medical Imaging, Western University, London, ON N6A 3K7, Canada
| |
Collapse
|
170
|
Liao SM, Shen NK, Liang G, Lu B, Lu ZL, Peng LX, Zhou F, Du LQ, Wei YT, Zhou GP, Huang RB. Inhibition of α-amylase Activity by Zn2+: Insights from Spectroscopy and Molecular Dynamics Simulations. Med Chem 2019; 15:510-520. [DOI: 10.2174/1573406415666181217114101] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2018] [Revised: 10/23/2018] [Accepted: 12/12/2018] [Indexed: 02/08/2023]
Abstract
Background:Inhibition of α-amylase activity is an important strategy in the treatment of diabetes mellitus. An important treatment for diabetes mellitus is to reduce the digestion of carbohydrates and blood glucose concentrations. Inhibiting the activity of carbohydrate-degrading enzymes such as α-amylase and glucosidase significantly decreases the blood glucose level. Most inhibitors of α-amylase have serious adverse effects, and the α-amylase inactivation mechanisms for the design of safer inhibitors are yet to be revealed.Objective:In this study, we focused on the inhibitory effect of Zn2+ on the structure and dynamic characteristics of α-amylase from Anoxybacillus sp. GXS-BL (AGXA), which shares the same catalytic residues and similar structures as human pancreatic and salivary α-amylase (HPA and HSA, respectively).Methods:Circular dichroism (CD) spectra of the protein (AGXA) in the absence and presence of Zn2+ were recorded on a Chirascan instrument. The content of different secondary structures of AGXA in the absence and presence of Zn2+ was analyzed using the online SELCON3 program. An AGXA amino acid sequence similarity search was performed on the BLAST online server to find the most similar protein sequence to use as a template for homology modeling. The pocket volume measurer (POVME) program 3.0 was applied to calculate the active site pocket shape and volume, and molecular dynamics simulations were performed with the Amber14 software package.Results:According to circular dichroism experiments, upon Zn2+ binding, the protein secondary structure changed obviously, with the α-helix content decreasing and β-sheet, β-turn and randomcoil content increasing. The structural model of AGXA showed that His217 was near the active site pocket and that Phe178 was at the outer rim of the pocket. Based on the molecular dynamics trajectories, in the free AGXA model, the dihedral angle of C-CA-CB-CG displayed both acute and planar orientations, which corresponded to the open and closed states of the active site pocket, respectively. In the AGXA-Zn model, the dihedral angle of C-CA-CB-CG only showed the planar orientation. As Zn2+ was introduced, the metal center formed a coordination interaction with H217, a cation-π interaction with W244, a coordination interaction with E242 and a cation-π interaction with F178, which prevented F178 from easily rotating to the open state and inhibited the activity of the enzyme.Conclusion:This research may have uncovered a subtle mechanism for inhibiting the activity of α-amylase with transition metal ions, and this finding will help to design more potent and specific inhibitors of α-amylases.
Collapse
Affiliation(s)
- Si-Ming Liao
- Department of Bioengineering, College of Life Science and Technology, Guangxi University, Nanning, Guangxi, 530004, China
| | - Nai-Kun Shen
- School of Marine Sciences and Biotechnology, Guangxi University for Nationalities, Nanning, Guangxi, 530008, China
| | - Ge Liang
- State Key Laboratory of Non-Food Biomass and Enzyme Technology, Guangxi Academy of Sciences, Nanning, Guangxi, 530007, China
| | - Bo Lu
- State Key Laboratory of Non-Food Biomass and Enzyme Technology, Guangxi Academy of Sciences, Nanning, Guangxi, 530007, China
| | - Zhi-Long Lu
- Department of Bioengineering, College of Life Science and Technology, Guangxi University, Nanning, Guangxi, 530004, China
| | - Li-Xin Peng
- Department of Bioengineering, College of Life Science and Technology, Guangxi University, Nanning, Guangxi, 530004, China
| | - Feng Zhou
- State Key Laboratory of Non-Food Biomass and Enzyme Technology, Guangxi Academy of Sciences, Nanning, Guangxi, 530007, China
| | - Li-Qin Du
- Department of Bioengineering, College of Life Science and Technology, Guangxi University, Nanning, Guangxi, 530004, China
| | - Yu-Tuo Wei
- Department of Bioengineering, College of Life Science and Technology, Guangxi University, Nanning, Guangxi, 530004, China
| | - Guo-Ping Zhou
- State Key Laboratory of Non-Food Biomass and Enzyme Technology, Guangxi Academy of Sciences, Nanning, Guangxi, 530007, China
| | - Ri-Bo Huang
- Department of Bioengineering, College of Life Science and Technology, Guangxi University, Nanning, Guangxi, 530004, China
| |
Collapse
|
171
|
Xiao X, Cheng X, Chen G, Mao Q, Chou KC. pLoc_bal-mVirus: Predict Subcellular Localization of Multi-Label Virus Proteins by Chou's General PseAAC and IHTS Treatment to Balance Training Dataset. Med Chem 2019; 15:496-509. [DOI: 10.2174/1573406415666181217114710] [Citation(s) in RCA: 44] [Impact Index Per Article: 8.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2018] [Revised: 10/23/2018] [Accepted: 12/12/2018] [Indexed: 12/17/2022]
Abstract
Background/Objective:Knowledge of protein subcellular localization is vitally important for both basic research and drug development. Facing the avalanche of protein sequences emerging in the post-genomic age, it is urgent to develop computational tools for timely and effectively identifying their subcellular localization based on the sequence information alone. Recently, a predictor called “pLoc-mVirus” was developed for identifying the subcellular localization of virus proteins. Its performance is overwhelmingly better than that of the other predictors for the same purpose, particularly in dealing with multi-label systems in which some proteins, known as “multiplex proteins”, may simultaneously occur in, or move between two or more subcellular location sites. Despite the fact that it is indeed a very powerful predictor, more efforts are definitely needed to further improve it. This is because pLoc-mVirus was trained by an extremely skewed dataset in which some subset was over 10 times the size of the other subsets. Accordingly, it cannot avoid the biased consequence caused by such an uneven training dataset.Methods:Using the Chou's general PseAAC (Pseudo Amino Acid Composition) approach and the IHTS (Inserting Hypothetical Training Samples) treatment to balance out the training dataset, we have developed a new predictor called “pLoc_bal-mVirus” for predicting the subcellular localization of multi-label virus proteins.Results:Cross-validation tests on exactly the same experiment-confirmed dataset have indicated that the proposed new predictor is remarkably superior to pLoc-mVirus, the existing state-of-theart predictor for the same purpose.Conclusion:Its user-friendly web-server is available at http://www.jci-bioinfo.cn/pLoc_balmVirus/, by which the majority of experimental scientists can easily get their desired results without the need to go through the detailed complicated mathematics. Accordingly, pLoc_bal-mVirus will become a very useful tool for designing multi-target drugs and in-depth understanding of the biological process in a cell.
Collapse
Affiliation(s)
- Xuan Xiao
- Gordon Life Science Institute, Boston, MA 02478, United States
| | - Xiang Cheng
- Gordon Life Science Institute, Boston, MA 02478, United States
| | - Genqiang Chen
- College of Chemistry, Chemical Engineering and Biotechnology, Donghua University, Shanghai 201620, China
| | - Qi Mao
- College of Information Science and Technology, Donghua University, Shanghai, China
| | - Kuo-Chen Chou
- Gordon Life Science Institute, Boston, MA 02478, United States
| |
Collapse
|
172
|
Chou KC, Cheng X, Xiao X. pLoc_bal-mEuk: Predict Subcellular Localization of Eukaryotic Proteins by General PseAAC and Quasi-balancing Training Dataset. Med Chem 2019; 15:472-485. [DOI: 10.2174/1573406415666181218102517] [Citation(s) in RCA: 40] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2018] [Revised: 10/23/2018] [Accepted: 12/12/2018] [Indexed: 12/24/2022]
Abstract
<P>Background/Objective: Information of protein subcellular localization is crucially important for both basic research and drug development. With the explosive growth of protein sequences discovered in the post-genomic age, it is highly demanded to develop powerful bioinformatics tools for timely and effectively identifying their subcellular localization purely based on the sequence information alone. Recently, a predictor called “pLoc-mEuk” was developed for identifying the subcellular localization of eukaryotic proteins. Its performance is overwhelmingly better than that of the other predictors for the same purpose, particularly in dealing with multi-label systems where many proteins, called “multiplex proteins”, may simultaneously occur in two or more subcellular locations. Although it is indeed a very powerful predictor, more efforts are definitely needed to further improve it. This is because pLoc-mEuk was trained by an extremely skewed dataset where some subset was about 200 times the size of the other subsets. Accordingly, it cannot avoid the biased consequence caused by such an uneven training dataset. </P><P> Methods: To alleviate such bias, we have developed a new predictor called pLoc_bal-mEuk by quasi-balancing the training dataset. Cross-validation tests on exactly the same experimentconfirmed dataset have indicated that the proposed new predictor is remarkably superior to pLocmEuk, the existing state-of-the-art predictor in identifying the subcellular localization of eukaryotic proteins. It has not escaped our notice that the quasi-balancing treatment can also be used to deal with many other biological systems. </P><P> Results: To maximize the convenience for most experimental scientists, a user-friendly web-server for the new predictor has been established at http://www.jci-bioinfo.cn/pLoc_bal-mEuk/. </P><P> Conclusion: It is anticipated that the pLoc_bal-Euk predictor holds very high potential to become a useful high throughput tool in identifying the subcellular localization of eukaryotic proteins, particularly for finding multi-target drugs that is currently a very hot trend trend in drug development.</P>
Collapse
Affiliation(s)
- Kuo-Chen Chou
- Gordon Life Science Institute, Boston, MA 02478, United States
| | - Xiang Cheng
- Gordon Life Science Institute, Boston, MA 02478, United States
| | - Xuan Xiao
- Gordon Life Science Institute, Boston, MA 02478, United States
| |
Collapse
|
173
|
Niu B, Liang C, Lu Y, Zhao M, Chen Q, Zhang Y, Zheng L, Chou KC. Glioma stages prediction based on machine learning algorithm combined with protein-protein interaction networks. Genomics 2019; 112:837-847. [PMID: 31150762 DOI: 10.1016/j.ygeno.2019.05.024] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2019] [Accepted: 05/25/2019] [Indexed: 12/18/2022]
Abstract
BACKGROUND Glioma is the most lethal nervous system cancer. Recent studies have made great efforts to study the occurrence and development of glioma, but the molecular mechanisms are still unclear. This study was designed to reveal the molecular mechanisms of glioma based on protein-protein interaction network combined with machine learning methods. Key differentially expressed genes (DEGs) were screened and selected by using the protein-protein interaction (PPI) networks. RESULTS As a result, 19 genes between grade I and grade II, 21 genes between grade II and grade III, and 20 genes between grade III and grade IV. Then, five machine learning methods were employed to predict the gliomas stages based on the selected key genes. After comparison, Complement Naive Bayes classifier was employed to build the prediction model for grade II-III with accuracy 72.8%. And Random forest was employed to build the prediction model for grade I-II and grade III-VI with accuracy 97.1% and 83.2%, respectively. Finally, the selected genes were analyzed by PPI networks, Gene Ontology (GO) terms and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways, and the results improve our understanding of the biological functions of select DEGs involved in glioma growth. We expect that the key genes expressed have a guiding significance for the occurrence of gliomas or, at the very least, that they are useful for tumor researchers. CONCLUSION Machine learning combined with PPI networks, GO and KEGG analyses of selected DEGs improve our understanding of the biological functions involved in glioma growth.
Collapse
Affiliation(s)
- Bing Niu
- School of Life Sciences, Shanghai University, Shanghai 200444, China; Gordon Life Science Institute, Boston, MA 02478, USA.
| | - Chaofeng Liang
- Department of Neurosurgery, The Third Affiliated Hospital of Sun Yat-sen University, Guangzhou, China
| | - Yi Lu
- School of Life Sciences, Shanghai University, Shanghai 200444, China
| | - Manman Zhao
- School of Life Sciences, Shanghai University, Shanghai 200444, China
| | - Qin Chen
- School of Life Sciences, Shanghai University, Shanghai 200444, China.
| | - Yuhui Zhang
- Renji Hospital, Medical School, Shanghai Jiaotong University, 160 Pujian Rd, New Pudong District, Shanghai 200127, China; Changhai Hospital, Second Military Medical University, Shanghai 200433, China.
| | - Linfeng Zheng
- Department of Radiology, Shanghai General Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai 200080, China; Department of Radiology, Shanghai First People's Hospital, Baoshan Branch, Shanghai 200940, China.
| | - Kuo-Chen Chou
- Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu 610054, China; Gordon Life Science Institute, Boston, MA 02478, USA.
| |
Collapse
|
174
|
Schaduangrat N, Nantasenamat C, Prachayasittikul V, Shoombuatong W. ACPred: A Computational Tool for the Prediction and Analysis of Anticancer Peptides. Molecules 2019; 24:E1973. [PMID: 31121946 PMCID: PMC6571645 DOI: 10.3390/molecules24101973] [Citation(s) in RCA: 120] [Impact Index Per Article: 24.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2019] [Revised: 05/07/2019] [Accepted: 05/17/2019] [Indexed: 01/01/2023] Open
Abstract
Anticancer peptides (ACPs) have emerged as a new class of therapeutic agent for cancer treatment due to their lower toxicity as well as greater efficacy, selectivity and specificity when compared to conventional small molecule drugs. However, the experimental identification of ACPs still remains a time-consuming and expensive endeavor. Therefore, it is desirable to develop and improve upon existing computational models for predicting and characterizing ACPs. In this study, we present a bioinformatics tool called the ACPred, which is an interpretable tool for the prediction and characterization of the anticancer activities of peptides. ACPred was developed by utilizing powerful machine learning models (support vector machine and random forest) and various classes of peptide features. It was observed by a jackknife cross-validation test that ACPred can achieve an overall accuracy of 95.61% in identifying ACPs. In addition, analysis revealed the following distinguishing characteristics that ACPs possess: (i) hydrophobic residue enhances the cationic properties of α-helical ACPs resulting in better cell penetration; (ii) the amphipathic nature of the α-helical structure plays a crucial role in its mechanism of cytotoxicity; and (iii) the formation of disulfide bridges on β-sheets is vital for structural maintenance which correlates with its ability to kill cancer cells. Finally, for the convenience of experimental scientists, the ACPred web server was established and made freely available online.
Collapse
Affiliation(s)
- Nalini Schaduangrat
- Center of Data Mining and Biomedical Informatics, Faculty of Medical Technology, Mahidol University, Bangkok 10700, Thailand.
| | - Chanin Nantasenamat
- Center of Data Mining and Biomedical Informatics, Faculty of Medical Technology, Mahidol University, Bangkok 10700, Thailand.
| | - Virapong Prachayasittikul
- Department of Clinical Microbiology and Applied Technology, Faculty of Medical Technology, Mahidol University, Bangkok 10700, Thailand.
| | - Watshara Shoombuatong
- Center of Data Mining and Biomedical Informatics, Faculty of Medical Technology, Mahidol University, Bangkok 10700, Thailand.
| |
Collapse
|
175
|
Ning L, He B, Zhou P, Derda R, Huang J. Molecular Design of Peptide-Fc Fusion Drugs. Curr Drug Metab 2019; 20:203-208. [DOI: 10.2174/1389200219666180821095355] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2017] [Revised: 01/18/2018] [Accepted: 05/29/2018] [Indexed: 12/11/2022]
Abstract
Background:Peptide-Fc fusion drugs, also known as peptibodies, are a category of biological therapeutics in which the Fc region of an antibody is genetically fused to a peptide of interest. However, to develop such kind of drugs is laborious and expensive. Rational design is urgently needed.Methods:We summarized the key steps in peptide-Fc fusion technology and stressed the main computational resources, tools, and methods that had been used in the rational design of peptide-Fc fusion drugs. We also raised open questions about the computer-aided molecular design of peptide-Fc.Results:The design of peptibody consists of four steps. First, identify peptide leads from native ligands, biopanning, and computational design or prediction. Second, select the proper Fc region from different classes or subclasses of immunoglobulin. Third, fuse the peptide leads and Fc together properly. At last, evaluate the immunogenicity of the constructs. At each step, there are quite a few useful resources and computational tools.Conclusion:Reviewing the molecular design of peptibody will certainly help make the transition from peptide leads to drugs on the market quicker and cheaper.
Collapse
Affiliation(s)
- Lin Ning
- Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu, China
| | - Bifang He
- Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu, China
| | - Peng Zhou
- Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu, China
| | - Ratmir Derda
- Department of Chemistry, University of Alberta, Alberta, Canada
| | - Jian Huang
- Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu, China
| |
Collapse
|
176
|
Chen W, Feng P, Liu T, Jin D. Recent Advances in Machine Learning Methods for Predicting Heat Shock Proteins. Curr Drug Metab 2019; 20:224-228. [DOI: 10.2174/1389200219666181031105916] [Citation(s) in RCA: 69] [Impact Index Per Article: 13.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2018] [Revised: 05/21/2018] [Accepted: 08/02/2018] [Indexed: 02/08/2023]
Abstract
Background:As molecular chaperones, Heat Shock Proteins (HSPs) not only play key roles in protein folding and maintaining protein stabilities, but are also linked with multiple kinds of diseases. Therefore, HSPs have been regarded as the focus of drug design. Since HSPs from different families play distinct functions, accurately classifying the families of HSPs is the key step to clearly understand their biological functions. In contrast to laborintensive and cost-ineffective experimental methods, computational classification of HSP families has emerged to be an alternative approach.Methods:We reviewed the paper that described the existing datasets of HSPs and the representative computational approaches developed for the identification and classification of HSPs.Results:The two benchmark datasets of HSPs, namely HSPIR and sHSPdb were introduced, which provided invaluable resources for computationally identifying HSPs. The gold standard dataset and sequence encoding schemes for building computational methods of classifying HSPs were also introduced. The three representative web-servers for identifying HSPs and their families were described.Conclusion:The existing machine learning methods for identifying the different families of HSPs indeed yielded quite encouraging results and did play a role in promoting the research on HSPs. However, the number of HSPs with known structures is very limited. Therefore, determining the structure of the HSPs is also urgent, which will be helpful in revealing their functions.
Collapse
Affiliation(s)
- Wei Chen
- Innovative Institute of Chinese Medicine and Pharmacy, Chengdu University of Traditional Chinese Medicine, Chengdu 611730, China
| | - Pengmian Feng
- Hebei Province Key Laboratory of Occupational Health and Safety for Coal Industry, School of Public Health, North China University of Science and Technology, Tangshan 063000, China
| | - Tao Liu
- School of Sciences, and Center for Genomics and Computational Biology, North China University of Science and Technology, Tangshan 063000, China
| | - Dianchuan Jin
- School of Sciences, and Center for Genomics and Computational Biology, North China University of Science and Technology, Tangshan 063000, China
| |
Collapse
|
177
|
Yi HC, You ZH, Zhou X, Cheng L, Li X, Jiang TH, Chen ZH. ACP-DL: A Deep Learning Long Short-Term Memory Model to Predict Anticancer Peptides Using High-Efficiency Feature Representation. MOLECULAR THERAPY. NUCLEIC ACIDS 2019; 17:1-9. [PMID: 31173946 PMCID: PMC6554234 DOI: 10.1016/j.omtn.2019.04.025] [Citation(s) in RCA: 110] [Impact Index Per Article: 22.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/09/2019] [Revised: 04/08/2019] [Accepted: 04/08/2019] [Indexed: 01/10/2023]
Abstract
Cancer is a well-known killer of human beings, which has led to countless deaths and misery. Anticancer peptides open a promising perspective for cancer treatment, and they have various attractive advantages. Conventional wet experiments are expensive and inefficient for finding and identifying novel anticancer peptides. There is an urgent need to develop a novel computational method to predict novel anticancer peptides. In this study, we propose a deep learning long short-term memory (LSTM) neural network model, ACP-DL, to effectively predict novel anticancer peptides. More specifically, to fully exploit peptide sequence information, we developed an efficient feature representation approach by integrating binary profile feature and k-mer sparse matrix of the reduced amino acid alphabet. Then we implemented a deep LSTM model to automatically learn how to identify anticancer peptides and non-anticancer peptides. To our knowledge, this is the first time that the deep LSTM model has been applied to predict anticancer peptides. It was demonstrated by cross-validation experiments that the proposed ACP-DL remarkably outperformed other comparison methods with high accuracy and satisfied specificity on benchmark datasets. In addition, we also contributed two new anticancer peptides benchmark datasets, ACP740 and ACP240, in this work. The source code and datasets are available at https://github.com/haichengyi/ACP-DL.
Collapse
Affiliation(s)
- Hai-Cheng Yi
- The Xinjiang Technical Institute of Physics and Chemistry, Chinese Academy of Sciences, Urumqi 830011, China; University of Chinese Academy of Sciences, Beijing 100049, China
| | - Zhu-Hong You
- The Xinjiang Technical Institute of Physics and Chemistry, Chinese Academy of Sciences, Urumqi 830011, China.
| | - Xi Zhou
- The Xinjiang Technical Institute of Physics and Chemistry, Chinese Academy of Sciences, Urumqi 830011, China
| | - Li Cheng
- The Xinjiang Technical Institute of Physics and Chemistry, Chinese Academy of Sciences, Urumqi 830011, China
| | - Xiao Li
- The Xinjiang Technical Institute of Physics and Chemistry, Chinese Academy of Sciences, Urumqi 830011, China
| | - Tong-Hai Jiang
- The Xinjiang Technical Institute of Physics and Chemistry, Chinese Academy of Sciences, Urumqi 830011, China
| | - Zhan-Heng Chen
- The Xinjiang Technical Institute of Physics and Chemistry, Chinese Academy of Sciences, Urumqi 830011, China
| |
Collapse
|
178
|
Messerli MA, Sarkar A. Advances in Electrochemistry for Monitoring Cellular Chemical Flux. Curr Med Chem 2019; 26:4984-5002. [PMID: 31057100 DOI: 10.2174/0929867326666190506111629] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2018] [Revised: 03/06/2019] [Accepted: 03/12/2019] [Indexed: 11/22/2022]
Abstract
The transport of organic and inorganic molecules, along with inorganic ions across the plasma membrane results in chemical fluxes that reflect the cellular function in healthy and diseased states. Measurement of these chemical fluxes enables the characterization of protein function and transporter stoichiometry, characterization of a single cell and embryo viability prior to implantation, and screening of pharmaceutical agents. Electrochemical sensors emerge as sensitive and non-invasive tools for measuring chemical fluxes immediately outside the cells in the boundary layer, that are capable of monitoring a diverse range of transported analytes including inorganic ions, gases, neurotransmitters, hormones, and pharmaceutical agents. Used on their own or in combination with other methods, these sensors continue to expand our understanding of the function of rare cells and small tissues. Advances in sensor construction and detection strategies continue to improve sensitivity under physiological conditions, diversify analyte detection, and increase throughput. These advances will be discussed in the context of addressing technical challenges to measuring chemical flux in the boundary layer of cells and measuring the resultant changes to the chemical concentration in the bulk media.
Collapse
Affiliation(s)
- Mark A Messerli
- Department of Biology and Microbiology, South Dakota State University, Brookings, SD. United States
| | - Anyesha Sarkar
- Department of Biology and Microbiology, South Dakota State University, Brookings, SD. United States
| |
Collapse
|
179
|
Barukab O, Khan YD, Khan SA, Chou KC. iSulfoTyr-PseAAC: Identify Tyrosine Sulfation Sites by Incorporating Statistical Moments via Chou's 5-steps Rule and Pseudo Components. Curr Genomics 2019; 20:306-320. [PMID: 32030089 PMCID: PMC6983959 DOI: 10.2174/1389202920666190819091609] [Citation(s) in RCA: 22] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2019] [Revised: 08/04/2019] [Accepted: 08/06/2019] [Indexed: 01/01/2023] Open
Abstract
BACKGROUND The amino acid residues, in protein, undergo post-translation modification (PTM) during protein synthesis, a process of chemical and physical change in an amino acid that in turn alters behavioral properties of proteins. Tyrosine sulfation is a ubiquitous posttranslational modification which is known to be associated with regulation of various biological functions and pathological pro-cesses. Thus its identification is necessary to understand its mechanism. Experimental determination through site-directed mutagenesis and high throughput mass spectrometry is a costly and time taking process, thus, the reliable computational model is required for identification of sulfotyrosine sites. METHODOLOGY In this paper, we present a computational model for the prediction of the sulfotyrosine sites named iSulfoTyr-PseAAC in which feature vectors are constructed using statistical moments of protein amino acid sequences and various position/composition relative features. These features are in-corporated into PseAAC. The model is validated by jackknife, cross-validation, self-consistency and in-dependent testing. RESULTS Accuracy determined through validation was 93.93% for jackknife test, 95.16% for cross-validation, 94.3% for self-consistency and 94.3% for independent testing. CONCLUSION The proposed model has better performance as compared to the existing predictors, how-ever, the accuracy can be improved further, in future, due to increasing number of sulfotyrosine sites in proteins.
Collapse
Affiliation(s)
| | | | - Sher Afzal Khan
- Address correspondence to this author at the Department of Information Technology, Faculty of Computing and Information Technology in Rabigh, King Abdulaziz University, P.O. Box 344, Rabigh, 21911, Saudi Arabia; and Department of Computer Sciences, Abdul Wali Khan University, Mardan, Pakistan; E-mail:
| | | |
Collapse
|
180
|
Ilyas S, Hussain W, Ashraf A, Khan YD, Khan SA, Chou KC. iMethylK_pseAAC: Improving Accuracy of Lysine Methylation Sites Identification by Incorporating Statistical Moments and Position Relative Features into General PseAAC via Chou's 5-steps Rule. Curr Genomics 2019; 20:275-292. [PMID: 32030087 PMCID: PMC6983956 DOI: 10.2174/1389202920666190809095206] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2019] [Revised: 07/02/2019] [Accepted: 07/26/2019] [Indexed: 02/04/2023] Open
Abstract
BACKGROUND Methylation is one of the most important post-translational modifications in the human body which usually arises on lysine among the most intensely modified residues. It performs a dynamic role in numerous biological procedures, such as regulation of gene expression, regulation of protein function and RNA processing. Therefore, to identify lysine methylation sites is an important challenge as some experimental procedures are time-consuming. OBJECTIVE Herein, we propose a computational predictor named iMethylK_pseAAC to identify lysine methylation sites. METHODS Firstly, we constructed feature vectors based on PseAAC using position and composition rel-ative features and statistical moments. A neural network is trained based on the extracted features. The performance of the proposed method is then validated using cross-validation and jackknife testing. RESULTS The objective evaluation of the predictor showed accuracy of 96.7% for self-consistency, 91.61% for 10-fold cross-validation and 93.42% for jackknife testing. CONCLUSION It is concluded that iMethylK_pseAAC outperforms the counterparts to identify lysine methylation sites such as iMethyl_pseACC, BPB_pPMS and PMeS.
Collapse
Affiliation(s)
| | | | | | - Yaser Daanial Khan
- Address correspondence to this author at the Department of Computer Science, School of Systems and Technology, University of Management and Technology, P.O. Box 10033, C-II, Johar Town, Lahore, Pakistan; Tel: +923054440271; E-mail:
| | | | | |
Collapse
|
181
|
SPrenylC-PseAAC: A sequence-based model developed via Chou's 5-steps rule and general PseAAC for identifying S-prenylation sites in proteins. J Theor Biol 2019; 468:1-11. [DOI: 10.1016/j.jtbi.2019.02.007] [Citation(s) in RCA: 98] [Impact Index Per Article: 19.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2018] [Revised: 02/07/2019] [Accepted: 02/11/2019] [Indexed: 11/22/2022]
|
182
|
mACPpred: A Support Vector Machine-Based Meta-Predictor for Identification of Anticancer Peptides. Int J Mol Sci 2019; 20:ijms20081964. [PMID: 31013619 PMCID: PMC6514805 DOI: 10.3390/ijms20081964] [Citation(s) in RCA: 133] [Impact Index Per Article: 26.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2019] [Revised: 04/08/2019] [Accepted: 04/18/2019] [Indexed: 12/24/2022] Open
Abstract
Anticancer peptides (ACPs) are promising therapeutic agents for targeting and killing cancer cells. The accurate prediction of ACPs from given peptide sequences remains as an open problem in the field of immunoinformatics. Recently, machine learning algorithms have emerged as a promising tool for helping experimental scientists predict ACPs. However, the performance of existing methods still needs to be improved. In this study, we present a novel approach for the accurate prediction of ACPs, which involves the following two steps: (i) We applied a two-step feature selection protocol on seven feature encodings that cover various aspects of sequence information (composition-based, physicochemical properties and profiles) and obtained their corresponding optimal feature-based models. The resultant predicted probabilities of ACPs were further utilized as feature vectors. (ii) The predicted probability feature vectors were in turn used as an input to support vector machine to develop the final prediction model called mACPpred. Cross-validation analysis showed that the proposed predictor performs significantly better than individual feature encodings. Furthermore, mACPpred significantly outperformed the existing methods compared in this study when objectively evaluated on an independent dataset.
Collapse
|
183
|
Qu K, Guo F, Liu X, Lin Y, Zou Q. Application of Machine Learning in Microbiology. Front Microbiol 2019; 10:827. [PMID: 31057526 PMCID: PMC6482238 DOI: 10.3389/fmicb.2019.00827] [Citation(s) in RCA: 95] [Impact Index Per Article: 19.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2019] [Accepted: 04/01/2019] [Indexed: 02/01/2023] Open
Abstract
Microorganisms are ubiquitous and closely related to people's daily lives. Since they were first discovered in the 19th century, researchers have shown great interest in microorganisms. People studied microorganisms through cultivation, but this method is expensive and time consuming. However, the cultivation method cannot keep a pace with the development of high-throughput sequencing technology. To deal with this problem, machine learning (ML) methods have been widely applied to the field of microbiology. Literature reviews have shown that ML can be used in many aspects of microbiology research, especially classification problems, and for exploring the interaction between microorganisms and the surrounding environment. In this study, we summarize the application of ML in microbiology.
Collapse
Affiliation(s)
- Kaiyang Qu
- College of Intelligence and Computing, Tianjin University, Tianjin, China
| | - Fei Guo
- College of Intelligence and Computing, Tianjin University, Tianjin, China
| | - Xiangrong Liu
- School of Information Science and Technology, Xiamen University, Xiamen, China
| | - Yuan Lin
- School of Information Science and Technology, Xiamen University, Xiamen, China
- Department of System Integration, Sparebanken Vest, Bergen, Norway
| | - Quan Zou
- Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu, China
- Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu, China
| |
Collapse
|
184
|
Sharma A, Lysenko A, López Y, Dehzangi A, Sharma R, Reddy H, Sattar A, Tsunoda T. HseSUMO: Sumoylation site prediction using half-sphere exposures of amino acids residues. BMC Genomics 2019; 19:982. [PMID: 30999862 PMCID: PMC7402407 DOI: 10.1186/s12864-018-5206-8] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2018] [Accepted: 10/28/2018] [Indexed: 02/06/2023] Open
Abstract
Background Post-translational modifications are viewed as an important mechanism for controlling protein function and are believed to be involved in multiple important diseases. However, their profiling using laboratory-based techniques remain challenging. Therefore, making the development of accurate computational methods to predict post-translational modifications is particularly important for making progress in this area of research. Results This work explores the use of four half-sphere exposure-based features for computational prediction of sumoylation sites. Unlike most of the previously proposed approaches, which focused on patterns of amino acid co-occurrence, we were able to demonstrate that protein structural based features could be sufficiently informative to achieve good predictive performance. The evaluation of our method has demonstrated high sensitivity (0.9), accuracy (0.89) and Matthew’s correlation coefficient (0.78–0.79). We have compared these results to the recently released pSumo-CD method and were able to demonstrate better performance of our method on the same evaluation dataset. Conclusions The proposed predictor HseSUMO uses half-sphere exposures of amino acids to predict sumoylation sites. It has shown promising results on a benchmark dataset when compared with the state-of-the-art method. The extracted data of this study can be accessed at https://github.com/YosvanyLopez/HseSUMO. Electronic supplementary material The online version of this article (10.1186/s12864-018-5206-8) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Alok Sharma
- Institute for Integrated and Intelligent Systems, Griffith University, Q, Brisbane, LD-4111, Australia. .,Laboratory for Medical Science Mathematics, RIKEN Center for Integrative Medical Sciences, Yokohama, Kanagawa, Japan. .,School of Engineering and Physics, Faculty of Science, Technology and Environment, University of the South Pacific, Suva, Fiji Islands.
| | - Artem Lysenko
- Laboratory for Medical Science Mathematics, RIKEN Center for Integrative Medical Sciences, Yokohama, Kanagawa, Japan
| | - Yosvany López
- Genesis Institute of Genetic Research, Genesis Healthcare Co, Tokyo, Japan
| | - Abdollah Dehzangi
- Department of Computer Science, Morgan State University, Baltimore, MD, USA
| | - Ronesh Sharma
- School of Engineering and Physics, Faculty of Science, Technology and Environment, University of the South Pacific, Suva, Fiji Islands.,School of Electrical and Electronics Engineering, Fiji National University, Suva, Fiji
| | - Hamendra Reddy
- School of Engineering and Physics, Faculty of Science, Technology and Environment, University of the South Pacific, Suva, Fiji Islands
| | - Abdul Sattar
- Institute for Integrated and Intelligent Systems, Griffith University, Q, Brisbane, LD-4111, Australia
| | - Tatsuhiko Tsunoda
- Laboratory for Medical Science Mathematics, RIKEN Center for Integrative Medical Sciences, Yokohama, Kanagawa, Japan. .,Department of Medical Science Mathematics, Medical Research Institute, Tokyo Medical and Dental University, Tokyo, Japan. .,CREST, JST, Tokyo, 113-8510, Japan.
| |
Collapse
|
185
|
Pan Q, Guo Y, Guo L, Liao S, Zhao C, Wang S, Liu HF. Mechanistic Insights of Chemicals and Drugs as Risk Factors for Systemic Lupus Erythematosus. Curr Med Chem 2019; 27:5175-5188. [PMID: 30947650 DOI: 10.2174/0929867326666190404140658] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2018] [Revised: 03/25/2019] [Accepted: 03/27/2019] [Indexed: 12/21/2022]
Abstract
Systemic Lupus Erythematosus (SLE) is a chronic and relapsing heterogenous autoimmune disease that primarily affects women of reproductive age. Genetic and environmental risk factors are involved in the pathogenesis of SLE, and susceptibility genes have recently been identified. However, as gene therapy is far from clinical application, further investigation of environmental risk factors could reveal important therapeutic approaches. We systematically explored two groups of environmental risk factors: chemicals (including silica, solvents, pesticides, hydrocarbons, heavy metals, and particulate matter) and drugs (including procainamide, hydralazine, quinidine, Dpenicillamine, isoniazid, and methyldopa). Furthermore, the mechanisms underlying risk factors, such as genetic factors, epigenetic change, and disrupted immune tolerance, were explored. This review identifies novel risk factors and their underlying mechanisms. Practicable measures for the management of these risk factors will benefit SLE patients and provide potential therapeutic strategies.
Collapse
Affiliation(s)
- Qingjun Pan
- Key Laboratory of Prevention and Management of Chronic Kidney Disease of Zhanjiang City, Affiliated Hospital of Guangdong Medical University, 57th South Renmin Road, Zhanjiang 524001, Guangdong, China
| | - Yun Guo
- Key Laboratory of Prevention and Management of Chronic Kidney Disease of Zhanjiang City, Affiliated Hospital of Guangdong Medical University, 57th South Renmin Road, Zhanjiang 524001, Guangdong, China
| | - Linjie Guo
- Key Laboratory of Prevention and Management of Chronic Kidney Disease of Zhanjiang City, Affiliated Hospital of Guangdong Medical University, 57th South Renmin Road, Zhanjiang 524001, Guangdong, China
| | - Shuzhen Liao
- Key Laboratory of Prevention and Management of Chronic Kidney Disease of Zhanjiang City, Affiliated Hospital of Guangdong Medical University, 57th South Renmin Road, Zhanjiang 524001, Guangdong, China
| | - Chunfei Zhao
- Key Laboratory of Prevention and Management of Chronic Kidney Disease of Zhanjiang City, Affiliated Hospital of Guangdong Medical University, 57th South Renmin Road, Zhanjiang 524001, Guangdong, China
| | - Sijie Wang
- Key Laboratory of Prevention and Management of Chronic Kidney Disease of Zhanjiang City, Affiliated Hospital of Guangdong Medical University, 57th South Renmin Road, Zhanjiang 524001, Guangdong, China
| | - Hua-Feng Liu
- Key Laboratory of Prevention and Management of Chronic Kidney Disease of Zhanjiang City, Affiliated Hospital of Guangdong Medical University, 57th South Renmin Road, Zhanjiang 524001, Guangdong, China
| |
Collapse
|
186
|
Grisoni F, Neuhaus CS, Hishinuma M, Gabernet G, Hiss JA, Kotera M, Schneider G. De novo design of anticancer peptides by ensemble artificial neural networks. J Mol Model 2019; 25:112. [PMID: 30953170 DOI: 10.1007/s00894-019-4007-6] [Citation(s) in RCA: 21] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2019] [Accepted: 03/21/2019] [Indexed: 12/17/2022]
Abstract
Membranolytic anticancer peptides (ACPs) are drawing increasing attention as potential future therapeutics against cancer, due to their ability to hinder the development of cellular resistance and their potential to overcome common hurdles of chemotherapy, e.g., side effects and cytotoxicity. In this work, we present an ensemble machine learning model to design potent ACPs. Four counter-propagation artificial neural-networks were trained to identify peptides that kill breast and/or lung cancer cells. For prospective application of the ensemble model, we selected 14 peptides from a total of 1000 de novo designs, for synthesis and testing in vitro on breast cancer (MCF7) and lung cancer (A549) cell lines. Six de novo designs showed anticancer activity in vitro, five of which against both MCF7 and A549 cell lines. The novel active peptides populate uncharted regions of ACP sequence space.
Collapse
Affiliation(s)
- Francesca Grisoni
- Department of Chemistry and Applied Biosciences, RETHINK, ETH Zurich, Vladimir-Prelog-Weg 4, 8093, Zurich, Switzerland. .,Department of Earth and Environmental Sciences, University of Milano-Bicocca, Piazza della Scienza 1, 20126, Milan, Italy.
| | - Claudia S Neuhaus
- Department of Chemistry and Applied Biosciences, RETHINK, ETH Zurich, Vladimir-Prelog-Weg 4, 8093, Zurich, Switzerland
| | - Miyabi Hishinuma
- Department of Chemistry and Applied Biosciences, RETHINK, ETH Zurich, Vladimir-Prelog-Weg 4, 8093, Zurich, Switzerland.,Department of Chemical System Engineering, School of Engineering, University of Tokyo, 7-3-1 Hongo, Bunkyo-ku, Tokyo, 113-8656, Japan.,School of Life Science and Technology, Tokyo Institute of Technology, 1-11-5, Midorigaoka, Meguro-ku, Tokyo, 152-0034, Japan
| | - Gisela Gabernet
- Department of Chemistry and Applied Biosciences, RETHINK, ETH Zurich, Vladimir-Prelog-Weg 4, 8093, Zurich, Switzerland
| | - Jan A Hiss
- Department of Chemistry and Applied Biosciences, RETHINK, ETH Zurich, Vladimir-Prelog-Weg 4, 8093, Zurich, Switzerland
| | - Masaaki Kotera
- Department of Chemical System Engineering, School of Engineering, University of Tokyo, 7-3-1 Hongo, Bunkyo-ku, Tokyo, 113-8656, Japan
| | - Gisbert Schneider
- Department of Chemistry and Applied Biosciences, RETHINK, ETH Zurich, Vladimir-Prelog-Weg 4, 8093, Zurich, Switzerland.
| |
Collapse
|
187
|
Are peptides a solution for the treatment of hyperactivated JAK3 pathways? Inflammopharmacology 2019; 27:433-452. [PMID: 30929155 DOI: 10.1007/s10787-019-00589-2] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2019] [Accepted: 03/18/2019] [Indexed: 01/10/2023]
Abstract
While the inactivation mutations that eliminate JAK3 function lead to the immunological disorders such as severe combined immunodeficiency, activation mutations, causing constitutive JAK3 signaling, are known to trigger various types of cancer or are responsible for autoimmune diseases, such as rheumatoid arthritis, psoriasis, or inflammatory bowel diseases. Treatment of hyperactivated JAK3 is still an obstacle, due to different sensibility of mutation types to conventional drugs and unwanted side effects, because these drugs are not absolutely specific for JAK3, thus inhibiting other members of the JAK family, too. Lack of information, in which way sole inhibition of JAK3 is necessary for elimination of the disease, calls for the development of isoform-specific JAK3 inhibitors. Beside this strategy, up to date peptides are a rising alternative as chemo- or immunotherapeutics, but still sparsely represented in drug development and clinical trials. Beyond a possible direct inhibition function, crossing the cancer cell membrane and interfering in disease-causing pathways or triggering apoptosis, peptides could be used in future as adjunct remedies to potentialize traditional therapy and preserve non-affected cells. To discuss such feasible topics, this review deals with the knowledge about the structure-function of JAK3 and the actual state-of-the-art of isoform-specific inhibitor development, as well as the function of currently approved drugs or those currently being tested in clinical trials. Furthermore, several strategies for the application of peptide-based drugs for cancer therapy and the physicochemical and structural relations to peptide efficacy are discussed, and an overview of peptide sequences, which were qualified for clinical trials, is given.
Collapse
|
188
|
Wu Q, Ke H, Li D, Wang Q, Fang J, Zhou J. Recent Progress in Machine Learning-based Prediction of Peptide Activity for Drug Discovery. Curr Top Med Chem 2019; 19:4-16. [DOI: 10.2174/1568026619666190122151634] [Citation(s) in RCA: 20] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2018] [Revised: 11/14/2018] [Accepted: 11/16/2018] [Indexed: 12/25/2022]
Abstract
Over the past decades, peptide as a therapeutic candidate has received increasing attention in
drug discovery, especially for antimicrobial peptides (AMPs), anticancer peptides (ACPs) and antiinflammatory
peptides (AIPs). It is considered that the peptides can regulate various complex diseases
which are previously untouchable. In recent years, the critical problem of antimicrobial resistance drives
the pharmaceutical industry to look for new therapeutic agents. Compared to organic small drugs, peptide-
based therapy exhibits high specificity and minimal toxicity. Thus, peptides are widely recruited in
the design and discovery of new potent drugs. Currently, large-scale screening of peptide activity with
traditional approaches is costly, time-consuming and labor-intensive. Hence, in silico methods, mainly
machine learning approaches, for their accuracy and effectiveness, have been introduced to predict the
peptide activity. In this review, we document the recent progress in machine learning-based prediction
of peptides which will be of great benefit to the discovery of potential active AMPs, ACPs and AIPs.
Collapse
Affiliation(s)
- Qihui Wu
- Institute of Clinical Pharmacology, Guangzhou University of Chinese Medicine, Guangzhou 510405, China
| | - Hanzhong Ke
- Department of Pathobiology, University of Illinois at Urbana-Champaign, Urbana, Illinois 61802, United States
| | - Dongli Li
- Institute of Clinical Pharmacology, Guangzhou University of Chinese Medicine, Guangzhou 510405, China
| | - Qi Wang
- Institute of Clinical Pharmacology, Guangzhou University of Chinese Medicine, Guangzhou 510405, China
| | - Jiansong Fang
- Institute of Clinical Pharmacology, Guangzhou University of Chinese Medicine, Guangzhou 510405, China
| | - Jingwei Zhou
- Institute of Clinical Pharmacology, Guangzhou University of Chinese Medicine, Guangzhou 510405, China
| |
Collapse
|
189
|
Lu Y, Wang S, Wang J, Zhou G, Zhang Q, Zhou X, Niu B, Chen Q, Chou KC. An Epidemic Avian Influenza Prediction Model Based on Google Trends. LETT ORG CHEM 2019. [DOI: 10.2174/1570178615666180724103325] [Citation(s) in RCA: 24] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/03/2023]
Abstract
The occurrence of epidemic avian influenza (EAI) not only hinders the development of a country's agricultural economy, but also seriously affects human beings’ life. Recently, the information collected from Google Trends has been increasingly used to predict various epidemics. In this study, using the relevant keywords in Google Trends as well as the multiple linear regression approach, a model was developed to predict the occurrence of epidemic avian influenza. It was demonstrated by rigorous cross-validations that the success rates achieved by the new model were quite high, indicating the predictor will become a very useful tool for hospitals and health providers.
Collapse
Affiliation(s)
- Yi Lu
- School of Life Sciences, Shanghai University, Shanghai, China
| | - Shuo Wang
- School of Life Sciences, Shanghai University, Shanghai, China
| | - Jianying Wang
- School of Life Sciences, Shanghai University, Shanghai, China
| | - Guangya Zhou
- School of Life Sciences, Shanghai University, Shanghai, China
| | - Qiang Zhang
- Technical Center for Animal Plant and Food Inspection and Quarantine, Shanghai, China
| | - Xiang Zhou
- Institute of Heating, Ventilating & Air Conditioning Engineering, School of Mechanical Engineering, Tongji University, 1239 Siping Road, Shanghai, China
| | - Bing Niu
- School of Life Sciences, Shanghai University, Shanghai, China
| | - Qin Chen
- School of Life Sciences, Shanghai University, Shanghai, China
| | - Kuo-Chen Chou
- Gordon Life Science Institute, Boston, MA, United States
| |
Collapse
|
190
|
Yang W, Zhu XJ, Huang J, Ding H, Lin H. A Brief Survey of Machine Learning Methods in Protein Sub-Golgi Localization. Curr Bioinform 2019. [DOI: 10.2174/1574893613666181113131415] [Citation(s) in RCA: 111] [Impact Index Per Article: 22.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023]
Abstract
Background:The location of proteins in a cell can provide important clues to their functions in various biological processes. Thus, the application of machine learning method in the prediction of protein subcellular localization has become a hotspot in bioinformatics. As one of key organelles, the Golgi apparatus is in charge of protein storage, package, and distribution.Objective:The identification of protein location in Golgi apparatus will provide in-depth insights into their functions. Thus, the machine learning-based method of predicting protein location in Golgi apparatus has been extensively explored. The development of protein sub-Golgi apparatus localization prediction should be reviewed for providing a whole background for the fields.Method:The benchmark dataset, feature extraction, machine learning method and published results were summarized.Results:We briefly introduced the recent progresses in protein sub-Golgi apparatus localization prediction using machine learning methods and discussed their advantages and disadvantages.Conclusion:We pointed out the perspective of machine learning methods in protein sub-Golgi localization prediction.
Collapse
Affiliation(s)
- Wuritu Yang
- Key Laboratory for Neuro-Information of Ministry of Education, School of Life Science and Technology, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu, Sichuan, 610054, China
| | - Xiao-Juan Zhu
- Key Laboratory for Neuro-Information of Ministry of Education, School of Life Science and Technology, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu, Sichuan, 610054, China
| | - Jian Huang
- Key Laboratory for Neuro-Information of Ministry of Education, School of Life Science and Technology, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu, Sichuan, 610054, China
| | - Hui Ding
- Key Laboratory for Neuro-Information of Ministry of Education, School of Life Science and Technology, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu, Sichuan, 610054, China
| | - Hao Lin
- Key Laboratory for Neuro-Information of Ministry of Education, School of Life Science and Technology, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu, Sichuan, 610054, China
| |
Collapse
|
191
|
Spänig S, Heider D. Encodings and models for antimicrobial peptide classification for multi-resistant pathogens. BioData Min 2019; 12:7. [PMID: 30867681 PMCID: PMC6399931 DOI: 10.1186/s13040-019-0196-x] [Citation(s) in RCA: 51] [Impact Index Per Article: 10.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2018] [Accepted: 02/24/2019] [Indexed: 01/10/2023] Open
Abstract
Antimicrobial peptides (AMPs) are part of the inherent immune system. In fact, they occur in almost all organisms including, e.g., plants, animals, and humans. Remarkably, they show effectivity also against multi-resistant pathogens with a high selectivity. This is especially crucial in times, where society is faced with the major threat of an ever-increasing amount of antibiotic resistant microbes. In addition, AMPs can also exhibit antitumor and antiviral effects, thus a variety of scientific studies dealt with the prediction of active peptides in recent years. Due to their potential, even the pharmaceutical industry is keen on discovering and developing novel AMPs. However, AMPs are difficult to verify in vitro, hence researchers conduct sequence similarity experiments against known, active peptides. Unfortunately, this approach is very time-consuming and limits potential candidates to sequences with a high similarity to known AMPs. Machine learning methods offer the opportunity to explore the huge space of sequence variations in a timely manner. These algorithms have, in principal, paved the way for an automated discovery of AMPs. However, machine learning models require a numerical input, thus an informative encoding is very important. Unfortunately, developing an appropriate encoding is a major challenge, which has not been entirely solved so far. For this reason, the development of novel amino acid encodings is established as a stand-alone research branch. The present review introduces state-of-the-art encodings of amino acids as well as their properties in sequence and structure based aggregation. Moreover, albeit a well-chosen encoding is essential, performant classifiers are required, which is reflected by a tendency towards specifically designed models in the literature. Furthermore, we introduce these models with a particular focus on encodings derived from support vector machines and deep learning approaches. Albeit a strong focus has been set on AMP predictions, not all of the mentioned encodings have been elaborated as part of antimicrobial research studies, but rather as general protein or peptide representations.
Collapse
Affiliation(s)
- Sebastian Spänig
- Department of Bioinformatics, Faculty of Mathematics and Computer Science, Philipps-University of Marburg, Marburg, Germany
| | - Dominik Heider
- Department of Bioinformatics, Faculty of Mathematics and Computer Science, Philipps-University of Marburg, Marburg, Germany
| |
Collapse
|
192
|
SPalmitoylC-PseAAC: A sequence-based model developed via Chou's 5-steps rule and general PseAAC for identifying S-palmitoylation sites in proteins. Anal Biochem 2019; 568:14-23. [DOI: 10.1016/j.ab.2018.12.019] [Citation(s) in RCA: 93] [Impact Index Per Article: 18.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2018] [Revised: 12/19/2018] [Accepted: 12/22/2018] [Indexed: 02/06/2023]
|
193
|
MFSC: Multi-voting based feature selection for classification of Golgi proteins by adopting the general form of Chou's PseAAC components. J Theor Biol 2019; 463:99-109. [DOI: 10.1016/j.jtbi.2018.12.017] [Citation(s) in RCA: 39] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/03/2018] [Revised: 12/02/2018] [Accepted: 12/14/2018] [Indexed: 12/29/2022]
|
194
|
Rout S, Mahapatra RK. In silico analysis of plasmodium falciparum CDPK5 protein through molecular modeling, docking and dynamics. J Theor Biol 2019; 461:254-267. [DOI: 10.1016/j.jtbi.2018.10.045] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2018] [Revised: 10/15/2018] [Accepted: 10/22/2018] [Indexed: 10/28/2022]
|
195
|
Jia J, Li X, Qiu W, Xiao X, Chou KC. iPPI-PseAAC(CGR): Identify protein-protein interactions by incorporating chaos game representation into PseAAC. J Theor Biol 2019; 460:195-203. [DOI: 10.1016/j.jtbi.2018.10.021] [Citation(s) in RCA: 78] [Impact Index Per Article: 15.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2018] [Revised: 09/16/2018] [Accepted: 10/08/2018] [Indexed: 01/11/2023]
|
196
|
Jiang QX. Structural Variability in the RLR-MAVS Pathway and Sensitive Detection of Viral RNAs. Med Chem 2019; 15:443-458. [PMID: 30569868 PMCID: PMC6858087 DOI: 10.2174/1573406415666181219101613] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2018] [Revised: 10/23/2018] [Accepted: 12/12/2018] [Indexed: 12/25/2022]
Abstract
Cells need high-sensitivity detection of non-self molecules in order to fight against pathogens. These cellular sensors are thus of significant importance to medicinal purposes, especially for treating novel emerging pathogens. RIG-I-like receptors (RLRs) are intracellular sensors for viral RNAs (vRNAs). Their active forms activate mitochondrial antiviral signaling protein (MAVS) and trigger downstream immune responses against viral infection. Functional and structural studies of the RLR-MAVS signaling pathway have revealed significant supramolecular variability in the past few years, which revealed different aspects of the functional signaling pathway. Here I will discuss the molecular events of RLR-MAVS pathway from the angle of detecting single copy or a very low copy number of vRNAs in the presence of non-specific competition from cytosolic RNAs, and review key structural variability in the RLR / vRNA complexes, the MAVS helical polymers, and the adapter-mediated interactions between the active RLR / vRNA complex and the inactive MAVS in triggering the initiation of the MAVS filaments. These structural variations may not be exclusive to each other, but instead may reflect the adaptation of the signaling pathways to different conditions or reach different levels of sensitivity in its response to exogenous vRNAs.
Collapse
Affiliation(s)
- Qiu-Xing Jiang
- Department of Microbiology and Cell Science, University of Florida, Gainesville, FL 32611, United States
| |
Collapse
|
197
|
Zhang S, Lin J, Su L, Zhou Z. pDHS-DSET: Prediction of DNase I hypersensitive sites in plant genome using DS evidence theory. Anal Biochem 2019; 564-565:54-63. [DOI: 10.1016/j.ab.2018.10.018] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2018] [Revised: 10/10/2018] [Accepted: 10/15/2018] [Indexed: 10/28/2022]
|
198
|
Xiao X, Xu ZC, Qiu WR, Wang P, Ge HT, Chou KC. iPSW(2L)-PseKNC: A two-layer predictor for identifying promoters and their strength by hybrid features via pseudo K-tuple nucleotide composition. Genomics 2018; 111:1785-1793. [PMID: 30529532 DOI: 10.1016/j.ygeno.2018.12.001] [Citation(s) in RCA: 44] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2018] [Revised: 11/20/2018] [Accepted: 12/04/2018] [Indexed: 12/20/2022]
Abstract
The promoter is a regulatory DNA region about 81-1000 base pairs long, usually located near the transcription start site (TSS) along upstream of a given gene. By combining a certain protein called transcription factor, the promoter provides the starting point for regulated gene transcription, and hence plays a vitally important role in gene transcriptional regulation. With explosive growth of DNA sequences in the post-genomic age, it has become an urgent challenge to develop computational method for effectively identifying promoters because the information thus obtained is very useful for both basic research and drug development. Although some prediction methods were developed in this regard, most of them were limited at merely identifying whether a query DNA sequence being of a promoter or not. However, based on their strength-distinct levels for transcriptional activation and expression, promoter should be divided into two categories: strong and weak types. Here a new two-layer predictor, called "iPSW(2L)-PseKNC", was developed by fusing the physicochemical properties of nucleotides and their nucleotide density into PseKNC (pseudo K-tuple nucleotide composition). Its 1st-layer serves to predict whether a query DNA sequence sample is of promoter or not, while its 2nd-layer is able to predict the strength of promoters. It has been observed through rigorous cross-validations that the 1st-layer sub-predictor is remarkably superior to the existing state-of-the-art predictors in identifying the promoters and non-promoters, and that the 2nd-layer sub-predictor can do what is beyond the reach of the existing predictors. Moreover, the web-server for iPSW(2L)-PseKNC has been established at http://www.jci-bioinfo.cn/iPSW(2L)-PseKNC, by which the majority of experimental scientists can easily get the results they need.
Collapse
Affiliation(s)
- Xuan Xiao
- Computer Department, Jingdezhen Ceramic Institute, Jingdezhen, China; The Gordon Life Science Institute, Boston, MA 02478, USA.
| | - Zhao-Chun Xu
- Computer Department, Jingdezhen Ceramic Institute, Jingdezhen, China.
| | - Wang-Ren Qiu
- Computer Department, Jingdezhen Ceramic Institute, Jingdezhen, China; The Gordon Life Science Institute, Boston, MA 02478, USA
| | - Peng Wang
- Computer Department, Jingdezhen Ceramic Institute, Jingdezhen, China
| | - Hui-Ting Ge
- Computer Department, Jingdezhen Ceramic Institute, Jingdezhen, China
| | - Kuo-Chen Chou
- The Gordon Life Science Institute, Boston, MA 02478, USA; Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu 610054, China.
| |
Collapse
|
199
|
Cheng X, Xiao X, Chou KC. pLoc_bal-mGneg: Predict subcellular localization of Gram-negative bacterial proteins by quasi-balancing training dataset and general PseAAC. J Theor Biol 2018; 458:92-102. [DOI: 10.1016/j.jtbi.2018.09.005] [Citation(s) in RCA: 65] [Impact Index Per Article: 10.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2018] [Revised: 09/05/2018] [Accepted: 09/07/2018] [Indexed: 01/03/2023]
|
200
|
Wei L, Zhou C, Chen H, Song J, Su R. ACPred-FL: a sequence-based predictor using effective feature representation to improve the prediction of anti-cancer peptides. Bioinformatics 2018; 34:4007-4016. [PMID: 29868903 PMCID: PMC6247924 DOI: 10.1093/bioinformatics/bty451] [Citation(s) in RCA: 218] [Impact Index Per Article: 36.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2018] [Revised: 05/14/2018] [Accepted: 05/29/2018] [Indexed: 11/15/2022] Open
Abstract
Motivation Anti-cancer peptides (ACPs) have recently emerged as promising therapeutic agents for cancer treatment. Due to the avalanche of protein sequence data in the post-genomic era, there is an urgent need to develop automated computational methods to enable fast and accurate identification of novel ACPs within the vast number of candidate proteins and peptides. Results To address this, we propose a novel predictor named Anti-Cancer peptide Predictor with Feature representation Learning (ACPred-FL) for accurate prediction of ACPs based on sequence information. More specifically, we develop an effective feature representation learning model, with which we can extract and learn a set of informative features from a pool of support vector machine-based models trained using sequence-based feature descriptors. By doing so, the class label information of data samples is fully utilized. To improve the feature representation, we further employ a two-step feature selection technique, resulting in a most informative five-dimensional feature vector for the final peptide representation. Experimental results show that such five features provide the most discriminative power for identifying ACPs than currently available feature descriptors, highlighting the effectiveness of the proposed feature representation learning approach. The developed ACPred-FL method significantly outperforms state-of-the-art methods. Availability and implementation The web-server of ACPred-FL is available at http://server.malab.cn/ACPred-FL. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Leyi Wei
- School of Computer Science and Technology, Tianjin University, Tianjin, China
- State Key Laboratory of Medicinal Chemical Biology, Nankai University, Tianjin, China
| | - Chen Zhou
- School of Computer Science and Technology, Tianjin University, Tianjin, China
| | - Huangrong Chen
- School of Computer Science and Technology, Tianjin University, Tianjin, China
| | - Jiangning Song
- Biomedicine Discovery Institute and Department of Biochemistry and Molecular Biology
- Monash Centre for Data Science, Faculty of Information Technology, Monash University, Clayton, VIC 3800, Australia
| | - Ran Su
- School of Computer Software, Tianjin University, Tianjin, China
- State Key Laboratory of Medicinal Chemical Biology, Nankai University, Tianjin, China
| |
Collapse
|