1
|
Yang W, Zou S, Gao H, Wang L, Ni W. A Novel Method for Targeted Identification of Essential Proteins by Integrating Chemical Reaction Optimization and Naive Bayes Model. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2024; 21:1274-1286. [PMID: 38536675 DOI: 10.1109/tcbb.2024.3382392] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/10/2024]
Abstract
Targeted identification of essential proteins is of great significance for species identification, drug manufacturing, and disease treatment. It is a challenge to analyze the binding mechanism between essential proteins and improve the identification speed while ensuring the accuracy of the identification. This paper proposes a novel method called EPCRO for identifying essential proteins, which incorporates the chemical reaction optimization (CRO) algorithm and the naive Bayes model to effectively detect essential proteins. In EPCRO, the naive Bayes model is employed to analyze the homogeneity between proteins. In order to improve the identification rate and speed of essential proteins, the protein homogeneity rate is integrated into the CRO algorithm to balance between local and global searches. EPCRO is experimentally compared with 17 existing methods (including, DC, SC, IC, EC, LAC, NC, PeC, WDC, EPD-RW, RWHN, TEGS, CFMM, BSPM, AFSO-EP, CVIM, RWEP, and EPPSO-DC) based on biological datasets. The results show that EPCRO is superior to the above methods in identification accuracy and speed.
Collapse
|
2
|
Ye C, Wu Q, Chen S, Zhang X, Xu W, Wu Y, Zhang Y, Yue Y. ECDEP: identifying essential proteins based on evolutionary community discovery and subcellular localization. BMC Genomics 2024; 25:117. [PMID: 38279081 PMCID: PMC10821549 DOI: 10.1186/s12864-024-10019-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2023] [Accepted: 01/15/2024] [Indexed: 01/28/2024] Open
Abstract
BACKGROUND In cellular activities, essential proteins play a vital role and are instrumental in comprehending fundamental biological necessities and identifying pathogenic genes. Current deep learning approaches for predicting essential proteins underutilize the potential of gene expression data and are inadequate for the exploration of dynamic networks with limited evaluation across diverse species. RESULTS We introduce ECDEP, an essential protein identification model based on evolutionary community discovery. ECDEP integrates temporal gene expression data with a protein-protein interaction (PPI) network and employs the 3-Sigma rule to eliminate outliers at each time point, constructing a dynamic network. Next, we utilize edge birth and death information to establish an interaction streaming source to feed into the evolutionary community discovery algorithm and then identify overlapping communities during the evolution of the dynamic network. SVM recursive feature elimination (RFE) is applied to extract the most informative communities, which are combined with subcellular localization data for classification predictions. We assess the performance of ECDEP by comparing it against ten centrality methods, four shallow machine learning methods with RFE, and two deep learning methods that incorporate multiple biological data sources on Saccharomyces. Cerevisiae (S. cerevisiae), Homo sapiens (H. sapiens), Mus musculus, and Caenorhabditis elegans. ECDEP achieves an AP value of 0.86 on the H. sapiens dataset and the contribution ratio of community features in classification reaches 0.54 on the S. cerevisiae (Krogan) dataset. CONCLUSIONS Our proposed method adeptly integrates network dynamics and yields outstanding results across various datasets. Furthermore, the incorporation of evolutionary community discovery algorithms amplifies the capacity of gene expression data in classification.
Collapse
Affiliation(s)
- Chen Ye
- School of Information and Artificial Intelligence, Anhui Agricultural University, Hefei, Anhui, 230036, China
- Anhui Beidou Precision Agriculture Information Engineering Research Center, Anhui Agricultural University, Hefei, 230036, China
| | - Qi Wu
- School of Information and Artificial Intelligence, Anhui Agricultural University, Hefei, Anhui, 230036, China
- Anhui Beidou Precision Agriculture Information Engineering Research Center, Anhui Agricultural University, Hefei, 230036, China
| | - Shuxia Chen
- School of Information and Artificial Intelligence, Anhui Agricultural University, Hefei, Anhui, 230036, China
- Anhui Beidou Precision Agriculture Information Engineering Research Center, Anhui Agricultural University, Hefei, 230036, China
| | - Xuemei Zhang
- School of Information and Artificial Intelligence, Anhui Agricultural University, Hefei, Anhui, 230036, China
- Anhui Beidou Precision Agriculture Information Engineering Research Center, Anhui Agricultural University, Hefei, 230036, China
| | - Wenwen Xu
- School of Information and Artificial Intelligence, Anhui Agricultural University, Hefei, Anhui, 230036, China
- Anhui Beidou Precision Agriculture Information Engineering Research Center, Anhui Agricultural University, Hefei, 230036, China
| | - Yunzhi Wu
- School of Information and Artificial Intelligence, Anhui Agricultural University, Hefei, Anhui, 230036, China
- Anhui Beidou Precision Agriculture Information Engineering Research Center, Anhui Agricultural University, Hefei, 230036, China
| | - Youhua Zhang
- School of Information and Artificial Intelligence, Anhui Agricultural University, Hefei, Anhui, 230036, China
- Anhui Beidou Precision Agriculture Information Engineering Research Center, Anhui Agricultural University, Hefei, 230036, China
| | - Yi Yue
- School of Information and Artificial Intelligence, Anhui Agricultural University, Hefei, Anhui, 230036, China.
- Anhui Beidou Precision Agriculture Information Engineering Research Center, Anhui Agricultural University, Hefei, 230036, China.
| |
Collapse
|
3
|
Liu P, Liu C, Mao Y, Guo J, Liu F, Cai W, Zhao F. Identification of essential proteins based on edge features and the fusion of multiple-source biological information. BMC Bioinformatics 2023; 24:203. [PMID: 37198530 DOI: 10.1186/s12859-023-05315-y] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2023] [Accepted: 04/30/2023] [Indexed: 05/19/2023] Open
Abstract
BACKGROUND A major current focus in the analysis of protein-protein interaction (PPI) data is how to identify essential proteins. As massive PPI data are available, this warrants the design of efficient computing methods for identifying essential proteins. Previous studies have achieved considerable performance. However, as a consequence of the features of high noise and structural complexity in PPIs, it is still a challenge to further upgrade the performance of the identification methods. METHODS This paper proposes an identification method, named CTF, which identifies essential proteins based on edge features including h-quasi-cliques and uv-triangle graphs and the fusion of multiple-source information. We first design an edge-weight function, named EWCT, for computing the topological scores of proteins based on quasi-cliques and triangle graphs. Then, we generate an edge-weighted PPI network using EWCT and dynamic PPI data. Finally, we compute the essentiality of proteins by the fusion of topological scores and three scores of biological information. RESULTS We evaluated the performance of the CTF method by comparison with 16 other methods, such as MON, PeC, TEGS, and LBCC, the experiment results on three datasets of Saccharomyces cerevisiae show that CTF outperforms the state-of-the-art methods. Moreover, our method indicates that the fusion of other biological information is beneficial to improve the accuracy of identification.
Collapse
Affiliation(s)
- Peiqiang Liu
- School of Computer Science and Technology, Shandong Technology and Business University, Yantai, China.
| | - Chang Liu
- School of Computer Science and Technology, Shandong Technology and Business University, Yantai, China
| | - Yanyan Mao
- School of Computer Science and Technology, Shandong Technology and Business University, Yantai, China
- College of Oceanography and Space Informatics, China University of Petroleum (East China), Qingdao, China
| | - Junhong Guo
- School of Computer Science and Technology, Shandong Technology and Business University, Yantai, China
| | - Fanshu Liu
- School of Computer Science and Technology, Shandong Technology and Business University, Yantai, China
| | - Wangmin Cai
- School of Computer Science and Technology, Shandong Technology and Business University, Yantai, China
| | - Feng Zhao
- School of Computer Science and Technology, Shandong Technology and Business University, Yantai, China
| |
Collapse
|
4
|
Liu A, Li R, Zaaboul F, He M, Li X, Shi J, Liu Y, Xu YJ. Proteomic analysis reveals the mechanisms of the astaxanthin suppressed foam cell formation. Life Sci 2023; 325:121774. [PMID: 37172817 DOI: 10.1016/j.lfs.2023.121774] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2022] [Revised: 05/08/2023] [Accepted: 05/09/2023] [Indexed: 05/15/2023]
Abstract
AIMS Lipid metabolism in macrophages plays a key role in atherosclerosis development. Excessive low-density lipoprotein taken by macrophages leads to foam cell formation. In this study, we aimed to investigate the effect of astaxanthin on foam cells, and using mass spectrometry-based proteomic approaches to identified the protein expression changes of foam cells. MAIN METHODS The foam cell model was build, then treated with astaxanthin, and tested the content of TC and FC. And proteomics analysis was used in macrophage, macrophage-derived foam cells and macrophage-derived foam cells treated with AST. Then bioinformatic analyses were performed to annotate the functions and associated pathways of the differential proteins. Finally, western blot analysis further confirmed the differential expression of these proteins. KEY FINDINGS Total cholesterol (TC) while free cholesterol (FC) increased in foam cells treated with astaxanthin. The proteomics data set presents a global view of the critical pathways involved in lipid metabolism included PI3K/CDC42 and PI3K/RAC1/TGF-β1 pathways. These pathways significantly increased cholesterol efflux from foam cells and further improved foam cell-induced inflammation. SIGNIFICANCE The present finding provide new insights into the mechanism of astaxanthin regulate lipid metabolism in macrophage foam cells.
Collapse
Affiliation(s)
- Aiyang Liu
- State Key Laboratory of Food Science and Technology, School of Food Science and Technology, National Engineering Research Center for Functional Food, National Engineering Laboratory for Cereal Fermentation Technology, Collaborative Innovation Center of Food Safety and Quality Control, Jiangnan University, 1800 Lihu Road, Wuxi, Jiangsu 214122, China
| | - Ruizhi Li
- State Key Laboratory of Food Science and Technology, School of Food Science and Technology, National Engineering Research Center for Functional Food, National Engineering Laboratory for Cereal Fermentation Technology, Collaborative Innovation Center of Food Safety and Quality Control, Jiangnan University, 1800 Lihu Road, Wuxi, Jiangsu 214122, China
| | - Farah Zaaboul
- State Key Laboratory of Food Science and Technology, School of Food Science and Technology, National Engineering Research Center for Functional Food, National Engineering Laboratory for Cereal Fermentation Technology, Collaborative Innovation Center of Food Safety and Quality Control, Jiangnan University, 1800 Lihu Road, Wuxi, Jiangsu 214122, China
| | - Mengxue He
- State Key Laboratory of Food Science and Technology, School of Food Science and Technology, National Engineering Research Center for Functional Food, National Engineering Laboratory for Cereal Fermentation Technology, Collaborative Innovation Center of Food Safety and Quality Control, Jiangnan University, 1800 Lihu Road, Wuxi, Jiangsu 214122, China
| | - Xue Li
- State Key Laboratory of Food Science and Technology, School of Food Science and Technology, National Engineering Research Center for Functional Food, National Engineering Laboratory for Cereal Fermentation Technology, Collaborative Innovation Center of Food Safety and Quality Control, Jiangnan University, 1800 Lihu Road, Wuxi, Jiangsu 214122, China
| | - Jiachen Shi
- State Key Laboratory of Food Science and Technology, School of Food Science and Technology, National Engineering Research Center for Functional Food, National Engineering Laboratory for Cereal Fermentation Technology, Collaborative Innovation Center of Food Safety and Quality Control, Jiangnan University, 1800 Lihu Road, Wuxi, Jiangsu 214122, China
| | - Yuanfa Liu
- State Key Laboratory of Food Science and Technology, School of Food Science and Technology, National Engineering Research Center for Functional Food, National Engineering Laboratory for Cereal Fermentation Technology, Collaborative Innovation Center of Food Safety and Quality Control, Jiangnan University, 1800 Lihu Road, Wuxi, Jiangsu 214122, China.
| | - Yong-Jiang Xu
- State Key Laboratory of Food Science and Technology, School of Food Science and Technology, National Engineering Research Center for Functional Food, National Engineering Laboratory for Cereal Fermentation Technology, Collaborative Innovation Center of Food Safety and Quality Control, Jiangnan University, 1800 Lihu Road, Wuxi, Jiangsu 214122, China.
| |
Collapse
|
5
|
Payra AK, Saha B, Ghosh A. MM-CCNB: Essential protein prediction using MAX-MIN strategies and compartment of common neighboring approach. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2023; 228:107247. [PMID: 36427433 DOI: 10.1016/j.cmpb.2022.107247] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/27/2022] [Revised: 10/16/2022] [Accepted: 11/14/2022] [Indexed: 06/16/2023]
Abstract
BACKGROUND AND OBJECTIVE Proteins are indispensable for the flow of the life of living organisms. Protein pairs in interaction exhibit more functional activities than individuals. These activities have been considered an essential measure in predicting their essentiality. Neighborhood approaches have been used frequently in the prediction of essentiality scores. All paired neighbors of the essential proteins are nominated for the suitable candidate seeds for prediction. Still now Jaccard's coefficient is limited to predicting functions, homologous groups, sequence analysis, etc. It really motivate us to predict essential proteins efficiently using different computational approaches. METHODS In our work, we proposed modified Jaccard's coefficient to predict essential proteins. We have proposed a novel methodology for predicting essential proteins using MAX-MIN strategies and modified Jaccard's coefficient approach. RESULTS The performance of our proposed methodology has been analyzed for Saccharomyces cerevisiae datasets with an accuracy of more than 80%. It has been observed that the proposed algorithm is outperforms with an accuracy of 0.78, 0.74, 0.79, and 0.862 for YDIP, YMIPS, YHQ, and YMBD datasets respectivly. CONCLUSIONS There are several computational approaches in the existing state-of-art model of essential protein prediction. It has been noted that our predicted methodology outperforms other existing models viz. different centralities, local interaction density combined with protein complexes, modified monkey algorithm and ortho_sim_loc methods.
Collapse
Affiliation(s)
- Anjan Kumar Payra
- Department of Computer Science & Engineering, Dr. Sudhir Chandra Sur Degree Engineering College, 540, Dum Dum Road, Near Dum Dum Jn. Station, Surermath, Kolkata 700074, India.
| | - Banani Saha
- Department of Computer Science & Engineering, University of Calcutta, Saltlake City Kolkata 700073, India
| | - Anupam Ghosh
- Department of Computer Science & Engineering, Netaji Subhash Engineering College, Techno City, Panchpota, Garia, Kolkata 700152, India.
| |
Collapse
|
6
|
Yue Y, Ye C, Peng PY, Zhai HX, Ahmad I, Xia C, Wu YZ, Zhang YH. A deep learning framework for identifying essential proteins based on multiple biological information. BMC Bioinformatics 2022; 23:318. [PMID: 35927611 PMCID: PMC9351218 DOI: 10.1186/s12859-022-04868-8] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2022] [Accepted: 07/29/2022] [Indexed: 11/15/2022] Open
Abstract
Background Essential Proteins are demonstrated to exert vital functions on cellular processes and are indispensable for the survival and reproduction of the organism. Traditional centrality methods perform poorly on complex protein–protein interaction (PPI) networks. Machine learning approaches based on high-throughput data lack the exploitation of the temporal and spatial dimensions of biological information. Results We put forward a deep learning framework to predict essential proteins by integrating features obtained from the PPI network, subcellular localization, and gene expression profiles. In our model, the node2vec method is applied to learn continuous feature representations for proteins in the PPI network, which capture the diversity of connectivity patterns in the network. The concept of depthwise separable convolution is employed on gene expression profiles to extract properties and observe the trends of gene expression over time under different experimental conditions. Subcellular localization information is mapped into a long one-dimensional vector to capture its characteristics. Additionally, we use a sampling method to mitigate the impact of imbalanced learning when training the model. With experiments carried out on the data of Saccharomyces cerevisiae, results show that our model outperforms traditional centrality methods and machine learning methods. Likewise, the comparative experiments have manifested that our process of various biological information is preferable. Conclusions Our proposed deep learning framework effectively identifies essential proteins by integrating multiple biological data, proving a broader selection of subcellular localization information significantly improves the results of prediction and depthwise separable convolution implemented on gene expression profiles enhances the performance.
Collapse
Affiliation(s)
- Yi Yue
- Anhui Provincial Engineering Laboratory for Beidou Precision Agriculture Information, Anhui Agricultural University, Hefei, 230036, China. .,School of Information and Computer, Anhui Agricultural University, Hefei, 230036, China. .,School of Life Sciences, Anhui Agricultural University, Hefei, 230036, China. .,State Key Laboratory of Tea Plant Biology and Utilization, Anhui Agricultural University, Hefei, 230036, China.
| | - Chen Ye
- Anhui Provincial Engineering Laboratory for Beidou Precision Agriculture Information, Anhui Agricultural University, Hefei, 230036, China.,School of Information and Computer, Anhui Agricultural University, Hefei, 230036, China
| | - Pei-Yun Peng
- Anhui Provincial Engineering Laboratory for Beidou Precision Agriculture Information, Anhui Agricultural University, Hefei, 230036, China.,School of Information and Computer, Anhui Agricultural University, Hefei, 230036, China
| | - Hui-Xin Zhai
- Anhui Provincial Engineering Laboratory for Beidou Precision Agriculture Information, Anhui Agricultural University, Hefei, 230036, China.,School of Information and Computer, Anhui Agricultural University, Hefei, 230036, China
| | - Iftikhar Ahmad
- Anhui Provincial Engineering Laboratory for Beidou Precision Agriculture Information, Anhui Agricultural University, Hefei, 230036, China.,School of Information and Computer, Anhui Agricultural University, Hefei, 230036, China
| | - Chuan Xia
- Anhui Provincial Engineering Laboratory for Beidou Precision Agriculture Information, Anhui Agricultural University, Hefei, 230036, China.,School of Information and Computer, Anhui Agricultural University, Hefei, 230036, China
| | - Yun-Zhi Wu
- Anhui Provincial Engineering Laboratory for Beidou Precision Agriculture Information, Anhui Agricultural University, Hefei, 230036, China.,School of Information and Computer, Anhui Agricultural University, Hefei, 230036, China.,State Key Laboratory of Tea Plant Biology and Utilization, Anhui Agricultural University, Hefei, 230036, China
| | - You-Hua Zhang
- Anhui Provincial Engineering Laboratory for Beidou Precision Agriculture Information, Anhui Agricultural University, Hefei, 230036, China. .,School of Information and Computer, Anhui Agricultural University, Hefei, 230036, China. .,School of Life Sciences, Anhui Agricultural University, Hefei, 230036, China.
| |
Collapse
|
7
|
Liu H, Guo S, Wang R, He Y, Shi Q, Song Z, Yang M. Pathogen of Vibrio harveyi infection and C-type lectin proteins in whiteleg shrimp (Litopenaeus vannamei). FISH & SHELLFISH IMMUNOLOGY 2021; 119:554-562. [PMID: 34718124 DOI: 10.1016/j.fsi.2021.10.040] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/10/2021] [Revised: 10/23/2021] [Accepted: 10/25/2021] [Indexed: 06/13/2023]
Abstract
Diseases caused by Vibrio harveyi in shrimps have gradually become one group of the most serious threats to shrimp production, while related molecular mechanisms of infections with Vibrio harveyi are still not known well in shrimps. Here, we performed proteomic sequencing of hepatopancreas in whiteleg shrimps (Litopenaeus vannamei) infected with exogenous Vibrio harveyi, and subsequent functional annotation and calculation of differentially expressed proteins (DEPs) in this study. A total of 145 DEPs were obtained, among them 36 were up-regulated and 109 were down-regulated after the infection. Meanwhile, our results showed that after the infection of Vibrio harveyi, expression levels of a variety of C-type lectins (CTLs) were changed significantly. In-depth functional domain analysis and spatial structure prediction of these CTLs revealed that amino acid sequences and spatial structures of the C-type lectin domain (CTLD) shared by the CTL-S and IML proteins were variant, suggesting differential functions between the two CTLs. In summary, various members of the CTL family have different epidemic responses to Vibrio harveyi infection, which provides a theoretical guidance for deep-going investigations on practical immunity reactions and pathogen infections in shrimps.
Collapse
Affiliation(s)
- Hongtao Liu
- Hainan Provincial Key Laboratory of Tropical Maricultural Technologies, Hainan Academy of Ocean and Fisheries Sciences, Haikou, 571126, China
| | - Shengtao Guo
- Key Laboratory of Bio-Resources and Eco-Environment of Ministry of Education, College of Life Sciences, Sichuan University, Chengdu, 610065, China.
| | - Rong Wang
- Hainan Provincial Key Laboratory of Tropical Maricultural Technologies, Hainan Academy of Ocean and Fisheries Sciences, Haikou, 571126, China
| | - Yugui He
- Hainan Provincial Key Laboratory of Tropical Maricultural Technologies, Hainan Academy of Ocean and Fisheries Sciences, Haikou, 571126, China
| | - Qiong Shi
- Shenzhen Key Lab of Marine Genomics, Guangdong Provincial Key Lab of Molecular Breeding in Marine Economic Animals, BGI Academy of Marine Sciences, BGI Marine, BGI, Shenzhen, 518083, China
| | - Zhaobin Song
- Key Laboratory of Bio-Resources and Eco-Environment of Ministry of Education, College of Life Sciences, Sichuan University, Chengdu, 610065, China.
| | - Mingqiu Yang
- Hainan Provincial Key Laboratory of Tropical Maricultural Technologies, Hainan Academy of Ocean and Fisheries Sciences, Haikou, 571126, China.
| |
Collapse
|
8
|
Zhu X, He X, Kuang L, Chen Z, Lancine C. A Novel Collaborative Filtering Model-Based Method for Identifying Essential Proteins. Front Genet 2021; 12:763153. [PMID: 34745230 PMCID: PMC8566338 DOI: 10.3389/fgene.2021.763153] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2021] [Accepted: 09/13/2021] [Indexed: 11/19/2022] Open
Abstract
Considering that traditional biological experiments are expensive and time consuming, it is important to develop effective computational models to infer potential essential proteins. In this manuscript, a novel collaborative filtering model-based method called CFMM was proposed, in which, an updated protein–domain interaction (PDI) network was constructed first by applying collaborative filtering algorithm on the original PDI network, and then, through integrating topological features of PDI networks with biological features of proteins, a calculative method was designed to infer potential essential proteins based on an improved PageRank algorithm. The novelties of CFMM lie in construction of an updated PDI network, application of the commodity-customer-based collaborative filtering algorithm, and introduction of the calculation method based on an improved PageRank algorithm, which ensured that CFMM can be applied to predict essential proteins without relying entirely on known protein–domain associations. Simulation results showed that CFMM can achieve reliable prediction accuracies of 92.16, 83.14, 71.37, 63.87, 55.84, and 52.43% in the top 1, 5, 10, 15, 20, and 25% predicted candidate key proteins based on the DIP database, which are remarkably higher than 14 competitive state-of-the-art predictive models as a whole, and in addition, CFMM can achieve satisfactory predictive performances based on different databases with various evaluation measurements, which further indicated that CFMM may be a useful tool for the identification of essential proteins in the future.
Collapse
Affiliation(s)
- Xianyou Zhu
- College of Computer Science and Technology, Hengyang Normal University, Hengyang, China.,Hunan Provincial Key Laboratory of Intelligent Information Processing and Application, Hengyang, China
| | - Xin He
- College of Computer, Xiangtan University, Xiangtan, China
| | - Linai Kuang
- College of Computer, Xiangtan University, Xiangtan, China
| | - Zhiping Chen
- College of Computer Engineering and Applied Mathematics, Changsha University, Changsha, China
| | - Camara Lancine
- The Social Sciences and Management University of Bamako, Bamako, Mali
| |
Collapse
|
9
|
Peng J, Kuang L, Zhang Z, Tan Y, Chen Z, Wang L. A Novel Model for Identifying Essential Proteins Based on Key Target Convergence Sets. Front Genet 2021; 12:721486. [PMID: 34394201 PMCID: PMC8358660 DOI: 10.3389/fgene.2021.721486] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2021] [Accepted: 06/30/2021] [Indexed: 11/20/2022] Open
Abstract
In recent years, many computational models have been designed to detect essential proteins based on protein-protein interaction (PPI) networks. However, due to the incompleteness of PPI networks, the prediction accuracy of these models is still not satisfactory. In this manuscript, a novel key target convergence sets based prediction model (KTCSPM) is proposed to identify essential proteins. In KTCSPM, a weighted PPI network and a weighted (Domain-Domain Interaction) network are constructed first based on known PPIs and PDIs downloaded from benchmark databases. And then, by integrating these two kinds of networks, a novel weighted PDI network is built. Next, through assigning a unique key target convergence set (KTCS) for each node in the weighted PDI network, an improved method based on the random walk with restart is designed to identify essential proteins. Finally, in order to evaluate the predictive effects of KTCSPM, it is compared with 12 competitive state-of-the-art models, and experimental results show that KTCSPM can achieve better prediction accuracy. Considering the satisfactory predictive performance achieved by KTCSPM, it indicates that KTCSPM might be a good supplement to the future research on prediction of essential proteins.
Collapse
Affiliation(s)
- Jiaxin Peng
- College of Computer, Xiangtan University, Xiangtan, China.,College of Computer Engineering and Applied Mathematics, Changsha University, Changsha, China
| | - Linai Kuang
- College of Computer, Xiangtan University, Xiangtan, China
| | - Zhen Zhang
- College of Computer Engineering and Applied Mathematics, Changsha University, Changsha, China
| | - Yihong Tan
- College of Computer Engineering and Applied Mathematics, Changsha University, Changsha, China
| | - Zhiping Chen
- College of Computer Engineering and Applied Mathematics, Changsha University, Changsha, China
| | - Lei Wang
- College of Computer, Xiangtan University, Xiangtan, China.,College of Computer Engineering and Applied Mathematics, Changsha University, Changsha, China
| |
Collapse
|
10
|
He X, Kuang L, Chen Z, Tan Y, Wang L. Method for Identifying Essential Proteins by Key Features of Proteins in a Novel Protein-Domain Network. Front Genet 2021; 12:708162. [PMID: 34267785 PMCID: PMC8276041 DOI: 10.3389/fgene.2021.708162] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2021] [Accepted: 05/31/2021] [Indexed: 11/21/2022] Open
Abstract
In recent years, due to low accuracy and high costs of traditional biological experiments, more and more computational models have been proposed successively to infer potential essential proteins. In this paper, a novel prediction method called KFPM is proposed, in which, a novel protein-domain heterogeneous network is established first by combining known protein-protein interactions with known associations between proteins and domains. Next, based on key topological characteristics extracted from the newly constructed protein-domain network and functional characteristics extracted from multiple biological information of proteins, a new computational method is designed to effectively integrate multiple biological features to infer potential essential proteins based on an improved PageRank algorithm. Finally, in order to evaluate the performance of KFPM, we compared it with 13 state-of-the-art prediction methods, experimental results show that, among the top 1, 5, and 10% of candidate proteins predicted by KFPM, the prediction accuracy can achieve 96.08, 83.14, and 70.59%, respectively, which significantly outperform all these 13 competitive methods. It means that KFPM may be a meaningful tool for prediction of potential essential proteins in the future.
Collapse
Affiliation(s)
- Xin He
- College of Computer, Xiangtan University, Xiangtan, China
| | - Linai Kuang
- College of Computer, Xiangtan University, Xiangtan, China
| | - Zhiping Chen
- College of Computer Engineering & Applied Mathematics, Changsha University, Changsha, China
| | - Yihong Tan
- College of Computer Engineering & Applied Mathematics, Changsha University, Changsha, China
| | - Lei Wang
- College of Computer, Xiangtan University, Xiangtan, China
- College of Computer Engineering & Applied Mathematics, Changsha University, Changsha, China
| |
Collapse
|