Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Jiang JQ, McQuay LJ. Predicting protein function by multi-label correlated semi-supervised learning. IEEE/ACM Trans Comput Biol Bioinform 2012;9:1059-1069. [PMID: 22595236 DOI: 10.1109/tcbb.2011.156] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/31/2023]

For:	Jiang JQ, McQuay LJ. Predicting protein function by multi-label correlated semi-supervised learning. IEEE/ACM Trans Comput Biol Bioinform 2012;9:1059-1069. [PMID: 22595236 DOI: 10.1109/tcbb.2011.156] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/31/2023]

Number

Cited by Other Article(s)

Han M, Wu H, Chen Z, Li M, Zhang X. A survey of multi-label classification based on supervised and semi-supervised learning. INT J MACH LEARN CYB 2022. [DOI: 10.1007/s13042-022-01658-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]

Sengupta K, Saha S, Halder AK, Chatterjee P, Nasipuri M, Basu S, Plewczynski D. PFP-GO: Integrating protein sequence, domain and protein-protein interaction information for protein function prediction using ranked GO terms. Front Genet 2022;13:969915. [PMID: 36246645 PMCID: PMC9556876 DOI: 10.3389/fgene.2022.969915] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2022] [Accepted: 08/31/2022] [Indexed: 11/13/2022] Open

Mansoor M, Nauman M, Rehman HU, Omar M. Gene Ontology Capsule GAN: an improved architecture for protein function prediction. PeerJ Comput Sci 2022;8:e1014. [PMID: 36092003 PMCID: PMC9454774 DOI: 10.7717/peerj-cs.1014] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/11/2022] [Accepted: 05/31/2022] [Indexed: 06/15/2023]

Qian W, Xiong Y, Yang J, Shu W. Feature selection for label distribution learning via feature similarity and label correlation. Inf Sci (N Y) 2022. [DOI: 10.1016/j.ins.2021.08.076] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]

Zhang Z, Zhao Y, Wang J, Guo M. DeepRCI: predicting ATP-binding proteins using the residue-residue contact information. IEEE J Biomed Health Inform 2021;26:2822-2829. [PMID: 34941538 DOI: 10.1109/jbhi.2021.3137840] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]

Zhang F, Song H, Zeng M, Wu FX, Li Y, Pan Y, Li M. A Deep Learning Framework for Gene Ontology Annotations With Sequence- and Network-Based Information. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2021;18:2208-2217. [PMID: 31985440 DOI: 10.1109/tcbb.2020.2968882] [Citation(s) in RCA: 17] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]

Zhang Z, Wang J, Liu J. DeepRTCP: Predicting ATP-Binding Cassette Transporters Based on 1-Dimensional Convolutional Network. Front Cell Dev Biol 2021;8:614080. [PMID: 33598454 PMCID: PMC7882686 DOI: 10.3389/fcell.2020.614080] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2020] [Accepted: 12/24/2020] [Indexed: 11/13/2022] Open

Mixture distribution modeling for scalable graph-based semi-supervised learning. Knowl Based Syst 2020. [DOI: 10.1016/j.knosys.2020.105974] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]

Cai Y, Wang J, Deng L. SDN2GO: An Integrated Deep Learning Model for Protein Function Prediction. Front Bioeng Biotechnol 2020;8:391. [PMID: 32411695 PMCID: PMC7201018 DOI: 10.3389/fbioe.2020.00391] [Citation(s) in RCA: 19] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2020] [Accepted: 04/07/2020] [Indexed: 02/01/2023] Open

Bonetta R, Valentino G. Machine learning techniques for protein function prediction. Proteins 2019;88:397-413. [PMID: 31603244 DOI: 10.1002/prot.25832] [Citation(s) in RCA: 63] [Impact Index Per Article: 12.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/19/2019] [Revised: 07/05/2019] [Accepted: 09/17/2019] [Indexed: 12/17/2022]

Saha S, Chatterjee P, Basu S, Nasipuri M, Plewczynski D. FunPred 3.0: improved protein function prediction using protein interaction network. PeerJ 2019;7:e6830. [PMID: 31198622 PMCID: PMC6535044 DOI: 10.7717/peerj.6830] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2018] [Accepted: 03/21/2019] [Indexed: 11/23/2022] Open

Abstract

Proteins are the most versatile macromolecules in living systems and perform crucial biological functions. In the advent of the post-genomic era, the next generation sequencing is done routinely at the population scale for a variety of species. The challenging problem is to massively determine the functions of proteins that are yet not characterized by detailed experimental studies. Identification of protein functions experimentally is a laborious and time-consuming task involving many resources. We therefore propose the automated protein function prediction methodology using in silico algorithms trained on carefully curated experimental datasets. We present the improved protein function prediction tool FunPred 3.0, an extended version of our previous methodology FunPred 2, which exploits neighborhood properties in protein–protein interaction network (PPIN) and physicochemical properties of amino acids. Our method is validated using the available functional annotations in the PPIN network of Saccharomyces cerevisiae in the latest Munich information center for protein (MIPS) dataset. The PPIN data of S. cerevisiae in MIPS dataset includes 4,554 unique proteins in 13,528 protein–protein interactions after the elimination of the self-replicating and the self-interacting protein pairs. Using the developed FunPred 3.0 tool, we are able to achieve the mean precision, the recall and the F-score values of 0.55, 0.82 and 0.66, respectively. FunPred 3.0 is then used to predict the functions of unpredicted protein pairs (incomplete and missing functional annotations) in MIPS dataset of S. cerevisiae. The method is also capable of predicting the subcellular localization of proteins along with its corresponding functions. The code and the complete prediction results are available freely at: https://github.com/SovanSaha/FunPred-3.0.git.

Collapse

Zhang F, Song H, Zeng M, Li Y, Kurgan L, Li M. DeepFunc: A Deep Learning Framework for Accurate Prediction of Protein Functions from Protein Sequences and Interactions. Proteomics 2019;19:e1900019. [PMID: 30941889 DOI: 10.1002/pmic.201900019] [Citation(s) in RCA: 52] [Impact Index Per Article: 10.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2019] [Revised: 03/18/2019] [Indexed: 01/06/2023]

Yu Z, Zhang Y, You J, Chen CLP, Wong HS, Han G, Zhang J. Adaptive Semi-Supervised Classifier Ensemble for High Dimensional Data Classification. IEEE TRANSACTIONS ON CYBERNETICS 2019;49:366-379. [PMID: 29989979 DOI: 10.1109/tcyb.2017.2761908] [Citation(s) in RCA: 24] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]

Kulmanov M, Khan MA, Hoehndorf R, Wren J. DeepGO: predicting protein functions from sequence and interactions using a deep ontology-aware classifier. Bioinformatics 2018;34:660-668. [PMID: 29028931 PMCID: PMC5860606 DOI: 10.1093/bioinformatics/btx624] [Citation(s) in RCA: 201] [Impact Index Per Article: 33.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2017] [Accepted: 09/27/2017] [Indexed: 12/29/2022] Open

Saha S, Prasad A, Chatterjee P, Basu S, Nasipuri M. Protein function prediction from protein-protein interaction network using gene ontology based neighborhood analysis and physico-chemical features. J Bioinform Comput Biol 2018;16:1850025. [PMID: 30400756 DOI: 10.1142/s0219720018500257] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]

Abstract

Protein Function Prediction from Protein-Protein Interaction Network (PPIN) and physico-chemical features using the Gene Ontology (GO) classification are indeed very useful for assigning biological or biochemical functions to a protein. They also lead to the identification of those significant proteins which are responsible for the generation of various diseases whose drugs are still yet to be discovered. So, the prediction of GO functional terms from PPIN and sequence is an important field of study. In this work, we have proposed a methodology, Multi Label Protein Function Prediction (ML_PFP) which is based on Neighborhood analysis empowered with physico-chemical features of constituent amino acids to predict the functional group of unannotated protein. A protein does not perform functions in isolation rather it performs functions in a group by interacting with others. So a protein is involved in many functions or, in other words, may be associated with multiple functional groups or labels or GO terms. Though functional group of other known interacting partner protein and its physico-chemical features provide useful information, assignment of multiple labels to unannotated protein is a very challenging task. Here, we have taken Homo sapiens or Human PPIN as well as Saccharomyces cerevisiae or yeast PPIN along with their GO terms to predict functional groups or GO terms of unannotated proteins. This work has become very challenging as both Human and Yeast protein dataset are voluminous and complex in nature and multi-label functional groups assignment has also added a new dimension to this challenge. Our algorithm has been observed to achieve a better performance in Cellular Function, Molecular Function and Biological Process of both yeast and human network when compared with the other existing state-of-the-art methodologies which will be discussed in detail in the results section.

Collapse

Manifold regularized matrix completion for multi-label learning with ADMM. Neural Netw 2018;101:57-67. [DOI: 10.1016/j.neunet.2018.01.011] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2016] [Revised: 11/24/2017] [Accepted: 01/26/2018] [Indexed: 11/23/2022]

Dongliang X, Jingchang P, Bailing W. Multiple kernels learning-based biological entity relationship extraction method. J Biomed Semantics 2017;8:38. [PMID: 29297359 PMCID: PMC5763518 DOI: 10.1186/s13326-017-0138-9] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open

Dai HL. Imbalanced Protein Data Classification Using Ensemble FTM-SVM. IEEE Trans Nanobioscience 2015;14:350-359. [DOI: 10.1109/tnb.2015.2431292] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]

Application of gap-constraints given sequential frequent pattern mining for protein function prediction. Osong Public Health Res Perspect 2015;6:112-20. [PMID: 25938021 PMCID: PMC4411351 DOI: 10.1016/j.phrp.2015.01.006] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2014] [Revised: 12/14/2014] [Accepted: 01/16/2015] [Indexed: 11/23/2022] Open

Zhao X, Ning Q, Chai H, Ma Z. Accurate in silico identification of protein succinylation sites using an iterative semi-supervised learning technique. J Theor Biol 2015;374:60-5. [PMID: 25843215 DOI: 10.1016/j.jtbi.2015.03.029] [Citation(s) in RCA: 33] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2014] [Revised: 03/21/2015] [Accepted: 03/24/2015] [Indexed: 01/23/2023]

Yu G, Zhu H, Domeniconi C. Predicting protein functions using incomplete hierarchical labels. BMC Bioinformatics 2015;16:1. [PMID: 25591917 PMCID: PMC4384381 DOI: 10.1186/s12859-014-0430-y] [Citation(s) in RCA: 83] [Impact Index Per Article: 9.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2014] [Accepted: 12/11/2014] [Indexed: 02/07/2023] Open

Abstract

BACKGROUND

Protein function prediction is to assign biological or biochemical functions to proteins, and it is a challenging computational problem characterized by several factors: (1) the number of function labels (annotations) is large; (2) a protein may be associated with multiple labels; (3) the function labels are structured in a hierarchy; and (4) the labels are incomplete. Current predictive models often assume that the labels of the labeled proteins are complete, i.e. no label is missing. But in real scenarios, we may be aware of only some hierarchical labels of a protein, and we may not know whether additional ones are actually present. The scenario of incomplete hierarchical labels, a challenging and practical problem, is seldom studied in protein function prediction.

RESULTS

In this paper, we propose an algorithm to Predict protein functions using Incomplete hierarchical LabeLs (PILL in short). PILL takes into account the hierarchical and the flat taxonomy similarity between function labels, and defines a Combined Similarity (ComSim) to measure the correlation between labels. PILL estimates the missing labels for a protein based on ComSim and the known labels of the protein, and uses a regularization to exploit the interactions between proteins for function prediction. PILL is shown to outperform other related techniques in replenishing the missing labels and in predicting the functions of completely unlabeled proteins on publicly available PPI datasets annotated with MIPS Functional Catalogue and Gene Ontology labels.

CONCLUSION

The empirical study shows that it is important to consider the incomplete annotation for protein function prediction. The proposed method (PILL) can serve as a valuable tool for protein function prediction using incomplete labels. The Matlab code of PILL is available upon request.

Collapse

Yu G, Rangwala H, Domeniconi C, Zhang G, Zhang Z. Predicting Protein Function Using Multiple Kernels. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2015;12:219-233. [PMID: 26357091 DOI: 10.1109/tcbb.2014.2351821] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]

Wu Q, Ye Y, Ho SS, Zhou S. Semi-supervised multi-label collective classification ensemble for functional genomics. BMC Genomics 2014;15 Suppl 9:S17. [PMID: 25521242 PMCID: PMC4290603 DOI: 10.1186/1471-2164-15-s9-s17] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022] Open

Abstract

BACKGROUND

With the rapid accumulation of proteomic and genomic datasets in terms of genome-scale features and interaction networks through high-throughput experimental techniques, the process of manual predicting functional properties of the proteins has become increasingly cumbersome, and computational methods to automate this annotation task are urgently needed. Most of the approaches in predicting functional properties of proteins require to either identify a reliable set of labeled proteins with similar attribute features to unannotated proteins, or to learn from a fully-labeled protein interaction network with a large amount of labeled data. However, acquiring such labels can be very difficult in practice, especially for multi-label protein function prediction problems. Learning with only a few labeled data can lead to poor performance as limited supervision knowledge can be obtained from similar proteins or from connections between them. To effectively annotate proteins even in the paucity of labeled data, it is important to take advantage of all data sources that are available in this problem setting, including interaction networks, attribute feature information, correlations of functional labels, and unlabeled data.

RESULTS

In this paper, we show that the underlying nature of predicting functional properties of proteins using various data sources of relational data is a typical collective classification (CC) problem in machine learning. The protein functional prediction task with limited annotation is then cast into a semi-supervised multi-label collective classification (SMCC) framework. As such, we propose a novel generative model based SMCC algorithm, called GM-SMCC, to effectively compute the label probability distributions of unannotated protein instances and predict their functional properties. To further boost the predicting performance, we extend the method in an ensemble manner, called EGM-SMCC, by utilizing multiple heterogeneous networks with various latent linkages constructed to explicitly model the relationships among the nodes for effectively propagate the supervision knowledge from labeled to unlabeled nodes.

CONCLUSION

Experimental results on a yeast gene dataset predicting the functions and localization of proteins demonstrate the effectiveness of the proposed method. In the comparison, we find that the performances of the proposed algorithms are better than the other compared algorithms.

Collapse

Wu JS, Huang SJ, Zhou ZH. Genome-Wide Protein Function Prediction through Multi-Instance Multi-Label Learning. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2014;11:891-902. [PMID: 26356861 DOI: 10.1109/tcbb.2014.2323058] [Citation(s) in RCA: 27] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]

Yu G, Rangwala H, Domeniconi C, Zhang G, Yu Z. Protein Function Prediction with Incomplete Annotations. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2014;11:579-591. [PMID: 26356025 DOI: 10.1109/tcbb.2013.142] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]

Liu Q, Chen YPP, Li J. k-Partite cliques of protein interactions: A novel subgraph topology for functional coherence analysis on PPI networks. J Theor Biol 2014;340:146-54. [PMID: 24056214 DOI: 10.1016/j.jtbi.2013.09.013] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/04/2013] [Revised: 08/09/2013] [Accepted: 09/10/2013] [Indexed: 01/02/2023]

Yu G, Rangwala H, Domeniconi C, Zhang G, Yu Z. Protein function prediction using multilabel ensemble classification. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2013;10:1045-57. [PMID: 24334396 DOI: 10.1109/tcbb.2013.111] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/03/2023]

Sun X, Gong D, Jin Y, Chen S. A New Surrogate-Assisted Interactive Genetic Algorithm With Weighted Semisupervised Learning. IEEE TRANSACTIONS ON CYBERNETICS 2013;43:685-698. [PMID: 23014759 DOI: 10.1109/tsmcb.2012.2214382] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/01/2023]