Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Wass MN, Barton G, Sternberg MJE. CombFunc: predicting protein function using heterogeneous data sources. Nucleic Acids Res 2012;40:W466-70. [PMID: 22641853 PMCID: PMC3394346 DOI: 10.1093/nar/gks489] [Citation(s) in RCA: 43] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022] Open

For:	Wass MN, Barton G, Sternberg MJE. CombFunc: predicting protein function using heterogeneous data sources. Nucleic Acids Res 2012;40:W466-70. [PMID: 22641853 PMCID: PMC3394346 DOI: 10.1093/nar/gks489] [Citation(s) in RCA: 43] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022] Open

Number

Cited by Other Article(s)

Zhang C, Freddolino L. A large-scale assessment of sequence database search tools for homology-based protein function prediction. Brief Bioinform 2024;25:bbae349. [PMID: 39038936 DOI: 10.1093/bib/bbae349] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2024] [Revised: 06/03/2024] [Accepted: 07/05/2024] [Indexed: 07/24/2024] Open

Wang J, Chen C, Yao G, Ding J, Wang L, Jiang H. Intelligent Protein Design and Molecular Characterization Techniques: A Comprehensive Review. Molecules 2023;28:7865. [PMID: 38067593 PMCID: PMC10707872 DOI: 10.3390/molecules28237865] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2023] [Revised: 11/13/2023] [Accepted: 11/23/2023] [Indexed: 12/18/2023] Open

Yan TC, Yue ZX, Xu HQ, Liu YH, Hong YF, Chen GX, Tao L, Xie T. A systematic review of state-of-the-art strategies for machine learning-based protein function prediction. Comput Biol Med 2023;154:106446. [PMID: 36680931 DOI: 10.1016/j.compbiomed.2022.106446] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2022] [Revised: 12/07/2022] [Accepted: 12/19/2022] [Indexed: 12/24/2022]

Targeting hydrophobicity in biofilm-associated protein (Bap) as a novel antibiofilm strategy against Staphylococcus aureus biofilm. Biophys Chem 2022;289:106860. [DOI: 10.1016/j.bpc.2022.106860] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2022] [Revised: 07/20/2022] [Accepted: 07/20/2022] [Indexed: 11/23/2022]

Zhang F, Song H, Zeng M, Wu FX, Li Y, Pan Y, Li M. A Deep Learning Framework for Gene Ontology Annotations With Sequence- and Network-Based Information. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2021;18:2208-2217. [PMID: 31985440 DOI: 10.1109/tcbb.2020.2968882] [Citation(s) in RCA: 17] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]

Structure-based protein function prediction using graph convolutional networks. Nat Commun 2021;12:3168. [PMID: 34039967 PMCID: PMC8155034 DOI: 10.1038/s41467-021-23303-9] [Citation(s) in RCA: 217] [Impact Index Per Article: 72.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2020] [Accepted: 04/22/2021] [Indexed: 02/04/2023] Open

Hou Y, Zhang X, Zhou Q, Hong W, Wang Y. Hierarchical Microbial Functions Prediction by Graph Aggregated Embedding. Front Genet 2021;11:608512. [PMID: 33584804 PMCID: PMC7874084 DOI: 10.3389/fgene.2020.608512] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2020] [Accepted: 11/20/2020] [Indexed: 02/01/2023] Open

You R, Yao S, Xiong Y, Huang X, Sun F, Mamitsuka H, Zhu S. NetGO: improving large-scale protein function prediction with massive network information. Nucleic Acids Res 2020;47:W379-W387. [PMID: 31106361 PMCID: PMC6602452 DOI: 10.1093/nar/gkz388] [Citation(s) in RCA: 60] [Impact Index Per Article: 15.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2019] [Revised: 04/24/2019] [Accepted: 05/01/2019] [Indexed: 01/19/2023] Open

Affiliation(s)

Ronghui You School of Computer Science and Shanghai Key Lab of Intelligent Information Processing, Fudan University, Shanghai 200433, China.,Institute of Science and Technology for Brain-Inspired Intelligence, Fudan University, Shanghai 200433, China.,Key Laboratory of Computational Neuroscience and Brain-Inspired Intelligence (Fudan University), Ministry of Education, China
Shuwei Yao School of Computer Science and Shanghai Key Lab of Intelligent Information Processing, Fudan University, Shanghai 200433, China.,Institute of Science and Technology for Brain-Inspired Intelligence, Fudan University, Shanghai 200433, China.,Key Laboratory of Computational Neuroscience and Brain-Inspired Intelligence (Fudan University), Ministry of Education, China
Yi Xiong Department of Bioinformatics and Biostatistics, Shanghai Jiao Tong University
Xiaodi Huang School of Computing and Mathematics, Charles Sturt University, Albury, NSW 2640, Australia
Fengzhu Sun Institute of Science and Technology for Brain-Inspired Intelligence, Fudan University, Shanghai 200433, China.,Key Laboratory of Computational Neuroscience and Brain-Inspired Intelligence (Fudan University), Ministry of Education, China.,Quantitative and Computational Biology, Department of Biological Sciences, University of Southern California, Los Angeles, CA 90089, USA
Hiroshi Mamitsuka Bioinformatics Center, Institute for Chemical Research, Kyoto University, Uji 611-0011, Japan.,Department of Computer Science, Aalto University, Espoo and Helsinki, Finland
Shanfeng Zhu School of Computer Science and Shanghai Key Lab of Intelligent Information Processing, Fudan University, Shanghai 200433, China.,Institute of Science and Technology for Brain-Inspired Intelligence, Fudan University, Shanghai 200433, China.,Key Laboratory of Computational Neuroscience and Brain-Inspired Intelligence (Fudan University), Ministry of Education, China

Collapse

Peng J, Xue H, Wei Z, Tuncali I, Hao J, Shang X. Integrating multi-network topology for gene function prediction using deep neural networks. Brief Bioinform 2020;22:2096-2105. [PMID: 32249297 DOI: 10.1093/bib/bbaa036] [Citation(s) in RCA: 50] [Impact Index Per Article: 12.5] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2019] [Revised: 02/09/2020] [Accepted: 02/25/2020] [Indexed: 01/18/2023] Open

Makrodimitris S, van Ham RCHJ, Reinders MJT. Improving protein function prediction using protein sequence and GO-term similarities. Bioinformatics 2020;35:1116-1124. [PMID: 30169569 PMCID: PMC6449755 DOI: 10.1093/bioinformatics/bty751] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/25/2017] [Revised: 07/04/2018] [Accepted: 08/28/2018] [Indexed: 12/26/2022] Open

Zhao J, Cao Y, Zhang L. Exploring the computational methods for protein-ligand binding site prediction. Comput Struct Biotechnol J 2020;18:417-426. [PMID: 32140203 PMCID: PMC7049599 DOI: 10.1016/j.csbj.2020.02.008] [Citation(s) in RCA: 82] [Impact Index Per Article: 20.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2019] [Revised: 01/23/2020] [Accepted: 02/11/2020] [Indexed: 12/21/2022] Open

Waman VP, Blundell TL, Buchan DWA, Gough J, Jones D, Kelley L, Murzin A, Pandurangan AP, Sillitoe I, Sternberg M, Torres P, Orengo C. The Genome3D Consortium for Structural Annotations of Selected Model Organisms. Methods Mol Biol 2020;2165:27-67. [PMID: 32621218 DOI: 10.1007/978-1-0716-0708-4_3] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]

Gligorijevic V, Barot M, Bonneau R. deepNF: deep network fusion for protein function prediction. Bioinformatics 2019;34:3873-3881. [PMID: 29868758 PMCID: PMC6223364 DOI: 10.1093/bioinformatics/bty440] [Citation(s) in RCA: 111] [Impact Index Per Article: 22.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2017] [Accepted: 05/28/2018] [Indexed: 01/10/2023] Open

Bhat AS, Grishin NV. Predicting Sequence Features, Function, and Structure of Proteins Using MESSA. CURRENT PROTOCOLS IN BIOINFORMATICS 2019;67:e84. [PMID: 31524991 PMCID: PMC6750024 DOI: 10.1002/cpbi.84] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]

Perlasca P, Frasca M, Ba CT, Notaro M, Petrini A, Casiraghi E, Grossi G, Gliozzo J, Valentini G, Mesiti M. UNIPred-Web: a web tool for the integration and visualization of biomolecular networks for protein function prediction. BMC Bioinformatics 2019;20:422. [PMID: 31412768 PMCID: PMC6694573 DOI: 10.1186/s12859-019-2959-2] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2019] [Accepted: 06/18/2019] [Indexed: 01/06/2023] Open

Abstract

Background

One of the main issues in the automated protein function prediction (AFP) problem is the integration of multiple networked data sources. The UNIPred algorithm was thereby proposed to efficiently integrate —in a function-specific fashion— the protein networks by taking into account the imbalance that characterizes protein annotations, and to subsequently predict novel hypotheses about unannotated proteins. UNIPred is publicly available as R code, which might result of limited usage for non-expert users. Moreover, its application requires efforts in the acquisition and preparation of the networks to be integrated. Finally, the UNIPred source code does not handle the visualization of the resulting consensus network, whereas suitable views of the network topology are necessary to explore and interpret existing protein relationships.

Results

We address the aforementioned issues by proposing UNIPred-Web, a user-friendly Web tool for the application of the UNIPred algorithm to a variety of biomolecular networks, already supplied by the system, and for the visualization and exploration of protein networks. We support different organisms and different types of networks —e.g., co-expression, shared domains and physical interaction networks. Users are supported in the different phases of the process, ranging from the selection of the networks and the protein function to be predicted, to the navigation of the integrated network. The system also supports the upload of user-defined protein networks. The vertex-centric and the highly interactive approach of UNIPred-Web allow a narrow exploration of specific proteins, and an interactive analysis of large sub-networks with only a few mouse clicks.

Conclusions

UNIPred-Web offers a practical and intuitive (visual) guidance to biologists interested in gaining insights into protein biomolecular functions. UNIPred-Web provides facilities for the integration of networks, and supplies a framework for the imbalance-aware protein network integration of nine organisms, the prediction of thousands of GO protein functions, and a easy-to-use graphical interface for the visual analysis, navigation and interpretation of the integrated networks and of the functional predictions.

Collapse

Environmental conditions shape the nature of a minimal bacterial genome. Nat Commun 2019;10:3100. [PMID: 31308405 PMCID: PMC6629657 DOI: 10.1038/s41467-019-10837-2] [Citation(s) in RCA: 26] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2018] [Accepted: 06/04/2019] [Indexed: 12/16/2022] Open

Zhang F, Song H, Zeng M, Li Y, Kurgan L, Li M. DeepFunc: A Deep Learning Framework for Accurate Prediction of Protein Functions from Protein Sequences and Interactions. Proteomics 2019;19:e1900019. [PMID: 30941889 DOI: 10.1002/pmic.201900019] [Citation(s) in RCA: 52] [Impact Index Per Article: 10.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2019] [Revised: 03/18/2019] [Indexed: 01/06/2023]

Sureyya Rifaioglu A, Doğan T, Jesus Martin M, Cetin-Atalay R, Atalay V. DEEPred: Automated Protein Function Prediction with Multi-task Feed-forward Deep Neural Networks. Sci Rep 2019;9:7344. [PMID: 31089211 PMCID: PMC6517386 DOI: 10.1038/s41598-019-43708-3] [Citation(s) in RCA: 47] [Impact Index Per Article: 9.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/13/2018] [Accepted: 04/27/2019] [Indexed: 01/22/2023] Open

Wu J, Yin Q, Zhang C, Geng J, Wu H, Hu H, Ke X, Zhang Y. Function Prediction for G Protein-Coupled Receptors through Text Mining and Induction Matrix Completion. ACS OMEGA 2019;4:3045-3054. [PMID: 31459527 PMCID: PMC6649004 DOI: 10.1021/acsomega.8b02454] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 09/30/2018] [Accepted: 01/11/2019] [Indexed: 06/10/2023]

Kulmanov M, Khan MA, Hoehndorf R, Wren J. DeepGO: predicting protein functions from sequence and interactions using a deep ontology-aware classifier. Bioinformatics 2018;34:660-668. [PMID: 29028931 PMCID: PMC5860606 DOI: 10.1093/bioinformatics/btx624] [Citation(s) in RCA: 201] [Impact Index Per Article: 33.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2017] [Accepted: 09/27/2017] [Indexed: 12/29/2022] Open

Doğan T. HPO2GO: prediction of human phenotype ontology term associations for proteins using cross ontology annotation co-occurrences. PeerJ 2018;6:e5298. [PMID: 30083448 PMCID: PMC6076985 DOI: 10.7717/peerj.5298] [Citation(s) in RCA: 20] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2018] [Accepted: 07/03/2018] [Indexed: 01/24/2023] Open

Abstract

Analysing the relationships between biomolecules and the genetic diseases is a highly active area of research, where the aim is to identify the genes and their products that cause a particular disease due to functional changes originated from mutations. Biological ontologies are frequently employed in these studies, which provides researchers with extensive opportunities for knowledge discovery through computational data analysis. In this study, a novel approach is proposed for the identification of relationships between biomedical entities by automatically mapping phenotypic abnormality defining HPO terms with biomolecular function defining GO terms, where each association indicates the occurrence of the abnormality due to the loss of the biomolecular function expressed by the corresponding GO term. The proposed HPO2GO mappings were extracted by calculating the frequency of the co-annotations of the terms on the same genes/proteins, using already existing curated HPO and GO annotation sets. This was followed by the filtering of the unreliable mappings that could be observed due to chance, by statistical resampling of the co-occurrence similarity distributions. Furthermore, the biological relevance of the finalized mappings were discussed over selected cases, using the literature. The resulting HPO2GO mappings can be employed in different settings to predict and to analyse novel gene/protein—ontology term—disease relations. As an application of the proposed approach, HPO term—protein associations (i.e., HPO2protein) were predicted. In order to test the predictive performance of the method on a quantitative basis, and to compare it with the state-of-the-art, CAFA2 challenge HPO prediction target protein set was employed. The results of the benchmark indicated the potential of the proposed approach, as HPO2GO performance was among the best (Fmax = 0.35). The automated cross ontology mapping approach developed in this work may be extended to other ontologies as well, to identify unexplored relation patterns at the systemic level. The datasets, results and the source code of HPO2GO are available for download at: https://github.com/cansyl/HPO2GO.

Collapse

Fa R, Cozzetto D, Wan C, Jones DT. Predicting human protein function with multi-task deep neural networks. PLoS One 2018;13:e0198216. [PMID: 29889900 PMCID: PMC5995439 DOI: 10.1371/journal.pone.0198216] [Citation(s) in RCA: 44] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2018] [Accepted: 05/15/2018] [Indexed: 11/19/2022] Open

Abstract

Machine learning methods for protein function prediction are urgently needed, especially now that a substantial fraction of known sequences remains unannotated despite the extensive use of functional assignments based on sequence similarity. One major bottleneck supervised learning faces in protein function prediction is the structured, multi-label nature of the problem, because biological roles are represented by lists of terms from hierarchically organised controlled vocabularies such as the Gene Ontology. In this work, we build on recent developments in the area of deep learning and investigate the usefulness of multi-task deep neural networks (MTDNN), which consist of upstream shared layers upon which are stacked in parallel as many independent modules (additional hidden layers with their own output units) as the number of output GO terms (the tasks). MTDNN learns individual tasks partially using shared representations and partially from task-specific characteristics. When no close homologues with experimentally validated functions can be identified, MTDNN gives more accurate predictions than baseline methods based on annotation frequencies in public databases or homology transfers. More importantly, the results show that MTDNN binary classification accuracy is higher than alternative machine learning-based methods that do not exploit commonalities and differences among prediction tasks. Interestingly, compared with a single-task predictor, the performance improvement is not linearly correlated with the number of tasks in MTDNN, but medium size models provide more improvement in our case. One of advantages of MTDNN is that given a set of features, there is no requirement for MTDNN to have a bootstrap feature selection procedure as what traditional machine learning algorithms do. Overall, the results indicate that the proposed MTDNN algorithm improves the performance of protein function prediction. On the other hand, there is still large room for deep learning techniques to further enhance prediction ability.

Collapse

Zhang C, Zheng W, Freddolino PL, Zhang Y. MetaGO: Predicting Gene Ontology of Non-homologous Proteins Through Low-Resolution Protein Structure Prediction and Protein-Protein Network Mapping. J Mol Biol 2018. [PMID: 29534977 DOI: 10.1016/j.jmb.2018.03.004] [Citation(s) in RCA: 43] [Impact Index Per Article: 7.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]

Yu CY, Li XX, Yang H, Li YH, Xue WW, Chen YZ, Tao L, Zhu F. Assessing the Performances of Protein Function Prediction Algorithms from the Perspectives of Identification Accuracy and False Discovery Rate. Int J Mol Sci 2018;19:E183. [PMID: 29316706 PMCID: PMC5796132 DOI: 10.3390/ijms19010183] [Citation(s) in RCA: 27] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2017] [Revised: 12/09/2017] [Accepted: 01/04/2018] [Indexed: 12/27/2022] Open

Abstract

The function of a protein is of great interest in the cutting-edge research of biological mechanisms, disease development and drug/target discovery. Besides experimental explorations, a variety of computational methods have been designed to predict protein function. Among these in silico methods, the prediction of BLAST is based on protein sequence similarity, while that of machine learning is also based on the sequence, but without the consideration of their similarity. This unique characteristic of machine learning makes it a good complement to BLAST and many other approaches in predicting the function of remotely relevant proteins and the homologous proteins of distinct function. However, the identification accuracies of these in silico methods and their false discovery rate have not yet been assessed so far, which greatly limits the usage of these algorithms. Herein, a comprehensive comparison of the performances among four popular prediction algorithms (BLAST, SVM, PNN and KNN) was conducted. In particular, the performance of these methods was systematically assessed by four standard statistical indexes based on the independent test datasets of 93 functional protein families defined by UniProtKB keywords. Moreover, the false discovery rates of these algorithms were evaluated by scanning the genomes of four representative model organisms (Homo sapiens, Arabidopsis thaliana, Saccharomyces cerevisiae and Mycobacterium tuberculosis). As a result, the substantially higher sensitivity of SVM and BLAST was observed compared with that of PNN and KNN. However, the machine learning algorithms (PNN, KNN and SVM) were found capable of substantially reducing the false discovery rate (SVM < PNN < KNN). In sum, this study comprehensively assessed the performance of four popular algorithms applied to protein function prediction, which could facilitate the selection of the most appropriate method in the related biomedical research.

Collapse

Affiliation(s)

Chun Yan Yu Innovative Drug Research and Bioinformatics Group, School of Pharmaceutical Sciences and Collaborative Innovation Center for Brain Science, Chongqing University, Chongqing 401331, China. Innovative Drug Research and Bioinformatics Group, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China.
Xiao Xu Li Innovative Drug Research and Bioinformatics Group, School of Pharmaceutical Sciences and Collaborative Innovation Center for Brain Science, Chongqing University, Chongqing 401331, China. Innovative Drug Research and Bioinformatics Group, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China.
Hong Yang Innovative Drug Research and Bioinformatics Group, School of Pharmaceutical Sciences and Collaborative Innovation Center for Brain Science, Chongqing University, Chongqing 401331, China. Innovative Drug Research and Bioinformatics Group, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China.
Ying Hong Li Innovative Drug Research and Bioinformatics Group, School of Pharmaceutical Sciences and Collaborative Innovation Center for Brain Science, Chongqing University, Chongqing 401331, China. Innovative Drug Research and Bioinformatics Group, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China.
Wei Wei Xue Innovative Drug Research and Bioinformatics Group, School of Pharmaceutical Sciences and Collaborative Innovation Center for Brain Science, Chongqing University, Chongqing 401331, China.
Yu Zong Chen Bioinformatics and Drug Design Group, Department of Pharmacy, and Center for Computational Science and Engineering, National University of Singapore, Singapore 117543, Singapore.
Lin Tao School of Medicine, Hangzhou Normal University, Hangzhou 310012, China.
Feng Zhu Innovative Drug Research and Bioinformatics Group, School of Pharmaceutical Sciences and Collaborative Innovation Center for Brain Science, Chongqing University, Chongqing 401331, China. Innovative Drug Research and Bioinformatics Group, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China.

Collapse

Rifaioglu AS, Doğan T, Saraç ÖS, Ersahin T, Saidi R, Atalay MV, Martin MJ, Cetin-Atalay R. Large-scale automated function prediction of protein sequences and an experimental case study validation on PTEN transcript variants. Proteins 2017;86:135-151. [PMID: 29098713 DOI: 10.1002/prot.25416] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2017] [Revised: 10/24/2017] [Accepted: 11/01/2017] [Indexed: 12/24/2022]

Gupta A, Fuentes SM, Grove A. Redox-Sensitive MarR Homologue BifR from Burkholderia thailandensis Regulates Biofilm Formation. Biochemistry 2017;56:2315-2327. [PMID: 28406615 DOI: 10.1021/acs.biochem.7b00103] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]

Bhadra P, Pal D. Pipeline for inferring protein function from dynamics using coarse-grained molecular mechanics forcefield. Comput Biol Med 2017;83:134-142. [PMID: 28279862 DOI: 10.1016/j.compbiomed.2017.02.009] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2016] [Revised: 02/18/2017] [Accepted: 02/22/2017] [Indexed: 11/28/2022]

Giles TC, Emes RD. Inferring Function from Homology. Methods Mol Biol 2017;1526:23-40. [PMID: 27896734 DOI: 10.1007/978-1-4939-6613-4_2] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]

Cozzetto D, Jones DT. Computational Methods for Annotation Transfers from Sequence. Methods Mol Biol 2017;1446:55-67. [PMID: 27812935 DOI: 10.1007/978-1-4939-3743-1_5] [Citation(s) in RCA: 23] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]

Cruz LM, Trefflich S, Weiss VA, Castro MAA. Protein Function Prediction. Methods Mol Biol 2017;1654:55-75. [PMID: 28986783 DOI: 10.1007/978-1-4939-7231-9_5] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/07/2023]

Li Z, Liu Z, Zhong W, Huang M, Wu N, Xie Y, Dai Z, Zou X. Large-scale identification of human protein function using topological features of interaction network. Sci Rep 2016;6:37179. [PMID: 27849060 PMCID: PMC5111120 DOI: 10.1038/srep37179] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2015] [Accepted: 10/26/2016] [Indexed: 12/25/2022] Open

Mahdieh N, Rabbani B. Beta thalassemia in 31,734 cases with HBB gene mutations: Pathogenic and structural analysis of the common mutations; Iran as the crossroads of the Middle East. Blood Rev 2016;30:493-508. [DOI: 10.1016/j.blre.2016.07.001] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2015] [Revised: 06/13/2016] [Accepted: 07/08/2016] [Indexed: 12/16/2022]

Li YH, Xu JY, Tao L, Li XF, Li S, Zeng X, Chen SY, Zhang P, Qin C, Zhang C, Chen Z, Zhu F, Chen YZ. SVM-Prot 2016: A Web-Server for Machine Learning Prediction of Protein Functional Families from Sequence Irrespective of Similarity. PLoS One 2016;11:e0155290. [PMID: 27525735 PMCID: PMC4985167 DOI: 10.1371/journal.pone.0155290] [Citation(s) in RCA: 80] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2016] [Accepted: 04/27/2016] [Indexed: 12/20/2022] Open

Affiliation(s)

Ying Hong Li Innovative Drug Research and Bioinformatics Group, Innovative Drug Research Centre and School of Pharmaceutical Sciences, Chongqing University, Chongqing, 401331, China
Jing Yu Xu Innovative Drug Research and Bioinformatics Group, Innovative Drug Research Centre and School of Pharmaceutical Sciences, Chongqing University, Chongqing, 401331, China School of Mathematics and Statistics, Beijing Institute of Technology, Beijing, China
Lin Tao Innovative Drug Research and Bioinformatics Group, Innovative Drug Research Centre and School of Pharmaceutical Sciences, Chongqing University, Chongqing, 401331, China Bioinformatics and Drug Discovery group, Department of Pharmacy, National University of Singapore, Singapore, 117543, Singapore
Xiao Feng Li Innovative Drug Research and Bioinformatics Group, Innovative Drug Research Centre and School of Pharmaceutical Sciences, Chongqing University, Chongqing, 401331, China
Shuang Li Innovative Drug Research and Bioinformatics Group, Innovative Drug Research Centre and School of Pharmaceutical Sciences, Chongqing University, Chongqing, 401331, China
Xian Zeng Bioinformatics and Drug Discovery group, Department of Pharmacy, National University of Singapore, Singapore, 117543, Singapore
Shang Ying Chen Bioinformatics and Drug Discovery group, Department of Pharmacy, National University of Singapore, Singapore, 117543, Singapore
Peng Zhang Bioinformatics and Drug Discovery group, Department of Pharmacy, National University of Singapore, Singapore, 117543, Singapore
Chu Qin Bioinformatics and Drug Discovery group, Department of Pharmacy, National University of Singapore, Singapore, 117543, Singapore
Cheng Zhang Bioinformatics and Drug Discovery group, Department of Pharmacy, National University of Singapore, Singapore, 117543, Singapore
Zhe Chen Zhejiang Key Laboratory of Gastro-intestinal Pathophysiology, Zhejiang Hospital of Traditional Chinese Medicine, Zhejiang Chinese Medical University, Hangzhou, P. R. China
Feng Zhu Innovative Drug Research and Bioinformatics Group, Innovative Drug Research Centre and School of Pharmaceutical Sciences, Chongqing University, Chongqing, 401331, China
Yu Zong Chen Bioinformatics and Drug Discovery group, Department of Pharmacy, National University of Singapore, Singapore, 117543, Singapore

Collapse

Vidulin V, Šmuc T, Supek F. Extensive complementarity between gene function prediction methods. Bioinformatics 2016;32:3645-3653. [PMID: 27522084 DOI: 10.1093/bioinformatics/btw532] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2016] [Revised: 07/11/2016] [Accepted: 08/09/2016] [Indexed: 12/22/2022] Open

Abstract

MOTIVATION

The number of sequenced genomes rises steadily but we still lack the knowledge about the biological roles of many genes. Automated function prediction (AFP) is thus a necessity. We hypothesized that AFP approaches that draw on distinct genome features may be useful for predicting different types of gene functions, motivating a systematic analysis of the benefits gained by obtaining and integrating such predictions.

RESULTS

Our pipeline amalgamates 5 133 543 genes from 2071 genomes in a single massive analysis that evaluates five established genomic AFP methodologies. While 1227 Gene Ontology (GO) terms yielded reliable predictions, the majority of these functions were accessible to only one or two of the methods. Moreover, different methods tend to assign a GO term to non-overlapping sets of genes. Thus, inferences made by diverse genomic AFP methods display a striking complementary, both gene-wise and function-wise. Because of this, a viable integration strategy is to rely on a single most-confident prediction per gene/function, rather than enforcing agreement across multiple AFP methods. Using an information-theoretic approach, we estimate that current databases contain 29.2 bits/gene of known Escherichia coli gene functions. This can be increased by up to 5.5 bits/gene using individual AFP methods or by 11 additional bits/gene upon integration, thereby providing a highly-ranking predictor on the Critical Assessment of Function Annotation 2 community benchmark. Availability of more sequenced genomes boosts the predictive accuracy of AFP approaches and also the benefit from integrating them.

AVAILABILITY AND IMPLEMENTATION

The individual and integrated GO predictions for the complete set of genes are available from http://gorbi.irb.hr/ CONTACT: fran.supek@irb.hrSupplementary information: Supplementary materials are available at Bioinformatics online.

Collapse

Zhang Q, Wang C, Wan M, Wu Y, Ma Q. Streptococcus pneumoniae Genome-wide Identification and Characterization of BOX Element-binding Domains. Mol Inform 2016;34:742-52. [PMID: 27491035 DOI: 10.1002/minf.201500044] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2015] [Indexed: 11/11/2022]

Wang J, Luttrell J, Zhang N, Khan S, Shi N, Wang MX, Kang JQ, Wang Z, Xu D. Exploring Human Diseases and Biological Mechanisms by Protein Structure Prediction and Modeling. ADVANCES IN EXPERIMENTAL MEDICINE AND BIOLOGY 2016;939:39-61. [PMID: 27807743 PMCID: PMC6829626 DOI: 10.1007/978-981-10-1503-8_3] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/06/2023]

Parasuram R, Mills CL, Wang Z, Somasundaram S, Beuning PJ, Ondrechen MJ. Local structure based method for prediction of the biochemical function of proteins: Applications to glycoside hydrolases. Methods 2016;93:51-63. [DOI: 10.1016/j.ymeth.2015.11.010] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2015] [Revised: 11/05/2015] [Accepted: 11/09/2015] [Indexed: 01/07/2023] Open

Harke MJ, Gobler CJ. Daily transcriptome changes reveal the role of nitrogen in controlling microcystin synthesis and nutrient transport in the toxic cyanobacterium, Microcystis aeruginosa. BMC Genomics 2015;16:1068. [PMID: 26673568 PMCID: PMC4681089 DOI: 10.1186/s12864-015-2275-9] [Citation(s) in RCA: 48] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2015] [Accepted: 12/03/2015] [Indexed: 11/10/2022] Open

Abstract

BACKGROUND

While transcriptomics have become a valuable tool for linking physiology and ecology in aquatic microbes, the temporal dynamics of global transcriptomic patterns in Microcystis have rarely been assessed. Furthermore, while many microbial studies have explored expression of nutrient transporter genes, few studies have concurrently measured nutrient assimilation rates. Here, we considered how the global transcriptomic patterns and physiology of the cyanobacterium, Microcystis aeruginosa, changed daily as cells were grown from replete to deficient nitrogen (N) conditions and then back to replete conditions.

RESULTS

During N deprivation, Microcystis downregulated genes involved in photosynthesis and respiration, carbon acquisition, lipid metabolism, and amino acid biosynthesis while upregulating genes involved in N acquisition and transport. With increasing N stress, both the strength of expression and number of genes being differentially expressed increased, until N was restored at which point these patterns reversed. Uptake of (15)N-labeled nitrate, ammonium and urea reflected differential expression of genes encoding transporters for these nutrients, with Microcystis appearing to preferentially increase transcription of ammonium and urea transporters and uptake of these compounds during N deprivation. Nitrate uptake and nitrate transporter expression were correlated for one set of transporters but not another, indicating these were high and low affinity nitrate transporters, respectively. Concentrations of microcystin per cell decreased during N deprivation and increased upon N restoration. However, the transcript abundance of genes involved in the synthesis of this compound was complex, as microcystin synthetase genes involved in peptide synthesis were downregulated under N deprivation while genes involved in tailoring and transport were upregulated, suggesting modification of the microcystin molecule under N stress as well as potential alternative functions for these genes and/or this toxin.

CONCLUSIONS

Collectively, this study highlights the complex choreography of gene expression, cell physiology, and toxin synthesis that dynamic N levels can elicit in this ecologically important cyanobacterium. Differing expression patterns of genes within the microcystin synthetase operon in response to changing N levels revealed the potential limitations drawing conclusions based on only one gene in this operon.

Collapse

Sahraeian SM, Luo KR, Brenner SE. SIFTER search: a web server for accurate phylogeny-based protein function prediction. Nucleic Acids Res 2015;43:W141-7. [PMID: 25979264 PMCID: PMC4489292 DOI: 10.1093/nar/gkv461] [Citation(s) in RCA: 33] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2015] [Accepted: 04/27/2015] [Indexed: 12/26/2022] Open

Wu JS, Huang SJ, Zhou ZH. Genome-Wide Protein Function Prediction through Multi-Instance Multi-Label Learning. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2014;11:891-902. [PMID: 26356861 DOI: 10.1109/tcbb.2014.2323058] [Citation(s) in RCA: 27] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]

Teh BA, Choi SB, Musa N, Ling FL, Cun STW, Salleh AB, Najimudin N, Wahab HA, Normi YM. Structure to function prediction of hypothetical protein KPN_00953 (Ycbk) from Klebsiella pneumoniae MGH 78578 highlights possible role in cell wall metabolism. BMC STRUCTURAL BIOLOGY 2014;14:7. [PMID: 24499172 PMCID: PMC3927764 DOI: 10.1186/1472-6807-14-7] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/11/2013] [Accepted: 02/01/2014] [Indexed: 11/10/2022]

Abstract

Background

Klebsiella pneumoniae plays a major role in causing nosocomial infection in immunocompromised patients. Medical inflictions by the pathogen can range from respiratory and urinary tract infections, septicemia and primarily, pneumonia. As more K. pneumoniae strains are becoming highly resistant to various antibiotics, treatment of this bacterium has been rendered more difficult. This situation, as a consequence, poses a threat to public health. Hence, identification of possible novel drug targets against this opportunistic pathogen need to be undertaken. In the complete genome sequence of K. pneumoniae MGH 78578, approximately one-fourth of the genome encodes for hypothetical proteins (HPs). Due to their low homology and relatedness to other known proteins, HPs may serve as potential, new drug targets.

Results

Sequence analysis on the HPs of K. pneumoniae MGH 78578 revealed that a particular HP termed KPN_00953 (YcbK) contains a M15_3 peptidases superfamily conserved domain. Some members of this superfamily are metalloproteases which are involved in cell wall metabolism. BLASTP similarity search on KPN_00953 (YcbK) revealed that majority of the hits were hypothetical proteins although two of the hits suggested that it may be a lipoprotein or related to twin-arginine translocation (Tat) pathway important for transport of proteins to the cell membrane and periplasmic space. As lipoproteins and other components of the cell wall are important pathogenic factors, homology modeling of KPN_00953 was attempted to predict the structure and function of this protein. Three-dimensional model of the protein showed that its secondary structure topology and active site are similar with those found among metalloproteases where two His residues, namely His169 and His209 and an Asp residue, Asp176 in KPN_00953 were found to be Zn-chelating residues. Interestingly, induced expression of the cloned KPN_00953 gene in lipoprotein-deficient E. coli JE5505 resulted in smoother cells with flattened edges. Some cells showed deposits of film-like material under scanning electron microscope.

Conclusions

We postulate that KPN_00953 is a Zn metalloprotease and may play a role in bacterial cell wall metabolism. Structural biology studies to understand its structure, function and mechanism of action pose the possibility of utilizing this protein as a new drug target against K. pneumoniae in the future.

Collapse

Nagao C, Nagano N, Mizuguchi K. Prediction of detailed enzyme functions and identification of specificity determining residues by random forests. PLoS One 2014;9:e84623. [PMID: 24416252 PMCID: PMC3885575 DOI: 10.1371/journal.pone.0084623] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2013] [Accepted: 11/15/2013] [Indexed: 12/03/2022] Open

Cao M, Zhang H, Park J, Daniels NM, Crovella ME, Cowen LJ, Hescott B. Going the distance for protein function prediction: a new distance metric for protein interaction networks. PLoS One 2013;8:e76339. [PMID: 24194834 PMCID: PMC3806810 DOI: 10.1371/journal.pone.0076339] [Citation(s) in RCA: 81] [Impact Index Per Article: 7.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2013] [Accepted: 08/23/2013] [Indexed: 01/17/2023] Open

MESSA: MEta-Server for protein Sequence Analysis. BMC Biol 2012;10:82. [PMID: 23031578 PMCID: PMC3519821 DOI: 10.1186/1741-7007-10-82] [Citation(s) in RCA: 36] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2012] [Accepted: 10/02/2012] [Indexed: 11/10/2022] Open

Abstract

Background

Computational sequence analysis, that is, prediction of local sequence properties, homologs, spatial structure and function from the sequence of a protein, offers an efficient way to obtain needed information about proteins under study. Since reliable prediction is usually based on the consensus of many computer programs, meta-severs have been developed to fit such needs. Most meta-servers focus on one aspect of sequence analysis, while others incorporate more information, such as PredictProtein for local sequence feature predictions, SMART for domain architecture and sequence motif annotation, and GeneSilico for secondary and spatial structure prediction. However, as predictions of local sequence properties, three-dimensional structure and function are usually intertwined, it is beneficial to address them together.

Results

We developed a MEta-Server for protein Sequence Analysis (MESSA) to facilitate comprehensive protein sequence analysis and gather structural and functional predictions for a protein of interest. For an input sequence, the server exploits a number of select tools to predict local sequence properties, such as secondary structure, structurally disordered regions, coiled coils, signal peptides and transmembrane helices; detect homologous proteins and assign the query to a protein family; identify three-dimensional structure templates and generate structure models; and provide predictive statements about the protein's function, including functional annotations, Gene Ontology terms, enzyme classification and possible functionally associated proteins. We tested MESSA on the proteome of Candidatus Liberibacter asiaticus. Manual curation shows that three-dimensional structure models generated by MESSA covered around 75% of all the residues in this proteome and the function of 80% of all proteins could be predicted.

Availability

MESSA is free for non-commercial use at http://prodata.swmed.edu/MESSA/

Collapse