Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Makrodimitris S, van Ham RCHJ, Reinders MJT. Automatic Gene Function Prediction in the 2020's. Genes (Basel) 2020;11:E1264. [PMID: 33120976 PMCID: PMC7692357 DOI: 10.3390/genes11111264] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2020] [Revised: 10/19/2020] [Accepted: 10/21/2020] [Indexed: 02/06/2023] Open

For:	Makrodimitris S, van Ham RCHJ, Reinders MJT. Automatic Gene Function Prediction in the 2020's. Genes (Basel) 2020;11:E1264. [PMID: 33120976 PMCID: PMC7692357 DOI: 10.3390/genes11111264] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2020] [Revised: 10/19/2020] [Accepted: 10/21/2020] [Indexed: 02/06/2023] Open

Number

Cited by Other Article(s)

Kohyama S, Frohn BP, Babl L, Schwille P. Machine learning-aided design and screening of an emergent protein function in synthetic cells. Nat Commun 2024;15:2010. [PMID: 38443351 PMCID: PMC10914801 DOI: 10.1038/s41467-024-46203-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2023] [Accepted: 02/16/2024] [Indexed: 03/07/2024] Open

de Crécy-Lagard V, Swairjo MA. On the necessity to include multiple types of evidence when predicting molecular function of proteins. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.12.18.571875. [PMID: 38187591 PMCID: PMC10769224 DOI: 10.1101/2023.12.18.571875] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/09/2024]

Cai P, Liu S, Zhang D, Xing H, Han M, Liu D, Gong L, Hu QN. SynBioTools: a one-stop facility for searching and selecting synthetic biology tools. BMC Bioinformatics 2023;24:152. [PMID: 37069545 PMCID: PMC10111727 DOI: 10.1186/s12859-023-05281-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2023] [Accepted: 04/11/2023] [Indexed: 04/19/2023] Open

Zheng Y, Young ND, Song J, Chang BC, Gasser RB. An informatic workflow for the enhanced annotation of excretory/secretory proteins of Haemonchus contortus. Comput Struct Biotechnol J 2023;21:2696-2704. [PMID: 37143762 PMCID: PMC10151223 DOI: 10.1016/j.csbj.2023.03.025] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2023] [Revised: 03/16/2023] [Accepted: 03/16/2023] [Indexed: 03/19/2023] Open

Romero M, Nakano FK, Finke J, Rocha C, Vens C. Leveraging class hierarchy for detecting missing annotations on hierarchical multi-label classification. Comput Biol Med 2023;152:106423. [PMID: 36529023 DOI: 10.1016/j.compbiomed.2022.106423] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2022] [Revised: 11/09/2022] [Accepted: 12/11/2022] [Indexed: 12/15/2022]

Escudeiro P, Henry CS, Dias RP. Functional characterization of prokaryotic dark matter: the road so far and what lies ahead. CURRENT RESEARCH IN MICROBIAL SCIENCES 2022;3:100159. [PMID: 36561390 PMCID: PMC9764257 DOI: 10.1016/j.crmicr.2022.100159] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2022] [Revised: 07/18/2022] [Accepted: 08/05/2022] [Indexed: 12/25/2022] Open

Merino GA, Saidi R, Milone DH, Stegmayer G, Martin MJ. Hierarchical deep learning for predicting GO annotations by integrating protein knowledge. Bioinformatics 2022;38:4488-4496. [PMID: 35929781 PMCID: PMC9524999 DOI: 10.1093/bioinformatics/btac536] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2021] [Revised: 07/18/2022] [Indexed: 12/24/2022] Open

Fenoy E, Edera AA, Stegmayer G. Transfer learning in proteins: evaluating novel protein learned representations for bioinformatics tasks. Brief Bioinform 2022;23:6618242. [PMID: 35758229 DOI: 10.1093/bib/bbac232] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2022] [Revised: 05/13/2022] [Accepted: 05/18/2022] [Indexed: 11/13/2022] Open

Abstract

A representation method is an algorithm that calculates numerical feature vectors for samples in a dataset. Such vectors, also known as embeddings, define a relatively low-dimensional space able to efficiently encode high-dimensional data. Very recently, many types of learned data representations based on machine learning have appeared and are being applied to several tasks in bioinformatics. In particular, protein representation learning methods integrate different types of protein information (sequence, domains, etc.), in supervised or unsupervised learning approaches, and provide embeddings of protein sequences that can be used for downstream tasks. One task that is of special interest is the automatic function prediction of the huge number of novel proteins that are being discovered nowadays and are still totally uncharacterized. However, despite its importance, up to date there is not a fair benchmark study of the predictive performance of existing proposals on the same large set of proteins and for very concrete and common bioinformatics tasks. Therefore, this lack of benchmark studies prevent the community from using adequate predictive methods for accelerating the functional characterization of proteins. In this study, we performed a detailed comparison of protein sequence representation learning methods, explaining each approach and comparing them with an experimental benchmark on several bioinformatics tasks: (i) determining protein sequence similarity in the embedding space; (ii) inferring protein domains and (iii) predicting ontology-based protein functions. We examine the advantages and disadvantages of each representation approach over the benchmark results. We hope the results and the discussion of this study can help the community to select the most adequate machine learning-based technique for protein representation according to the bioinformatics task at hand.

Collapse

Reijnders MJMF, Waterhouse RM. CrowdGO: Machine learning and semantic similarity guided consensus Gene Ontology annotation. PLoS Comput Biol 2022;18:e1010075. [PMID: 35560159 PMCID: PMC9132264 DOI: 10.1371/journal.pcbi.1010075] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2021] [Revised: 05/25/2022] [Accepted: 04/04/2022] [Indexed: 11/29/2022] Open

Abstract

Characterising gene function for the ever-increasing number and diversity of species with annotated genomes relies almost entirely on computational prediction methods. These software are also numerous and diverse, each with different strengths and weaknesses as revealed through community benchmarking efforts. Meta-predictors that assess consensus and conflict from individual algorithms should deliver enhanced functional annotations. To exploit the benefits of meta-approaches, we developed CrowdGO, an open-source consensus-based Gene Ontology (GO) term meta-predictor that employs machine learning models with GO term semantic similarities and information contents. By re-evaluating each gene-term annotation, a consensus dataset is produced with high-scoring confident annotations and low-scoring rejected annotations. Applying CrowdGO to results from a deep learning-based, a sequence similarity-based, and two protein domain-based methods, delivers consensus annotations with improved precision and recall. Furthermore, using standard evaluation measures CrowdGO performance matches that of the community’s best performing individual methods. CrowdGO therefore offers a model-informed approach to leverage strengths of individual predictors and produce comprehensive and accurate gene functional annotations.

New technologies mean that we are able to read the genetic blueprints in the form of complete genome sequences from many different species. We are also able to use computational methods combined with evidence from experiments to map out the locations in the genomes of many thousands of genes and other important regions. However, discovering and characterising the biological functions of all these genes and their protein products requires considerably more experimental work. In order to gain insights into the possible functions of the many genes currently lacking functional information from experiments we must therefore rely on methods that computationally predict protein functions. Many different software tools have been developed to tackle this challenge, each with their own strengths and weaknesses as shown by several community-based competitions that assess the performance of the predictors. Taking advantage of powerful modern machine learning techniques, we developed CrowdGO, a new software that aims to combine predictions from several tools and produce comprehensive and accurate gene functional annotations. CrowdGO is able to computationally assess agreements and conflicts amongst annotations from different predictors to then re-evaluate the results and deliver enhanced predictions of protein functions.

Collapse

Törönen P, Holm L. PANNZER-A practical tool for protein function prediction. Protein Sci 2022;31:118-128. [PMID: 34562305 PMCID: PMC8740830 DOI: 10.1002/pro.4193] [Citation(s) in RCA: 40] [Impact Index Per Article: 20.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2021] [Revised: 09/22/2021] [Accepted: 09/22/2021] [Indexed: 01/03/2023]

Torres M, Yang H, Romero AE, Paccanaro A. Protein function prediction for newly sequenced organisms. NAT MACH INTELL 2021. [DOI: 10.1038/s42256-021-00419-7] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022]

RNA_seq and quantitative proteomic analysis of Dictyostelium knock-out cells lacking the core autophagy proteins ATG9 and/or ATG16. BMC Genomics 2021;22:444. [PMID: 34126926 PMCID: PMC8204557 DOI: 10.1186/s12864-021-07756-2] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2021] [Accepted: 05/26/2021] [Indexed: 01/09/2023] Open

Abstract

BACKGROUND

Autophagy is an evolutionary ancient mechanism that sequesters substrates for degradation within autolysosomes. The process is driven by many autophagy-related (ATG) proteins, including the core members ATG9 and ATG16. However, the functions of these two core ATG proteins still need further elucidation. Here, we applied RNA_seq and tandem mass tag (TMT) proteomic approaches to identify differentially expressed genes (DEGs) and proteins (DEPs) in Dictyostelium discoideum ATG9‾, ATG16‾ and ATG9‾/16‾ strains in comparison to AX2 wild-type cells.

RESULT

In total, we identified 332 (279 up and 53 down), 639 (487 up and 152 down) and 260 (114 up and 146 down) DEGs and 124 (83 up and 41 down), 431 (238 up and 193 down) and 677 (347 up and 330 down) DEPs in ATG9‾, ATG16‾ and ATG9‾/16‾ strains, respectively. Thus, in the single knock-out strains, the number of DEGs was higher than the number of DEPs while in the double knock-out strain the number of DEPs was higher. Comparison of RNA_seq and proteomic data further revealed, that only a small proportion of the transcriptional changes were reflected on the protein level. Gene ontology (GO) analysis revealed an enrichment of DEPs involved in lipid metabolism and oxidative phosphorylation. Furthermore, we found increased expression of the anti-oxidant enzymes glutathione reductase (gsr) and catalase A (catA) in ATG16‾ and ATG9‾/16‾ cells, respectively, indicating adaptation to excess reactive oxygen species (ROS).

CONCLUSIONS

Our study provides the first combined transcriptome and proteome analysis of ATG9‾, ATG16‾ and ATG9‾/16‾ cells. Our results suggest, that most changes in protein abundance were not caused by transcriptional changes, but were rather due to changes in protein homeostasis. In particular, knock-out of atg9 and/or atg16 appears to cause dysregulation of lipid metabolism and oxidative phosphorylation.

Collapse

MacLean F. Knowledge graphs and their applications in drug discovery. Expert Opin Drug Discov 2021;16:1057-1069. [PMID: 33843398 DOI: 10.1080/17460441.2021.1910673] [Citation(s) in RCA: 26] [Impact Index Per Article: 8.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/09/2023]