Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Rembeza E, Engqvist MKM. Experimental and computational investigation of enzyme functional annotations uncovers misannotation in the EC 1.1.3.15 enzyme class. PLoS Comput Biol 2021;17:e1009446. [PMID: 34555022 DOI: 10.1371/journal.pcbi.1009446] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2021] [Revised: 10/05/2021] [Accepted: 09/13/2021] [Indexed: 12/12/2022] Open

For:	Rembeza E, Engqvist MKM. Experimental and computational investigation of enzyme functional annotations uncovers misannotation in the EC 1.1.3.15 enzyme class. PLoS Comput Biol 2021;17:e1009446. [PMID: 34555022 DOI: 10.1371/journal.pcbi.1009446] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2021] [Revised: 10/05/2021] [Accepted: 09/13/2021] [Indexed: 12/12/2022] Open

Number

Cited by Other Article(s)

Price MN, Arkin AP. Interactive tools for functional annotation of bacterial genomes. Database (Oxford) 2024;2024:baae089. [PMID: 39241109 PMCID: PMC11378808 DOI: 10.1093/database/baae089] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2024] [Revised: 07/29/2024] [Accepted: 08/09/2024] [Indexed: 09/08/2024]

de Crécy-Lagard V, Dias R, Friedberg I, Yuan Y, Swairjo MA. Limitations of Current Machine-Learning Models in Predicting Enzymatic Functions for Uncharacterized Proteins. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.07.01.601547. [PMID: 39005379 PMCID: PMC11244979 DOI: 10.1101/2024.07.01.601547] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/16/2024]

Hou S, Kang Z, Liu Y, Lü C, Wang X, Wang Q, Ma C, Xu P, Gao C. An enzymic l-2-hydroxyglutarate biosensor based on l-2-hydroxyglutarate dehydrogenase from Azoarcus olearius. Biosens Bioelectron 2024;243:115740. [PMID: 37862756 DOI: 10.1016/j.bios.2023.115740] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2023] [Revised: 09/21/2023] [Accepted: 10/03/2023] [Indexed: 10/22/2023]

de Oliveira SG, Kotowski N, Sampaio-Filho HR, Aguiar FHB, Dávila AMR, Jardim R. Metalloproteinases in Restorative Dentistry: An In Silico Study toward an Ideal Animal Model. Biomedicines 2023;11:3042. [PMID: 38002041 PMCID: PMC10669239 DOI: 10.3390/biomedicines11113042] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2023] [Revised: 09/02/2023] [Accepted: 09/13/2023] [Indexed: 11/26/2023] Open

Vezina B, Watts SC, Hawkey J, Cooper HB, Judd LM, Jenney AWJ, Monk JM, Holt KE, Wyres KL. Bactabolize is a tool for high-throughput generation of bacterial strain-specific metabolic models. eLife 2023;12:RP87406. [PMID: 37815531 PMCID: PMC10564454 DOI: 10.7554/elife.87406] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/11/2023] Open

Davidson RB, Coletti M, Gao M, Piatkowski B, Sreedasyam A, Quadir F, Weston DJ, Schmutz J, Cheng J, Skolnick J, Parks JM, Sedova A. Predicted structural proteome of Sphagnum divinum and proteome-scale annotation. Bioinformatics 2023;39:btad511. [PMID: 37589594 PMCID: PMC10463551 DOI: 10.1093/bioinformatics/btad511] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2023] [Revised: 08/01/2023] [Accepted: 08/16/2023] [Indexed: 08/18/2023] Open

Oberg N, Zallot R, Gerlt JA. EFI-EST, EFI-GNT, and EFI-CGFP: Enzyme Function Initiative (EFI) Web Resource for Genomic Enzymology Tools. J Mol Biol 2023;435:168018. [PMID: 37356897 PMCID: PMC10291204 DOI: 10.1016/j.jmb.2023.168018] [Citation(s) in RCA: 79] [Impact Index Per Article: 79.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2022] [Revised: 02/04/2023] [Accepted: 02/13/2023] [Indexed: 02/19/2023]

Kroll A, Ranjan S, Engqvist MKM, Lercher MJ. A general model to predict small molecule substrates of enzymes based on machine and deep learning. Nat Commun 2023;14:2787. [PMID: 37188731 DOI: 10.1038/s41467-023-38347-2] [Citation(s) in RCA: 16] [Impact Index Per Article: 16.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/24/2022] [Accepted: 04/21/2023] [Indexed: 05/17/2023] Open

Vasina M, Kovar D, Damborsky J, Ding Y, Yang T, deMello A, Mazurenko S, Stavrakis S, Prokop Z. In-depth analysis of biocatalysts by microfluidics: An emerging source of data for machine learning. Biotechnol Adv 2023;66:108171. [PMID: 37150331 DOI: 10.1016/j.biotechadv.2023.108171] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2023] [Revised: 05/04/2023] [Accepted: 05/04/2023] [Indexed: 05/09/2023]

Kress A, Poch O, Lecompte O, Thompson JD. Real or fake? Measuring the impact of protein annotation errors on estimates of domain gain and loss events. FRONTIERS IN BIOINFORMATICS 2023;3:1178926. [PMID: 37151482 PMCID: PMC10158824 DOI: 10.3389/fbinf.2023.1178926] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2023] [Accepted: 04/05/2023] [Indexed: 05/09/2023] Open

Abstract

Protein annotation errors can have significant consequences in a wide range of fields, ranging from protein structure and function prediction to biomedical research, drug discovery, and biotechnology. By comparing the domains of different proteins, scientists can identify common domains, classify proteins based on their domain architecture, and highlight proteins that have evolved differently in one or more species or clades. However, genome-wide identification of different protein domain architectures involves a complex error-prone pipeline that includes genome sequencing, prediction of gene exon/intron structures, and inference of protein sequences and domain annotations. Here we developed an automated fact-checking approach to distinguish true domain loss/gain events from false events caused by errors that occur during the annotation process. Using genome-wide ortholog sets and taking advantage of the high-quality human and Saccharomyces cerevisiae genome annotations, we analyzed the domain gain and loss events in the predicted proteomes of 9 non-human primates (NHP) and 20 non-S. cerevisiae fungi (NSF) as annotated in the Uniprot and Interpro databases. Our approach allowed us to quantify the impact of errors on estimates of protein domain gains and losses, and we show that domain losses are over-estimated ten-fold and three-fold in the NHP and NSF proteins respectively. This is in line with previous studies of gene-level losses, where issues with genome sequencing or gene annotation led to genes being falsely inferred as absent. In addition, we show that insistent protein domain annotations are a major factor contributing to the false events. For the first time, to our knowledge, we show that domain gains are also over-estimated by three-fold and two-fold respectively in NHP and NSF proteins. Based on our more accurate estimates, we infer that true domain losses and gains in NHP with respect to humans are observed at similar rates, while domain gains in the more divergent NSF are observed twice as frequently as domain losses with respect to S. cerevisiae. This study highlights the need to critically examine the scientific validity of protein annotations, and represents a significant step toward scalable computational fact-checking methods that may 1 day mitigate the propagation of wrong information in protein databases.

Collapse

Yokoi Y, Kawabuchi Y, Zulmajdi AA, Tanaka R, Shibata T, Muraoka T, Mori T. Cell-Penetrating Peptide-Peptide Nucleic Acid Conjugates as a Tool for Protein Functional Elucidation in the Native Bacterium. MOLECULES (BASEL, SWITZERLAND) 2022;27:molecules27248944. [PMID: 36558072 PMCID: PMC9788395 DOI: 10.3390/molecules27248944] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 10/03/2022] [Revised: 12/12/2022] [Accepted: 12/12/2022] [Indexed: 12/23/2022]

Tsvik L, Steiner B, Herzog P, Haltrich D, Sützl L. Flavin Mononucleotide-Dependent l-Lactate Dehydrogenases: Expanding the Toolbox of Enzymes for l-Lactate Biosensors. ACS OMEGA 2022;7:41480-41492. [PMID: 36406534 PMCID: PMC9670274 DOI: 10.1021/acsomega.2c05257] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/16/2022] [Accepted: 10/19/2022] [Indexed: 06/16/2023]

Goudey B, Geard N, Verspoor K, Zobel J. Propagation, detection and correction of errors using the sequence database network. Brief Bioinform 2022;23:6764545. [PMID: 36266246 PMCID: PMC9677457 DOI: 10.1093/bib/bbac416] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2022] [Revised: 07/31/2022] [Accepted: 08/28/2022] [Indexed: 12/14/2022] Open

Abstract

Nucleotide and protein sequences stored in public databases are the cornerstone of many bioinformatics analyses. The records containing these sequences are prone to a wide range of errors, including incorrect functional annotation, sequence contamination and taxonomic misclassification. One source of information that can help to detect errors are the strong interdependency between records. Novel sequences in one database draw their annotations from existing records, may generate new records in multiple other locations and will have varying degrees of similarity with existing records across a range of attributes. A network perspective of these relationships between sequence records, within and across databases, offers new opportunities to detect-or even correct-erroneous entries and more broadly to make inferences about record quality. Here, we describe this novel perspective of sequence database records as a rich network, which we call the sequence database network, and illustrate the opportunities this perspective offers for quantification of database quality and detection of spurious entries. We provide an overview of the relevant databases and describe how the interdependencies between sequence records across these databases can be exploited by network analyses. We review the process of sequence annotation and provide a classification of sources of error, highlighting propagation as a major source. We illustrate the value of a network perspective through three case studies that use network analysis to detect errors, and explore the quality and quantity of critical relationships that would inform such network analyses. This systematic description of a network perspective of sequence database records provides a novel direction to combat the proliferation of errors within these critical bioinformatics resources.

Collapse

Escudeiro P, Henry CS, Dias RP. Functional characterization of prokaryotic dark matter: the road so far and what lies ahead. CURRENT RESEARCH IN MICROBIAL SCIENCES 2022;3:100159. [PMID: 36561390 PMCID: PMC9764257 DOI: 10.1016/j.crmicr.2022.100159] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2022] [Revised: 07/18/2022] [Accepted: 08/05/2022] [Indexed: 12/25/2022] Open

Controllable protein design with language models. NAT MACH INTELL 2022. [DOI: 10.1038/s42256-022-00499-z] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022]

Ilgisonis EV, Pogodin PV, Kiseleva OI, Tarbeeva SN, Ponomarenko EA. Evolution of Protein Functional Annotation: Text Mining Study. J Pers Med 2022;12:jpm12030479. [PMID: 35330478 PMCID: PMC8952229 DOI: 10.3390/jpm12030479] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2022] [Revised: 03/07/2022] [Accepted: 03/08/2022] [Indexed: 11/23/2022] Open

Investigation and Alteration of Organic Acid Synthesis Pathways in the Mammalian Gut Symbiont Bacteroides thetaiotaomicron. Microbiol Spectr 2022;10:e0231221. [PMID: 35196806 PMCID: PMC8865466 DOI: 10.1128/spectrum.02312-21] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022] Open

Abstract

Members of the gut-dwelling Bacteroides genus have remarkable abilities in degrading a diverse set of fiber polysaccharide structures, most of which are found in the mammalian diet. As part of their metabolism, they convert these fibers to organic acids that can in turn provide energy to their host. While many studies have identified and characterized the genes and corresponding proteins involved in polysaccharide degradation, relatively little is known about Bacteroides genes involved in downstream metabolic pathways. Bacteroides thetaiotaomicron is one of the most studied species from the genus and is representative of this group in producing multiple organic acids as part of its metabolism. We focused here on several organic acid synthesis pathways in B. thetaiotaomicron, including those involved in formate, lactate, propionate, and acetate production. We identified potential genes involved in each pathway and characterized these through gene deletions coupled to growth assays and organic acid quantification. In addition, we developed and employed a Golden Gate-compatible plasmid system to simplify alteration of native gene expression levels. Our work both validates and contradicts previous bioinformatic gene annotations, and we develop a model on which to base future efforts. A clearer understanding of Bacteroides metabolic pathways can inform and facilitate efforts to employ these bacteria for improved human health or other utilization strategies.

IMPORTANCE Both humans and animals host a large community of bacteria and other microorganisms in their gastrointestinal tracts. This community breaks down dietary fiber and produces organic acids that are used as an energy source by the body and can also help the host resist infection by various pathogens. While the Bacteroides genus is one of the most common in the gut microbiota, it is only distantly related to bacteria with well-characterized metabolic pathways and it is therefore unclear whether research insights on organic acid production in those species can also be directly applied to the Bacteroides. By investigating multiple genetic pathways for organic acid production in Bacteroides thetaiotaomicron, we provide a basis for deeper understanding of these pathways. The work further enables greater understanding of Bacteroides–host relationships, as well as inter-species relationships in the microbiota, which are of importance for both human and animal gut health.

Collapse

Rembeza E, Boverio A, Fraaije MW, Engqvist MKM. Discovery of Two Novel Oxidases Using a High-Throughput Activity Screen. Chembiochem 2022;23:e202100510. [PMID: 34709726 PMCID: PMC9299179 DOI: 10.1002/cbic.202100510] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2021] [Revised: 10/27/2021] [Indexed: 12/17/2022]