Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Hanson AD, Pribat A, Waller JC, de Crécy-Lagard V. 'Unknown' proteins and 'orphan' enzymes: the missing half of the engineering parts list--and how to find it. Biochem J 2009;425:1-11. [PMID: 20001958 DOI: 10.1042/BJ20091328] [Citation(s) in RCA: 135] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]

For:	Hanson AD, Pribat A, Waller JC, de Crécy-Lagard V. 'Unknown' proteins and 'orphan' enzymes: the missing half of the engineering parts list--and how to find it. Biochem J 2009;425:1-11. [PMID: 20001958 DOI: 10.1042/BJ20091328] [Citation(s) in RCA: 135] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]

Number

Cited by Other Article(s)

101

Trichler SA, Bulla SC, Mahajan N, Lunsford KV, Pendarvis K, Nanduri B, McCarthy FM, Bulla C. Identification of canine platelet proteins separated by differential detergent fractionation for nonelectrophoretic proteomics analyzed by Gene Ontology and pathways analysis. VETERINARY MEDICINE-RESEARCH AND REPORTS 2014;5:1-9. [PMID: 32670841 PMCID: PMC7337207 DOI: 10.2147/vmrr.s47127] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/05/2014] [Accepted: 04/23/2014] [Indexed: 01/20/2023]

102

de Crécy-Lagard V. Variations in metabolic pathways create challenges for automated metabolic reconstructions: Examples from the tetrahydrofolate synthesis pathway. Comput Struct Biotechnol J 2014;10:41-50. [PMID: 25210598 PMCID: PMC4151868 DOI: 10.1016/j.csbj.2014.05.008] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/03/2022] Open

103

Sorokina M, Stam M, Médigue C, Lespinet O, Vallenet D. Profiling the orphan enzymes. Biol Direct 2014;9:10. [PMID: 24906382 PMCID: PMC4084501 DOI: 10.1186/1745-6150-9-10] [Citation(s) in RCA: 31] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2014] [Accepted: 05/29/2014] [Indexed: 11/10/2022] Open

Abstract

The emergence of Next Generation Sequencing generates an incredible amount of sequence and great potential for new enzyme discovery. Despite this huge amount of data and the profusion of bioinformatic methods for function prediction, a large part of known enzyme activities is still lacking an associated protein sequence. These particular activities are called "orphan enzymes". The present review proposes an update of previous surveys on orphan enzymes by mining the current content of public databases. While the percentage of orphan enzyme activities has decreased from 38% to 22% in ten years, there are still more than 1,000 orphans among the 5,000 entries of the Enzyme Commission (EC) classification. Taking into account all the reactions present in metabolic databases, this proportion dramatically increases to reach nearly 50% of orphans and many of them are not associated to a known pathway. We extended our survey to "local orphan enzymes" that are activities which have no representative sequence in a given clade, but have at least one in organisms belonging to other clades. We observe an important bias in Archaea and find that in general more than 30% of the EC activities have incomplete sequence information in at least one superkingdom. To estimate if candidate proteins for local orphans could be retrieved by homology search, we applied a simple strategy based on the PRIAM software and noticed that candidates may be proposed for an important fraction of local orphan enzymes. Finally, by studying relation between protein domains and catalyzed activities, it appears that newly discovered enzymes are mostly associated with already known enzyme domains. Thus, the exploration of the promiscuity and the multifunctional aspect of known enzyme families may solve part of the orphan enzyme issue. We conclude this review with a presentation of recent initiatives in finding proteins for orphan enzymes and in extending the enzyme world by the discovery of new activities.

Collapse

104

Islam MA, Waller AS, Hug LA, Provart NJ, Edwards EA, Mahadevan R. New insights into Dehalococcoides mccartyi metabolism from a reconstructed metabolic network-based systems-level analysis of D. mccartyi transcriptomes. PLoS One 2014;9:e94808. [PMID: 24733489 PMCID: PMC3986231 DOI: 10.1371/journal.pone.0094808] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2013] [Accepted: 03/19/2014] [Indexed: 12/16/2022] Open

Abstract

Organohalide respiration, mediated by Dehalococcoides mccartyi, is a useful bioremediation process that transforms ground water pollutants and known human carcinogens such as trichloroethene and vinyl chloride into benign ethenes. Successful application of this process depends on the fundamental understanding of the respiration and metabolism of D. mccartyi. Reductive dehalogenases, encoded by rdhA genes of these anaerobic bacteria, exclusively catalyze organohalide respiration and drive metabolism. To better elucidate D. mccartyi metabolism and physiology, we analyzed available transcriptomic data for a pure isolate (Dehalococcoides mccartyi strain 195) and a mixed microbial consortium (KB-1) using the previously developed pan-genome-scale reconstructed metabolic network of D. mccartyi. The transcriptomic data, together with available proteomic data helped confirm transcription and expression of the majority genes in D. mccartyi genomes. A composite genome of two highly similar D. mccartyi strains (KB-1 Dhc) from the KB-1 metagenome sequence was constructed, and operon prediction was conducted for this composite genome and other single genomes. This operon analysis, together with the quality threshold clustering analysis of transcriptomic data helped generate experimentally testable hypotheses regarding the function of a number of hypothetical proteins and the poorly understood mechanism of energy conservation in D. mccartyi. We also identified functionally enriched important clusters (13 for strain 195 and 11 for KB-1 Dhc) of co-expressed metabolic genes using information from the reconstructed metabolic network. This analysis highlighted some metabolic genes and processes, including lipid metabolism, energy metabolism, and transport that potentially play important roles in organohalide respiration. Overall, this study shows the importance of an organism's metabolic reconstruction in analyzing various "omics" data to obtain improved understanding of the metabolism and physiology of the organism.

Collapse

105

Siller M, Goyal S, Yoshimoto FK, Xiao Y, Wei S, Guengerich FP. Oxidation of endogenous N-arachidonoylserotonin by human cytochrome P450 2U1. J Biol Chem 2014;289:10476-10487. [PMID: 24563460 DOI: 10.1074/jbc.m114.550004] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022] Open

106

de la Tour CB, Passot FM, Toueille M, Mirabella B, Guérin P, Blanchard L, Servant P, de Groot A, Sommer S, Armengaud J. Comparative proteomics reveals key proteins recruited at the nucleoid of Deinococcus after irradiation-induced DNA damage. Proteomics 2013;13:3457-69. [PMID: 24307635 DOI: 10.1002/pmic.201300249] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/26/2013] [Revised: 10/19/2013] [Accepted: 10/23/2013] [Indexed: 11/09/2022]

107

Hwang WC, Bakolitsa C, Punta M, Coggill PC, Bateman A, Axelrod HL, Rawlings ND, Sedova M, Peterson SN, Eberhardt RY, Aravind L, Pascual J, Godzik A. LUD, a new protein domain associated with lactate utilization. BMC Bioinformatics 2013;14:341. [PMID: 24274019 PMCID: PMC3924224 DOI: 10.1186/1471-2105-14-341] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2013] [Accepted: 11/19/2013] [Indexed: 11/24/2022] Open

108

Revealing the hidden functional diversity of an enzyme family. Nat Chem Biol 2013;10:42-9. [DOI: 10.1038/nchembio.1387] [Citation(s) in RCA: 81] [Impact Index Per Article: 7.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2013] [Accepted: 10/02/2013] [Indexed: 11/08/2022]

109

Tchigvintsev A, Tchigvintsev D, Flick R, Popovic A, Dong A, Xu X, Brown G, Lu W, Wu H, Cui H, Dombrowski L, Joo JC, Beloglazova N, Min J, Savchenko A, Caudy AA, Rabinowitz JD, Murzin AG, Yakunin AF. Biochemical and structural studies of conserved Maf proteins revealed nucleotide pyrophosphatases with a preference for modified nucleotides. ACTA ACUST UNITED AC 2013;20:1386-98. [PMID: 24210219 PMCID: PMC3899018 DOI: 10.1016/j.chembiol.2013.09.011] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2013] [Revised: 09/06/2013] [Accepted: 09/13/2013] [Indexed: 11/17/2022]

110

Baran R, Ivanova NN, Jose N, Garcia-Pichel F, Kyrpides NC, Gugger M, Northen TR. Functional genomics of novel secondary metabolites from diverse cyanobacteria using untargeted metabolomics. Mar Drugs 2013;11:3617-31. [PMID: 24084783 PMCID: PMC3826126 DOI: 10.3390/md11103617] [Citation(s) in RCA: 44] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2013] [Revised: 08/21/2013] [Accepted: 09/09/2013] [Indexed: 12/22/2022] Open

111

Semi-automated curation of metabolic models via flux balance analysis: a case study with Mycoplasma gallisepticum. PLoS Comput Biol 2013;9:e1003208. [PMID: 24039564 PMCID: PMC3764002 DOI: 10.1371/journal.pcbi.1003208] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2012] [Accepted: 07/19/2013] [Indexed: 11/19/2022] Open

Abstract

Primarily used for metabolic engineering and synthetic biology, genome-scale metabolic modeling shows tremendous potential as a tool for fundamental research and curation of metabolism. Through a novel integration of flux balance analysis and genetic algorithms, a strategy to curate metabolic networks and facilitate identification of metabolic pathways that may not be directly inferable solely from genome annotation was developed. Specifically, metabolites involved in unknown reactions can be determined, and potentially erroneous pathways can be identified. The procedure developed allows for new fundamental insight into metabolism, as well as acting as a semi-automated curation methodology for genome-scale metabolic modeling. To validate the methodology, a genome-scale metabolic model for the bacterium Mycoplasma gallisepticum was created. Several reactions not predicted by the genome annotation were postulated and validated via the literature. The model predicted an average growth rate of 0.358±0.12, closely matching the experimentally determined growth rate of M. gallisepticum of 0.244±0.03. This work presents a powerful algorithm for facilitating the identification and curation of previously known and new metabolic pathways, as well as presenting the first genome-scale reconstruction of M. gallisepticum.

Flux balance analysis (FBA) is a powerful approach for genome-scale metabolic modeling. It provides metabolic engineers with a tool for manipulating, predicting, and optimizing metabolism for biotechnological and biomedical purposes. However, we posit that it can also be used as tool for fundamental research in understanding and curating metabolic networks. Specifically, by using a genetic algorithm integrated with FBA, we developed a curation approach to identify missing reactions, incomplete reactions, and erroneous reactions. Additionally, it was possible to take advantage of the ensemble information from the genetic algorithm to identify the most critical reactions for curation. We tested our strategy using Mycoplasma gallisepticum as our model organism. Using the genome annotation as the basis, the preliminary genome-scale metabolic model consisted of 446 metabolites involved in 380 reactions. Carrying out our analysis, we found over 80 incorrect reactions and 16 missing reactions. Based upon the guidance of the algorithm, we were able to curate and resolve all discrepancies. The model predicted an average bacterial growth rate of 0.358±0.12 h⁻¹ compared to the experimentally observed 0.244±0.03 h⁻¹. Thus, our approach facilitated the curation of a genome-scale metabolic network and generated a high quality metabolic model.

Collapse

112

A novel predicted calcium-regulated kinase family implicated in neurological disorders. PLoS One 2013;8:e66427. [PMID: 23840464 PMCID: PMC3696010 DOI: 10.1371/journal.pone.0066427] [Citation(s) in RCA: 33] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2012] [Accepted: 05/08/2013] [Indexed: 12/03/2022] Open

113

Larhlimi A, Basler G, Grimbs S, Selbig J, Nikoloski Z. Stoichiometric capacitance reveals the theoretical capabilities of metabolic networks. ACTA ACUST UNITED AC 2013;28:i502-i508. [PMID: 22962473 PMCID: PMC3436808 DOI: 10.1093/bioinformatics/bts381] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]

114

Van Schaftingen E, Rzem R, Marbaix A, Collard F, Veiga-da-Cunha M, Linster CL. Metabolite proofreading, a neglected aspect of intermediary metabolism. J Inherit Metab Dis 2013;36:427-34. [PMID: 23296366 DOI: 10.1007/s10545-012-9571-1] [Citation(s) in RCA: 56] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/06/2012] [Revised: 11/26/2012] [Accepted: 11/29/2012] [Indexed: 10/27/2022]

115

Comparative genomics approaches to understanding and manipulating plant metabolism. Curr Opin Biotechnol 2013;24:278-84. [DOI: 10.1016/j.copbio.2012.07.005] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2012] [Revised: 07/29/2012] [Accepted: 07/30/2012] [Indexed: 12/11/2022]

116

Inferring the metabolism of human orphan metabolites from their metabolic network context affirms human gluconokinase activity. Biochem J 2013;449:427-35. [PMID: 23067238 DOI: 10.1042/bj20120980] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]

117

Baran R, Bowen BP, Price MN, Arkin AP, Deutschbauer AM, Northen TR. Metabolic footprinting of mutant libraries to map metabolite utilization to genotype. ACS Chem Biol 2013;8:189-99. [PMID: 23082955 DOI: 10.1021/cb300477w] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/07/2023]

118

Blais EM, Chavali AK, Papin JA. Linking genome-scale metabolic modeling and genome annotation. Methods Mol Biol 2013;985:61-83. [PMID: 23417799 DOI: 10.1007/978-1-62703-299-5_4] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]

119

Chen TW, Gan RCR, Wu TH, Huang PJ, Lee CY, Chen YYM, Chen CC, Tang P. FastAnnotator--an efficient transcript annotation web tool. BMC Genomics 2012;13 Suppl 7:S9. [PMID: 23281853 PMCID: PMC3521244 DOI: 10.1186/1471-2164-13-s7-s9] [Citation(s) in RCA: 49] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open

Abstract

BACKGROUND

Recent developments in high-throughput sequencing (HTS) technologies have made it feasible to sequence the complete transcriptomes of non-model organisms or metatranscriptomes from environmental samples. The challenge after generating hundreds of millions of sequences is to annotate these transcripts and classify the transcripts based on their putative functions. Because many biological scientists lack the knowledge to install Linux-based software packages or maintain databases used for transcript annotation, we developed an automatic annotation tool with an easy-to-use interface.

METHODS

To elucidate the potential functions of gene transcripts, we integrated well-established annotation tools: Blast2GO, PRIAM and RPS BLAST in a web-based service, FastAnnotator, which can assign Gene Ontology (GO) terms, Enzyme Commission numbers (EC numbers) and functional domains to query sequences.

RESULTS

Using six transcriptome sequence datasets as examples, we demonstrated the ability of FastAnnotator to assign functional annotations. FastAnnotator annotated 88.1% and 81.3% of the transcripts from the well-studied organisms Caenorhabditis elegans and Streptococcus parasanguinis, respectively. Furthermore, FastAnnotator annotated 62.9%, 20.4%, 53.1% and 42.0% of the sequences from the transcriptomes of sweet potato, clam, amoeba, and Trichomonas vaginalis, respectively, which lack reference genomes. We demonstrated that FastAnnotator can complete the annotation process in a reasonable amount of time and is suitable for the annotation of transcriptomes from model organisms or organisms for which annotated reference genomes are not avaiable.

CONCLUSIONS

The sequencing process no longer represents the bottleneck in the study of genomics, and automatic annotation tools have become invaluable as the annotation procedure has become the limiting step. We present FastAnnotator, which was an automated annotation web tool designed to efficiently annotate sequences with their gene functions, enzyme functions or domains. FastAnnotator is useful in transcriptome studies and especially for those focusing on non-model organisms or metatranscriptomes. FastAnnotator does not require local installation and is freely available at http://fastannotator.cgu.edu.tw.

Collapse

120

Crécy-Lagard VD, Phillips G, Grochowski LL, Yacoubi BE, Jenney F, Adams MWW, Murzin AG, White RH. Comparative genomics guided discovery of two missing archaeal enzyme families involved in the biosynthesis of the pterin moiety of tetrahydromethanopterin and tetrahydrofolate. ACS Chem Biol 2012;7:1807-16. [PMID: 22931285 PMCID: PMC3500442 DOI: 10.1021/cb300342u] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/03/2022]

121

Structural analysis of hypothetical proteins from Helicobacter pylori: an approach to estimate functions of unknown or hypothetical proteins. Int J Mol Sci 2012;13:7109-7137. [PMID: 22837682 PMCID: PMC3397514 DOI: 10.3390/ijms13067109] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2012] [Revised: 05/29/2012] [Accepted: 06/01/2012] [Indexed: 12/12/2022] Open

122

Prediction and identification of sequences coding for orphan enzymes using genomic and metagenomic neighbours. Mol Syst Biol 2012;8:581. [PMID: 22569339 PMCID: PMC3377989 DOI: 10.1038/msb.2012.13] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2011] [Accepted: 03/24/2012] [Indexed: 11/09/2022] Open

Abstract

Many characterized metabolic enzymes currently lack associated gene and protein sequences. Here, pathway and genomic neighbour data are used to assign genes to these ‘orphan enzymes,' and the predictions are validated with experimental assays and genome-scale metabolic modelling.

A computational method is developed for assigning candidate sequences to orphan enzymes. The method uses metabolic pathway, genomic neighbourhood, genomic co-occurrence, and protein domain information to predict genes that are likely to perform a particular enzymatic function.

Benchmarking of the scoring scheme based on the 4 features above revealed that some combinations of parameters yielded greater than 70% accuracy, and that high-confidence predictions could be generated for 131 orphan enzymes.

Enzyme assay experiments confirmed the predicted enzymatic activity for two of the high-confidence candidate sequences.

Predicted functions can improve the annotation of genomic and metagenomic data, and can reveal putative genes for enzymes with potential biotechnological applications.

Incorporating the predicted enzymatic reactions into genome-scale metabolic models changed the flux connectivity and improved their ability to correctly predict gene essentiality, supporting the biological relevance of these predictions.

Despite the current wealth of sequencing data, one-third of all biochemically characterized metabolic enzymes lack a corresponding gene or protein sequence, and as such can be considered orphan enzymes. They represent a major gap between our molecular and biochemical knowledge, and consequently are not amenable to modern systemic analyses. As 555 of these orphan enzymes have metabolic pathway neighbours, we developed a global framework that utilizes the pathway and (meta)genomic neighbour information to assign candidate sequences to orphan enzymes. For 131 orphan enzymes (37% of those for which (meta)genomic neighbours are available), we associate sequences to them using scoring parameters with an estimated accuracy of 70%, implying functional annotation of 16 345 gene sequences in numerous (meta)genomes. As a case in point, two of these candidate sequences were experimentally validated to encode the predicted activity. In addition, we augmented the currently available genome-scale metabolic models with these new sequence–function associations and were able to expand the models by on average 8%, with a considerable change in the flux connectivity patterns and improved essentiality prediction.

Collapse

123

Basler G, Grimbs S, Nikoloski Z. Optimizing metabolic pathways by screening for feasible synthetic reactions. Biosystems 2012;109:186-91. [PMID: 22575307 DOI: 10.1016/j.biosystems.2012.04.007] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2011] [Revised: 03/23/2012] [Accepted: 04/23/2012] [Indexed: 11/18/2022]

Abstract

BACKGROUND

Reconstruction of genome-scale metabolic networks has resulted in models capable of reproducing experimentally observed biomass yield/growth rates and predicting the effect of alterations in metabolism for biotechnological applications. The existing studies rely on modifying the metabolic network of an investigated organism by removing or inserting reactions taken either from evolutionary similar organisms or from databases of biochemical reactions (e.g., KEGG). A potential disadvantage of these knowledge-driven approaches is that the result is biased towards known reactions, as such approaches do not account for the possibility of including novel enzymes, together with the reactions they catalyze.

RESULTS

Here, we explore the alternative of increasing biomass yield in three model organisms, namely Bacillus subtilis, Escherichia coli, and Hordeum vulgare, by applying small, chemically feasible network modifications. We use the predicted and experimentally confirmed growth rates of the wild-type networks as reference values and determine the effect of inserting mass-balanced, thermodynamically feasible reactions on predictions of growth rate by using flux balance analysis.

CONCLUSIONS

While many replacements of existing reactions naturally lead to a decrease or complete loss of biomass production ability, in all three investigated organisms we find feasible modifications which facilitate a significant increase in this biological function. We focus on modifications with feasible chemical properties and a significant increase in biomass yield. The results demonstrate that small modifications are sufficient to substantially alter biomass yield in the three organisms. The method can be used to predict the effect of targeted modifications on the yield of any set of metabolites (e.g., ethanol), thus providing a computational framework for synthetic metabolic engineering.

Collapse

124

Delavat F, Phalip V, Forster A, Plewniak F, Lett MC, Lièvremont D. Amylases without known homologues discovered in an acid mine drainage: significance and impact. Sci Rep 2012;2:354. [PMID: 22482035 PMCID: PMC3319935 DOI: 10.1038/srep00354] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2011] [Accepted: 03/08/2012] [Indexed: 12/25/2022] Open

125

Seaver SMD, Henry CS, Hanson AD. Frontiers in metabolic reconstruction and modeling of plant genomes. JOURNAL OF EXPERIMENTAL BOTANY 2012;63:2247-58. [PMID: 22238452 DOI: 10.1093/jxb/err371] [Citation(s) in RCA: 44] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/20/2023]

126

Fernie AR, Stitt M. On the discordance of metabolomics with proteomics and transcriptomics: coping with increasing complexity in logic, chemistry, and network interactions scientific correspondence. PLANT PHYSIOLOGY 2012;158:1139-45. [PMID: 22253257 PMCID: PMC3291261 DOI: 10.1104/pp.112.193235] [Citation(s) in RCA: 136] [Impact Index Per Article: 11.3] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/14/2023]

127

Belin P, Moutiez M, Lautru S, Seguin J, Pernodet JL, Gondry M. The nonribosomal synthesis of diketopiperazines in tRNA-dependent cyclodipeptide synthase pathways. Nat Prod Rep 2012;29:961-79. [DOI: 10.1039/c2np20010d] [Citation(s) in RCA: 113] [Impact Index Per Article: 9.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022]

128

Galeazzi L, Bocci P, Amici A, Brunetti L, Ruggieri S, Romine M, Reed S, Osterman AL, Rodionov DA, Sorci L, Raffaelli N. Identification of nicotinamide mononucleotide deamidase of the bacterial pyridine nucleotide cycle reveals a novel broadly conserved amidohydrolase family. J Biol Chem 2011;286:40365-75. [PMID: 21953451 PMCID: PMC3220592 DOI: 10.1074/jbc.m111.275818] [Citation(s) in RCA: 45] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/24/2011] [Revised: 08/29/2011] [Indexed: 11/06/2022] Open

129

Galperin MY, Koonin EV. Divergence and convergence in enzyme evolution. J Biol Chem 2011;287:21-28. [PMID: 22069324 PMCID: PMC3249071 DOI: 10.1074/jbc.r111.241976] [Citation(s) in RCA: 125] [Impact Index Per Article: 9.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/02/2022] Open

130

von Saint Paul V, Zhang W, Kanawati B, Geist B, Faus-Keßler T, Schmitt-Kopplin P, Schäffner AR. The Arabidopsis glucosyltransferase UGT76B1 conjugates isoleucic acid and modulates plant defense and senescence. THE PLANT CELL 2011;23:4124-45. [PMID: 22080599 PMCID: PMC3246326 DOI: 10.1105/tpc.111.088443] [Citation(s) in RCA: 130] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/30/2011] [Revised: 09/30/2011] [Accepted: 10/24/2011] [Indexed: 05/18/2023]

131

Rolfsson O, Palsson BØ, Thiele I. The human metabolic reconstruction Recon 1 directs hypotheses of novel human metabolic functions. BMC SYSTEMS BIOLOGY 2011;5:155. [PMID: 21962087 PMCID: PMC3224382 DOI: 10.1186/1752-0509-5-155] [Citation(s) in RCA: 55] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/29/2011] [Accepted: 10/01/2011] [Indexed: 11/29/2022]

132

Pribat A, Blaby IK, Lara-Núñez A, Jeanguenin L, Fouquet R, Frelin O, Gregory JF, Philmus B, Begley TP, de Crécy-Lagard V, Hanson AD. A 5-formyltetrahydrofolate cycloligase paralog from all domains of life: comparative genomic and experimental evidence for a cryptic role in thiamin metabolism. Funct Integr Genomics 2011;11:467-78. [PMID: 21538139 PMCID: PMC6078417 DOI: 10.1007/s10142-011-0224-5] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2011] [Revised: 03/19/2011] [Accepted: 04/03/2011] [Indexed: 12/18/2022]

133

Guengerich FP, Cheng Q. Orphans in the human cytochrome P450 superfamily: approaches to discovering functions and relevance in pharmacology. Pharmacol Rev 2011;63:684-99. [PMID: 21737533 DOI: 10.1124/pr.110.003525] [Citation(s) in RCA: 70] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022] Open

134

Boynton TO, Gerdes S, Craven SH, Neidle EL, Phillips JD, Dailey HA. Discovery of a gene involved in a third bacterial protoporphyrinogen oxidase activity through comparative genomic analysis and functional complementation. Appl Environ Microbiol 2011;77:4795-801. [PMID: 21642412 PMCID: PMC3147383 DOI: 10.1128/aem.00171-11] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2011] [Accepted: 05/20/2011] [Indexed: 11/20/2022] Open

135

Gerdes S, El Yacoubi B, Bailly M, Blaby IK, Blaby-Haas CE, Jeanguenin L, Lara-Núñez A, Pribat A, Waller JC, Wilke A, Overbeek R, Hanson AD, de Crécy-Lagard V. Synergistic use of plant-prokaryote comparative genomics for functional annotations. BMC Genomics 2011;12 Suppl 1:S2. [PMID: 21810204 PMCID: PMC3223725 DOI: 10.1186/1471-2164-12-s1-s2] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open

Abstract

BACKGROUND

Identifying functions for all gene products in all sequenced organisms is a central challenge of the post-genomic era. However, at least 30-50% of the proteins encoded by any given genome are of unknown or vaguely known function, and a large number are wrongly annotated. Many of these 'unknown' proteins are common to prokaryotes and plants. We set out to predict and experimentally test the functions of such proteins. Our approach to functional prediction integrates comparative genomics based mainly on microbial genomes with functional genomic data from model microorganisms and post-genomic data from plants. This approach bridges the gap between automated homology-based annotations and the classical gene discovery efforts of experimentalists, and is more powerful than purely computational approaches to identifying gene-function associations.

RESULTS

Among Arabidopsis genes, we focused on those (2,325 in total) that (i) are unique or belong to families with no more than three members, (ii) occur in prokaryotes, and (iii) have unknown or poorly known functions. Computer-assisted selection of promising targets for deeper analysis was based on homology-independent characteristics associated in the SEED database with the prokaryotic members of each family. In-depth comparative genomic analysis was performed for 360 top candidate families. From this pool, 78 families were connected to general areas of metabolism and, of these families, specific functional predictions were made for 41. Twenty-one predicted functions have been experimentally tested or are currently under investigation by our group in at least one prokaryotic organism (nine of them have been validated, four invalidated, and eight are in progress). Ten additional predictions have been independently validated by other groups. Discovering the function of very widespread but hitherto enigmatic proteins such as the YrdC or YgfZ families illustrates the power of our approach.

CONCLUSIONS

Our approach correctly predicted functions for 19 uncharacterized protein families from plants and prokaryotes; none of these functions had previously been correctly predicted by computational methods. The resulting annotations could be propagated with confidence to over six thousand homologous proteins encoded in over 900 bacterial, archaeal, and eukaryotic genomes currently available in public databases.

Collapse

136

Salamanca-Pinzón SG, Guengerich FP. A tricistronic human adrenodoxin reductase-adrenodoxin-cytochrome P450 27A1 vector system for substrate hydroxylation in Escherichia coli. Protein Expr Purif 2011;79:231-6. [PMID: 21621619 DOI: 10.1016/j.pep.2011.05.008] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/26/2011] [Revised: 05/11/2011] [Accepted: 05/12/2011] [Indexed: 01/01/2023]

137

Hanson AD, Gregory JF. Folate biosynthesis, turnover, and transport in plants. ANNUAL REVIEW OF PLANT BIOLOGY 2011;62:105-25. [PMID: 21275646 DOI: 10.1146/annurev-arplant-042110-103819] [Citation(s) in RCA: 145] [Impact Index Per Article: 11.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/02/2023]

138

Jeanguenin L, Lara-Núñez A, Pribat A, Mageroy MH, Gregory JF, Rice KC, de Crécy-Lagard V, Hanson AD. Moonlighting glutamate formiminotransferases can functionally replace 5-formyltetrahydrofolate cycloligase. J Biol Chem 2010;285:41557-66. [PMID: 20952389 DOI: 10.1074/jbc.m110.190504] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/05/2023] Open

139

Galperin MY, Koonin EV. From complete genome sequence to 'complete' understanding? Trends Biotechnol 2010;28:398-406. [PMID: 20647113 PMCID: PMC3065831 DOI: 10.1016/j.tibtech.2010.05.006] [Citation(s) in RCA: 119] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2010] [Revised: 05/18/2010] [Accepted: 05/28/2010] [Indexed: 12/29/2022]

140

Guengerich FP, Tang Z, Salamanca-Pinzón SG, Cheng Q. Characterizing proteins of unknown function: orphan cytochrome p450 enzymes as a paradigm. Mol Interv 2010;10:153-63. [PMID: 20539034 PMCID: PMC2895278 DOI: 10.1124/mi.10.3.6] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022]

141

Guengerich FP, Tang Z, Cheng Q, Salamanca-Pinzón SG. Approaches to deorphanization of human and microbial cytochrome P450 enzymes. BIOCHIMICA ET BIOPHYSICA ACTA-PROTEINS AND PROTEOMICS 2010;1814:139-45. [PMID: 20493973 DOI: 10.1016/j.bbapap.2010.05.005] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/13/2010] [Revised: 04/30/2010] [Accepted: 05/09/2010] [Indexed: 12/30/2022]