Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: von Mering C, Zdobnov EM, Tsoka S, Ciccarelli FD, Pereira-Leal JB, Ouzounis CA, Bork P. Genome evolution reveals biochemical networks and functional modules. Proc Natl Acad Sci U S A 2003;100:15428-33. [PMID: 14673105 PMCID: PMC307584 DOI: 10.1073/pnas.2136809100] [Citation(s) in RCA: 120] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022] Open

For:	von Mering C, Zdobnov EM, Tsoka S, Ciccarelli FD, Pereira-Leal JB, Ouzounis CA, Bork P. Genome evolution reveals biochemical networks and functional modules. Proc Natl Acad Sci U S A 2003;100:15428-33. [PMID: 14673105 PMCID: PMC307584 DOI: 10.1073/pnas.2136809100] [Citation(s) in RCA: 120] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022] Open

Number

Cited by Other Article(s)

Computational Network Inference for Bacterial Interactomics. mSystems 2022;7:e0145621. [PMID: 35353009 PMCID: PMC9040873 DOI: 10.1128/msystems.01456-21] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022] Open

Sahoo A, Pechmann S. Functional network motifs defined through integration of protein-protein and genetic interactions. PeerJ 2022;10:e13016. [PMID: 35223214 PMCID: PMC8877332 DOI: 10.7717/peerj.13016] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2021] [Accepted: 02/06/2022] [Indexed: 01/11/2023] Open

Harrison BR, Hoffman JM, Samuelson A, Raftery D, Promislow DEL. Modular Evolution of the Drosophila Metabolome. Mol Biol Evol 2022;39:msab307. [PMID: 34662414 PMCID: PMC8760934 DOI: 10.1093/molbev/msab307] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2023] Open

OUP accepted manuscript. Brief Funct Genomics 2022;21:243-269. [DOI: 10.1093/bfgp/elac007] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2021] [Revised: 03/17/2022] [Accepted: 03/18/2022] [Indexed: 11/14/2022] Open

James K, Olson PD. The tapeworm interactome: inferring confidence scored protein-protein interactions from the proteome of Hymenolepis microstoma. BMC Genomics 2020;21:346. [PMID: 32380953 PMCID: PMC7204028 DOI: 10.1186/s12864-020-6710-1] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2019] [Accepted: 03/30/2020] [Indexed: 12/14/2022] Open

Abstract

Background

Reference genome and transcriptome assemblies of helminths have reached a level of completion whereby secondary analyses that rely on accurate gene estimation or syntenic relationships can be now conducted with a high level of confidence. Recent public release of the v.3 assembly of the mouse bile-duct tapeworm, Hymenolepis microstoma, provides chromosome-level characterisation of the genome and a stabilised set of protein coding gene models underpinned by bioinformatic and empirical data. However, interactome data have not been produced. Conserved protein-protein interactions in other organisms, termed interologs, can be used to transfer interactions between species, allowing systems-level analysis in non-model organisms.

Results

Here, we describe a probabilistic, integrated network of interologs for the H. microstoma proteome, based on conserved protein interactions found in eukaryote model species. Almost a third of the 10,139 gene models in the v.3 assembly could be assigned interaction data and assessment of the resulting network indicates that topologically-important proteins are related to essential cellular pathways, and that the network clusters into biologically meaningful components. Moreover, network parameters are similar to those of single-species interaction networks that we constructed in the same way for S. cerevisiae, C. elegans and H. sapiens, demonstrating that information-rich, system-level analyses can be conducted even on species separated by a large phylogenetic distance from the major model organisms from which most protein interaction evidence is based. Using the interolog network, we then focused on sub-networks of interactions assigned to discrete suites of genes of interest, including signalling components and transcription factors, germline multipotency genes, and genes differentially-expressed between larval and adult worms. Results show not only an expected bias toward highly-conserved proteins, such as components of intracellular signal transduction, but in some cases predicted interactions with transcription factors that aid in identifying their target genes.

Conclusions

With key helminth genomes now complete, systems-level analyses can provide an important predictive framework to guide basic and applied research on helminths and will become increasingly informative as new protein-protein interaction data accumulate.

Collapse

Kaznadzey A, Shelyakin P, Belousova E, Eremina A, Shvyreva U, Bykova D, Emelianenko V, Korosteleva A, Tutukina M, Gelfand MS. The genes of the sulphoquinovose catabolism in Escherichia coli are also associated with a previously unknown pathway of lactose degradation. Sci Rep 2018;8:3177. [PMID: 29453395 PMCID: PMC5816610 DOI: 10.1038/s41598-018-21534-3] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2017] [Accepted: 02/06/2018] [Indexed: 12/29/2022] Open

Kaznadzey A, Shelyakin P, Gelfand MS. Sugar Lego: gene composition of bacterial carbohydrate metabolism genomic loci. Biol Direct 2017;12:28. [PMID: 29178959 PMCID: PMC5702140 DOI: 10.1186/s13062-017-0200-7] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2017] [Accepted: 11/20/2017] [Indexed: 11/25/2022] Open

Abstract

Background

Bacterial carbohydrate metabolism is extremely diverse, since carbohydrates serve as a major energy source and are involved in a variety of cellular processes. Bacterial genes belonging to same metabolic pathway are often co-localized in the chromosome, but it is not a strict rule. Gene co-localization in linked to co-evolution and co-regulation. This study focuses on a large-scale analysis of bacterial genomic loci related to the carbohydrate metabolism.

Results

We demonstrate that only 53% of 148,000 studied genes from over six hundred bacterial genomes are co-localized in bacterial genomes with other carbohydrate metabolism genes, which points to a significant role of singleton genes. Co-localized genes form cassettes, ranging in size from two to fifteen genes. Two major factors influencing the cassette-forming tendency are gene function and bacterial phylogeny. We have obtained a comprehensive picture of co-localization preferences of genes for nineteen major carbohydrate metabolism functional classes, over two hundred gene orthologous clusters, and thirty bacterial classes, and characterized the cassette variety in size and content among different species, highlighting a significant role of short cassettes. The preference towards co-localization of carbohydrate metabolism genes varies between 40 and 76% for bacterial taxa. Analysis of frequently co-localized genes yielded forty-five significant pairwise links between genes belonging to different functional classes. The number of such links per class range from zero to eight, demonstrating varying preferences of respective genes towards a specific chromosomal neighborhood. Genes from eleven functional classes tend to co-localize with genes from the same class, indicating an important role of clustering of genes with similar functions. At that, in most cases such co-localization does not originate from local duplication events.

Conclusions

Overall, we describe a complex web formed by evolutionary relationships of bacterial carbohydrate metabolism genes, manifested as co-localization patterns.

Reviewers

This article was reviewed by Daria V. Dibrova (A.N. Belozersky Institute of Physico-Chemical Biology, Lomonosov Moscow State University, Moscow, Russia), nominated by Armen Mulkidjanian (University of Osnabrück, Germany), Igor Rogozin (NCBI, NLM, NIH, USA) and Yuri Wolf (NCBI, NLM, NIH, USA).

Electronic supplementary material

The online version of this article (10.1186/s13062-017-0200-7) contains supplementary material, which is available to authorized users.

Collapse

Chaiboonchoe A, Ghamsari L, Dohai B, Ng P, Khraiwesh B, Jaiswal A, Jijakli K, Koussa J, Nelson DR, Cai H, Yang X, Chang RL, Papin J, Yu H, Balaji S, Salehi-Ashtiani K. Systems level analysis of the Chlamydomonas reinhardtii metabolic network reveals variability in evolutionary co-conservation. MOLECULAR BIOSYSTEMS 2017;12:2394-407. [PMID: 27357594 DOI: 10.1039/c6mb00237d] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/18/2022]

Abstract

Metabolic networks, which are mathematical representations of organismal metabolism, are reconstructed to provide computational platforms to guide metabolic engineering experiments and explore fundamental questions on metabolism. Systems level analyses, such as interrogation of phylogenetic relationships within the network, can provide further guidance on the modification of metabolic circuitries. Chlamydomonas reinhardtii, a biofuel relevant green alga that has retained key genes with plant, animal, and protist affinities, serves as an ideal model organism to investigate the interplay between gene function and phylogenetic affinities at multiple organizational levels. Here, using detailed topological and functional analyses, coupled with transcriptomics studies on a metabolic network that we have reconstructed for C. reinhardtii, we show that network connectivity has a significant concordance with the co-conservation of genes; however, a distinction between topological and functional relationships is observable within the network. Dynamic and static modes of co-conservation were defined and observed in a subset of gene-pairs across the network topologically. In contrast, genes with predicted synthetic interactions, or genes involved in coupled reactions, show significant enrichment for both shorter and longer phylogenetic distances. Based on our results, we propose that the metabolic network of C. reinhardtii is assembled with an architecture to minimize phylogenetic profile distances topologically, while it includes an expansion of such distances for functionally interacting genes. This arrangement may increase the robustness of C. reinhardtii's network in dealing with varied environmental challenges that the species may face. The defined evolutionary constraints within the network, which identify important pairings of genes in metabolism, may offer guidance on synthetic biology approaches to optimize the production of desirable metabolites.

Collapse

Affiliation(s)

Amphun Chaiboonchoe Laboratory of Algal, Systems, and Synthetic Biology, Division of Science and Math, New York University Abu Dhabi and Center for Genomics and Systems Biology (CGSB), New York University Abu Dhabi Institute, Abu Dhabi, UAE.
Lila Ghamsari Center for Cancer Systems Biology (CCSB) and Department of Cancer Biology, Dana-Farber Cancer Institute, and Department of Genetics, Harvard Medical School, Boston, MA, USA
Bushra Dohai Laboratory of Algal, Systems, and Synthetic Biology, Division of Science and Math, New York University Abu Dhabi and Center for Genomics and Systems Biology (CGSB), New York University Abu Dhabi Institute, Abu Dhabi, UAE.
Patrick Ng Department of Biological Statistics and Computational Biology and Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, NY, USA.
Basel Khraiwesh Laboratory of Algal, Systems, and Synthetic Biology, Division of Science and Math, New York University Abu Dhabi and Center for Genomics and Systems Biology (CGSB), New York University Abu Dhabi Institute, Abu Dhabi, UAE.
Ashish Jaiswal Laboratory of Algal, Systems, and Synthetic Biology, Division of Science and Math, New York University Abu Dhabi and Center for Genomics and Systems Biology (CGSB), New York University Abu Dhabi Institute, Abu Dhabi, UAE.
Kenan Jijakli Laboratory of Algal, Systems, and Synthetic Biology, Division of Science and Math, New York University Abu Dhabi and Center for Genomics and Systems Biology (CGSB), New York University Abu Dhabi Institute, Abu Dhabi, UAE.
Joseph Koussa Laboratory of Algal, Systems, and Synthetic Biology, Division of Science and Math, New York University Abu Dhabi and Center for Genomics and Systems Biology (CGSB), New York University Abu Dhabi Institute, Abu Dhabi, UAE.
David R Nelson Laboratory of Algal, Systems, and Synthetic Biology, Division of Science and Math, New York University Abu Dhabi and Center for Genomics and Systems Biology (CGSB), New York University Abu Dhabi Institute, Abu Dhabi, UAE.
Hong Cai Laboratory of Algal, Systems, and Synthetic Biology, Division of Science and Math, New York University Abu Dhabi and Center for Genomics and Systems Biology (CGSB), New York University Abu Dhabi Institute, Abu Dhabi, UAE.
Xinping Yang Center for Cancer Systems Biology (CCSB) and Department of Cancer Biology, Dana-Farber Cancer Institute, and Department of Genetics, Harvard Medical School, Boston, MA, USA
Roger L Chang Department of Systems Biology, Harvard Medical School, Boston, MA, USA
Jason Papin Department of Biomedical Engineering, University of Virginia, Charlottesville, VA, USA.
Haiyuan Yu Department of Biological Statistics and Computational Biology and Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, NY, USA.
Santhanam Balaji Laboratory of Algal, Systems, and Synthetic Biology, Division of Science and Math, New York University Abu Dhabi and Center for Genomics and Systems Biology (CGSB), New York University Abu Dhabi Institute, Abu Dhabi, UAE. and Center for Cancer Systems Biology (CCSB) and Department of Cancer Biology, Dana-Farber Cancer Institute, and Department of Genetics, Harvard Medical School, Boston, MA, USA and MRC Laboratory of Molecular Biology, Cambridge, UK.
Kourosh Salehi-Ashtiani Laboratory of Algal, Systems, and Synthetic Biology, Division of Science and Math, New York University Abu Dhabi and Center for Genomics and Systems Biology (CGSB), New York University Abu Dhabi Institute, Abu Dhabi, UAE. and Center for Cancer Systems Biology (CCSB) and Department of Cancer Biology, Dana-Farber Cancer Institute, and Department of Genetics, Harvard Medical School, Boston, MA, USA

Collapse

Poot-Hernandez AC, Rodriguez-Vazquez K, Perez-Rueda E. The alignment of enzymatic steps reveals similar metabolic pathways and probable recruitment events in Gammaproteobacteria. BMC Genomics 2015;16:957. [PMID: 26578309 PMCID: PMC4647829 DOI: 10.1186/s12864-015-2113-0] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2015] [Accepted: 10/19/2015] [Indexed: 11/29/2022] Open

Abstract

Background

It is generally accepted that gene duplication followed by functional divergence is one of the main sources of metabolic diversity. In this regard, there is an increasing interest in the development of methods that allow the systematic identification of these evolutionary events in metabolism. Here, we used a method not based on biomolecular sequence analysis to compare and identify common and variable routes in the metabolism of 40 Gammaproteobacteria species.

Method

The metabolic maps deposited in the KEGG database were transformed into linear Enzymatic Step Sequences (ESS) by using the breadth-first search algorithm. These ESS represent subsequent enzymes linked to each other, where their catalytic activities are encoded in the Enzyme Commission numbers. The ESS were compared in an all-against-all (pairwise comparisons) approach by using a dynamic programming algorithm, leaving only a set of significant pairs.

Results and conclusion

From these comparisons, we identified a set of functionally conserved enzymatic steps in different metabolic maps, in which cell wall components and fatty acid and lysine biosynthesis were included. In addition, we found that pathways associated with biosynthesis share a higher proportion of similar ESS than degradation pathways and secondary metabolism pathways. Also, maps associated with the metabolism of similar compounds contain a high proportion of similar ESS, such as those maps from nucleotide metabolism pathways, in particular the inosine monophosphate pathway. Furthermore, diverse ESS associated with the low part of the glycolysis pathway were identified as functionally similar to multiple metabolic pathways. In summary, our comparisons may help to identify similar reactions in different metabolic pathways and could reinforce the patchwork model in the evolution of metabolism in Gammaproteobacteria.

Electronic supplementary material

The online version of this article (doi:10.1186/s12864-015-2113-0) contains supplementary material, which is available to authorized users.

Collapse

Induction of the Sugar-Phosphate Stress Response Allows Saccharomyces cerevisiae 2-Methyl-4-Amino-5-Hydroxymethylpyrimidine Phosphate Synthase To Function in Salmonella enterica. J Bacteriol 2015;197:3554-62. [PMID: 26324451 DOI: 10.1128/jb.00576-15] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2015] [Accepted: 08/25/2015] [Indexed: 11/20/2022] Open

Abstract

UNLABELLED

Thiamine pyrophosphate is a required cofactor for all forms of life. The pyrimidine moiety of thiamine, 2-methyl-4-amino-5-hydroxymethylpyrimidine phosphate (HMP-P), is synthesized by different mechanisms in bacteria and plants compared to fungi. In this study, Salmonella enterica was used as a host to probe requirements for activity of the yeast HMP-P synthase, Thi5p. Thi5p synthesizes HMP-P from histidine and pyridoxal-5-phosphate and was reported to use a backbone histidine as the substrate, which would mean that it was a single-turnover enzyme. Heterologous expression of Thi5p did not complement an S. enterica HMP-P auxotroph during growth with glucose as the sole carbon source. Genetic analyses described here showed that Thi5p was activated in S. enterica by alleles of sgrR that induced the sugar-phosphate stress response. Deletion of ptsG (encodes enzyme IICB [EIICB] of the phosphotransferase system [PTS]) also allowed function of Thi5p and required sgrR but not sgrS. This result suggested that the role of sgrS in activation of Thi5p was to decrease PtsG activity. In total, the data herein supported the hypothesis that one mechanism to activate Thi5p in S. enterica grown on minimal medium containing glucose (minimal glucose medium) required decreased PtsG activity and an unidentified gene regulated by SgrR.

IMPORTANCE

This work describes a metabolic link between the sugar-phosphate stress response and the yeast thiamine biosynthetic enzyme Thi5p when heterologously expressed in Salmonella enterica during growth on minimal glucose medium. Suppressor analysis (i) identified a mutant class of the regulator SgrR that activate sugar-phosphate stress response constitutively and (ii) determined that Thi5p is conditionally active in S. enterica. These results emphasized the power of genetic systems in model organisms to uncover enzyme function and underlying metabolic network structure.

Collapse

Park JM, Niestemski LR, Deem MW. Quasispecies theory for evolution of modularity. PHYSICAL REVIEW. E, STATISTICAL, NONLINEAR, AND SOFT MATTER PHYSICS 2015;91:012714. [PMID: 25679649 PMCID: PMC4477872 DOI: 10.1103/physreve.91.012714] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/01/2014] [Indexed: 06/04/2023]

CHEN JING, DING YANRUI, XU WENBO. COMPARATIVE ANALYSIS OF METABOLIC NETWORKS IN MESOPHILIC AND THERMOPHILIC ARCHAEA METHANOGENS BASED ON MODULARITY. J BIOL SYST 2013. [DOI: 10.1142/s0218339013500150] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]

Abstract Metabolic networks are useful representations of the metabolic capabilities of cells. A comparison of metabolic networks across species is essential to better understand how evolutionary pressures shape these networks. By comparing the set of reactions that are expected to occur in an organism with the set of reactions in reference metabolic pathways, it is possible to infer the main metabolic functions of an organism. In this paper, the metabolic networks of the mesophilic archaeon Methanosarcina acetivorans and the thermophilic archaeon Methanopyrus kandleri have been reconstructed based on the KEGG LIGAND database, followed by four topological statistical analyses of the nodes in the two networks to compare their metabolic networks. The values of average degree and characteristic path length are very small but clustering coefficient is relatively large. The results show that the complete metabolic networks of M. acetivorans and M. kandleri possessed "small-world" network properties. Then we used Girvan–Newman modular algorithm to identify hub modules and compared hub modules with non-hub modules, respectively. The results show that M. kandleri metabolic network has a better modular organization than the M. acetivorans network. M. acetivorans includes 39 modules, 25 modules of them are independent, and 15 modules are functionally pure. On the other hand, M. kandleri includes 30 modules. Among them, there are 20 independent modules, and 14 of them are functionally pure. These results further indicated that the present approach for identifying modules yields modules that have biologically significant functions. We also identified hub modules of the metabolic networks and found that these hub modules are carbohydrate metabolism and amino acid metabolism. The conclusions obtained from such studies provide a broad overview of the similarities and differences between organism's metabolic networks. These will be very helpful for further research on thermostability of methanogens. Collapse

Global probabilistic annotation of metabolic networks enables enzyme discovery. Nat Chem Biol 2013;8:848-54. [PMID: 22960854 PMCID: PMC3696893 DOI: 10.1038/nchembio.1063] [Citation(s) in RCA: 48] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2012] [Accepted: 08/07/2012] [Indexed: 11/08/2022]

Enhancing community detection using a network weighting strategy. Inf Sci (N Y) 2013. [DOI: 10.1016/j.ins.2012.08.001] [Citation(s) in RCA: 72] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]

Muley VY, Ranjan A. Evaluation of physical and functional protein-protein interaction prediction methods for detecting biological pathways. PLoS One 2013;8:e54325. [PMID: 23349851 PMCID: PMC3547882 DOI: 10.1371/journal.pone.0054325] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2012] [Accepted: 12/11/2012] [Indexed: 11/18/2022] Open

Abstract

BACKGROUND

Cellular activities are governed by the physical and the functional interactions among several proteins involved in various biological pathways. With the availability of sequenced genomes and high-throughput experimental data one can identify genome-wide protein-protein interactions using various computational techniques. Comparative assessments of these techniques in predicting protein interactions have been frequently reported in the literature but not their ability to elucidate a particular biological pathway.

METHODS

Towards the goal of understanding the prediction capabilities of interactions among the specific biological pathway proteins, we report the analyses of 14 biological pathways of Escherichia coli catalogued in KEGG database using five protein-protein functional linkage prediction methods. These methods are phylogenetic profiling, gene neighborhood, co-presence of orthologous genes in the same gene clusters, a mirrortree variant, and expression similarity.

CONCLUSIONS

Our results reveal that the prediction of metabolic pathway protein interactions continues to be a challenging task for all methods which possibly reflect flexible/independent evolutionary histories of these proteins. These methods have predicted functional associations of proteins involved in amino acids, nucleotide, glycans and vitamins & co-factors pathways slightly better than the random performance on carbohydrate, lipid and energy metabolism. We also make similar observations for interactions involved among the environmental information processing proteins. On the contrary, genetic information processing or specialized processes such as motility related protein-protein linkages that occur in the subset of organisms are predicted with comparable accuracy. Metabolic pathways are best predicted by using neighborhood of orthologous genes whereas phyletic pattern is good enough to reconstruct central dogma pathway protein interactions. We have also shown that the effective use of a particular prediction method depends on the pathway under investigation. In case one is not focused on specific pathway, gene expression similarity method is the best option.

Collapse

Psomopoulos FE, Mitkas PA, Ouzounis CA. Detection of genomic idiosyncrasies using fuzzy phylogenetic profiles. PLoS One 2013;8:e52854. [PMID: 23341912 PMCID: PMC3544837 DOI: 10.1371/journal.pone.0052854] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2012] [Accepted: 11/22/2012] [Indexed: 11/18/2022] Open

Muley VY, Ranjan A. Effect of reference genome selection on the performance of computational methods for genome-wide protein-protein interaction prediction. PLoS One 2012;7:e42057. [PMID: 22844541 PMCID: PMC3406042 DOI: 10.1371/journal.pone.0042057] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2011] [Accepted: 07/02/2012] [Indexed: 12/20/2022] Open

Abstract

Background

Recent progress in computational methods for predicting physical and functional protein-protein interactions has provided new insights into the complexity of biological processes. Most of these methods assume that functionally interacting proteins are likely to have a shared evolutionary history. This history can be traced out for the protein pairs of a query genome by correlating different evolutionary aspects of their homologs in multiple genomes known as the reference genomes. These methods include phylogenetic profiling, gene neighborhood and co-occurrence of the orthologous protein coding genes in the same cluster or operon. These are collectively known as genomic context methods. On the other hand a method called mirrortree is based on the similarity of phylogenetic trees between two interacting proteins. Comprehensive performance analyses of these methods have been frequently reported in literature. However, very few studies provide insight into the effect of reference genome selection on detection of meaningful protein interactions.

Methods

We analyzed the performance of four methods and their variants to understand the effect of reference genome selection on prediction efficacy. We used six sets of reference genomes, sampled in accordance with phylogenetic diversity and relationship between organisms from 565 bacteria. We used Escherichia coli as a model organism and the gold standard datasets of interacting proteins reported in DIP, EcoCyc and KEGG databases to compare the performance of the prediction methods.

Conclusions

Higher performance for predicting protein-protein interactions was achievable even with 100–150 bacterial genomes out of 565 genomes. Inclusion of archaeal genomes in the reference genome set improves performance. We find that in order to obtain a good performance, it is better to sample few genomes of related genera of prokaryotes from the large number of available genomes. Moreover, such a sampling allows for selecting 50–100 genomes for comparable accuracy of predictions when computational resources are limited.

Collapse

Doerks T, van Noort V, Minguez P, Bork P. Annotation of the M. tuberculosis hypothetical orfeome: adding functional information to more than half of the uncharacterized proteins. PLoS One 2012;7:e34302. [PMID: 22485162 PMCID: PMC3317503 DOI: 10.1371/journal.pone.0034302] [Citation(s) in RCA: 49] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2011] [Accepted: 02/26/2012] [Indexed: 11/18/2022] Open

Chae L, Lee I, Shin J, Rhee SY. Towards understanding how molecular networks evolve in plants. CURRENT OPINION IN PLANT BIOLOGY 2012;15:177-84. [PMID: 22280840 DOI: 10.1016/j.pbi.2012.01.006] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/14/2011] [Revised: 12/20/2011] [Accepted: 01/05/2012] [Indexed: 05/02/2023]

Judson RS, Mortensen HM, Shah I, Knudsen TB, Elloumi F. Using pathway modules as targets for assay development in xenobiotic screening. ACTA ACUST UNITED AC 2012;8:531-42. [DOI: 10.1039/c1mb05303e] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]

Wang X, Yue J, Ren X, Wang Y, Tan M, Li B, Liang L. Modularity analysis based on predicted protein-protein interactions provides new insights into pathogenicity and cellular process of Escherichia coli O157:H7. Theor Biol Med Model 2011;8:47. [PMID: 22188601 PMCID: PMC3275473 DOI: 10.1186/1742-4682-8-47] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2011] [Accepted: 12/22/2011] [Indexed: 12/19/2022] Open

Brouwers L, Iskar M, Zeller G, van Noort V, Bork P. Network neighbors of drug targets contribute to drug side-effect similarity. PLoS One 2011;6:e22187. [PMID: 21765950 PMCID: PMC3135612 DOI: 10.1371/journal.pone.0022187] [Citation(s) in RCA: 79] [Impact Index Per Article: 6.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2011] [Accepted: 06/19/2011] [Indexed: 12/31/2022] Open

Raes J, Letunic I, Yamada T, Jensen LJ, Bork P. Toward molecular trait-based ecology through integration of biogeochemical, geographical and metagenomic data. Mol Syst Biol 2011;7:473. [PMID: 21407210 PMCID: PMC3094067 DOI: 10.1038/msb.2011.6] [Citation(s) in RCA: 148] [Impact Index Per Article: 11.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2010] [Accepted: 01/25/2011] [Indexed: 11/10/2022] Open

Abstract

Using metagenomic ‘parts lists' to study microbial ecology remains a significant challenge. This work proposes a molecular trait-based approach to biogeography by integrating metagenomic data with external metadata and using functional community composition as readout.

Climatic factors drive functional and phylogenetic composition of ocean microbial communities.

Function dispersal is controlled by environmental conditions.

Functional richness has a clear latitudinal gradient and correlates with primary production.

Metagenomic data can be used as a predictor for ecosystem processes.

To understand the relationship between community composition and environment, functional readouts are the most direct. Metagenomic data enable such trait-based ecology at the molecular level.

Metagenomics (shotgun sequencing of pooled DNA of complete microbial communities) is widely used to investigate ecosystem functioning of environmental and clinical samples. However, the nature of this data (usually a gigantic collection of gene fragments of 1000s of organisms) makes it very hard to infer global patterns on microbial ecology of the environment at hand. To address important ecological questions such as ‘How do microbial communities adapt to the environmental conditions?', ‘What drives the functional variation across the globe and to what extent do genes disperse?' and ‘What drives variation of CO₂ uptake across different locations and communities?', we integrated 25 ocean metagenomes from the Global Ocean Sampling project with geographical, meteorological and geophysicochemical data. We find that climatic factors (temperature, sunlight) are the major determinants of the functional and phylogenetic composition of an environment and the main limiting factor on whether functions dispersal across the planet. We find a distinct latitudinal gradient in the size and diversity of the functional repertoire of ocean microbial communities, peaking at 20°N, and which correlates with oceanic CO₂ uptake. The latter can also be predicted from the molecular functional composition of an environmental sample. Together, our results show that the functional community composition derived from metagenomes can be used as quantitative predictor for molecular trait-based biogeography and ecology.

Using metagenomic ‘parts lists' to infer global patterns on microbial ecology remains a significant challenge. To deduce important ecological indicators such as environmental adaptation, molecular trait dispersal, diversity variation and primary production from the gene pool of an ecosystem, we integrated 25 ocean metagenomes with geographical, meteorological and geophysicochemical data. We find that climatic factors (temperature, sunlight) are the major determinants of the biomolecular repertoire of each sample and the main limiting factor on functional trait dispersal (absence of biogeographic provincialism). Molecular functional richness and diversity show a distinct latitudinal gradient peaking at 20°N and correlate with primary production. The latter can also be predicted from the molecular functional composition of an environmental sample. Together, our results show that the functional community composition derived from metagenomes is an important quantitative readout for molecular trait-based biogeography and ecology.

Collapse

Konietzny SG, Dietz L, McHardy AC. Inferring functional modules of protein families with probabilistic topic models. BMC Bioinformatics 2011;12:141. [PMID: 21554720 PMCID: PMC3098182 DOI: 10.1186/1471-2105-12-141] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2010] [Accepted: 05/09/2011] [Indexed: 01/15/2023] Open

Kumar M, Balaji PV. Comparative genomics analysis of completely sequenced microbial genomes reveals the ubiquity of N-linked glycosylation in prokaryotes. MOLECULAR BIOSYSTEMS 2011;7:1629-45. [PMID: 21387023 DOI: 10.1039/c0mb00259c] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]

Towards an Evolutionary Model of Animal-Associated Microbiomes. ENTROPY 2011. [DOI: 10.3390/e13030570] [Citation(s) in RCA: 39] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/26/2022]

The emergence of modularity in biological systems. Phys Life Rev 2011;8:129-60. [PMID: 21353651 DOI: 10.1016/j.plrev.2011.02.003] [Citation(s) in RCA: 58] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2011] [Accepted: 02/09/2011] [Indexed: 11/22/2022]

Xu G, Bennett L, Papageorgiou LG, Tsoka S. Module detection in complex networks using integer optimisation. Algorithms Mol Biol 2010;5:36. [PMID: 21073720 PMCID: PMC2993711 DOI: 10.1186/1748-7188-5-36] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2010] [Accepted: 11/12/2010] [Indexed: 11/10/2022] Open

Circulating brain-derived neurotrophic factor and indices of metabolic and cardiovascular health: data from the Baltimore Longitudinal Study of Aging. PLoS One 2010;5:e10099. [PMID: 20404913 PMCID: PMC2852401 DOI: 10.1371/journal.pone.0010099] [Citation(s) in RCA: 144] [Impact Index Per Article: 10.3] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2009] [Accepted: 03/10/2010] [Indexed: 12/31/2022] Open

Kanapin AA, Mulder N, Kuznetsov VA. Projection of gene-protein networks to the functional space of the proteome and its application to analysis of organism complexity. BMC Genomics 2010;11 Suppl 1:S4. [PMID: 20158875 PMCID: PMC2822532 DOI: 10.1186/1471-2164-11-s1-s4] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open

Reid AJ, Ranea JA, Orengo CA. Comparative evolutionary analysis of protein complexes in E. coli and yeast. BMC Genomics 2010;11:79. [PMID: 20122144 PMCID: PMC2837643 DOI: 10.1186/1471-2164-11-79] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2009] [Accepted: 02/01/2010] [Indexed: 11/17/2022] Open

Vallabhajosyula RR, Raval A. Computational modeling in systems biology. Methods Mol Biol 2010;662:97-120. [PMID: 20824468 DOI: 10.1007/978-1-60761-800-3_5] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/28/2023]

Progress in The Evolutionary Analysis of Protein Interaction Networks*. PROG BIOCHEM BIOPHYS 2009. [DOI: 10.3724/sp.j.1206.2008.00393] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]

'Unknown' proteins and 'orphan' enzymes: the missing half of the engineering parts list--and how to find it. Biochem J 2009;425:1-11. [PMID: 20001958 DOI: 10.1042/bj20091328] [Citation(s) in RCA: 135] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]

Evolution of biomolecular networks: lessons from metabolic and protein interactions. Nat Rev Mol Cell Biol 2009;10:791-803. [PMID: 19851337 DOI: 10.1038/nrm2787] [Citation(s) in RCA: 144] [Impact Index Per Article: 9.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]

Wilm M. Quantitative proteomics in biological research. Proteomics 2009;9:4590-605. [DOI: 10.1002/pmic.200900299] [Citation(s) in RCA: 68] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/14/2023]

Song J, Singh M. How and when should interactome-derived clusters be used to predict functional modules and protein function? ACTA ACUST UNITED AC 2009;25:3143-50. [PMID: 19770263 PMCID: PMC3167697 DOI: 10.1093/bioinformatics/btp551] [Citation(s) in RCA: 103] [Impact Index Per Article: 6.9] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022]

Wagner A. Evolutionary constraints permeate large metabolic networks. BMC Evol Biol 2009;9:231. [PMID: 19747381 PMCID: PMC2753571 DOI: 10.1186/1471-2148-9-231] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/17/2009] [Accepted: 09/11/2009] [Indexed: 11/22/2022] Open

Rentzsch R, Orengo CA. Protein function prediction--the power of multiplicity. Trends Biotechnol 2009;27:210-9. [PMID: 19251332 DOI: 10.1016/j.tibtech.2009.01.002] [Citation(s) in RCA: 88] [Impact Index Per Article: 5.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2008] [Revised: 01/21/2009] [Accepted: 01/23/2009] [Indexed: 01/07/2023]

Wang H, Kakaradov B, Collins SR, Karotki L, Fiedler D, Shales M, Shokat KM, Walther TC, Krogan NJ, Koller D. A complex-based reconstruction of the Saccharomyces cerevisiae interactome. Mol Cell Proteomics 2009;8:1361-81. [PMID: 19176519 PMCID: PMC2690481 DOI: 10.1074/mcp.m800490-mcp200] [Citation(s) in RCA: 75] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022] Open

Janga SC, Babu MM. Network-based approaches for linking metabolism with environment. Genome Biol 2008;9:239. [PMID: 19040774 PMCID: PMC2614483 DOI: 10.1186/gb-2008-9-11-239] [Citation(s) in RCA: 34] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/12/2023] Open

Harrington ED, Jensen LJ, Bork P. Predicting biological networks from genomic data. FEBS Lett 2008;582:1251-8. [PMID: 18294967 DOI: 10.1016/j.febslet.2008.02.033] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2008] [Accepted: 02/13/2008] [Indexed: 12/27/2022]

Kensche PR, van Noort V, Dutilh BE, Huynen MA. Practical and theoretical advances in predicting the function of a protein by its phylogenetic distribution. J R Soc Interface 2008;5:151-70. [PMID: 17535793 PMCID: PMC2405902 DOI: 10.1098/rsif.2007.1047] [Citation(s) in RCA: 76] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open

Martínez JL, Baquero F, Andersson DI. Predicting antibiotic resistance. Nat Rev Microbiol 2007;5:958-65. [PMID: 18007678 DOI: 10.1038/nrmicro1796] [Citation(s) in RCA: 234] [Impact Index Per Article: 13.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/09/2023]

Parter M, Kashtan N, Alon U. Environmental variability and modularity of bacterial metabolic networks. BMC Evol Biol 2007;7:169. [PMID: 17888177 PMCID: PMC2151768 DOI: 10.1186/1471-2148-7-169] [Citation(s) in RCA: 121] [Impact Index Per Article: 7.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2007] [Accepted: 09/23/2007] [Indexed: 11/10/2022] Open

Zhao J, Ding GH, Tao L, Yu H, Yu ZH, Luo JH, Cao ZW, Li YX. Modular co-evolution of metabolic networks. BMC Bioinformatics 2007;8:311. [PMID: 17723146 PMCID: PMC2001200 DOI: 10.1186/1471-2105-8-311] [Citation(s) in RCA: 51] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2007] [Accepted: 08/27/2007] [Indexed: 11/25/2022] Open

Harrington ED, Singh AH, Doerks T, Letunic I, von Mering C, Jensen LJ, Raes J, Bork P. Quantitative assessment of protein function prediction from metagenomics shotgun sequences. Proc Natl Acad Sci U S A 2007;104:13913-8. [PMID: 17717083 PMCID: PMC1955820 DOI: 10.1073/pnas.0702636104] [Citation(s) in RCA: 63] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open

Tsoka S. Computational methodologies for genome evolution and functional association. Comput Chem Eng 2007. [DOI: 10.1016/j.compchemeng.2006.11.011] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]

Díaz-Mejía JJ, Pérez-Rueda E, Segovia L. A network perspective on the evolution of metabolism by gene duplication. Genome Biol 2007;8:R26. [PMID: 17326820 PMCID: PMC1852415 DOI: 10.1186/gb-2007-8-2-r26] [Citation(s) in RCA: 41] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2006] [Revised: 10/23/2006] [Accepted: 02/27/2007] [Indexed: 01/16/2023] Open

Abstract

BACKGROUND

Gene duplication followed by divergence is one of the main sources of metabolic versatility. The patchwork and stepwise models of metabolic evolution help us to understand these processes, but their assumptions are relatively simplistic. We used a network-based approach to determine the influence of metabolic constraints on the retention of duplicated genes.

RESULTS

We detected duplicated genes by looking for enzymes sharing homologous domains and uncovered an increased retention of duplicates for enzymes catalyzing consecutive reactions, as illustrated by the ligases acting in the biosynthesis of peptidoglycan. As a consequence, metabolic networks show a high retention of duplicates within functional modules, and we found a preferential biochemical coupling of reactions that partially explains this bias. A similar situation was found in enzyme-enzyme interaction networks, but not in interaction networks of non-enzymatic proteins or gene transcriptional regulatory networks, suggesting that the retention of duplicates results from the biochemical rules governing substrate-enzyme-product relationships. We confirmed a high retention of duplicates between chemically similar reactions, as illustrated by fatty-acid metabolism. The retention of duplicates between chemically dissimilar reactions is, however, also greater than expected by chance. Finally, we detected a significant retention of duplicates as groups, instead of single pairs.

CONCLUSION

Our results indicate that in silico modeling of the origin and evolution of metabolism is improved by the inclusion of specific functional constraints, such as the preferential biochemical coupling of reactions. We suggest that the stepwise and patchwork models are not independent of each other: in fact, the network perspective enables us to reconcile and combine these models.

Collapse

Chen L, Vitkup D. Distribution of orphan metabolic activities. Trends Biotechnol 2007;25:343-8. [PMID: 17580095 DOI: 10.1016/j.tibtech.2007.06.001] [Citation(s) in RCA: 33] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2007] [Revised: 04/17/2007] [Accepted: 06/01/2007] [Indexed: 10/23/2022]