Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Kensche PR, van Noort V, Dutilh BE, Huynen MA. Practical and theoretical advances in predicting the function of a protein by its phylogenetic distribution. J R Soc Interface 2008;5:151-70. [PMID: 17535793 PMCID: PMC2405902 DOI: 10.1098/rsif.2007.1047] [Citation(s) in RCA: 76] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open

For:	Kensche PR, van Noort V, Dutilh BE, Huynen MA. Practical and theoretical advances in predicting the function of a protein by its phylogenetic distribution. J R Soc Interface 2008;5:151-70. [PMID: 17535793 PMCID: PMC2405902 DOI: 10.1098/rsif.2007.1047] [Citation(s) in RCA: 76] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open

Number

Cited by Other Article(s)

Langschied F, Leisegang MS, Brandes RP, Ebersberger I. ncOrtho: efficient and reliable identification of miRNA orthologs. Nucleic Acids Res 2023;51:e71. [PMID: 37260093 PMCID: PMC10359484 DOI: 10.1093/nar/gkad467] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2022] [Revised: 05/04/2023] [Accepted: 05/30/2023] [Indexed: 06/02/2023] Open

Elhabashy H, Merino F, Alva V, Kohlbacher O, Lupas AN. Exploring protein-protein interactions at the proteome level. Structure 2022;30:462-475. [DOI: 10.1016/j.str.2022.02.004] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2021] [Revised: 10/26/2021] [Accepted: 02/02/2022] [Indexed: 02/08/2023]

Fukunaga T, Iwasaki W. Inverse Potts model improves accuracy of phylogenetic profiling. Bioinformatics 2022;38:1794-1800. [PMID: 35060594 PMCID: PMC8963296 DOI: 10.1093/bioinformatics/btac034] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2021] [Revised: 01/11/2022] [Accepted: 01/13/2022] [Indexed: 02/03/2023] Open

Stupp D, Sharon E, Bloch I, Zitnik M, Zuk O, Tabach Y. Co-evolution based machine-learning for predicting functional interactions between human genes. Nat Commun 2021;12:6454. [PMID: 34753957 PMCID: PMC8578642 DOI: 10.1038/s41467-021-26792-w] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2020] [Accepted: 10/09/2021] [Indexed: 12/20/2022] Open

Fukunaga T, Iwasaki W. Mirage: estimation of ancestral gene-copy numbers by considering different evolutionary patterns among gene families. BIOINFORMATICS ADVANCES 2021;1:vbab014. [PMID: 36700099 PMCID: PMC9710636 DOI: 10.1093/bioadv/vbab014] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 04/28/2021] [Revised: 07/22/2021] [Accepted: 07/28/2021] [Indexed: 01/28/2023]

Tsaban T, Stupp D, Sherill-Rofe D, Bloch I, Sharon E, Schueler-Furman O, Wiener R, Tabach Y. CladeOScope: functional interactions through the prism of clade-wise co-evolution. NAR Genom Bioinform 2021;3:lqab024. [PMID: 33928243 PMCID: PMC8057497 DOI: 10.1093/nargab/lqab024] [Citation(s) in RCA: 16] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/10/2020] [Revised: 03/12/2021] [Accepted: 03/18/2021] [Indexed: 12/11/2022] Open

Matsumoto H, Mimori T, Fukunaga T. Novel metric for hyperbolic phylogenetic tree embeddings. Biol Methods Protoc 2021;6:bpab006. [PMID: 33928190 PMCID: PMC8058397 DOI: 10.1093/biomethods/bpab006] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2020] [Revised: 03/19/2021] [Accepted: 03/23/2021] [Indexed: 01/09/2023] Open

Niu Y, Moghimyfiroozabad S, Moghimyfiroozabad A, Tierney TS, Alavian KN. The factors for the early and late development of midbrain dopaminergic neurons segregate into two distinct evolutionary clusters. BRAIN DISORDERS 2021. [DOI: 10.1016/j.dscb.2021.100002] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022] Open

Bloch I, Sherill-Rofe D, Stupp D, Unterman I, Beer H, Sharon E, Tabach Y. Optimization of co-evolution analysis through phylogenetic profiling reveals pathway-specific signals. Bioinformatics 2021;36:4116-4125. [PMID: 32353123 DOI: 10.1093/bioinformatics/btaa281] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2019] [Revised: 04/17/2020] [Accepted: 04/23/2020] [Indexed: 12/11/2022] Open

Tremblay BJM, Lobb B, Doxey AC. PhyloCorrelate: inferring bacterial gene-gene functional associations through large-scale phylogenetic profiling. Bioinformatics 2021;37:17-22. [PMID: 33416870 DOI: 10.1093/bioinformatics/btaa1105] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2020] [Revised: 12/26/2020] [Accepted: 12/29/2020] [Indexed: 11/12/2022] Open

Abstract

MOTIVATION

Statistical detection of co-occurring genes across genomes, known as "phylogenetic profiling", is a powerful bioinformatic technique for inferring gene-gene functional associations. However, this can be a challenging task given the size and complexity of phylogenomic databases, difficulty in accounting for phylogenetic structure, inconsistencies in genome annotation, and substantial computational requirements.

RESULTS

We introduce PhyloCorrelate-a computational framework for gene co-occurrence analysis across large phylogenomic datasets. PhyloCorrelate implements a variety of co-occurrence metrics including standard correlation metrics and model-based metrics that account for phylogenetic history. By combining multiple metrics, we developed an optimized score that exhibits a superior ability to link genes with overlapping GO terms and KEGG pathways, enabling gene function prediction. Using genomic and functional annotation data from the Genome Taxonomy Database and AnnoTree, we performed all-by-all comparisons of gene occurrence profiles across the bacterial tree of life, totaling 154,217,052 comparisons for 28,315 genes across 27,372 bacterial genomes. All predictions are available in an online database, which instantaneously returns the top correlated genes for any PFAM, TIGRFAM, or KEGG query. In total, PhyloCorrelate detected 29,762 high confidence associations between bacterial gene/protein pairs, and generated functional predictions for 834 DUFs and proteins of unknown function.

AVAILABILITY

PhyloCorrelate is available as a web-server at phylocorrelate.uwaterloo.ca as well as an R package for analysis of custom datasets. We anticipate that PhyloCorrelate will be broadly useful as a tool for predicting function and interactions for gene families.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

Collapse

Han Y, Cheng L, Sun W. Analysis of Protein-Protein Interaction Networks through Computational Approaches. Protein Pept Lett 2020;27:265-278. [PMID: 31692419 DOI: 10.2174/0929866526666191105142034] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2019] [Revised: 05/08/2019] [Accepted: 09/26/2019] [Indexed: 01/02/2023]

Moi D, Kilchoer L, Aguilar PS, Dessimoz C. Scalable phylogenetic profiling using MinHash uncovers likely eukaryotic sexual reproduction genes. PLoS Comput Biol 2020;16:e1007553. [PMID: 32697802 PMCID: PMC7423146 DOI: 10.1371/journal.pcbi.1007553] [Citation(s) in RCA: 16] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2019] [Revised: 08/12/2020] [Accepted: 05/18/2020] [Indexed: 01/09/2023] Open

Abstract

Phylogenetic profiling is a computational method to predict genes involved in the same biological process by identifying protein families which tend to be jointly lost or retained across the tree of life. Phylogenetic profiling has customarily been more widely used with prokaryotes than eukaryotes, because the method is thought to require many diverse genomes. There are now many eukaryotic genomes available, but these are considerably larger, and typical phylogenetic profiling methods require at least quadratic time as a function of the number of genes. We introduce a fast, scalable phylogenetic profiling approach entitled HogProf, which leverages hierarchical orthologous groups for the construction of large profiles and locality-sensitive hashing for efficient retrieval of similar profiles. We show that the approach outperforms Enhanced Phylogenetic Tree, a phylogeny-based method, and use the tool to reconstruct networks and query for interactors of the kinetochore complex as well as conserved proteins involved in sexual reproduction: Hap2, Spo11 and Gex1. HogProf enables large-scale phylogenetic profiling across the three domains of life, and will be useful to predict biological pathways among the hundreds of thousands of eukaryotic species that will become available in the coming few years. HogProf is available at https://github.com/DessimozLab/HogProf.

Genes that are involved in the same biological process tend to co-evolve. This property is exploited by the technique of phylogenetic profiling, which identifies co-evolving (and therefore likely functionally related) genes through patterns of correlated gene retention and loss in evolution and across species. However, conventional methods to computing and clustering these correlated genes do not scale with increasing numbers of genomes. HogProf is a novel phylogenetic profiling tool built on probabilistic data structures. It allows the user to construct searchable databases containing the evolutionary history of hundreds of thousands of protein families. Such fast detection of coevolution takes advantage of the rapidly increasing amount of genomic data publicly available, and can uncover unknown biological networks and guide in-vivo research and experimentation. We have applied our tool to describe the biological networks underpinning sexual reproduction in eukaryotes.

Collapse

Fukunaga T, Iwasaki W. Logicome Profiler: Exhaustive detection of statistically significant logic relationships from comparative omics data. PLoS One 2020;15:e0232106. [PMID: 32357172 PMCID: PMC7194410 DOI: 10.1371/journal.pone.0232106] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2019] [Accepted: 04/07/2020] [Indexed: 02/01/2023] Open

Sánchez-Caballero L, Elurbe DM, Baertling F, Guerrero-Castillo S, van den Brand M, van Strien J, van Dam TJP, Rodenburg R, Brandt U, Huynen MA, Nijtmans LGJ. TMEM70 functions in the assembly of complexes I and V. BIOCHIMICA ET BIOPHYSICA ACTA-BIOENERGETICS 2020;1861:148202. [PMID: 32275929 DOI: 10.1016/j.bbabio.2020.148202] [Citation(s) in RCA: 30] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/29/2020] [Revised: 03/19/2020] [Accepted: 04/02/2020] [Indexed: 10/24/2022]

Affiliation(s)

Laura Sánchez-Caballero Department of Paediatrics, Radboud Centre for Mitochondrial Medicine, Radboud University Medical Centre, Nijmegen, the Netherlands
Dei M Elurbe Centre for Molecular and Biomolecular Informatics, Radboud Institute for Molecular Life Sciences, Radboud University Medical Centre, Nijmegen, the Netherlands
Fabian Baertling Department of Paediatrics, Radboud Centre for Mitochondrial Medicine, Radboud University Medical Centre, Nijmegen, the Netherlands; Department of General Paediatrics, Neonatology and Paediatric Cardiology, University Children's Hospital Düsseldorf, Heinrich Heine University, Düsseldorf, Germany
Sergio Guerrero-Castillo Department of Paediatrics, Radboud Centre for Mitochondrial Medicine, Radboud University Medical Centre, Nijmegen, the Netherlands
Mariel van den Brand Department of Paediatrics, Radboud Centre for Mitochondrial Medicine, Radboud University Medical Centre, Nijmegen, the Netherlands
Joeri van Strien Centre for Molecular and Biomolecular Informatics, Radboud Institute for Molecular Life Sciences, Radboud University Medical Centre, Nijmegen, the Netherlands
Teunis J P van Dam Theoretical Biology and Bioinformatics, Department of Biology, Utrecht University, Utrecht, the Netherlands
Richard Rodenburg Department of Paediatrics, Radboud Centre for Mitochondrial Medicine, Radboud University Medical Centre, Nijmegen, the Netherlands
Ulrich Brandt Department of Paediatrics, Radboud Centre for Mitochondrial Medicine, Radboud University Medical Centre, Nijmegen, the Netherlands
Martijn A Huynen Centre for Molecular and Biomolecular Informatics, Radboud Institute for Molecular Life Sciences, Radboud University Medical Centre, Nijmegen, the Netherlands.
Leo G J Nijtmans Department of Paediatrics, Radboud Centre for Mitochondrial Medicine, Radboud University Medical Centre, Nijmegen, the Netherlands

Collapse

Deutekom ES, Vosseberg J, van Dam TJP, Snel B. Measuring the impact of gene prediction on gene loss estimates in Eukaryotes by quantifying falsely inferred absences. PLoS Comput Biol 2019;15:e1007301. [PMID: 31461468 PMCID: PMC6736253 DOI: 10.1371/journal.pcbi.1007301] [Citation(s) in RCA: 30] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2018] [Revised: 09/10/2019] [Accepted: 08/01/2019] [Indexed: 12/25/2022] Open

CiliaCarta: An integrated and validated compendium of ciliary genes. PLoS One 2019;14:e0216705. [PMID: 31095607 PMCID: PMC6522010 DOI: 10.1371/journal.pone.0216705] [Citation(s) in RCA: 78] [Impact Index Per Article: 15.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2018] [Accepted: 04/26/2019] [Indexed: 12/25/2022] Open

Li Y, Ning S, Calvo SE, Mootha VK, Liu JS. Bayesian hidden Markov tree models for clustering genes with shared evolutionary history. Ann Appl Stat 2019. [DOI: 10.1214/18-aoas1208] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]

Kim H, Joe A, Lee M, Yang S, Ma X, Ronald PC, Lee I. A Genome-Scale Co-Functional Network of Xanthomonas Genes Can Accurately Reconstruct Regulatory Circuits Controlled by Two-Component Signaling Systems. Mol Cells 2019;42:166-174. [PMID: 30759970 PMCID: PMC6399010 DOI: 10.14348/molcells.2018.0403] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2018] [Revised: 12/09/2018] [Accepted: 12/19/2018] [Indexed: 01/24/2023] Open

Vidulin V, Šmuc T, Džeroski S, Supek F. The evolutionary signal in metagenome phyletic profiles predicts many gene functions. MICROBIOME 2018;6:129. [PMID: 29991352 PMCID: PMC6040064 DOI: 10.1186/s40168-018-0506-4] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 09/07/2017] [Accepted: 06/19/2018] [Indexed: 06/08/2023]

Abstract

BACKGROUND

The function of many genes is still not known even in model organisms. An increasing availability of microbiome DNA sequencing data provides an opportunity to infer gene function in a systematic manner.

RESULTS

We evaluated if the evolutionary signal contained in metagenome phyletic profiles (MPP) is predictive of a broad array of gene functions. The MPPs are an encoding of environmental DNA sequencing data that consists of relative abundances of gene families across metagenomes. We find that such MPPs can accurately predict 826 Gene Ontology functional categories, while drawing on human gut microbiomes, ocean metagenomes, and DNA sequences from various other engineered and natural environments. Overall, in this task, the MPPs are highly accurate, and moreover they provide coverage for a set of Gene Ontology terms largely complementary to standard phylogenetic profiles, derived from fully sequenced genomes. We also find that metagenomes approximated from taxon relative abundance obtained via 16S rRNA gene sequencing may provide surprisingly useful predictive models. Crucially, the MPPs derived from different types of environments can infer distinct, non-overlapping sets of gene functions and therefore complement each other. Consistently, simulations on > 5000 metagenomes indicate that the amount of data is not in itself critical for maximizing predictive accuracy, while the diversity of sampled environments appears to be the critical factor for obtaining robust models.

CONCLUSIONS

In past work, metagenomics has provided invaluable insight into ecology of various habitats, into diversity of microbial life and also into human health and disease mechanisms. We propose that environmental DNA sequencing additionally constitutes a useful tool to predict biological roles of genes, yielding inferences out of reach for existing comparative genomics approaches.

Collapse

Beck C, Knoop H, Steuer R. Modules of co-occurrence in the cyanobacterial pan-genome reveal functional associations between groups of ortholog genes. PLoS Genet 2018. [PMID: 29522508 PMCID: PMC5862535 DOI: 10.1371/journal.pgen.1007239] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022] Open

Abstract

Cyanobacteria are a monophyletic phylogenetic group of global importance and have received considerable attention as potential host organisms for the renewable synthesis of chemical bulk products from atmospheric CO₂. The cyanobacterial phylum exhibits enormous metabolic diversity with respect to morphology, lifestyle and habitat. As yet, however, research has mostly focused on few model strains and cyanobacterial diversity is insufficiently understood. In this respect, the increasing availability of fully sequenced bacterial genomes opens new and unprecedented opportunities to investigate the genetic inventory of organisms in the context of their pan-genome. Here, we seek understand cyanobacterial diversity using a comparative genome analysis of 77 fully sequenced and assembled cyanobacterial genomes. We use phylogenetic profiling to analyze the co-occurrence of clusters of likely ortholog genes (CLOGs) and reveal novel functional associations between CLOGs that are not captured by co-localization of genes. Going beyond pair-wise co-occurrences, we propose a network approach that allows us to identify modules of co-occurring CLOGs. The extracted modules exhibit a high degree of functional coherence and reveal known as well as previously unknown functional associations. We argue that the high functional coherence observed for the modules is a consequence of the similar-yet-diverse nature of cyanobacteria. Our approach highlights the importance of a multi-strain analysis to understand gene functions and environmental adaptations, with implications beyond the cyanobacterial phylum. The analysis is augmented with a simple toolbox that facilitates further analysis to investigate the co-occurrence neighborhood of specific CLOGs of interest.

Cyanobacteria are photoautotrophic prokaryotes of global importance and offer great potential as host organisms for the renewable synthesis of chemical bulk products, including biofuels, from atmospheric CO₂. As yet, however, research has mostly focussed on a small number of model strains and the genetic inventory of the cyanobacterial phylum is still insufficiently understood. The rapidly increasing availability of fully sequenced cyanobacterial genomes opens new and unprecendented possibilities to study the diversity of cyanobacterial strain in the context of the cyanobacterial pan-genome. Here, we seek to understand the genetic inventory of individual cyanobacterial strains based on the hypothesis that genes that are functionally related also co-occur within the genomes of different strains. We confirm this hypothesis by in depth analysis of co-occurrence that goes beyond pair-wise co-occurrences. We show that co-occurrence does not imply co-localization on the genome. Our work provides a novel approach to infer gene function and highlights the importance of a multi-strain analysis, with implications beyond the analysis of the cyanobacterial phylum.

Collapse

Niu Y, Moghimyfiroozabad S, Safaie S, Yang Y, Jonas EA, Alavian KN. Phylogenetic Profiling of Mitochondrial Proteins and Integration Analysis of Bacterial Transcription Units Suggest Evolution of F1Fo ATP Synthase from Multiple Modules. J Mol Evol 2017;85:219-233. [PMID: 29177973 PMCID: PMC5709465 DOI: 10.1007/s00239-017-9819-3] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2017] [Accepted: 11/11/2017] [Indexed: 11/26/2022]

Sferra G, Fratini F, Ponzi M, Pizzi E. Phylo_dCor: distance correlation as a novel metric for phylogenetic profiling. BMC Bioinformatics 2017;18:396. [PMID: 28870256 PMCID: PMC5584357 DOI: 10.1186/s12859-017-1815-5] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2017] [Accepted: 08/29/2017] [Indexed: 12/20/2022] Open

Niu Y, Liu C, Moghimyfiroozabad S, Yang Y, Alavian KN. PrePhyloPro: phylogenetic profile-based prediction of whole proteome linkages. PeerJ 2017;5:e3712. [PMID: 28875072 PMCID: PMC5578374 DOI: 10.7717/peerj.3712] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2017] [Accepted: 07/28/2017] [Indexed: 02/05/2023] Open

van Hooff JJ, Tromer E, van Wijk LM, Snel B, Kops GJ. Evolutionary dynamics of the kinetochore network in eukaryotes as revealed by comparative genomics. EMBO Rep 2017. [PMID: 28642229 PMCID: PMC5579357 DOI: 10.15252/embr.201744102] [Citation(s) in RCA: 141] [Impact Index Per Article: 20.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022] Open

van Hooff JJE, Snel B, Kops GJPL. Unique Phylogenetic Distributions of the Ska and Dam1 Complexes Support Functional Analogy and Suggest Multiple Parallel Displacements of Ska by Dam1. Genome Biol Evol 2017;9:1295-1303. [PMID: 28472331 PMCID: PMC5439489 DOI: 10.1093/gbe/evx088] [Citation(s) in RCA: 36] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 05/03/2017] [Indexed: 12/27/2022] Open

Wittouck S, van Noort V. Correlated duplications and losses in the evolution of palmitoylation writer and eraser families. BMC Evol Biol 2017;17:83. [PMID: 28320309 PMCID: PMC5359973 DOI: 10.1186/s12862-017-0932-0] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2016] [Accepted: 03/09/2017] [Indexed: 12/27/2022] Open

Abstract

Background

Protein post-translational modifications (PTMs) change protein properties. Each PTM type is associated with domain families that apply the modification (writers), remove the modification (erasers) and bind to the modified sites (readers) together called toolkit domains. The evolutionary origin and diversification remains largely understudied, except for tyrosine phosphorylation. Protein palmitoylation entails the addition of a palmitoyl fatty acid to a cysteine residue. This PTM functions as a membrane anchor and is involved in a range of cellular processes. One writer family and two erasers families are known for protein palmitoylation.

Results

In this work we unravel the evolutionary history of these writer and eraser families. We constructed a high-quality profile hidden Markov model (HMM) of each family, searched for protein family members in fully sequenced genomes and subsequently constructed phylogenetic distributions of the families. We constructed Maximum Likelihood phylogenetic trees and using gene tree rearrangement and tree reconciliation inferred their evolutionary histories in terms of duplication and loss events. We identified lineages where the families expanded or contracted and found that the evolutionary histories of the families are correlated. The results show that the erasers were invented first, before the origin of the eukaryotes. The writers first arose in the eukaryotic ancestor. The writers and erasers show co-expansions in several eukaryotic ancestral lineages. These expansions often seem to be followed by contractions in some or all of the lineages further in evolution.

Conclusions

A general pattern of correlated evolution appears between writer and eraser domains. These co-evolution patterns could be used in new methods for interaction prediction based on phylogenies.

Electronic supplementary material

The online version of this article (doi:10.1186/s12862-017-0932-0) contains supplementary material, which is available to authorized users.

Collapse

Shim JE, Lee T, Lee I. From sequencing data to gene functions: co-functional network approaches. Anim Cells Syst (Seoul) 2017;21:77-83. [PMID: 30460054 PMCID: PMC6138336 DOI: 10.1080/19768354.2017.1284156] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2017] [Accepted: 01/15/2017] [Indexed: 01/04/2023] Open

Shin J, Lee I. Construction of Functional Gene Networks Using Phylogenetic Profiles. Methods Mol Biol 2017;1526:87-98. [PMID: 27896737 DOI: 10.1007/978-1-4939-6613-4_5] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]

Cruz LM, Trefflich S, Weiss VA, Castro MAA. Protein Function Prediction. Methods Mol Biol 2017;1654:55-75. [PMID: 28986783 DOI: 10.1007/978-1-4939-7231-9_5] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/07/2023]

Adebali O, Zhulin IB. Aquerium: A web application for comparative exploration of domain-based protein occurrences on the taxonomically clustered genome tree. Proteins 2016;85:72-77. [PMID: 27802571 DOI: 10.1002/prot.25199] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2016] [Accepted: 10/20/2016] [Indexed: 01/27/2023]

Vidulin V, Šmuc T, Supek F. Extensive complementarity between gene function prediction methods. Bioinformatics 2016;32:3645-3653. [PMID: 27522084 DOI: 10.1093/bioinformatics/btw532] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2016] [Revised: 07/11/2016] [Accepted: 08/09/2016] [Indexed: 12/22/2022] Open

Abstract

MOTIVATION

The number of sequenced genomes rises steadily but we still lack the knowledge about the biological roles of many genes. Automated function prediction (AFP) is thus a necessity. We hypothesized that AFP approaches that draw on distinct genome features may be useful for predicting different types of gene functions, motivating a systematic analysis of the benefits gained by obtaining and integrating such predictions.

RESULTS

Our pipeline amalgamates 5 133 543 genes from 2071 genomes in a single massive analysis that evaluates five established genomic AFP methodologies. While 1227 Gene Ontology (GO) terms yielded reliable predictions, the majority of these functions were accessible to only one or two of the methods. Moreover, different methods tend to assign a GO term to non-overlapping sets of genes. Thus, inferences made by diverse genomic AFP methods display a striking complementary, both gene-wise and function-wise. Because of this, a viable integration strategy is to rely on a single most-confident prediction per gene/function, rather than enforcing agreement across multiple AFP methods. Using an information-theoretic approach, we estimate that current databases contain 29.2 bits/gene of known Escherichia coli gene functions. This can be increased by up to 5.5 bits/gene using individual AFP methods or by 11 additional bits/gene upon integration, thereby providing a highly-ranking predictor on the Critical Assessment of Function Annotation 2 community benchmark. Availability of more sequenced genomes boosts the predictive accuracy of AFP approaches and also the benefit from integrating them.

AVAILABILITY AND IMPLEMENTATION

The individual and integrated GO predictions for the complete set of genes are available from http://gorbi.irb.hr/ CONTACT: fran.supek@irb.hrSupplementary information: Supplementary materials are available at Bioinformatics online.

Collapse

Developing of the Computer Method for Annotation of Bacterial Genes. Adv Bioinformatics 2016;2015:635437. [PMID: 26770195 PMCID: PMC4684837 DOI: 10.1155/2015/635437] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2015] [Revised: 11/16/2015] [Accepted: 11/18/2015] [Indexed: 02/07/2023] Open

TMEM107 recruits ciliopathy proteins to subdomains of the ciliary transition zone and causes Joubert syndrome. Nat Cell Biol 2015;18:122-31. [PMID: 26595381 DOI: 10.1038/ncb3273] [Citation(s) in RCA: 99] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2015] [Accepted: 10/20/2015] [Indexed: 01/10/2023]

Supek F. The Code of Silence: Widespread Associations Between Synonymous Codon Biases and Gene Function. J Mol Evol 2015;82:65-73. [PMID: 26538122 DOI: 10.1007/s00239-015-9714-8] [Citation(s) in RCA: 43] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2015] [Accepted: 10/30/2015] [Indexed: 02/07/2023]

Shin J, Lee I. Co-Inheritance Analysis within the Domains of Life Substantially Improves Network Inference by Phylogenetic Profiling. PLoS One 2015;10:e0139006. [PMID: 26394049 PMCID: PMC4578931 DOI: 10.1371/journal.pone.0139006] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2015] [Accepted: 09/07/2015] [Indexed: 01/23/2023] Open

Abstract

Phylogenetic profiling, a network inference method based on gene inheritance profiles, has been widely used to construct functional gene networks in microbes. However, its utility for network inference in higher eukaryotes has been limited. An improved algorithm with an in-depth understanding of pathway evolution may overcome this limitation. In this study, we investigated the effects of taxonomic structures on co-inheritance analysis using 2,144 reference species in four query species: Escherichia coli, Saccharomyces cerevisiae, Arabidopsis thaliana, and Homo sapiens. We observed three clusters of reference species based on a principal component analysis of the phylogenetic profiles, which correspond to the three domains of life-Archaea, Bacteria, and Eukaryota-suggesting that pathways inherit primarily within specific domains or lower-ranked taxonomic groups during speciation. Hence, the co-inheritance pattern within a taxonomic group may be eroded by confounding inheritance patterns from irrelevant taxonomic groups. We demonstrated that co-inheritance analysis within domains substantially improved network inference not only in microbe species but also in the higher eukaryotes, including humans. Although we observed two sub-domain clusters of reference species within Eukaryota, co-inheritance analysis within these sub-domain taxonomic groups only marginally improved network inference. Therefore, we conclude that co-inheritance analysis within domains is the optimal approach to network inference with the given reference species. The construction of a series of human gene networks with increasing sample sizes of the reference species for each domain revealed that the size of the high-accuracy networks increased as additional reference species genomes were included, suggesting that within-domain co-inheritance analysis will continue to expand human gene networks as genomes of additional species are sequenced. Taken together, we propose that co-inheritance analysis within the domains of life will greatly potentiate the use of the expected onslaught of sequenced genomes in the study of molecular pathways in higher eukaryotes.

Collapse

Dey G, Meyer T. Phylogenetic Profiling for Probing the Modular Architecture of the Human Genome. Cell Syst 2015;1:106-15. [PMID: 27135799 DOI: 10.1016/j.cels.2015.08.006] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2015] [Revised: 08/03/2015] [Accepted: 08/10/2015] [Indexed: 12/22/2022]

Lee T, Kim H, Lee I. Network-assisted crop systems genetics: network inference and integrative analysis. CURRENT OPINION IN PLANT BIOLOGY 2015;24:61-70. [PMID: 25698380 DOI: 10.1016/j.pbi.2015.02.001] [Citation(s) in RCA: 28] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/08/2014] [Revised: 01/15/2015] [Accepted: 02/02/2015] [Indexed: 05/24/2023]

Škunca N, Dessimoz C. Phylogenetic profiling: how much input data is enough? PLoS One 2015;10:e0114701. [PMID: 25679783 PMCID: PMC4332489 DOI: 10.1371/journal.pone.0114701] [Citation(s) in RCA: 28] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2014] [Accepted: 11/10/2014] [Indexed: 12/04/2022] Open

Dey G, Jaimovich A, Collins SR, Seki A, Meyer T. Systematic Discovery of Human Gene Function and Principles of Modular Organization through Phylogenetic Profiling. Cell Rep 2015;10:993-1006. [PMID: 25683721 DOI: 10.1016/j.celrep.2015.01.025] [Citation(s) in RCA: 48] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2014] [Revised: 12/17/2014] [Accepted: 01/09/2015] [Indexed: 01/17/2023] Open

Haft DH. Using comparative genomics to drive new discoveries in microbiology. Curr Opin Microbiol 2015;23:189-96. [PMID: 25617609 DOI: 10.1016/j.mib.2014.11.017] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2014] [Revised: 11/19/2014] [Accepted: 11/20/2014] [Indexed: 01/17/2023]

Zahiri J, Mohammad-Noori M, Ebrahimpour R, Saadat S, Bozorgmehr JH, Goldberg T, Masoudi-Nejad A. LocFuse: human protein-protein interaction prediction via classifier fusion using protein localization information. Genomics 2014;104:496-503. [PMID: 25458812 DOI: 10.1016/j.ygeno.2014.10.006] [Citation(s) in RCA: 42] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2014] [Revised: 09/28/2014] [Accepted: 10/02/2014] [Indexed: 12/20/2022]

Abstract

UNLABELLED

Protein-protein interaction (PPI) detection is one of the central goals of functional genomics and systems biology. Knowledge about the nature of PPIs can help fill the widening gap between sequence information and functional annotations. Although experimental methods have produced valuable PPI data, they also suffer from significant limitations. Computational PPI prediction methods have attracted tremendous attentions. Despite considerable efforts, PPI prediction is still in its infancy in complex multicellular organisms such as humans. Here, we propose a novel ensemble learning method, LocFuse, which is useful in human PPI prediction. This method uses eight different genomic and proteomic features along with four types of different classifiers. The prediction performance of this classifier selection method was found to be considerably better than methods employed hitherto. This confirms the complex nature of the PPI prediction problem and also the necessity of using biological information for classifier fusion. The LocFuse is available at: http://lbb.ut.ac.ir/Download/LBBsoft/LocFuse.

BIOLOGICAL SIGNIFICANCE

The results revealed that if we divide proteome space according to the cellular localization of proteins, then the utility of some classifiers in PPI prediction can be improved. Therefore, to predict the interaction for any given protein pair, we can select the most accurate classifier with regard to the cellular localization information. Based on the results, we can say that the importance of different features for PPI prediction varies between differently localized proteins; however in general, our novel features, which were extracted from position-specific scoring matrices (PSSMs), are the most important ones and the Random Forest (RF) classifier performs best in most cases. LocFuse was developed with a user-friendly graphic interface and it is freely available for Linux, Mac OSX and MS Windows operating systems.

Collapse

Li Y, Calvo SE, Gutman R, Liu JS, Mootha VK. Expansion of biological pathways based on evolutionary inference. Cell 2014;158:213-25. [PMID: 24995987 DOI: 10.1016/j.cell.2014.05.034] [Citation(s) in RCA: 81] [Impact Index Per Article: 8.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2013] [Revised: 02/06/2014] [Accepted: 05/12/2014] [Indexed: 01/24/2023]

Dutilh BE, Cassman N, McNair K, Sanchez SE, Silva GGZ, Boling L, Barr JJ, Speth DR, Seguritan V, Aziz RK, Felts B, Dinsdale EA, Mokili JL, Edwards RA. A highly abundant bacteriophage discovered in the unknown sequences of human faecal metagenomes. Nat Commun 2014;5:4498. [PMID: 25058116 PMCID: PMC4111155 DOI: 10.1038/ncomms5498] [Citation(s) in RCA: 491] [Impact Index Per Article: 49.1] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2014] [Accepted: 06/25/2014] [Indexed: 01/20/2023] Open

Affiliation(s)

Bas E Dutilh 1] Centre for Molecular and Biomolecular Informatics, Radboud Institute for Molecular Life Sciences, Radboud university medical centre, Geert Grooteplein 28, 6525 GA Nijmegen, The Netherlands [2] Department of Computer Science, San Diego State University, 5500 Campanile Drive, San Diego, California 92182, USA [3] Department of Biology, San Diego State University, 5500 Campanile Drive, San Diego, California 92182, USA [4] Department of Marine Biology, Institute of Biology, Federal University of Rio de Janeiro, Av. Carlos Chagas Fo. 373, Prédio Anexo ao Bloco A do Centro de Ciências da Saúde, Ilha do Fundão, CEP 21941-902 Rio de Janeiro, Brazil
Noriko Cassman 1] Department of Biology, San Diego State University, 5500 Campanile Drive, San Diego, California 92182, USA [2]
Katelyn McNair Department of Computer Science, San Diego State University, 5500 Campanile Drive, San Diego, California 92182, USA
Savannah E Sanchez Department of Biology, San Diego State University, 5500 Campanile Drive, San Diego, California 92182, USA
Genivaldo G Z Silva Computational Science Research Center, San Diego State University, 5500 Campanile Drive, San Diego, California 92182, USA
Lance Boling Department of Biology, San Diego State University, 5500 Campanile Drive, San Diego, California 92182, USA
Jeremy J Barr Department of Biology, San Diego State University, 5500 Campanile Drive, San Diego, California 92182, USA
Daan R Speth Department of Microbiology, Institute for Water and Wetland Research, Radboud University, Heyendaalseweg 135, 6525 AJ Nijmegen, The Netherlands
Victor Seguritan Department of Biology, San Diego State University, 5500 Campanile Drive, San Diego, California 92182, USA
Ramy K Aziz 1] Department of Computer Science, San Diego State University, 5500 Campanile Drive, San Diego, California 92182, USA [2] Department of Microbiology and Immunology, Faculty of Pharmacy, Cairo University, Kasr El-Aini Street, Cairo 11562, Egypt
Ben Felts Department of Mathematics, San Diego State University, 5500 Campanile Drive, San Diego, California 92182, USA
Elizabeth A Dinsdale 1] Department of Biology, San Diego State University, 5500 Campanile Drive, San Diego, California 92182, USA [2] Computational Science Research Center, San Diego State University, 5500 Campanile Drive, San Diego, California 92182, USA
John L Mokili Department of Biology, San Diego State University, 5500 Campanile Drive, San Diego, California 92182, USA
Robert A Edwards 1] Department of Computer Science, San Diego State University, 5500 Campanile Drive, San Diego, California 92182, USA [2] Department of Marine Biology, Institute of Biology, Federal University of Rio de Janeiro, Av. Carlos Chagas Fo. 373, Prédio Anexo ao Bloco A do Centro de Ciências da Saúde, Ilha do Fundão, CEP 21941-902 Rio de Janeiro, Brazil [3] Computational Science Research Center, San Diego State University, 5500 Campanile Drive, San Diego, California 92182, USA [4] Division of Mathematics and Computer Science, Argonne National Laboratory, 9700 S Cass Ave B109, Argonne, Illinois 60439, USA

Collapse

Reynolds KA. Finding a common path: predicting gene function using inferred evolutionary trees. Dev Cell 2014;30:4-5. [PMID: 25026031 DOI: 10.1016/j.devcel.2014.06.029] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2022]

Different subunits belonging to the same protein complex often exhibit discordant expression levels and evolutionary properties. Curr Opin Struct Biol 2014;26:113-20. [DOI: 10.1016/j.sbi.2014.06.001] [Citation(s) in RCA: 32] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2014] [Revised: 04/27/2014] [Accepted: 06/04/2014] [Indexed: 11/21/2022]

Lua RC, Marciano DC, Katsonis P, Adikesavan AK, Wilkins AD, Lichtarge O. Prediction and redesign of protein-protein interactions. PROGRESS IN BIOPHYSICS AND MOLECULAR BIOLOGY 2014;116:194-202. [PMID: 24878423 DOI: 10.1016/j.pbiomolbio.2014.05.004] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/25/2014] [Revised: 05/02/2014] [Accepted: 05/17/2014] [Indexed: 12/14/2022]

Frolov AA, Husek D, Polyakov PY, Snasel V. New BFA method based on attractor neural network and likelihood maximization. Neurocomputing 2014. [DOI: 10.1016/j.neucom.2013.07.047] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/26/2022]

Konietzny SGA, Pope PB, Weimann A, McHardy AC. Inference of phenotype-defining functional modules of protein families for microbial plant biomass degraders. BIOTECHNOLOGY FOR BIOFUELS 2014;7:124. [PMID: 25342967 PMCID: PMC4189754 DOI: 10.1186/s13068-014-0124-8] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/19/2014] [Accepted: 08/05/2014] [Indexed: 05/14/2023]

Abstract

BACKGROUND

Efficient industrial processes for converting plant lignocellulosic materials into biofuels are a key to global efforts to come up with alternative energy sources to fossil fuels. Novel cellulolytic enzymes have been discovered in microbial genomes and metagenomes of microbial communities. However, the identification of relevant genes without known homologs, and the elucidation of the lignocellulolytic pathways and protein complexes for different microorganisms remain challenging.

RESULTS

We describe a new computational method for the targeted discovery of functional modules of plant biomass-degrading protein families, based on their co-occurrence patterns across genomes and metagenome datasets, and the strength of association of these modules with the genomes of known degraders. From approximately 6.4 million family annotations for 2,884 microbial genomes, and 332 taxonomic bins from 18 metagenomes, we identified 5 functional modules that are distinctive for plant biomass degraders, which we term "plant biomass degradation modules" (PDMs). These modules incorporate protein families involved in the degradation of cellulose, hemicelluloses, and pectins, structural components of the cellulosome, and additional families with potential functions in plant biomass degradation. The PDMs were linked to 81 gene clusters in genomes of known lignocellulose degraders, including previously described clusters of lignocellulolytic genes. On average, 70% of the families of each PDM were found to map to gene clusters in known degraders, which served as an additional confirmation of their functional relationships. The presence of a PDM in a genome or taxonomic metagenome bin furthermore allowed us to accurately predict the ability of any particular organism to degrade plant biomass. For 15 draft genomes of a cow rumen metagenome, we used cross-referencing to confirmed cellulolytic enzymes to validate that the PDMs identified plant biomass degraders within a complex microbial community.

CONCLUSIONS

Functional modules of protein families that are involved in different aspects of plant cell wall degradation can be inferred from co-occurrence patterns across (meta-)genomes with a probabilistic topic model. PDMs represent a new resource of protein families and candidate genes implicated in microbial plant biomass degradation. They can also be used to predict the plant biomass degradation ability for a genome or taxonomic bin. The method is also suitable for characterizing other microbial phenotypes.

Collapse

Phylogenetic portrait of the Saccharomyces cerevisiae functional genome. G3-GENES GENOMES GENETICS 2013;3:1335-40. [PMID: 23749449 PMCID: PMC3737173 DOI: 10.1534/g3.113.006585] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]

Dutilh BE, Backus L, Edwards RA, Wels M, Bayjanov JR, van Hijum SAFT. Explaining microbial phenotypes on a genomic scale: GWAS for microbes. Brief Funct Genomics 2013;12:366-80. [PMID: 23625995 PMCID: PMC3743258 DOI: 10.1093/bfgp/elt008] [Citation(s) in RCA: 43] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022] Open