Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Huynen MA, Snel B, von Mering C, Bork P. Function prediction and protein networks. Curr Opin Cell Biol 2003;15:191-8. [PMID: 12648675 DOI: 10.1016/s0955-0674(03)00009-7] [Citation(s) in RCA: 107] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022]

For:	Huynen MA, Snel B, von Mering C, Bork P. Function prediction and protein networks. Curr Opin Cell Biol 2003;15:191-8. [PMID: 12648675 DOI: 10.1016/s0955-0674(03)00009-7] [Citation(s) in RCA: 107] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022]

Number

Cited by Other Article(s)

Sahoo A, Pechmann S. Functional network motifs defined through integration of protein-protein and genetic interactions. PeerJ 2022;10:e13016. [PMID: 35223214 PMCID: PMC8877332 DOI: 10.7717/peerj.13016] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2021] [Accepted: 02/06/2022] [Indexed: 01/11/2023] Open

Saul M, Dinu V. Family Rank: A graphical domain knowledge informed feature ranking algorithm. Bioinformatics 2021;37:3626-3631. [PMID: 34009295 DOI: 10.1093/bioinformatics/btab387] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2021] [Revised: 04/11/2021] [Accepted: 05/18/2021] [Indexed: 11/12/2022] Open

Sangphukieo A, Laomettachit T, Ruengjitchatchawalya M. PhotoModPlus: A web server for photosynthetic protein prediction from genome neighborhood features. PLoS One 2021;16:e0248682. [PMID: 33730083 PMCID: PMC7968678 DOI: 10.1371/journal.pone.0248682] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2020] [Accepted: 03/03/2021] [Indexed: 11/20/2022] Open

Xiong E, Li Z, Zhang C, Zhang J, Liu Y, Peng T, Chen Z, Zhao Q. A study of leaf-senescence genes in rice based on a combination of genomics, proteomics and bioinformatics. Brief Bioinform 2020;22:5998850. [PMID: 33257942 DOI: 10.1093/bib/bbaa305] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2020] [Revised: 09/15/2020] [Accepted: 10/10/2020] [Indexed: 12/14/2022] Open

Kaushik AC, Mehmood A, Dai X, Wei DQ. WeiBI (web-based platform): Enriching integrated interaction network with increased coverage and functional proteins from genome-wide experimental OMICS data. Sci Rep 2020;10:5618. [PMID: 32221380 PMCID: PMC7101429 DOI: 10.1038/s41598-020-62508-8] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2019] [Accepted: 03/10/2020] [Indexed: 12/27/2022] Open

Spark’s GraphX-based link prediction for social communication using triangle counting. SOCIAL NETWORK ANALYSIS AND MINING 2019. [DOI: 10.1007/s13278-019-0573-y] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]

Ding Z, Kihara D. Computational Methods for Predicting Protein-Protein Interactions Using Various Protein Features. CURRENT PROTOCOLS IN PROTEIN SCIENCE 2018;93:e62. [PMID: 29927082 PMCID: PMC6097941 DOI: 10.1002/cpps.62] [Citation(s) in RCA: 41] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]

Solar-panel and parasol strategies shape the proteorhodopsin distribution pattern in marine Flavobacteriia. ISME JOURNAL 2018;12:1329-1343. [PMID: 29410487 PMCID: PMC5932025 DOI: 10.1038/s41396-018-0058-4] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/29/2017] [Revised: 12/17/2017] [Accepted: 01/02/2018] [Indexed: 12/30/2022]

Mitsopoulos C, Schierz AC, Workman P, Al-Lazikani B. Distinctive Behaviors of Druggable Proteins in Cellular Networks. PLoS Comput Biol 2015;11:e1004597. [PMID: 26699810 PMCID: PMC4689399 DOI: 10.1371/journal.pcbi.1004597] [Citation(s) in RCA: 33] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/12/2015] [Accepted: 10/13/2015] [Indexed: 01/12/2023] Open

Abstract

The interaction environment of a protein in a cellular network is important in defining the role that the protein plays in the system as a whole, and thus its potential suitability as a drug target. Despite the importance of the network environment, it is neglected during target selection for drug discovery. Here, we present the first systematic, comprehensive computational analysis of topological, community and graphical network parameters of the human interactome and identify discriminatory network patterns that strongly distinguish drug targets from the interactome as a whole. Importantly, we identify striking differences in the network behavior of targets of cancer drugs versus targets from other therapeutic areas and explore how they may relate to successful drug combinations to overcome acquired resistance to cancer drugs. We develop, computationally validate and provide the first public domain predictive algorithm for identifying druggable neighborhoods based on network parameters. We also make available full predictions for 13,345 proteins to aid target selection for drug discovery. All target predictions are available through canSAR.icr.ac.uk. Underlying data and tools are available at https://cansar.icr.ac.uk/cansar/publications/druggable_network_neighbourhoods/.

The need for well-validated targets for drug discovery is more pressing than ever, especially in cancer in view of resistance to current therapeutics coupled with late stage drug failures. Target prioritization and selection methodologies have typically not taken the protein interaction environment into account. Here we analyze a large representation of the human interactome comprising almost 90,000 interactions between 13,345 proteins. We assess these interactions using an extensive set of topological, graphical and community parameters, and we identify behaviors that distinguish the protein interaction environments of drug targets from the general interactome. Moreover, we identify clear distinctions between the network environment of cancer-drug targets and targets from other therapeutics areas. We use these distinguishing properties to build a predictive methodology to prioritize potential drug targets based on network parameters alone and we validate our predictive models using current FDA-approved drug targets. Our models provide an objective, interactome-based target prioritization methodology to complement existing structure-based and ligand-based prioritization methods. We provide our interactome-based predictions alongside other druggability predictors within the public canSAR resource (cansar.icr.ac.uk).

Collapse

Efficient Generation of Mice with Consistent Transgene Expression by FEEST. Sci Rep 2015;5:16284. [PMID: 26573149 PMCID: PMC4648098 DOI: 10.1038/srep16284] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/26/2015] [Accepted: 10/07/2015] [Indexed: 12/21/2022] Open

César-Razquin A, Snijder B, Frappier-Brinton T, Isserlin R, Gyimesi G, Bai X, Reithmeier RA, Hepworth D, Hediger MA, Edwards AM, Superti-Furga G. A Call for Systematic Research on Solute Carriers. Cell 2015;162:478-87. [PMID: 26232220 DOI: 10.1016/j.cell.2015.07.022] [Citation(s) in RCA: 381] [Impact Index Per Article: 42.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2015] [Indexed: 01/10/2023]

Maheshwari S, Brylinski M. Predicting protein interface residues using easily accessible on-line resources. Brief Bioinform 2015;16:1025-34. [PMID: 25797794 DOI: 10.1093/bib/bbv009] [Citation(s) in RCA: 35] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2014] [Indexed: 01/20/2023] Open

Whidden CE, DeZeeuw KG, Zorz JK, Joy AP, Barnett DA, Johnson MS, Zhaxybayeva O, Cockshutt AM. Quantitative and functional characterization of the hyper-conserved protein of Prochlorococcus and marine Synechococcus. PLoS One 2014;9:e109327. [PMID: 25360678 PMCID: PMC4215834 DOI: 10.1371/journal.pone.0109327] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/11/2014] [Accepted: 09/11/2014] [Indexed: 11/26/2022] Open

Szklarczyk D, Franceschini A, Wyder S, Forslund K, Heller D, Huerta-Cepas J, Simonovic M, Roth A, Santos A, Tsafou KP, Kuhn M, Bork P, Jensen LJ, von Mering C. STRING v10: protein-protein interaction networks, integrated over the tree of life. Nucleic Acids Res 2014;43:D447-52. [PMID: 25352553 PMCID: PMC4383874 DOI: 10.1093/nar/gku1003] [Citation(s) in RCA: 7123] [Impact Index Per Article: 712.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023] Open

Zybailov BL, Glazko GV, Jaiswal M, Raney KD. Large Scale Chemical Cross-linking Mass Spectrometry Perspectives. ACTA ACUST UNITED AC 2013;6:001. [PMID: 25045217 PMCID: PMC4101816 DOI: 10.4172/jpb.s2-001] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022]

Abstract

The spectacular heterogeneity of a complex protein mixture from biological samples becomes even more difficult to tackle when one’s attention is shifted towards different protein complex topologies, transient interactions, or localization of PPIs. Meticulous protein-by-protein affinity pull-downs and yeast-two-hybrid screens are the two approaches currently used to decipher proteome-wide interaction networks. Another method is to employ chemical cross-linking, which gives not only identities of interactors, but could also provide information on the sites of interactions and interaction interfaces. Despite significant advances in mass spectrometry instrumentation over the last decade, mapping Protein-Protein Interactions (PPIs) using chemical cross-linking remains time consuming and requires substantial expertise, even in the simplest of systems. While robust methodologies and software exist for the analysis of binary PPIs and also for the single protein structure refinement using cross-linking-derived constraints, undertaking a proteome-wide cross-linking study is highly complex. Difficulties include i) identifying cross-linkers of the right length and selectivity that could capture interactions of interest; ii) enrichment of the cross-linked species; iii) identification and validation of the cross-linked peptides and cross-linked sites.

In this review we examine existing literature aimed at the large-scale protein cross-linking and discuss possible paths for improvement. We also discuss short-length cross-linkers of broad specificity such as formaldehyde and diazirine-based photo-cross-linkers. These cross-linkers could potentially capture many types of interactions, without strict requirement for a particular amino-acid to be present at a given protein-protein interface. How these shortlength, broad specificity cross-linkers be applied to proteome-wide studies? We will suggest specific advances in methodology, instrumentation and software that are needed to make such a leap.

Collapse

Armean IM, Lilley KS, Trotter MWB. Popular computational methods to assess multiprotein complexes derived from label-free affinity purification and mass spectrometry (AP-MS) experiments. Mol Cell Proteomics 2012;12:1-13. [PMID: 23071097 DOI: 10.1074/mcp.r112.019554] [Citation(s) in RCA: 41] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022] Open

Abstract

Advances in sensitivity, resolution, mass accuracy, and throughput have considerably increased the number of protein identifications made via mass spectrometry. Despite these advances, state-of-the-art experimental methods for the study of protein-protein interactions yield more candidate interactions than may be expected biologically owing to biases and limitations in the experimental methodology. In silico methods, which distinguish between true and false interactions, have been developed and applied successfully to reduce the number of false positive results yielded by physical interaction assays. Such methods may be grouped according to: (1) the type of data used: methods based on experiment-specific measurements (e.g., spectral counts or identification scores) versus methods that extract knowledge encoded in external annotations (e.g., public interaction and functional categorisation databases); (2) the type of algorithm applied: the statistical description and estimation of physical protein properties versus predictive supervised machine learning or text-mining algorithms; (3) the type of protein relation evaluated: direct (binary) interaction of two proteins in a cocomplex versus probability of any functional relationship between two proteins (e.g., co-occurrence in a pathway, sub cellular compartment); and (4) initial motivation: elucidation of experimental data by evaluation versus prediction of novel protein-protein interaction, to be experimentally validated a posteriori. This work reviews several popular computational scoring methods and software platforms for protein-protein interactions evaluation according to their methodology, comparative strengths and weaknesses, data representation, accessibility, and availability. The scoring methods and platforms described include: CompPASS, SAINT, Decontaminator, MINT, IntAct, STRING, and FunCoup. References to related work are provided throughout in order to provide a concise but thorough introduction to a rapidly growing interdisciplinary field of investigation.

Collapse

Prediction and identification of sequences coding for orphan enzymes using genomic and metagenomic neighbours. Mol Syst Biol 2012;8:581. [PMID: 22569339 PMCID: PMC3377989 DOI: 10.1038/msb.2012.13] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2011] [Accepted: 03/24/2012] [Indexed: 11/09/2022] Open

Abstract

Many characterized metabolic enzymes currently lack associated gene and protein sequences. Here, pathway and genomic neighbour data are used to assign genes to these ‘orphan enzymes,' and the predictions are validated with experimental assays and genome-scale metabolic modelling.

A computational method is developed for assigning candidate sequences to orphan enzymes. The method uses metabolic pathway, genomic neighbourhood, genomic co-occurrence, and protein domain information to predict genes that are likely to perform a particular enzymatic function.

Benchmarking of the scoring scheme based on the 4 features above revealed that some combinations of parameters yielded greater than 70% accuracy, and that high-confidence predictions could be generated for 131 orphan enzymes.

Enzyme assay experiments confirmed the predicted enzymatic activity for two of the high-confidence candidate sequences.

Predicted functions can improve the annotation of genomic and metagenomic data, and can reveal putative genes for enzymes with potential biotechnological applications.

Incorporating the predicted enzymatic reactions into genome-scale metabolic models changed the flux connectivity and improved their ability to correctly predict gene essentiality, supporting the biological relevance of these predictions.

Despite the current wealth of sequencing data, one-third of all biochemically characterized metabolic enzymes lack a corresponding gene or protein sequence, and as such can be considered orphan enzymes. They represent a major gap between our molecular and biochemical knowledge, and consequently are not amenable to modern systemic analyses. As 555 of these orphan enzymes have metabolic pathway neighbours, we developed a global framework that utilizes the pathway and (meta)genomic neighbour information to assign candidate sequences to orphan enzymes. For 131 orphan enzymes (37% of those for which (meta)genomic neighbours are available), we associate sequences to them using scoring parameters with an estimated accuracy of 70%, implying functional annotation of 16 345 gene sequences in numerous (meta)genomes. As a case in point, two of these candidate sequences were experimentally validated to encode the predicted activity. In addition, we augmented the currently available genome-scale metabolic models with these new sequence–function associations and were able to expand the models by on average 8%, with a considerable change in the flux connectivity patterns and improved essentiality prediction.

Collapse

Raftery AE, Niu X, Hoff PD, Yeung KY. Fast Inference for the Latent Space Network Model Using a Case-Control Approximate Likelihood. J Comput Graph Stat 2012;21:901-919. [PMID: 27570438 DOI: 10.1080/10618600.2012.679240] [Citation(s) in RCA: 37] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/28/2022]

Doerks T, van Noort V, Minguez P, Bork P. Annotation of the M. tuberculosis hypothetical orfeome: adding functional information to more than half of the uncharacterized proteins. PLoS One 2012;7:e34302. [PMID: 22485162 PMCID: PMC3317503 DOI: 10.1371/journal.pone.0034302] [Citation(s) in RCA: 49] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2011] [Accepted: 02/26/2012] [Indexed: 11/18/2022] Open

Powell S, Szklarczyk D, Trachana K, Roth A, Kuhn M, Muller J, Arnold R, Rattei T, Letunic I, Doerks T, Jensen LJ, von Mering C, Bork P. eggNOG v3.0: orthologous groups covering 1133 organisms at 41 different taxonomic ranges. Nucleic Acids Res 2012;40:D284-9. [PMID: 22096231 PMCID: PMC3245133 DOI: 10.1093/nar/gkr1060] [Citation(s) in RCA: 386] [Impact Index Per Article: 32.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2011] [Revised: 10/26/2011] [Accepted: 10/26/2011] [Indexed: 11/28/2022] Open

Affiliation(s)

Sean Powell European Molecular Biology Laboratory, Meyerhofstrasse 1, 69117 Heidelberg, Germany, Novo Nordisk Foundation Center for Protein Research, Faculty of Health Sciences, University of Copenhagen, 2200 Copenhagen N, Denmark, University of Zurich and Swiss Institute of Bioinformatics, Winterthurerstrasse 190, 8057 Zurich, Switzerland, Biotechnology Center, TU Dresden, 01062 Dresden, Germany, Institute of Genetics and Molecular and Cellular Biology, CNRS, INSERM, University of Strasbourg, Genetic Diagnostics Laboratory, CHU Strasbourg Nouvel Hôpital Civil, Strasbourg, France, Terrence Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, 160 College Street, Toronto, Ontario M55 3E1, Canada, University of Vienna, Department of Computational Systems Biology, Althanstrasse 14, 1090 Vienna, Austria and Max-Delbrück-Centre for Molecular Medicine, Robert-Rössle-Strasse 10, 13092 Berlin, Germany
Damian Szklarczyk European Molecular Biology Laboratory, Meyerhofstrasse 1, 69117 Heidelberg, Germany, Novo Nordisk Foundation Center for Protein Research, Faculty of Health Sciences, University of Copenhagen, 2200 Copenhagen N, Denmark, University of Zurich and Swiss Institute of Bioinformatics, Winterthurerstrasse 190, 8057 Zurich, Switzerland, Biotechnology Center, TU Dresden, 01062 Dresden, Germany, Institute of Genetics and Molecular and Cellular Biology, CNRS, INSERM, University of Strasbourg, Genetic Diagnostics Laboratory, CHU Strasbourg Nouvel Hôpital Civil, Strasbourg, France, Terrence Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, 160 College Street, Toronto, Ontario M55 3E1, Canada, University of Vienna, Department of Computational Systems Biology, Althanstrasse 14, 1090 Vienna, Austria and Max-Delbrück-Centre for Molecular Medicine, Robert-Rössle-Strasse 10, 13092 Berlin, Germany
Kalliopi Trachana European Molecular Biology Laboratory, Meyerhofstrasse 1, 69117 Heidelberg, Germany, Novo Nordisk Foundation Center for Protein Research, Faculty of Health Sciences, University of Copenhagen, 2200 Copenhagen N, Denmark, University of Zurich and Swiss Institute of Bioinformatics, Winterthurerstrasse 190, 8057 Zurich, Switzerland, Biotechnology Center, TU Dresden, 01062 Dresden, Germany, Institute of Genetics and Molecular and Cellular Biology, CNRS, INSERM, University of Strasbourg, Genetic Diagnostics Laboratory, CHU Strasbourg Nouvel Hôpital Civil, Strasbourg, France, Terrence Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, 160 College Street, Toronto, Ontario M55 3E1, Canada, University of Vienna, Department of Computational Systems Biology, Althanstrasse 14, 1090 Vienna, Austria and Max-Delbrück-Centre for Molecular Medicine, Robert-Rössle-Strasse 10, 13092 Berlin, Germany
Alexander Roth European Molecular Biology Laboratory, Meyerhofstrasse 1, 69117 Heidelberg, Germany, Novo Nordisk Foundation Center for Protein Research, Faculty of Health Sciences, University of Copenhagen, 2200 Copenhagen N, Denmark, University of Zurich and Swiss Institute of Bioinformatics, Winterthurerstrasse 190, 8057 Zurich, Switzerland, Biotechnology Center, TU Dresden, 01062 Dresden, Germany, Institute of Genetics and Molecular and Cellular Biology, CNRS, INSERM, University of Strasbourg, Genetic Diagnostics Laboratory, CHU Strasbourg Nouvel Hôpital Civil, Strasbourg, France, Terrence Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, 160 College Street, Toronto, Ontario M55 3E1, Canada, University of Vienna, Department of Computational Systems Biology, Althanstrasse 14, 1090 Vienna, Austria and Max-Delbrück-Centre for Molecular Medicine, Robert-Rössle-Strasse 10, 13092 Berlin, Germany
Michael Kuhn European Molecular Biology Laboratory, Meyerhofstrasse 1, 69117 Heidelberg, Germany, Novo Nordisk Foundation Center for Protein Research, Faculty of Health Sciences, University of Copenhagen, 2200 Copenhagen N, Denmark, University of Zurich and Swiss Institute of Bioinformatics, Winterthurerstrasse 190, 8057 Zurich, Switzerland, Biotechnology Center, TU Dresden, 01062 Dresden, Germany, Institute of Genetics and Molecular and Cellular Biology, CNRS, INSERM, University of Strasbourg, Genetic Diagnostics Laboratory, CHU Strasbourg Nouvel Hôpital Civil, Strasbourg, France, Terrence Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, 160 College Street, Toronto, Ontario M55 3E1, Canada, University of Vienna, Department of Computational Systems Biology, Althanstrasse 14, 1090 Vienna, Austria and Max-Delbrück-Centre for Molecular Medicine, Robert-Rössle-Strasse 10, 13092 Berlin, Germany
Jean Muller European Molecular Biology Laboratory, Meyerhofstrasse 1, 69117 Heidelberg, Germany, Novo Nordisk Foundation Center for Protein Research, Faculty of Health Sciences, University of Copenhagen, 2200 Copenhagen N, Denmark, University of Zurich and Swiss Institute of Bioinformatics, Winterthurerstrasse 190, 8057 Zurich, Switzerland, Biotechnology Center, TU Dresden, 01062 Dresden, Germany, Institute of Genetics and Molecular and Cellular Biology, CNRS, INSERM, University of Strasbourg, Genetic Diagnostics Laboratory, CHU Strasbourg Nouvel Hôpital Civil, Strasbourg, France, Terrence Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, 160 College Street, Toronto, Ontario M55 3E1, Canada, University of Vienna, Department of Computational Systems Biology, Althanstrasse 14, 1090 Vienna, Austria and Max-Delbrück-Centre for Molecular Medicine, Robert-Rössle-Strasse 10, 13092 Berlin, Germany
Roland Arnold European Molecular Biology Laboratory, Meyerhofstrasse 1, 69117 Heidelberg, Germany, Novo Nordisk Foundation Center for Protein Research, Faculty of Health Sciences, University of Copenhagen, 2200 Copenhagen N, Denmark, University of Zurich and Swiss Institute of Bioinformatics, Winterthurerstrasse 190, 8057 Zurich, Switzerland, Biotechnology Center, TU Dresden, 01062 Dresden, Germany, Institute of Genetics and Molecular and Cellular Biology, CNRS, INSERM, University of Strasbourg, Genetic Diagnostics Laboratory, CHU Strasbourg Nouvel Hôpital Civil, Strasbourg, France, Terrence Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, 160 College Street, Toronto, Ontario M55 3E1, Canada, University of Vienna, Department of Computational Systems Biology, Althanstrasse 14, 1090 Vienna, Austria and Max-Delbrück-Centre for Molecular Medicine, Robert-Rössle-Strasse 10, 13092 Berlin, Germany
Thomas Rattei European Molecular Biology Laboratory, Meyerhofstrasse 1, 69117 Heidelberg, Germany, Novo Nordisk Foundation Center for Protein Research, Faculty of Health Sciences, University of Copenhagen, 2200 Copenhagen N, Denmark, University of Zurich and Swiss Institute of Bioinformatics, Winterthurerstrasse 190, 8057 Zurich, Switzerland, Biotechnology Center, TU Dresden, 01062 Dresden, Germany, Institute of Genetics and Molecular and Cellular Biology, CNRS, INSERM, University of Strasbourg, Genetic Diagnostics Laboratory, CHU Strasbourg Nouvel Hôpital Civil, Strasbourg, France, Terrence Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, 160 College Street, Toronto, Ontario M55 3E1, Canada, University of Vienna, Department of Computational Systems Biology, Althanstrasse 14, 1090 Vienna, Austria and Max-Delbrück-Centre for Molecular Medicine, Robert-Rössle-Strasse 10, 13092 Berlin, Germany
Ivica Letunic European Molecular Biology Laboratory, Meyerhofstrasse 1, 69117 Heidelberg, Germany, Novo Nordisk Foundation Center for Protein Research, Faculty of Health Sciences, University of Copenhagen, 2200 Copenhagen N, Denmark, University of Zurich and Swiss Institute of Bioinformatics, Winterthurerstrasse 190, 8057 Zurich, Switzerland, Biotechnology Center, TU Dresden, 01062 Dresden, Germany, Institute of Genetics and Molecular and Cellular Biology, CNRS, INSERM, University of Strasbourg, Genetic Diagnostics Laboratory, CHU Strasbourg Nouvel Hôpital Civil, Strasbourg, France, Terrence Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, 160 College Street, Toronto, Ontario M55 3E1, Canada, University of Vienna, Department of Computational Systems Biology, Althanstrasse 14, 1090 Vienna, Austria and Max-Delbrück-Centre for Molecular Medicine, Robert-Rössle-Strasse 10, 13092 Berlin, Germany
Tobias Doerks European Molecular Biology Laboratory, Meyerhofstrasse 1, 69117 Heidelberg, Germany, Novo Nordisk Foundation Center for Protein Research, Faculty of Health Sciences, University of Copenhagen, 2200 Copenhagen N, Denmark, University of Zurich and Swiss Institute of Bioinformatics, Winterthurerstrasse 190, 8057 Zurich, Switzerland, Biotechnology Center, TU Dresden, 01062 Dresden, Germany, Institute of Genetics and Molecular and Cellular Biology, CNRS, INSERM, University of Strasbourg, Genetic Diagnostics Laboratory, CHU Strasbourg Nouvel Hôpital Civil, Strasbourg, France, Terrence Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, 160 College Street, Toronto, Ontario M55 3E1, Canada, University of Vienna, Department of Computational Systems Biology, Althanstrasse 14, 1090 Vienna, Austria and Max-Delbrück-Centre for Molecular Medicine, Robert-Rössle-Strasse 10, 13092 Berlin, Germany
Lars J. Jensen European Molecular Biology Laboratory, Meyerhofstrasse 1, 69117 Heidelberg, Germany, Novo Nordisk Foundation Center for Protein Research, Faculty of Health Sciences, University of Copenhagen, 2200 Copenhagen N, Denmark, University of Zurich and Swiss Institute of Bioinformatics, Winterthurerstrasse 190, 8057 Zurich, Switzerland, Biotechnology Center, TU Dresden, 01062 Dresden, Germany, Institute of Genetics and Molecular and Cellular Biology, CNRS, INSERM, University of Strasbourg, Genetic Diagnostics Laboratory, CHU Strasbourg Nouvel Hôpital Civil, Strasbourg, France, Terrence Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, 160 College Street, Toronto, Ontario M55 3E1, Canada, University of Vienna, Department of Computational Systems Biology, Althanstrasse 14, 1090 Vienna, Austria and Max-Delbrück-Centre for Molecular Medicine, Robert-Rössle-Strasse 10, 13092 Berlin, Germany
Christian von Mering European Molecular Biology Laboratory, Meyerhofstrasse 1, 69117 Heidelberg, Germany, Novo Nordisk Foundation Center for Protein Research, Faculty of Health Sciences, University of Copenhagen, 2200 Copenhagen N, Denmark, University of Zurich and Swiss Institute of Bioinformatics, Winterthurerstrasse 190, 8057 Zurich, Switzerland, Biotechnology Center, TU Dresden, 01062 Dresden, Germany, Institute of Genetics and Molecular and Cellular Biology, CNRS, INSERM, University of Strasbourg, Genetic Diagnostics Laboratory, CHU Strasbourg Nouvel Hôpital Civil, Strasbourg, France, Terrence Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, 160 College Street, Toronto, Ontario M55 3E1, Canada, University of Vienna, Department of Computational Systems Biology, Althanstrasse 14, 1090 Vienna, Austria and Max-Delbrück-Centre for Molecular Medicine, Robert-Rössle-Strasse 10, 13092 Berlin, Germany
Peer Bork European Molecular Biology Laboratory, Meyerhofstrasse 1, 69117 Heidelberg, Germany, Novo Nordisk Foundation Center for Protein Research, Faculty of Health Sciences, University of Copenhagen, 2200 Copenhagen N, Denmark, University of Zurich and Swiss Institute of Bioinformatics, Winterthurerstrasse 190, 8057 Zurich, Switzerland, Biotechnology Center, TU Dresden, 01062 Dresden, Germany, Institute of Genetics and Molecular and Cellular Biology, CNRS, INSERM, University of Strasbourg, Genetic Diagnostics Laboratory, CHU Strasbourg Nouvel Hôpital Civil, Strasbourg, France, Terrence Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, 160 College Street, Toronto, Ontario M55 3E1, Canada, University of Vienna, Department of Computational Systems Biology, Althanstrasse 14, 1090 Vienna, Austria and Max-Delbrück-Centre for Molecular Medicine, Robert-Rössle-Strasse 10, 13092 Berlin, Germany

Collapse

Ng SK, Tan SH. DISCOVERING PROTEIN–PROTEIN INTERACTIONS. J Bioinform Comput Biol 2011;1:711-41. [PMID: 15290761 DOI: 10.1142/s0219720004000600] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2003] [Revised: 12/12/2003] [Accepted: 12/13/2003] [Indexed: 11/18/2022]

De Las Rivas J, de Luis A. Interactome data and databases: different types of protein interaction. Comp Funct Genomics 2011;5:173-8. [PMID: 18629062 PMCID: PMC2447346 DOI: 10.1002/cfg.377] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2003] [Revised: 12/10/2003] [Accepted: 12/18/2003] [Indexed: 11/29/2022] Open

Assessing the biological significance of gene expression signatures and co-expression modules by studying their network properties. PLoS One 2011;6:e17474. [PMID: 21408226 PMCID: PMC3049771 DOI: 10.1371/journal.pone.0017474] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2010] [Accepted: 02/03/2011] [Indexed: 12/23/2022] Open

Abstract

Microarray experiments have been extensively used to define signatures, which are sets of genes that can be considered markers of experimental conditions (typically diseases). Paradoxically, in spite of the apparent functional role that might be attributed to such gene sets, signatures do not seem to be reproducible across experiments. Given the close relationship between function and protein interaction, network properties can be used to study to what extent signatures are composed of genes whose resulting proteins show a considerable level of interaction (and consequently a putative common functional role).We have analysed 618 signatures and 507 modules of co-expression in cancer looking for significant values of four main protein-protein interaction (PPI) network parameters: connection degree, cluster coefficient, betweenness and number of components. A total of 3904 gene ontology (GO) modules, 146 KEGG pathways, and 263 Biocarta pathways have been used as functional modules of reference.Co-expression modules found in microarray experiments display a high level of connectivity, similar to the one shown by conventional modules based on functional definitions (GO, KEGG and Biocarta). A general observation for all the classes studied is that the networks formed by the modules improve their topological parameters when an external protein is allowed to be introduced within the paths (up to the 70% of GO modules show network parameters beyond the random expectation). This fact suggests that functional definitions are incomplete and some genes might still be missing. Conversely, signatures are clearly not capturing the altered functions in the corresponding studies. This is probably because the way in which the genes have been selected in the signatures is too conservative. These results suggest that gene selection methods which take into account relationships among genes should be superior to methods that assume independence among genes outside their functional contexts.

Collapse

Szklarczyk D, Franceschini A, Kuhn M, Simonovic M, Roth A, Minguez P, Doerks T, Stark M, Muller J, Bork P, Jensen LJ, von Mering C. The STRING database in 2011: functional interaction networks of proteins, globally integrated and scored. Nucleic Acids Res 2011;39:D561-8. [PMID: 21045058 PMCID: PMC3013807 DOI: 10.1093/nar/gkq973] [Citation(s) in RCA: 2547] [Impact Index Per Article: 195.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2010] [Accepted: 10/03/2010] [Indexed: 12/12/2022] Open

Affiliation(s)

Damian Szklarczyk Faculty of Health Sciences, Novo Nordisk Foundation Centre for Protein Research, University of Copenhagen, Denmark, Faculty of Science, Institute of Molecular Life Sciences and Swiss Institute of Bioinformatics, University of Zurich, Switzerland, Biotechnology Center, Technical University Dresden, Structural and Computational Biology Unit, European Molecular Biology Laboratory, Heidelberg, Germany, Institute of Genetics and Molecular and Cellular Biology, CNRS, INSERM, University of Strasbourg, Genetic Diagnostics Laboratory, CHU Strasbourg Nouvel Hôpital Civil, Strasbourg, France and Max-Delbrück-Centre for Molecular Medicine, Berlin, Germany
Andrea Franceschini Faculty of Health Sciences, Novo Nordisk Foundation Centre for Protein Research, University of Copenhagen, Denmark, Faculty of Science, Institute of Molecular Life Sciences and Swiss Institute of Bioinformatics, University of Zurich, Switzerland, Biotechnology Center, Technical University Dresden, Structural and Computational Biology Unit, European Molecular Biology Laboratory, Heidelberg, Germany, Institute of Genetics and Molecular and Cellular Biology, CNRS, INSERM, University of Strasbourg, Genetic Diagnostics Laboratory, CHU Strasbourg Nouvel Hôpital Civil, Strasbourg, France and Max-Delbrück-Centre for Molecular Medicine, Berlin, Germany
Michael Kuhn Faculty of Health Sciences, Novo Nordisk Foundation Centre for Protein Research, University of Copenhagen, Denmark, Faculty of Science, Institute of Molecular Life Sciences and Swiss Institute of Bioinformatics, University of Zurich, Switzerland, Biotechnology Center, Technical University Dresden, Structural and Computational Biology Unit, European Molecular Biology Laboratory, Heidelberg, Germany, Institute of Genetics and Molecular and Cellular Biology, CNRS, INSERM, University of Strasbourg, Genetic Diagnostics Laboratory, CHU Strasbourg Nouvel Hôpital Civil, Strasbourg, France and Max-Delbrück-Centre for Molecular Medicine, Berlin, Germany
Milan Simonovic Faculty of Health Sciences, Novo Nordisk Foundation Centre for Protein Research, University of Copenhagen, Denmark, Faculty of Science, Institute of Molecular Life Sciences and Swiss Institute of Bioinformatics, University of Zurich, Switzerland, Biotechnology Center, Technical University Dresden, Structural and Computational Biology Unit, European Molecular Biology Laboratory, Heidelberg, Germany, Institute of Genetics and Molecular and Cellular Biology, CNRS, INSERM, University of Strasbourg, Genetic Diagnostics Laboratory, CHU Strasbourg Nouvel Hôpital Civil, Strasbourg, France and Max-Delbrück-Centre for Molecular Medicine, Berlin, Germany
Alexander Roth Faculty of Health Sciences, Novo Nordisk Foundation Centre for Protein Research, University of Copenhagen, Denmark, Faculty of Science, Institute of Molecular Life Sciences and Swiss Institute of Bioinformatics, University of Zurich, Switzerland, Biotechnology Center, Technical University Dresden, Structural and Computational Biology Unit, European Molecular Biology Laboratory, Heidelberg, Germany, Institute of Genetics and Molecular and Cellular Biology, CNRS, INSERM, University of Strasbourg, Genetic Diagnostics Laboratory, CHU Strasbourg Nouvel Hôpital Civil, Strasbourg, France and Max-Delbrück-Centre for Molecular Medicine, Berlin, Germany
Pablo Minguez Faculty of Health Sciences, Novo Nordisk Foundation Centre for Protein Research, University of Copenhagen, Denmark, Faculty of Science, Institute of Molecular Life Sciences and Swiss Institute of Bioinformatics, University of Zurich, Switzerland, Biotechnology Center, Technical University Dresden, Structural and Computational Biology Unit, European Molecular Biology Laboratory, Heidelberg, Germany, Institute of Genetics and Molecular and Cellular Biology, CNRS, INSERM, University of Strasbourg, Genetic Diagnostics Laboratory, CHU Strasbourg Nouvel Hôpital Civil, Strasbourg, France and Max-Delbrück-Centre for Molecular Medicine, Berlin, Germany
Tobias Doerks Faculty of Health Sciences, Novo Nordisk Foundation Centre for Protein Research, University of Copenhagen, Denmark, Faculty of Science, Institute of Molecular Life Sciences and Swiss Institute of Bioinformatics, University of Zurich, Switzerland, Biotechnology Center, Technical University Dresden, Structural and Computational Biology Unit, European Molecular Biology Laboratory, Heidelberg, Germany, Institute of Genetics and Molecular and Cellular Biology, CNRS, INSERM, University of Strasbourg, Genetic Diagnostics Laboratory, CHU Strasbourg Nouvel Hôpital Civil, Strasbourg, France and Max-Delbrück-Centre for Molecular Medicine, Berlin, Germany
Manuel Stark Faculty of Health Sciences, Novo Nordisk Foundation Centre for Protein Research, University of Copenhagen, Denmark, Faculty of Science, Institute of Molecular Life Sciences and Swiss Institute of Bioinformatics, University of Zurich, Switzerland, Biotechnology Center, Technical University Dresden, Structural and Computational Biology Unit, European Molecular Biology Laboratory, Heidelberg, Germany, Institute of Genetics and Molecular and Cellular Biology, CNRS, INSERM, University of Strasbourg, Genetic Diagnostics Laboratory, CHU Strasbourg Nouvel Hôpital Civil, Strasbourg, France and Max-Delbrück-Centre for Molecular Medicine, Berlin, Germany
Jean Muller Faculty of Health Sciences, Novo Nordisk Foundation Centre for Protein Research, University of Copenhagen, Denmark, Faculty of Science, Institute of Molecular Life Sciences and Swiss Institute of Bioinformatics, University of Zurich, Switzerland, Biotechnology Center, Technical University Dresden, Structural and Computational Biology Unit, European Molecular Biology Laboratory, Heidelberg, Germany, Institute of Genetics and Molecular and Cellular Biology, CNRS, INSERM, University of Strasbourg, Genetic Diagnostics Laboratory, CHU Strasbourg Nouvel Hôpital Civil, Strasbourg, France and Max-Delbrück-Centre for Molecular Medicine, Berlin, Germany
Peer Bork Faculty of Health Sciences, Novo Nordisk Foundation Centre for Protein Research, University of Copenhagen, Denmark, Faculty of Science, Institute of Molecular Life Sciences and Swiss Institute of Bioinformatics, University of Zurich, Switzerland, Biotechnology Center, Technical University Dresden, Structural and Computational Biology Unit, European Molecular Biology Laboratory, Heidelberg, Germany, Institute of Genetics and Molecular and Cellular Biology, CNRS, INSERM, University of Strasbourg, Genetic Diagnostics Laboratory, CHU Strasbourg Nouvel Hôpital Civil, Strasbourg, France and Max-Delbrück-Centre for Molecular Medicine, Berlin, Germany
Lars J. Jensen Faculty of Health Sciences, Novo Nordisk Foundation Centre for Protein Research, University of Copenhagen, Denmark, Faculty of Science, Institute of Molecular Life Sciences and Swiss Institute of Bioinformatics, University of Zurich, Switzerland, Biotechnology Center, Technical University Dresden, Structural and Computational Biology Unit, European Molecular Biology Laboratory, Heidelberg, Germany, Institute of Genetics and Molecular and Cellular Biology, CNRS, INSERM, University of Strasbourg, Genetic Diagnostics Laboratory, CHU Strasbourg Nouvel Hôpital Civil, Strasbourg, France and Max-Delbrück-Centre for Molecular Medicine, Berlin, Germany
Christian von Mering Faculty of Health Sciences, Novo Nordisk Foundation Centre for Protein Research, University of Copenhagen, Denmark, Faculty of Science, Institute of Molecular Life Sciences and Swiss Institute of Bioinformatics, University of Zurich, Switzerland, Biotechnology Center, Technical University Dresden, Structural and Computational Biology Unit, European Molecular Biology Laboratory, Heidelberg, Germany, Institute of Genetics and Molecular and Cellular Biology, CNRS, INSERM, University of Strasbourg, Genetic Diagnostics Laboratory, CHU Strasbourg Nouvel Hôpital Civil, Strasbourg, France and Max-Delbrück-Centre for Molecular Medicine, Berlin, Germany

Collapse

Jaeger S, Sers CT, Leser U. Combining modularity, conservation, and interactions of proteins significantly increases precision and coverage of protein function prediction. BMC Genomics 2010;11:717. [PMID: 21171995 PMCID: PMC3017542 DOI: 10.1186/1471-2164-11-717] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2010] [Accepted: 12/20/2010] [Indexed: 11/10/2022] Open

Schliep K, Lopez P, Lapointe FJ, Bapteste E. Harvesting evolutionary signals in a forest of prokaryotic gene trees. Mol Biol Evol 2010;28:1393-405. [PMID: 21172835 DOI: 10.1093/molbev/msq323] [Citation(s) in RCA: 31] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open

Hawkins T, Chitale M, Kihara D. Functional enrichment analyses and construction of functional similarity networks with high confidence function prediction by PFP. BMC Bioinformatics 2010;11:265. [PMID: 20482861 PMCID: PMC2882935 DOI: 10.1186/1471-2105-11-265] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2009] [Accepted: 05/19/2010] [Indexed: 11/10/2022] Open

Abstract

BACKGROUND

A new paradigm of biological investigation takes advantage of technologies that produce large high throughput datasets, including genome sequences, interactions of proteins, and gene expression. The ability of biologists to analyze and interpret such data relies on functional annotation of the included proteins, but even in highly characterized organisms many proteins can lack the functional evidence necessary to infer their biological relevance.

RESULTS

Here we have applied high confidence function predictions from our automated prediction system, PFP, to three genome sequences, Escherichia coli, Saccharomyces cerevisiae, and Plasmodium falciparum (malaria). The number of annotated genes is increased by PFP to over 90% for all of the genomes. Using the large coverage of the function annotation, we introduced the functional similarity networks which represent the functional space of the proteomes. Four different functional similarity networks are constructed for each proteome, one each by considering similarity in a single Gene Ontology (GO) category, i.e. Biological Process, Cellular Component, and Molecular Function, and another one by considering overall similarity with the funSim score. The functional similarity networks are shown to have higher modularity than the protein-protein interaction network. Moreover, the funSim score network is distinct from the single GO-score networks by showing a higher clustering degree exponent value and thus has a higher tendency to be hierarchical. In addition, examining function assignments to the protein-protein interaction network and local regions of genomes has identified numerous cases where subnetworks or local regions have functionally coherent proteins. These results will help interpreting interactions of proteins and gene orders in a genome. Several examples of both analyses are highlighted.

CONCLUSION

The analyses demonstrate that applying high confidence predictions from PFP can have a significant impact on a researchers' ability to interpret the immense biological data that are being generated today. The newly introduced functional similarity networks of the three organisms show different network properties as compared with the protein-protein interaction networks.

Collapse

Rodriguez-Soca Y, Munteanu CR, Dorado J, Pazos A, Prado-Prado FJ, González-Díaz H. Trypano-PPI: A Web Server for Prediction of Unique Targets in Trypanosome Proteome by using Electrostatic Parameters of Protein−protein Interactions. J Proteome Res 2009;9:1182-90. [DOI: 10.1021/pr900827b] [Citation(s) in RCA: 34] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023]

Affiliation(s)

Yamilet Rodriguez-Soca Department of Microbiology & Parasitology, Faculty of Pharmacy, University of Santiago de Compostela, 15782, Santiago de Compostela, Spain, and Department of Information and Communication Technologies, Computer Science Faculty, University of A Coruña, Campus de Elviña, 15071, A Coruña, Spain
Cristian R. Munteanu Department of Microbiology & Parasitology, Faculty of Pharmacy, University of Santiago de Compostela, 15782, Santiago de Compostela, Spain, and Department of Information and Communication Technologies, Computer Science Faculty, University of A Coruña, Campus de Elviña, 15071, A Coruña, Spain
Julián Dorado Department of Microbiology & Parasitology, Faculty of Pharmacy, University of Santiago de Compostela, 15782, Santiago de Compostela, Spain, and Department of Information and Communication Technologies, Computer Science Faculty, University of A Coruña, Campus de Elviña, 15071, A Coruña, Spain
Alejandro Pazos Department of Microbiology & Parasitology, Faculty of Pharmacy, University of Santiago de Compostela, 15782, Santiago de Compostela, Spain, and Department of Information and Communication Technologies, Computer Science Faculty, University of A Coruña, Campus de Elviña, 15071, A Coruña, Spain
Francisco J. Prado-Prado Department of Microbiology & Parasitology, Faculty of Pharmacy, University of Santiago de Compostela, 15782, Santiago de Compostela, Spain, and Department of Information and Communication Technologies, Computer Science Faculty, University of A Coruña, Campus de Elviña, 15071, A Coruña, Spain
Humberto González-Díaz Department of Microbiology & Parasitology, Faculty of Pharmacy, University of Santiago de Compostela, 15782, Santiago de Compostela, Spain, and Department of Information and Communication Technologies, Computer Science Faculty, University of A Coruña, Campus de Elviña, 15071, A Coruña, Spain

Collapse

Bertini I, Cavallaro G. Bioinformatics in bioinorganic chemistry. Metallomics 2009;2:39-51. [PMID: 21072373 DOI: 10.1039/b912156k] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]

Li X, Chen H, Li J, Zhang Z. Gene function prediction with gene interaction networks: a context graph kernel approach. ACTA ACUST UNITED AC 2009;14:119-28. [PMID: 19789115 DOI: 10.1109/titb.2009.2033116] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]

Babu M, Musso G, Díaz-Mejía JJ, Butland G, Greenblatt JF, Emili A. Systems-level approaches for identifying and analyzing genetic interaction networks in Escherichia coli and extensions to other prokaryotes. MOLECULAR BIOSYSTEMS 2009;5:1439-55. [PMID: 19763343 DOI: 10.1039/b907407d] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]

Kushwaha SK, Shakya M. PINAT1.0: protein interaction network analysis tool. Bioinformation 2009;3:419-21. [PMID: 19759862 PMCID: PMC2737494 DOI: 10.6026/97320630003419] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2009] [Revised: 04/01/2009] [Accepted: 04/08/2009] [Indexed: 11/28/2022] Open

Dotan-Cohen D, Letovsky S, Melkman AA, Kasif S. Biological process linkage networks. PLoS One 2009;4:e5313. [PMID: 19390589 PMCID: PMC2669181 DOI: 10.1371/journal.pone.0005313] [Citation(s) in RCA: 31] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2008] [Accepted: 03/24/2009] [Indexed: 12/21/2022] Open

Abstract

Background

The traditional approach to studying complex biological networks is based on the identification of interactions between internal components of signaling or metabolic pathways. By comparison, little is known about interactions between higher order biological systems, such as biological pathways and processes.

We propose a methodology for gleaning patterns of interactions between biological processes by analyzing protein-protein interactions, transcriptional co-expression and genetic interactions. At the heart of the methodology are the concept of Linked Processes and the resultant network of biological processes, the Process Linkage Network (PLN).

Results

We construct, catalogue, and analyze different types of PLNs derived from different data sources and different species. When applied to the Gene Ontology, many of the resulting links connect processes that are distant from each other in the hierarchy, even though the connection makes eminent sense biologically. Some others, however, carry an element of surprise and may reflect mechanisms that are unique to the organism under investigation. In this aspect our method complements the link structure between processes inherent in the Gene Ontology, which by its very nature is species-independent.

As a practical application of the linkage of processes we demonstrate that it can be effectively used in protein function prediction, having the power to increase both the coverage and the accuracy of predictions, when carefully integrated into prediction methods.

Conclusions

Our approach constitutes a promising new direction towards understanding the higher levels of organization of the cell as a system which should help current efforts to re-engineer ontologies and improve our ability to predict which proteins are involved in specific biological processes.

Collapse

Hawkins T, Chitale M, Luban S, Kihara D. PFP: Automated prediction of gene ontology functional annotations with confidence scores using protein sequence data. Proteins 2009;74:566-82. [PMID: 18655063 DOI: 10.1002/prot.22172] [Citation(s) in RCA: 79] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]

Abstract

Protein function prediction is a central problem in bioinformatics, increasing in importance recently due to the rapid accumulation of biological data awaiting interpretation. Sequence data represents the bulk of this new stock and is the obvious target for consideration as input, as newly sequenced organisms often lack any other type of biological characterization. We have previously introduced PFP (Protein Function Prediction) as our sequence-based predictor of Gene Ontology (GO) functional terms. PFP interprets the results of a PSI-BLAST search by extracting and scoring individual functional attributes, searching a wide range of E-value sequence matches, and utilizing conventional data mining techniques to fill in missing information. We have shown it to be effective in predicting both specific and low-resolution functional attributes when sufficient data is unavailable. Here we describe (1) significant improvements to the PFP infrastructure, including the addition of prediction significance and confidence scores, (2) a thorough benchmark of performance and comparisons to other related prediction methods, and (3) applications of PFP predictions to genome-scale data. We applied PFP predictions to uncharacterized protein sequences from 15 organisms. Among these sequences, 60-90% could be annotated with a GO molecular function term at high confidence (>or=80%). We also applied our predictions to the protein-protein interaction network of the Malaria plasmodium (Plasmodium falciparum). High confidence GO biological process predictions (>or=90%) from PFP increased the number of fully enriched interactions in this dataset from 23% of interactions to 94%. Our benchmark comparison shows significant performance improvement of PFP relative to GOtcha, InterProScan, and PSI-BLAST predictions. This is consistent with the performance of PFP as the overall best predictor in both the AFP-SIG '05 and CASP7 function (FN) assessments. PFP is available as a web service at http://dragon.bio.purdue.edu/pfp/.

Collapse

Bernthaler A, Mühlberger I, Fechete R, Perco P, Lukas A, Mayer B. A dependency graph approach for the analysis of differential gene expression profiles. MOLECULAR BIOSYSTEMS 2009;5:1720-31. [PMID: 19585005 DOI: 10.1039/b903109j] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]

Jaeger S, Gaudan S, Leser U, Rebholz-Schuhmann D. Integrating protein-protein interactions and text mining for protein function prediction. BMC Bioinformatics 2008;9 Suppl 8:S2. [PMID: 18673526 PMCID: PMC2500093 DOI: 10.1186/1471-2105-9-s8-s2] [Citation(s) in RCA: 30] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022] Open

Dea-Ayuela MA, Pérez-Castillo Y, Meneses-Marcel A, Ubeira FM, Bolas-Fernández F, Chou KC, González-Díaz H. HP-Lattice QSAR for dynein proteins: experimental proteomics (2D-electrophoresis, mass spectrometry) and theoretic study of a Leishmania infantum sequence. Bioorg Med Chem 2008;16:7770-6. [PMID: 18662882 DOI: 10.1016/j.bmc.2008.07.023] [Citation(s) in RCA: 48] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2008] [Revised: 06/23/2008] [Accepted: 07/02/2008] [Indexed: 10/21/2022]

Abstract

The toxicity and inefficacy of actual organic drugs against Leishmaniosis justify research projects to find new molecular targets in Leishmania species including Leishmania infantum (L. infantum) and Leishmaniamajor (L. major), both important pathogens. In this sense, quantitative structure-activity relationship (QSAR) methods, which are very useful in Bioorganic and Medicinal Chemistry to discover small-sized drugs, may help to identify not only new drugs but also new drug targets, if we apply them to proteins. Dyneins are important proteins of these parasites governing fundamental processes such as cilia and flagella motion, nuclear migration, organization of the mitotic splinde, and chromosome separation during mitosis. However, despite the interest for them as potential drug targets, so far there has been no report whatsoever on dyneins with QSAR techniques. To the best of our knowledge, we report here the first QSAR for dynein proteins. We used as input the Spectral Moments of a Markov matrix associated to the HP-Lattice Network of the protein sequence. The data contain 411 protein sequences of different species selected by ClustalX to develop a QSAR that correctly discriminates on average between 92.75% and 92.51% of dyneins and other proteins in four different train and cross-validation datasets. We also report a combined experimental and theoretic study of a new dynein sequence in order to illustrate the utility of the model to search for potential drug targets with a practical example. First, we carried out a 2D-electrophoresis analysis of L. infantum biological samples. Next, we excised from 2D-E gels one spot of interest belonging to an unknown protein or protein fragment in the region M<20,200 and pI<4. We used MASCOT search engine to find proteins in the L. major data base with the highest similarity score to the MS of the protein isolated from L. infantum. We used the QSAR model to predict the new sequence as dynein with probability of 99.99% without relying upon alignment. In order to confirm the previous function annotation we predicted the sequences as dynein with BLAST and the omniBLAST tools (96% alignment similarity to dyneins of other species). Using this combined strategy, we have successfully identified L. infantum protein containing dynein heavy chain, and illustrated the potential use of the QSAR model as a complement to alignment tools.

Collapse

Aguilar D, Oliva B. Topological comparison of methods for predicting transcriptional cooperativity in yeast. BMC Genomics 2008;9:137. [PMID: 18366726 PMCID: PMC2315657 DOI: 10.1186/1471-2164-9-137] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2007] [Accepted: 03/25/2008] [Indexed: 11/10/2022] Open

Groth P, Weiss B, Pohlenz HD, Leser U. Mining phenotypes for gene function prediction. BMC Bioinformatics 2008;9:136. [PMID: 18315868 PMCID: PMC2311305 DOI: 10.1186/1471-2105-9-136] [Citation(s) in RCA: 37] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2007] [Accepted: 03/03/2008] [Indexed: 01/29/2023] Open

Abstract

Background

Health and disease of organisms are reflected in their phenotypes. Often, a genetic component to a disease is discovered only after clearly defining its phenotype. In the past years, many technologies to systematically generate phenotypes in a high-throughput manner, such as RNA interference or gene knock-out, have been developed and used to decipher functions for genes. However, there have been relatively few efforts to make use of phenotype data beyond the single genotype-phenotype relationships.

Results

We present results on a study where we use a large set of phenotype data – in textual form – to predict gene annotation. To this end, we use text clustering to group genes based on their phenotype descriptions. We show that these clusters correlate well with several indicators for biological coherence in gene groups, such as functional annotations from the Gene Ontology (GO) and protein-protein interactions. We exploit these clusters for predicting gene function by carrying over annotations from well-annotated genes to other, less-characterized genes in the same cluster. For a subset of groups selected by applying objective criteria, we can predict GO-term annotations from the biological process sub-ontology with up to 72.6% precision and 16.7% recall, as evaluated by cross-validation. We manually verified some of these clusters and found them to exhibit high biological coherence, e.g. a group containing all available antennal Drosophila odorant receptors despite inconsistent GO-annotations.

Conclusion

The intrinsic nature of phenotypes to visibly reflect genetic activity underlines their usefulness in inferring new gene functions. Thus, systematically analyzing these data on a large scale offers many possibilities for inferring functional annotation of genes. We show that text clustering can play an important role in this process.

Collapse

González-Díaz H, González-Díaz Y, Santana L, Ubeira FM, Uriarte E. Proteomics, networks and connectivity indices. Proteomics 2008;8:750-78. [DOI: 10.1002/pmic.200700638] [Citation(s) in RCA: 170] [Impact Index Per Article: 10.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022]

Kensche PR, van Noort V, Dutilh BE, Huynen MA. Practical and theoretical advances in predicting the function of a protein by its phylogenetic distribution. J R Soc Interface 2008;5:151-70. [PMID: 17535793 PMCID: PMC2405902 DOI: 10.1098/rsif.2007.1047] [Citation(s) in RCA: 76] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open

Gimona M. Protein Linguistics and the Modular Code of the Cytoskeleton. BIOSEMIOTICS 2008:189-206. [DOI: 10.1007/978-1-4020-6340-4_8] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/02/2023]

Sun J, Sun Y, Ding G, Liu Q, Wang C, He Y, Shi T, Li Y, Zhao Z. InPrePPI: an integrated evaluation method based on genomic context for predicting protein-protein interactions in prokaryotic genomes. BMC Bioinformatics 2007;8:414. [PMID: 17963500 PMCID: PMC2238723 DOI: 10.1186/1471-2105-8-414] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2007] [Accepted: 10/26/2007] [Indexed: 01/04/2023] Open

Abstract

Background

Although many genomic features have been used in the prediction of protein-protein interactions (PPIs), frequently only one is used in a computational method. After realizing the limited power in the prediction using only one genomic feature, investigators are now moving toward integration. So far, there have been few integration studies for PPI prediction; one failed to yield appreciable improvement of prediction and the others did not conduct performance comparison. It remains unclear whether an integration of multiple genomic features can improve the PPI prediction and, if it can, how to integrate these features.

Results

In this study, we first performed a systematic evaluation on the PPI prediction in Escherichia coli (E. coli) by four genomic context based methods: the phylogenetic profile method, the gene cluster method, the gene fusion method, and the gene neighbor method. The number of predicted PPIs and the average degree in the predicted PPI networks varied greatly among the four methods. Further, no method outperformed the others when we tested using three well-defined positive datasets from the KEGG, EcoCyc, and DIP databases. Based on these comparisons, we developed a novel integrated method, named InPrePPI. InPrePPI first normalizes the AC value (an integrated value of the accuracy and coverage) of each method using three positive datasets, then calculates a weight for each method, and finally uses the weight to calculate an integrated score for each protein pair predicted by the four genomic context based methods. We demonstrate that InPrePPI outperforms each of the four individual methods and, in general, the other two existing integrated methods: the joint observation method and the integrated prediction method in STRING. These four methods and InPrePPI are implemented in a user-friendly web interface.

Conclusion

This study evaluated the PPI prediction by four genomic context based methods, and presents an integrated evaluation method that shows better performance in E. coli.

Collapse

Raes J, Foerstner KU, Bork P. Get the most out of your metagenome: computational analysis of environmental sequence data. Curr Opin Microbiol 2007;10:490-8. [DOI: 10.1016/j.mib.2007.09.001] [Citation(s) in RCA: 130] [Impact Index Per Article: 7.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2007] [Revised: 08/27/2007] [Accepted: 09/03/2007] [Indexed: 11/28/2022]

Harrington ED, Singh AH, Doerks T, Letunic I, von Mering C, Jensen LJ, Raes J, Bork P. Quantitative assessment of protein function prediction from metagenomics shotgun sequences. Proc Natl Acad Sci U S A 2007;104:13913-8. [PMID: 17717083 PMCID: PMC1955820 DOI: 10.1073/pnas.0702636104] [Citation(s) in RCA: 63] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open

Lu LJ, Sboner A, Huang YJ, Lu HX, Gianoulis TA, Yip KY, Kim PM, Montelione GT, Gerstein MB. Comparing classical pathways and modern networks: towards the development of an edge ontology. Trends Biochem Sci 2007;32:320-31. [PMID: 17583513 DOI: 10.1016/j.tibs.2007.06.003] [Citation(s) in RCA: 48] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2006] [Revised: 05/02/2007] [Accepted: 06/06/2007] [Indexed: 02/04/2023]

Raes J, Harrington ED, Singh AH, Bork P. Protein function space: viewing the limits or limited by our view? Curr Opin Struct Biol 2007;17:362-9. [PMID: 17574832 DOI: 10.1016/j.sbi.2007.05.010] [Citation(s) in RCA: 27] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2007] [Revised: 04/25/2007] [Accepted: 05/31/2007] [Indexed: 12/13/2022]

Gene function prediction based on genomic context clustering and discriminative learning: an application to bacteriophages. BMC Bioinformatics 2007;8 Suppl 4:S6. [PMID: 17570149 PMCID: PMC1892085 DOI: 10.1186/1471-2105-8-s4-s6] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open

Szilágyi A, Grimm V, Arakaki AK, Skolnick J. Prediction of physical protein-protein interactions. Phys Biol 2007;2:S1-16. [PMID: 16204844 DOI: 10.1088/1478-3975/2/2/s01] [Citation(s) in RCA: 63] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/25/2023]

Pachkov M, Dandekar T, Korbel J, Bork P, Schuster S. Use of pathway analysis and genome context methods for functional genomics of Mycoplasma pneumoniae nucleotide metabolism. Gene 2007;396:215-25. [PMID: 17467928 DOI: 10.1016/j.gene.2007.02.033] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2005] [Revised: 11/26/2006] [Accepted: 02/21/2007] [Indexed: 11/27/2022]