Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Li H, Li J, Wong L. Discovering motif pairs at interaction sites from protein sequences on a proteome-wide scale. Bioinformatics 2006;22:989-96. [PMID: 16446278 DOI: 10.1093/bioinformatics/btl020] [Citation(s) in RCA: 42] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open

For:	Li H, Li J, Wong L. Discovering motif pairs at interaction sites from protein sequences on a proteome-wide scale. Bioinformatics 2006;22:989-96. [PMID: 16446278 DOI: 10.1093/bioinformatics/btl020] [Citation(s) in RCA: 42] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open

Number

Cited by Other Article(s)

Chen S, Mira A, Onnela JP. Flexible model selection for mechanistic network models. JOURNAL OF COMPLEX NETWORKS 2020;8:cnz024. [PMID: 32765880 PMCID: PMC7391990 DOI: 10.1093/comnet/cnz024] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/25/2019] [Accepted: 06/24/2019] [Indexed: 05/25/2023]

Boyen P, Neven F, van Dyck D, Valentim FL, van Dijk ADJ. Mining minimal motif pair sets maximally covering interactions in a protein-protein interaction network. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2013;10:73-86. [PMID: 23702545 DOI: 10.1109/tcbb.2012.165] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/02/2023]

Guo F, Wang L. Computing the protein binding sites. BMC Bioinformatics 2012;13 Suppl 10:S2. [PMID: 22759425 PMCID: PMC3382440 DOI: 10.1186/1471-2105-13-s10-s2] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open

Chang WC, Vakati S, Krause R, Eulenstein O. Exploring biological interaction networks with tailored weighted quasi-bicliques. BMC Bioinformatics 2012;13 Suppl 10:S16. [PMID: 22759421 PMCID: PMC3314588 DOI: 10.1186/1471-2105-13-s10-s16] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022] Open

Goh WWB, Lee YH, Chung M, Wong L. How advancement in biological network analysis methods empowers proteomics. Proteomics 2012;12:550-63. [PMID: 22247042 DOI: 10.1002/pmic.201100321] [Citation(s) in RCA: 49] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2011] [Revised: 09/05/2011] [Accepted: 09/13/2011] [Indexed: 12/23/2022]

Wong L. Using Biological Networks in Protein Function Prediction and Gene Expression Analysis. ACTA ACUST UNITED AC 2011. [DOI: 10.1080/15427951.2011.604561] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/15/2022]

Guo F, Li SC, Wang L. Protein-protein binding sites prediction by 3D structural similarities. J Chem Inf Model 2011;51:3287-94. [PMID: 22077765 DOI: 10.1021/ci200206n] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]

LEUNG HENRYCHIMING, SIU MANHUNG, YIU SIUMING, CHIN FRANCISYUKLUN, SUNG KENWINGKIN. CLUSTERING-BASED APPROACH FOR PREDICTING MOTIF PAIRS FROM PROTEIN INTERACTION DATA. J Bioinform Comput Biol 2011;7:701-16. [DOI: 10.1142/s0219720009004266] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2008] [Revised: 12/14/2008] [Accepted: 02/22/2009] [Indexed: 11/18/2022]

Schweiger R, Linial M, Linial N. Generative probabilistic models for protein-protein interaction networks--the biclique perspective. ACTA ACUST UNITED AC 2011;27:i142-8. [PMID: 21685063 PMCID: PMC3117378 DOI: 10.1093/bioinformatics/btr201] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]

Abstract

Motivation: Much of the large-scale molecular data from living cells can be represented in terms of networks. Such networks occupy a central position in cellular systems biology. In the protein–protein interaction (PPI) network, nodes represent proteins and edges represent connections between them, based on experimental evidence. As PPI networks are rich and complex, a mathematical model is sought to capture their properties and shed light on PPI evolution. The mathematical literature contains various generative models of random graphs. It is a major, still largely open question, which of these models (if any) can properly reproduce various biologically interesting networks. Here, we consider this problem where the graph at hand is the PPI network of Saccharomyces cerevisiae. We are trying to distinguishing between a model family which performs a process of copying neighbors, represented by the duplication–divergence (DD) model, and models which do not copy neighbors, with the Barabási–Albert (BA) preferential attachment model as a leading example.

Results: The observed property of the network is the distribution of maximal bicliques in the graph. This is a novel criterion to distinguish between models in this area. It is particularly appropriate for this purpose, since it reflects the graph's growth pattern under either model. This test clearly favors the DD model. In particular, for the BA model, the vast majority (92.9%) of the bicliques with both sides ≥4 must be already embedded in the model's seed graph, whereas the corresponding figure for the DD model is only 5.1%. Our results, based on the biclique perspective, conclusively show that a naïve unmodified DD model can capture a key aspect of PPI networks.

Contact:regevs01@cs.huji.ac.il; michall@cc.huji.ac.il; nati@cs.huji.ac.il

Supplementary information:Supplementary data are available at Bioinformatics online.

Collapse

Boyen P, Van Dyck D, Neven F, van Ham RCHJ, van Dijk ADJ. SLIDER: a generic metaheuristic for the discovery of correlated motifs in protein-protein interaction networks. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2011;8:1344-1357. [PMID: 21282865 DOI: 10.1109/tcbb.2011.17] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/30/2023]

Sarmady M, Dampier W, Tozeren A. HIV protein sequence hotspots for crosstalk with host hub proteins. PLoS One 2011;6:e23293. [PMID: 21858059 PMCID: PMC3156123 DOI: 10.1371/journal.pone.0023293] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2011] [Accepted: 07/12/2011] [Indexed: 11/18/2022] Open

Sequence- and interactome-based prediction of viral protein hotspots targeting host proteins: a case study for HIV Nef. PLoS One 2011;6:e20735. [PMID: 21738584 PMCID: PMC3125164 DOI: 10.1371/journal.pone.0020735] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2011] [Accepted: 05/08/2011] [Indexed: 01/03/2023] Open

Liu X, Li J, Wang L. Modeling protein interacting groups by quasi-bicliques: complexity, algorithm, and application. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2010;7:354-364. [PMID: 20431154 DOI: 10.1109/tcbb.2008.61] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/29/2023]

Hugo W, Song F, Aung Z, Ng SK, Sung WK. SLiM on Diet: finding short linear motifs on domain interaction interfaces in Protein Data Bank. Bioinformatics 2010;26:1036-42. [DOI: 10.1093/bioinformatics/btq065] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open

Park Y. Critical assessment of sequence-based protein-protein interaction prediction methods that do not require homologous protein sequences. BMC Bioinformatics 2009;10:419. [PMID: 20003442 PMCID: PMC2803199 DOI: 10.1186/1471-2105-10-419] [Citation(s) in RCA: 40] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2009] [Accepted: 12/14/2009] [Indexed: 11/10/2022] Open

Abstract

Background

Protein-protein interactions underlie many important biological processes. Computational prediction methods can nicely complement experimental approaches for identifying protein-protein interactions. Recently, a unique category of sequence-based prediction methods has been put forward - unique in the sense that it does not require homologous protein sequences. This enables it to be universally applicable to all protein sequences unlike many of previous sequence-based prediction methods. If effective as claimed, these new sequence-based, universally applicable prediction methods would have far-reaching utilities in many areas of biology research.

Results

Upon close survey, I realized that many of these new methods were ill-tested. In addition, newer methods were often published without performance comparison with previous ones. Thus, it is not clear how good they are and whether there are significant performance differences among them. In this study, I have implemented and thoroughly tested 4 different methods on large-scale, non-redundant data sets. It reveals several important points. First, significant performance differences are noted among different methods. Second, data sets typically used for training prediction methods appear significantly biased, limiting the general applicability of prediction methods trained with them. Third, there is still ample room for further developments. In addition, my analysis illustrates the importance of complementary performance measures coupled with right-sized data sets for meaningful benchmark tests.

Conclusions

The current study reveals the potentials and limits of the new category of sequence-based protein-protein interaction prediction methods, which in turn provides a firm ground for future endeavours in this important area of contemporary bioinformatics.

Collapse

Sim K, Li J, Gopalkrishnan V, Liu G. Mining maximal quasi-bicliques: Novel algorithm and applications in the stock market and protein networks. Stat Anal Data Min 2009. [DOI: 10.1002/sam.10051] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]

Andreopoulos B, Winter C, Labudde D, Schroeder M. Triangle network motifs predict complexes by complementing high-error interactomes with structural information. BMC Bioinformatics 2009;10:196. [PMID: 19558694 PMCID: PMC2714575 DOI: 10.1186/1471-2105-10-196] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2009] [Accepted: 06/27/2009] [Indexed: 11/30/2022] Open

Abstract

Background

A lot of high-throughput studies produce protein-protein interaction networks (PPINs) with many errors and missing information. Even for genome-wide approaches, there is often a low overlap between PPINs produced by different studies. Second-level neighbors separated by two protein-protein interactions (PPIs) were previously used for predicting protein function and finding complexes in high-error PPINs. We retrieve second level neighbors in PPINs, and complement these with structural domain-domain interactions (SDDIs) representing binding evidence on proteins, forming PPI-SDDI-PPI triangles.

Results

We find low overlap between PPINs, SDDIs and known complexes, all well below 10%. We evaluate the overlap of PPI-SDDI-PPI triangles with known complexes from Munich Information center for Protein Sequences (MIPS). PPI-SDDI-PPI triangles have ~20 times higher overlap with MIPS complexes than using second-level neighbors in PPINs without SDDIs. The biological interpretation for triangles is that a SDDI causes two proteins to be observed with common interaction partners in high-throughput experiments. The relatively few SDDIs overlapping with PPINs are part of highly connected SDDI components, and are more likely to be detected in experimental studies. We demonstrate the utility of PPI-SDDI-PPI triangles by reconstructing myosin-actin processes in the nucleus, cytoplasm, and cytoskeleton, which were not obvious in the original PPIN. Using other complementary datatypes in place of SDDIs to form triangles, such as PubMed co-occurrences or threading information, results in a similar ability to find protein complexes.

Conclusion

Given high-error PPINs with missing information, triangles of mixed datatypes are a promising direction for finding protein complexes. Integrating PPINs with SDDIs improves finding complexes. Structural SDDIs partially explain the high functional similarity of second-level neighbors in PPINs. We estimate that relatively little structural information would be sufficient for finding complexes involving most of the proteins and interactions in a typical PPIN.

Collapse

Chua HN, Hugo W, Liu G, Li X, Wong L, Ng SK. A Probabilistic Graph-Theoretic Approach to Integrate Multiple Predictions for the Protein-Protein Subnetwork Prediction Challenge. Ann N Y Acad Sci 2009;1158:224-33. [DOI: 10.1111/j.1749-6632.2008.03760.x] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]

Li J, Liu Q. 'Double water exclusion': a hypothesis refining the O-ring theory for the hot spots at protein interfaces. Bioinformatics 2009;25:743-50. [PMID: 19179356 PMCID: PMC2654803 DOI: 10.1093/bioinformatics/btp058] [Citation(s) in RCA: 49] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022] Open

Abstract

Motivation: The O-ring theory reveals that the binding hot spot at a protein interface is surrounded by a ring of residues that are energetically less important than the residues in the hot spot. As this ring of residues is served to occlude water molecules from the hot spot, the O-ring theory is also called ‘water exclusion’ hypothesis. We propose a ‘double water exclusion’ hypothesis to refine the O-ring theory by assuming the hot spot itself is water-free. To computationally model a water-free hot spot, we use a biclique pattern that is defined as two maximal groups of residues from two chains in a protein complex holding the property that every residue contacts with all residues in the other group.

Methods and Results: Given a chain pair A and B of a protein complex from the Protein Data Bank (PDB), we calculate the interatomic distance of all possible pairs of atoms between A and B. We then represent A and B as a bipartite graph based on these distance information. Maximal biclique subgraphs are subsequently identified from all of the bipartite graphs to locate biclique patterns at the interfaces. We address two properties of biclique patterns: a non-redundant occurrence in PDB, and a correspondence with hot spots when the solvent-accessible surface area (SASA) of a biclique pattern in the complex form is small. A total of 1293 biclique patterns are discovered which have a non-redundant occurrence of at least five, and which each have a minimum two and four residues at the two sides. Through extensive queries to the HotSprint and ASEdb databases, we verified that biclique patterns are rich of true hot residues. Our algorithm and results provide a new way to identify hot spots by examining proteins' structural data.

Availability: The biclique mining algorithm is available at http://www.ntu.edu.sg/home/jyli/dwe.html.

Contact:jyli@ntu.edu.sg

Supplementary information:Supplementary data are available at Bioinformatics online.

Collapse

Chua HN, Wong L. Increasing the reliability of protein interactomes. Drug Discov Today 2008;13:652-8. [DOI: 10.1016/j.drudis.2008.05.004] [Citation(s) in RCA: 54] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2008] [Revised: 05/09/2008] [Accepted: 05/13/2008] [Indexed: 11/28/2022]

Royer L, Reimann M, Andreopoulos B, Schroeder M. Unraveling protein networks with power graph analysis. PLoS Comput Biol 2008;4:e1000108. [PMID: 18617988 PMCID: PMC2424176 DOI: 10.1371/journal.pcbi.1000108] [Citation(s) in RCA: 96] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2007] [Accepted: 05/29/2008] [Indexed: 11/28/2022] Open

Abstract

Networks play a crucial role in computational biology, yet their analysis and representation is still an open problem. Power Graph Analysis is a lossless transformation of biological networks into a compact, less redundant representation, exploiting the abundance of cliques and bicliques as elementary topological motifs. We demonstrate with five examples the advantages of Power Graph Analysis. Investigating protein-protein interaction networks, we show how the catalytic subunits of the casein kinase II complex are distinguishable from the regulatory subunits, how interaction profiles and sequence phylogeny of SH3 domains correlate, and how false positive interactions among high-throughput interactions are spotted. Additionally, we demonstrate the generality of Power Graph Analysis by applying it to two other types of networks. We show how power graphs induce a clustering of both transcription factors and target genes in bipartite transcription networks, and how the erosion of a phosphatase domain in type 22 non-receptor tyrosine phosphatases is detected. We apply Power Graph Analysis to high-throughput protein interaction networks and show that up to 85% (56% on average) of the information is redundant. Experimental networks are more compressible than rewired ones of same degree distribution, indicating that experimental networks are rich in cliques and bicliques. Power Graphs are a novel representation of networks, which reduces network complexity by explicitly representing re-occurring network motifs. Power Graphs compress up to 85% of the edges in protein interaction networks and are applicable to all types of networks such as protein interactions, regulatory networks, or homology networks.

Networks play a crucial role in biology and are often used as a way to represent experimental results. Yet, their analysis and representation is still an open problem. Recent experimental and computational progress yields networks of increased size and complexity. There are, for example, small- and large-scale interaction networks, regulatory networks, genetic networks, protein-ligand interaction networks, and homology networks analyzed and published regularly. A common way to access the information in a network is though direct visualization, but this fails as it often just results in “fur balls” from which little insight can be gathered. On the other hand, clustering techniques manage to avoid the problems caused by the large number of nodes and even larger number of edges by coarse-graining the networks and thus abstracting details. But these also fail, since, in fact, much of the biology lies in the details. This work presents a novel methodology for analyzing and representing networks. Power Graphs are a lossless representation of networks, which reduces network complexity by explicitly representing re-occurring network motifs. Moreover, power graphs can be clearly visualized: they compress up to 90% of the edges in biological networks and are applicable to all types of networks such as protein interaction, regulatory networks, or homology networks.

Collapse

Guo J, Wu X, Zhang DY, Lin K. Genome-wide inference of protein interaction sites: lessons from the yeast high-quality negative protein-protein interaction dataset. Nucleic Acids Res 2008;36:2002-11. [PMID: 18281313 PMCID: PMC2346601 DOI: 10.1093/nar/gkn016] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open

van Dijk ADJ, ter Braak CJF, Immink RG, Angenent GC, van Ham RCHJ. Predicting and understanding transcription factor interactions based on sequence level determinants of combinatorial control. ACTA ACUST UNITED AC 2007;24:26-33. [PMID: 18024974 DOI: 10.1093/bioinformatics/btm539] [Citation(s) in RCA: 15] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022]

Devos D, Russell RB. A more complete, complexed and structured interactome. Curr Opin Struct Biol 2007;17:370-7. [PMID: 17574831 DOI: 10.1016/j.sbi.2007.05.011] [Citation(s) in RCA: 44] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2007] [Revised: 04/18/2007] [Accepted: 05/31/2007] [Indexed: 11/16/2022]

Andreopoulos B, An A, Wang X, Faloutsos M, Schroeder M. Clustering by common friends finds locally significant proteins mediating modules. Bioinformatics 2007;23:1124-31. [PMID: 17314122 DOI: 10.1093/bioinformatics/btm064] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open

Sim K, Li J, Gopalkrishnan V, Liu G. Mining Maximal Quasi-Bicliques to Co-Cluster Stocks and Financial Ratios for Value Investment. ACTA ACUST UNITED AC 2006. [DOI: 10.1109/icdm.2006.111] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]