1
|
Jing F, Zhang SW, Zhang S. Brief Survey of Biological Network Alignment and a Variant with Incorporation of Functional Annotations. Curr Bioinform 2018. [DOI: 10.2174/1574893612666171020103747] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
Background:Biological network alignment has been widely studied in the context of protein-protein interaction (PPI) networks, metabolic networks and others in bioinformatics. The topological structure of networks and genomic sequence are generally used by existing methods for achieving this task.Objective and Method:Here we briefly survey the methods generally used for this task and introduce a variant with incorporation of functional annotations based on similarity in Gene Ontology (GO). Making full use of GO information is beneficial to provide insights into precise biological network alignment.Results and Conclusion:We analyze the effect of incorporation of GO information to network alignment. Finally, we make a brief summary and discuss future directions about this topic.
Collapse
Affiliation(s)
- Fang Jing
- Key Laboratory of Information Fusion Technology of Ministry of Education, College of Automation, Northwestern Polytechnical University, Xi'an 710072, China
| | - Shao-Wu Zhang
- Key Laboratory of Information Fusion Technology of Ministry of Education, College of Automation, Northwestern Polytechnical University, Xi'an 710072, China
| | - Shihua Zhang
- NCMIS, CEMS, RCSDS, Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing 100190, China
| |
Collapse
|
2
|
Wu L, Shen Y, Li M, Wu FX. Network output controllability-based method for drug target identification. IEEE Trans Nanobioscience 2015; 14:184-91. [PMID: 25643411 DOI: 10.1109/tnb.2015.2391175] [Citation(s) in RCA: 30] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Biomolecules do not perform their functions alone, but interactively with one another to form so called biomolecular networks. It is well known that a complex disease stems from the malfunctions of corresponding biomolecular networks. Therefore, one of important tasks is to identify drug targets from biomolecular networks. In this study, the drug target identification is formulated as a problem of finding steering nodes in biomolecular networks while the concept of network output controllability is applied to the problem of drug target identification. By applying control signals to these steering nodes, the biomolecular networks are expected to be transited from one state to another. A graph-theoretic algorithm has been proposed to find a minimum set of steering nodes in biomolecular networks which can be a potential set of drug targets. Application results of the method to real biomolecular networks show that identified potential drug targets are in agreement with existing research results. This indicates that the method can generate testable predictions and provide insights into experimental design of drug discovery.
Collapse
|
3
|
Panni S, Rombo SE. Searching for repetitions in biological networks: methods, resources and tools. Brief Bioinform 2013; 16:118-36. [PMID: 24300112 DOI: 10.1093/bib/bbt084] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/11/2023] Open
Abstract
We present here a compact overview of the data, models and methods proposed for the analysis of biological networks based on the search for significant repetitions. In particular, we concentrate on three problems widely studied in the literature: 'network alignment', 'network querying' and 'network motif extraction'. We provide (i) details of the experimental techniques used to obtain the main types of interaction data, (ii) descriptions of the models and approaches introduced to solve such problems and (iii) pointers to both the available databases and software tools. The intent is to lay out a useful roadmap for identifying suitable strategies to analyse cellular data, possibly based on the joint use of different interaction data types or analysis techniques.
Collapse
|
4
|
Csermely P, Korcsmáros T, Kiss HJM, London G, Nussinov R. Structure and dynamics of molecular networks: a novel paradigm of drug discovery: a comprehensive review. Pharmacol Ther 2013; 138:333-408. [PMID: 23384594 PMCID: PMC3647006 DOI: 10.1016/j.pharmthera.2013.01.016] [Citation(s) in RCA: 512] [Impact Index Per Article: 46.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2013] [Accepted: 01/22/2013] [Indexed: 02/02/2023]
Abstract
Despite considerable progress in genome- and proteome-based high-throughput screening methods and in rational drug design, the increase in approved drugs in the past decade did not match the increase of drug development costs. Network description and analysis not only give a systems-level understanding of drug action and disease complexity, but can also help to improve the efficiency of drug design. We give a comprehensive assessment of the analytical tools of network topology and dynamics. The state-of-the-art use of chemical similarity, protein structure, protein-protein interaction, signaling, genetic interaction and metabolic networks in the discovery of drug targets is summarized. We propose that network targeting follows two basic strategies. The "central hit strategy" selectively targets central nodes/edges of the flexible networks of infectious agents or cancer cells to kill them. The "network influence strategy" works against other diseases, where an efficient reconfiguration of rigid networks needs to be achieved by targeting the neighbors of central nodes/edges. It is shown how network techniques can help in the identification of single-target, edgetic, multi-target and allo-network drug target candidates. We review the recent boom in network methods helping hit identification, lead selection optimizing drug efficacy, as well as minimizing side-effects and drug toxicity. Successful network-based drug development strategies are shown through the examples of infections, cancer, metabolic diseases, neurodegenerative diseases and aging. Summarizing >1200 references we suggest an optimized protocol of network-aided drug development, and provide a list of systems-level hallmarks of drug quality. Finally, we highlight network-related drug development trends helping to achieve these hallmarks by a cohesive, global approach.
Collapse
Affiliation(s)
- Peter Csermely
- Department of Medical Chemistry, Semmelweis University, P.O. Box 260, H-1444 Budapest 8, Hungary.
| | | | | | | | | |
Collapse
|
5
|
Pache RA, Aloy P. A novel framework for the comparative analysis of biological networks. PLoS One 2012; 7:e31220. [PMID: 22363585 PMCID: PMC3283617 DOI: 10.1371/journal.pone.0031220] [Citation(s) in RCA: 31] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2011] [Accepted: 01/04/2012] [Indexed: 11/19/2022] Open
Abstract
Genome sequencing projects provide nearly complete lists of the individual components present in an organism, but reveal little about how they work together. Follow-up initiatives have deciphered thousands of dynamic and context-dependent interrelationships between gene products that need to be analyzed with novel bioinformatics approaches able to capture their complex emerging properties. Here, we present a novel framework for the alignment and comparative analysis of biological networks of arbitrary topology. Our strategy includes the prediction of likely conserved interactions, based on evolutionary distances, to counter the high number of missing interactions in the current interactome networks, and a fast assessment of the statistical significance of individual alignment solutions, which vastly increases its performance with respect to existing tools. Finally, we illustrate the biological significance of the results through the identification of novel complex components and potential cases of cross-talk between pathways and alternative signaling routes.
Collapse
Affiliation(s)
- Roland A. Pache
- Joint BSC-IRB Program in Computational Biology, Institute for Research in Biomedicine, Barcelona, Spain
| | - Patrick Aloy
- Joint BSC-IRB Program in Computational Biology, Institute for Research in Biomedicine, Barcelona, Spain
- Institució Catalana de Recerca i Estudis Avançats (ICREA), Barcelona, Spain
- * E-mail:
| |
Collapse
|
6
|
Ali W, Deane CM. Evolutionary analysis reveals low coverage as the major challenge for protein interaction network alignment. MOLECULAR BIOSYSTEMS 2010; 6:2296-304. [PMID: 20740252 DOI: 10.1039/c004430j] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/06/2023]
Abstract
Local alignments of protein interaction networks have found little conservation among several species. While this could be a consequence of the incompleteness of interaction data-sets and presence of error, an intriguing prospect is that the process of network evolution is sufficient to erase any evidence of conservation. Here, we aim to test this hypothesis using models of network evolution and also investigate the role of error in the results of network alignment. We devised a distance metric based on summary statistics to assess the fit between experimental and simulated network alignments. Our results indicate that network evolution alone is unlikely to account for the poor quality alignments given by real data. Alignments of simulated networks undergoing evolution are considerably (4 to 5 times) larger than real alignments. We compare several error models in their ability to explain this discrepancy. Our estimates of false negative rates vary from 20 to 60% dependent on whether incomplete proteome sampling is taken into account or not. We also find that false positives appear to affect network alignments little compared to false negatives indicating that incompleteness, not spurious links, is the major challenge for interactome-level comparisons.
Collapse
Affiliation(s)
- Waqar Ali
- Department of Statistics, 1 South Parks Road, Oxford, UKOX1 3TG.
| | | |
Collapse
|
7
|
Drabovich AP, Diamandis EP. Combinatorial peptide libraries facilitate development of multiple reaction monitoring assays for low-abundance proteins. J Proteome Res 2010; 9:1236-45. [PMID: 20070123 DOI: 10.1021/pr900729g] [Citation(s) in RCA: 56] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]
Abstract
Low-abundance proteins present in biological fluids are often considered an attractive source of new disease biomarkers. Since such proteins are poorly observed in proteome-scale discovery experiments due to an overwhelming mass of high-abundance proteins, the development of quantitative multiple reaction monitoring (MRM) assays for low-abundance proteins is a challenging task. Here, we present a strategy that facilitates the development of MRM assays for large numbers of unpurified low-abundance proteins. Our discovery strategy is based on the reduction of the dynamic range of protein concentrations in biological fluids by means of one-bead one-compound combinatorial peptide libraries (CPL). Our 2D-LC-MS/MS approach allowed us to identify a total of 484 unique proteins in ovarian cancer ascites, and 216 proteins were assigned as low-abundance ones. Interestingly, 74 of those proteins have never been previously described in ascites fluid. Treatment with CPL allowed identification of a significantly higher number of unique peptides for low-abundance proteins and provided important empirical fragmentation information for development of MRM assays. Finally, we confirmed that MRM assays worked for 30 low-abundance proteins in the unfractionated ascites digest. Using a multiplexed MRM method, relative amounts of five proteins (kallikrein 6, metalloproteinase inhibitor 1, macrophage migration inhibitory factor, follistatin-related protein, and mesothelin) were determined in a set of ovarian cancer ascites. Multiplexed MRM assays targeting large numbers of proteins can be used to develop comprehensive panels of biomarkers with high sensitivity and selectivity, and to study complex protein networks.
Collapse
Affiliation(s)
- Andrei P Drabovich
- Samuel Lunenfeld Research Institute, Mount Sinai Hospital, University of Toronto, Toronto, Ontario, Canada
| | | |
Collapse
|
8
|
Baggs JE, Hughes ME, Hogenesch JB. The network as the target. WILEY INTERDISCIPLINARY REVIEWS-SYSTEMS BIOLOGY AND MEDICINE 2010; 2:127-133. [DOI: 10.1002/wsbm.57] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
Affiliation(s)
- Julie E. Baggs
- Institution for Translational Medicine and Therapeutics, School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
| | - Michael E. Hughes
- Institution for Translational Medicine and Therapeutics, School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
| | - John B. Hogenesch
- Institution for Translational Medicine and Therapeutics, School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
| |
Collapse
|
9
|
|
10
|
Liu LYD, Chen CY, Chen MJM, Tsai MS, Lee CHS, Phang TL, Chang LY, Kuo WH, Hwa HL, Lien HC, Jung SM, Lin YS, Chang KJ, Hsieh FJ. Statistical identification of gene association by CID in application of constructing ER regulatory network. BMC Bioinformatics 2009; 10:85. [PMID: 19292896 PMCID: PMC2679734 DOI: 10.1186/1471-2105-10-85] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2008] [Accepted: 03/17/2009] [Indexed: 02/01/2023] Open
Abstract
Background A variety of high-throughput techniques are now available for constructing comprehensive gene regulatory networks in systems biology. In this study, we report a new statistical approach for facilitating in silico inference of regulatory network structure. The new measure of association, coefficient of intrinsic dependence (CID), is model-free and can be applied to both continuous and categorical distributions. When given two variables X and Y, CID answers whether Y is dependent on X by examining the conditional distribution of Y given X. In this paper, we apply CID to analyze the regulatory relationships between transcription factors (TFs) (X) and their downstream genes (Y) based on clinical data. More specifically, we use estrogen receptor α (ERα) as the variable X, and the analyses are based on 48 clinical breast cancer gene expression arrays (48A). Results The analytical utility of CID was evaluated in comparison with four commonly used statistical methods, Galton-Pearson's correlation coefficient (GPCC), Student's t-test (STT), coefficient of determination (CoD), and mutual information (MI). When being compared to GPCC, CoD, and MI, CID reveals its preferential ability to discover the regulatory association where distribution of the mRNA expression levels on X and Y does not fit linear models. On the other hand, when CID is used to measure the association of a continuous variable (Y) against a discrete variable (X), it shows similar performance as compared to STT, and appears to outperform CoD and MI. In addition, this study established a two-layer transcriptional regulatory network to exemplify the usage of CID, in combination with GPCC, in deciphering gene networks based on gene expression profiles from patient arrays. Conclusion CID is shown to provide useful information for identifying associations between genes and transcription factors of interest in patient arrays. When coupled with the relationships detected by GPCC, the association predicted by CID are applicable to the construction of transcriptional regulatory networks. This study shows how information from different data sources and learning algorithms can be integrated to investigate whether relevant regulatory mechanisms identified in cell models can also be partially re-identified in clinical samples of breast cancers. Availability the implementation of CID in R codes can be freely downloaded from .
Collapse
Affiliation(s)
- Li-Yu D Liu
- Department of Agronomy, Biometry Division, National Taiwan University, Taipei, Taiwan.
| | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
11
|
Giancarlo R, Scaturro D, Utro F. Textual data compression in computational biology: a synopsis. Bioinformatics 2009; 25:1575-86. [DOI: 10.1093/bioinformatics/btp117] [Citation(s) in RCA: 63] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022] Open
|
12
|
Paananen J, Wong G. FORG3D: force-directed 3D graph editor for visualization of integrated genome scale data. BMC SYSTEMS BIOLOGY 2009; 3:26. [PMID: 19239683 PMCID: PMC2651117 DOI: 10.1186/1752-0509-3-26] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/01/2008] [Accepted: 02/24/2009] [Indexed: 11/13/2022]
Abstract
Background Genomics research produces vast amounts of experimental data that needs to be integrated in order to understand, model, and interpret the underlying biological phenomena. Interpreting these large and complex data sets is challenging and different visualization methods are needed to help produce knowledge from the data. Results To help researchers to visualize and interpret integrated genomics data, we present a novel visualization method and bioinformatics software tool called FORG3D that is based on real-time three-dimensional force-directed graphs. FORG3D can be used to visualize integrated networks of genome scale data such as interactions between genes or gene products, signaling transduction, metabolic pathways, functional interactions and evolutionary relationships. Furthermore, we demonstrate its utility by exploring gene network relationships using integrated data sets from a Caenorhabditis elegans Parkinson's disease model. Conclusion We have created an open source software tool called FORG3D that can be used for visualizing and exploring integrated genome scale data.
Collapse
Affiliation(s)
- Jussi Paananen
- A,I, Virtanen Institute of Molecular Sciences, University of Kuopio, Kuopio, Finland.
| | | |
Collapse
|
13
|
Kolár M, Lässig M, Berg J. From protein interactions to functional annotation: graph alignment in Herpes. BMC SYSTEMS BIOLOGY 2008; 2:90. [PMID: 18957106 PMCID: PMC2607256 DOI: 10.1186/1752-0509-2-90] [Citation(s) in RCA: 15] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/24/2008] [Accepted: 10/28/2008] [Indexed: 12/20/2022]
Abstract
Background Sequence alignment is a prolific basis of functional annotation, but remains a challenging problem in the 'twilight zone' of high sequence divergence or short gene length. Here we demonstrate how information on gene interactions can help to resolve ambiguous sequence alignments. We compare two distant Herpes viruses by constructing a graph alignment, which is based jointly on the similarity of their protein interaction networks and on sequence similarity. This hybrid method provides functional associations between proteins of the two organisms that cannot be obtained from sequence or interaction data alone. Results We find proteins where interaction similarity and sequence similarity are individually weak, but together provide significant evidence of orthology. There are also proteins with high interaction similarity but without any detectable sequence similarity, providing evidence of functional association beyond sequence homology. The functional predictions derived from our alignment are consistent with genomic position and gene expression data. Conclusion Our approach shows that evolutionary conservation is a powerful filter to make protein interaction data informative about functional similarities between the interacting proteins, and it establishes graph alignment as a powerful tool for the comparative analysis of data from highly diverged species.
Collapse
Affiliation(s)
- Michal Kolár
- Institut für Theoretische Physik, Universität zu Köln, Zülpicher Strasse 77, 50937 Köln, Germany.
| | | | | |
Collapse
|