51
|
Hinow P, Rietman EA, Omar SI, Tuszyński JA. Algebraic and topological indices of molecular pathway networks in human cancers. MATHEMATICAL BIOSCIENCES AND ENGINEERING : MBE 2015; 12:1289-1302. [PMID: 26775864 DOI: 10.3934/mbe.2015.12.1289] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]
Abstract
Protein-protein interaction networks associated with diseases have gained prominence as an area of research. We investigate algebraic and topological indices for protein-protein interaction networks of 11 human cancers derived from the Kyoto Encyclopedia of Genes and Genomes (KEGG) database. We find a strong correlation between relative automorphism group sizes and topological network complexities on the one hand and five year survival probabilities on the other hand. Moreover, we identify several protein families (e.g. PIK, ITG, AKT families) that are repeated motifs in many of the cancer pathways. Interestingly, these sources of symmetry are often central rather than peripheral. Our results can aide in identification of promising targets for anti-cancer drugs. Beyond that, we provide a unifying framework to study protein-protein interaction networks of families of related diseases (e.g. neurodegenerative diseases, viral diseases, substance abuse disorders).
Collapse
Affiliation(s)
- Peter Hinow
- Department of Mathematical Sciences, University of Wisconsin - Milwaukee, P.O. Box 413, Milwaukee, WI 53201-0413, United States
| | | | | | | |
Collapse
|
52
|
Gligorijević V, Pržulj N. Methods for biological data integration: perspectives and challenges. J R Soc Interface 2015; 12:20150571. [PMID: 26490630 PMCID: PMC4685837 DOI: 10.1098/rsif.2015.0571] [Citation(s) in RCA: 157] [Impact Index Per Article: 17.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/26/2015] [Accepted: 09/25/2015] [Indexed: 12/17/2022] Open
Abstract
Rapid technological advances have led to the production of different types of biological data and enabled construction of complex networks with various types of interactions between diverse biological entities. Standard network data analysis methods were shown to be limited in dealing with such heterogeneous networked data and consequently, new methods for integrative data analyses have been proposed. The integrative methods can collectively mine multiple types of biological data and produce more holistic, systems-level biological insights. We survey recent methods for collective mining (integration) of various types of networked biological data. We compare different state-of-the-art methods for data integration and highlight their advantages and disadvantages in addressing important biological problems. We identify the important computational challenges of these methods and provide a general guideline for which methods are suited for specific biological problems, or specific data types. Moreover, we propose that recent non-negative matrix factorization-based approaches may become the integration methodology of choice, as they are well suited and accurate in dealing with heterogeneous data and have many opportunities for further development.
Collapse
Affiliation(s)
| | - Nataša Pržulj
- Department of Computing, Imperial College London, London SW7 2AZ, UK
| |
Collapse
|
53
|
van Eeuwijk F. How to dissect complex traits and how to choose suitable mapping resources for system genetics? Phys Life Rev 2015; 13:186-9. [DOI: 10.1016/j.plrev.2015.04.035] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2015] [Accepted: 04/23/2015] [Indexed: 12/21/2022]
|
54
|
Babaei S, Mahfouz A, Hulsman M, Lelieveldt BPF, de Ridder J, Reinders M. Hi-C Chromatin Interaction Networks Predict Co-expression in the Mouse Cortex. PLoS Comput Biol 2015; 11:e1004221. [PMID: 25965262 PMCID: PMC4429121 DOI: 10.1371/journal.pcbi.1004221] [Citation(s) in RCA: 34] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2014] [Accepted: 03/03/2015] [Indexed: 01/08/2023] Open
Abstract
The three dimensional conformation of the genome in the cell nucleus influences important biological processes such as gene expression regulation. Recent studies have shown a strong correlation between chromatin interactions and gene co-expression. However, predicting gene co-expression from frequent long-range chromatin interactions remains challenging. We address this by characterizing the topology of the cortical chromatin interaction network using scale-aware topological measures. We demonstrate that based on these characterizations it is possible to accurately predict spatial co-expression between genes in the mouse cortex. Consistent with previous findings, we find that the chromatin interaction profile of a gene-pair is a good predictor of their spatial co-expression. However, the accuracy of the prediction can be substantially improved when chromatin interactions are described using scale-aware topological measures of the multi-resolution chromatin interaction network. We conclude that, for co-expression prediction, it is necessary to take into account different levels of chromatin interactions ranging from direct interaction between genes (i.e. small-scale) to chromatin compartment interactions (i.e. large-scale). Regulatory elements can target genes over large genomic distances through long-range chromatin interactions. These interactions arise as a result of the three-dimensional (3D) conformation of chromosomes in the cell nucleus. This 3D conformation can also result in the co-localization of co-regulated genes. To investigate this, we asked whether genome-wide chromatin interactions can predict co-expression patterns of genes. To address this question, we characterized 3D interactions between genes, captured by Hi-C measurements, by a network, termed chromatin interaction network (CIN). We applied scale-aware topological measures to the network to comprehensively characterize the chromatin interactions at different scales, ranging from direct interaction between gene pairs to chromatin compartment interactions. We then used multi-scale chromatin interactions to predict spatial co-expression patterns in the mouse cortex. The results show that the prediction performance improves when scale-aware topological measures of the multi-resolution chromatin interaction network are used.
Collapse
Affiliation(s)
- Sepideh Babaei
- Delft Bioinformatics Lab, Delft University of Technology, Delft, The Netherlands
| | - Ahmed Mahfouz
- Delft Bioinformatics Lab, Delft University of Technology, Delft, The Netherlands
- Division of Image Processing, Department of Radiology, Leiden University Medical Center, Leiden, The Netherlands
| | - Marc Hulsman
- Delft Bioinformatics Lab, Delft University of Technology, Delft, The Netherlands
- Department of Clinical Genetics, VU University Medical Center, Amsterdam, The Netherlands
| | - Boudewijn P. F. Lelieveldt
- Division of Image Processing, Department of Radiology, Leiden University Medical Center, Leiden, The Netherlands
- Department of Intelligent Systems, Delft University of Technology, Delft, The Netherlands
| | - Jeroen de Ridder
- Delft Bioinformatics Lab, Delft University of Technology, Delft, The Netherlands
- * E-mail: (JDR); (MR)
| | - Marcel Reinders
- Delft Bioinformatics Lab, Delft University of Technology, Delft, The Netherlands
- * E-mail: (JDR); (MR)
| |
Collapse
|
55
|
Quantitative Analysis of Robustness of Dynamic Response and Signal Transfer in Insulin mediated PI3K/AKT Pathway. Comput Chem Eng 2014; 71:715-727. [PMID: 25506104 DOI: 10.1016/j.compchemeng.2014.07.018] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2023]
Abstract
Robustness is a critical feature of signaling pathways ensuring signal propagation with high fidelity in the event of perturbations. Here we present a detailed quantitative analysis of robustness in insulin mediated PI3K/AKT pathway, a critical signaling pathway maintaining self-renewal in human embryonic stem cells. Using global sensitivity analysis, we identified robustness promoting mechanisms that ensure (1) maintenance of a first order or overshoot dynamics of self-renewal molecule, p-AKT and (2) robust transfer of signals from oscillatory insulin stimulus to p-AKT in the presence of noise. Our results indicate that negative feedback controls the robustness to most perturbations. Faithful transfer of signal from the stimulating ligand to p-AKT occurs even in the presence of noise, albeit with signal attenuation and high frequency cut-off. Negative feedback contributes to signal attenuation, while positive regulators upstream of PIP3 contribute to signal amplification. These results establish precise mechanisms to modulate self-renewal molecules like p-AKT.
Collapse
|
56
|
Bosque G, Folch-Fortuny A, Picó J, Ferrer A, Elena SF. Topology analysis and visualization of Potyvirus protein-protein interaction network. BMC SYSTEMS BIOLOGY 2014; 8:129. [PMID: 25409737 PMCID: PMC4251984 DOI: 10.1186/s12918-014-0129-8] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/28/2014] [Accepted: 11/05/2014] [Indexed: 11/25/2022]
Abstract
Background One of the central interests of Virology is the identification of host factors that contribute to virus infection. Despite tremendous efforts, the list of factors identified remains limited. With omics techniques, the focus has changed from identifying and thoroughly characterizing individual host factors to the simultaneous analysis of thousands of interactions, framing them on the context of protein-protein interaction networks and of transcriptional regulatory networks. This new perspective is allowing the identification of direct and indirect viral targets. Such information is available for several members of the Potyviridae family, one of the largest and more important families of plant viruses. Results After collecting information on virus protein-protein interactions from different potyviruses, we have processed it and used it for inferring a protein-protein interaction network. All proteins are connected into a single network component. Some proteins show a high degree and are highly connected while others are much less connected, with the network showing a significant degree of dissortativeness. We have attempted to integrate this virus protein-protein interaction network into the largest protein-protein interaction network of Arabidopsis thaliana, a susceptible laboratory host. To make the interpretation of data and results easier, we have developed a new approach for visualizing and analyzing the dynamic spread on the host network of the local perturbations induced by viral proteins. We found that local perturbations can reach the entire host protein-protein interaction network, although the efficiency of this spread depends on the particular viral proteins. By comparing the spread dynamics among viral proteins, we found that some proteins spread their effects fast and efficiently by attacking hubs in the host network while other proteins exert more local effects. Conclusions Our findings confirm that potyvirus protein-protein interaction networks are highly connected, with some proteins playing the role of hubs. Several topological parameters depend linearly on the protein degree. Some viral proteins focus their effect in only host hubs while others diversify its effect among several proteins at the first step. Future new data will help to refine our model and to improve our predictions. Electronic supplementary material The online version of this article (doi:10.1186/s12918-014-0129-8) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Gabriel Bosque
- Institut Universitari d'Automàtica i Informàtica Industrial, Universitat Politècnica de València, Camí de Vera s/n, 46022, València, Spain.
| | - Abel Folch-Fortuny
- Departamento de Estadística e Investigación Operativa Aplicadas y Calidad, Universitat Politècnica de València, Camí de Vera, s/n, Edificio 7A, 46022, València, Spain.
| | - Jesús Picó
- Institut Universitari d'Automàtica i Informàtica Industrial, Universitat Politècnica de València, Camí de Vera s/n, 46022, València, Spain.
| | - Alberto Ferrer
- Departamento de Estadística e Investigación Operativa Aplicadas y Calidad, Universitat Politècnica de València, Camí de Vera, s/n, Edificio 7A, 46022, València, Spain.
| | - Santiago F Elena
- Instituto de Biología Molecular y Celular de Plantas, Consejo Superior de Investigaciones Científicas-Universitat Politècnica de València, Campus UPV CPI 8E, Ingeniero Fausto Elio s/n, 46022, València, Spain. .,The Santa Fe Institute, Santa Fe, NM, 87501, USA.
| |
Collapse
|
57
|
Ay A, Gong D, Kahveci T. Network-based Prediction of Cancer under Genetic Storm. Cancer Inform 2014; 13:15-31. [PMID: 25368507 PMCID: PMC4214593 DOI: 10.4137/cin.s14025] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2014] [Revised: 06/17/2014] [Accepted: 06/24/2014] [Indexed: 01/01/2023] Open
Abstract
Classification of cancer patients using traditional methods is a challenging task in the medical practice. Owing to rapid advances in microarray technologies, currently expression levels of thousands of genes from individual cancer patients can be measured. The classification of cancer patients by supervised statistical learning algorithms using the gene expression datasets provides an alternative to the traditional methods. Here we present a new network-based supervised classification technique, namely the NBC method. We compare NBC to five traditional classification techniques (support vector machines (SVM), k-nearest neighbor (kNN), naïve Bayes (NB), C4.5, and random forest (RF)) using 50–300 genes selected by five feature selection methods. Our results on five large cancer datasets demonstrate that NBC method outperforms traditional classification techniques. Our analysis suggests that using symmetrical uncertainty (SU) feature selection method with NBC method provides the most accurate classification strategy. Finally, in-depth analysis of the correlation-based co-expression networks chosen by our network-based classifier in different cancer classes shows that there are drastic changes in the network models of different cancer types.
Collapse
Affiliation(s)
- Ahmet Ay
- Department of Mathematics, Colgate University, Hamilton, NY, USA. ; Department of Biology, Colgate University, Hamilton, NY, USA
| | - Dihong Gong
- Department of Computer and Information Science and Engineering, University of Florida, Gainesville, FL, USA
| | - Tamer Kahveci
- Department of Computer and Information Science and Engineering, University of Florida, Gainesville, FL, USA
| |
Collapse
|
58
|
Enhancing the functional content of eukaryotic protein interaction networks. PLoS One 2014; 9:e109130. [PMID: 25275489 PMCID: PMC4183583 DOI: 10.1371/journal.pone.0109130] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2013] [Accepted: 09/08/2014] [Indexed: 12/26/2022] Open
Abstract
Protein interaction networks are a promising type of data for studying complex biological systems. However, despite the rich information embedded in these networks, these networks face important data quality challenges of noise and incompleteness that adversely affect the results obtained from their analysis. Here, we apply a robust measure of local network structure called common neighborhood similarity (CNS) to address these challenges. Although several CNS measures have been proposed in the literature, an understanding of their relative efficacies for the analysis of interaction networks has been lacking. We follow the framework of graph transformation to convert the given interaction network into a transformed network corresponding to a variety of CNS measures evaluated. The effectiveness of each measure is then estimated by comparing the quality of protein function predictions obtained from its corresponding transformed network with those from the original network. Using a large set of human and fly protein interactions, and a set of over 100 GO terms for both, we find that several of the transformed networks produce more accurate predictions than those obtained from the original network. In particular, the HC.cont measure and other continuous CNS measures perform well this task, especially for large networks. Further investigation reveals that the two major factors contributing to this improvement are the abilities of CNS measures to prune out noisy edges and enhance functional coherence in the transformed networks.
Collapse
|
59
|
Hulsman M, Dimitrakopoulos C, de Ridder J. Scale-space measures for graph topology link protein network architecture to function. ACTA ACUST UNITED AC 2014; 30:i237-45. [PMID: 24931989 PMCID: PMC4058939 DOI: 10.1093/bioinformatics/btu283] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/17/2023]
Abstract
Motivation: The network architecture of physical protein interactions is an important determinant for the molecular functions that are carried out within each cell. To study this relation, the network architecture can be characterized by graph topological characteristics such as shortest paths and network hubs. These characteristics have an important shortcoming: they do not take into account that interactions occur across different scales. This is important because some cellular functions may involve a single direct protein interaction (small scale), whereas others require more and/or indirect interactions, such as protein complexes (medium scale) and interactions between large modules of proteins (large scale). Results: In this work, we derive generalized scale-aware versions of known graph topological measures based on diffusion kernels. We apply these to characterize the topology of networks across all scales simultaneously, generating a so-called graph topological scale-space. The comprehensive physical interaction network in yeast is used to show that scale-space based measures consistently give superior performance when distinguishing protein functional categories and three major types of functional interactions—genetic interaction, co-expression and perturbation interactions. Moreover, we demonstrate that graph topological scale spaces capture biologically meaningful features that provide new insights into the link between function and protein network architecture. Availability and implementation: MatlabTM code to calculate the scale-aware topological measures (STMs) is available at http://bioinformatics.tudelft.nl/TSSA Contact:j.deridder@tudelft.nl Supplementary information:Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Marc Hulsman
- Delft Bioinformatics Lab, Faculty of Electrical Engineering, Mathematics and Computer Science, Delft University of Technology, 2628CD Delft, The Netherlands
| | - Christos Dimitrakopoulos
- Delft Bioinformatics Lab, Faculty of Electrical Engineering, Mathematics and Computer Science, Delft University of Technology, 2628CD Delft, The Netherlands
| | - Jeroen de Ridder
- Delft Bioinformatics Lab, Faculty of Electrical Engineering, Mathematics and Computer Science, Delft University of Technology, 2628CD Delft, The Netherlands
| |
Collapse
|
60
|
Takemoto K. Metabolic networks are almost nonfractal: a comprehensive evaluation. PHYSICAL REVIEW. E, STATISTICAL, NONLINEAR, AND SOFT MATTER PHYSICS 2014; 90:022802. [PMID: 25215776 DOI: 10.1103/physreve.90.022802] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/23/2014] [Indexed: 06/03/2023]
Abstract
Network self-similarity or fractality are widely accepted as an important topological property of metabolic networks; however, recent studies cast doubt on the reality of self-similarity in the networks. Therefore, we perform a comprehensive evaluation of metabolic network fractality using a box-covering method with an earlier version and the latest version of metabolic networks and demonstrate that the latest metabolic networks are almost self-dissimilar, while the earlier ones are fractal, as reported in a number of previous studies. This result may be because the networks were randomized because of an increase in network density due to database updates, suggesting that the previously observed network fractality was due to a lack of available data on metabolic reactions. This finding may not entirely discount the importance of self-similarity of metabolic networks. Rather, it highlights the need for a more suitable definition of network fractality and a more careful examination of self-similarity of metabolic networks.
Collapse
Affiliation(s)
- Kazuhiro Takemoto
- Department of Bioscience and Bioinformatics, Kyushu Institute of Technology, Iizuka, Fukuoka 820-8502, Japan
| |
Collapse
|
61
|
Supervised clustering based on DPClusO: prediction of plant-disease relations using Jamu formulas of KNApSAcK database. BIOMED RESEARCH INTERNATIONAL 2014; 2014:831751. [PMID: 24804251 PMCID: PMC3997850 DOI: 10.1155/2014/831751] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/30/2013] [Accepted: 02/18/2014] [Indexed: 02/06/2023]
Abstract
Indonesia has the largest medicinal plant species in the world and these plants are used as Jamu medicines. Jamu medicines are popular traditional medicines from Indonesia and we need to systemize the formulation of Jamu and develop basic scientific principles of Jamu to meet the requirement of Indonesian Healthcare System. We propose a new approach to predict the relation between plant and disease using network analysis and supervised clustering. At the preliminary step, we assigned 3138 Jamu formulas to 116 diseases of International Classification of Diseases (ver. 10) which belong to 18 classes of disease from National Center for Biotechnology Information. The correlation measures between Jamu pairs were determined based on their ingredient similarity. Networks are constructed and analyzed by selecting highly correlated Jamu pairs. Clusters were then generated by using the network clustering algorithm DPClusO. By using matching score of a cluster, the dominant disease and high frequency plant associated to the cluster are determined. The plant to disease relations predicted by our method were evaluated in the context of previously published results and were found to produce around 90% successful predictions.
Collapse
|