Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Hulsman M, Dimitrakopoulos C, de Ridder J. Scale-space measures for graph topology link protein network architecture to function. ACTA ACUST UNITED AC 2014;30:i237-45. [PMID: 24931989 PMCID: PMC4058939 DOI: 10.1093/bioinformatics/btu283] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/17/2023]

For:	Hulsman M, Dimitrakopoulos C, de Ridder J. Scale-space measures for graph topology link protein network architecture to function. ACTA ACUST UNITED AC 2014;30:i237-45. [PMID: 24931989 PMCID: PMC4058939 DOI: 10.1093/bioinformatics/btu283] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/17/2023]

Number

Cited by Other Article(s)

Wang J, Yang B, Leier A, Marquez-Lago TT, Hayashida M, Rocker A, Zhang Y, Akutsu T, Chou KC, Strugnell RA, Song J, Lithgow T. Bastion6: a bioinformatics approach for accurate prediction of type VI secreted effectors. Bioinformatics 2019;34:2546-2555. [PMID: 29547915 DOI: 10.1093/bioinformatics/bty155] [Citation(s) in RCA: 85] [Impact Index Per Article: 17.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2017] [Accepted: 03/09/2018] [Indexed: 12/28/2022] Open

Abstract

Motivation

Many Gram-negative bacteria use type VI secretion systems (T6SS) to export effector proteins into adjacent target cells. These secreted effectors (T6SEs) play vital roles in the competitive survival in bacterial populations, as well as pathogenesis of bacteria. Although various computational analyses have been previously applied to identify effectors secreted by certain bacterial species, there is no universal method available to accurately predict T6SS effector proteins from the growing tide of bacterial genome sequence data.

Results

We extracted a wide range of features from T6SE protein sequences and comprehensively analyzed the prediction performance of these features through unsupervised and supervised learning. By integrating these features, we subsequently developed a two-layer SVM-based ensemble model with fine-grain optimized parameters, to identify potential T6SEs. We further validated the predictive model using an independent dataset, which showed that the proposed model achieved an impressive performance in terms of ACC (0.943), F-value (0.946), MCC (0.892) and AUC (0.976). To demonstrate applicability, we employed this method to correctly identify two very recently validated T6SE proteins, which represent challenging prediction targets because they significantly differed from previously known T6SEs in terms of their sequence similarity and cellular function. Furthermore, a genome-wide prediction across 12 bacterial species, involving in total 54 212 protein sequences, was carried out to distinguish 94 putative T6SE candidates. We envisage both this information and our publicly accessible web server will facilitate future discoveries of novel T6SEs.

Availability and implementation

http://bastion6.erc.monash.edu/.

Supplementary information

Supplementary data are available at Bioinformatics online.

Collapse

Affiliation(s)

Jiawei Wang Biomedicine Discovery Institute and Department of Microbiology, Monash University, Clayton, VIC, Australia
Bingjiao Yang Bioinformatics Group, School of Computer Science and Information Security, Guilin University of Electronic Technology, Guilin, China
André Leier Department of Genetics, School of Medicine, University of Alabama at Birmingham, Birmingham, AL, USA
Tatiana T Marquez-Lago Department of Genetics, School of Medicine, University of Alabama at Birmingham, Birmingham, AL, USA
Morihiro Hayashida National Institute of Technology, Matsue College, Matsue, Shimane, Japan
Andrea Rocker Biomedicine Discovery Institute and Department of Microbiology, Monash University, Clayton, VIC, Australia
Yanju Zhang Bioinformatics Group, School of Computer Science and Information Security, Guilin University of Electronic Technology, Guilin, China
Tatsuya Akutsu Bioinformatics Center, Institute for Chemical Research, Kyoto University, Uji, Kyoto, Japan
Kuo-Chen Chou Gordon Life Science Institute, Boston, MA, USA.,Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu, China.,Center of Excellence in Genomic Medicine Research (CEGMR), King Abdulaziz University, Jeddah, Saudi Arabia
Richard A Strugnell Department of Microbiology and Immunology and Peter Doherty Institute for Infection and Immunity, The University of Melbourne, Parkville, VIC, Australia
Jiangning Song Biomedicine Discovery Institute and Department of Biochemistry and Molecular Biology.,Monash Centre for Data Science, Faculty of Information Technolog, Monash University, Clayton, VIC, Australia.,ARC Centre of Excellence for Advanced Molecular Imaging, Monash University, Clayton, VIC, Australia
Trevor Lithgow Biomedicine Discovery Institute and Department of Microbiology, Monash University, Clayton, VIC, Australia

Collapse

Allahyar A, Ubels J, de Ridder J. A data-driven interactome of synergistic genes improves network-based cancer outcome prediction. PLoS Comput Biol 2019;15:e1006657. [PMID: 30726216 PMCID: PMC6380593 DOI: 10.1371/journal.pcbi.1006657] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2018] [Revised: 02/19/2019] [Accepted: 11/20/2018] [Indexed: 12/13/2022] Open

Abstract

Robustly predicting outcome for cancer patients from gene expression is an important challenge on the road to better personalized treatment. Network-based outcome predictors (NOPs), which considers the cellular wiring diagram in the classification, hold much promise to improve performance, stability and interpretability of identified marker genes. Problematically, reports on the efficacy of NOPs are conflicting and for instance suggest that utilizing random networks performs on par to networks that describe biologically relevant interactions. In this paper we turn the prediction problem around: instead of using a given biological network in the NOP, we aim to identify the network of genes that truly improves outcome prediction. To this end, we propose SyNet, a gene network constructed ab initio from synergistic gene pairs derived from survival-labelled gene expression data. To obtain SyNet, we evaluate synergy for all 69 million pairwise combinations of genes resulting in a network that is specific to the dataset and phenotype under study and can be used to in a NOP model. We evaluated SyNet and 11 other networks on a compendium dataset of >4000 survival-labelled breast cancer samples. For this purpose, we used cross-study validation which more closely emulates real world application of these outcome predictors. We find that SyNet is the only network that truly improves performance, stability and interpretability in several existing NOPs. We show that SyNet overlaps significantly with existing gene networks, and can be confidently predicted (~85% AUC) from graph-topological descriptions of these networks, in particular the breast tissue-specific network. Due to its data-driven nature, SyNet is not biased to well-studied genes and thus facilitates post-hoc interpretation. We find that SyNet is highly enriched for known breast cancer genes and genes related to e.g. histological grade and tamoxifen resistance, suggestive of a role in determining breast cancer outcome.

Collapse

Boldi P, Frasca M, Malchiodi D. Evaluating the impact of topological protein features on the negative examples selection. BMC Bioinformatics 2018;19:417. [PMID: 30453879 PMCID: PMC6245585 DOI: 10.1186/s12859-018-2385-x] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open

Abstract

BACKGROUND

Supervised machine learning methods when applied to the problem of automated protein-function prediction (AFP) require the availability of both positive examples (i.e., proteins which are known to possess a given protein function) and negative examples (corresponding to proteins not associated with that function). Unfortunately, publicly available proteome and genome data sources such as the Gene Ontology rarely store the functions not possessed by a protein. Thus the negative selection, consisting in identifying informative negative examples, is currently a central and challenging problem in AFP. Several heuristics have been proposed through the years to solve this problem; nevertheless, despite their effectiveness, to the best of our knowledge no previous existing work studied which protein features are more relevant to this task, that is, which protein features help more in discriminating reliable and unreliable negatives.

RESULTS

The present work analyses the impact of several features on the selection of negative proteins for the Gene Ontology (GO) terms. The analysis is network-based: it exploits the fact that proteins can be naturally structured in a network, considering the pairwise relationships coming from several sources of data, such as protein-protein and genetic interactions. Overall, the proposed protein features, including local and global graph centrality measures and protein multifunctionality, can be term-aware (i.e., depending on the GO term) and term-unaware (i.e., invariant across the GO terms). We validated the informativeness of each feature utilizing a temporal holdout in three different experiments on yeast, mouse and human proteomes: (i) feature selection to detect which protein features are more helpful for the negative selection; (ii) protein function prediction to verify whether the features considered are also useful to predict GO terms; (iii) negative selection by applying two different negative selection algorithms on proteins represented through the proposed features.

CONCLUSIONS

Term-aware features (with some exceptions) resulted more informative for problem (i), together with node betweenness, which is the most relevant among term-unaware features. The node positive neighborhood instead is the most predictive feature for the AFP problem, while experiment (iii) showed that the proposed features allow negative selection algorithms to select effectively negative instances in the temporal holdout setting, with better results when nonlinear combinations of features are also exploited.

Collapse

Mahfouz A, Huisman SMH, Lelieveldt BPF, Reinders MJT. Brain transcriptome atlases: a computational perspective. Brain Struct Funct 2017;222:1557-1580. [PMID: 27909802 PMCID: PMC5406417 DOI: 10.1007/s00429-016-1338-2] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2016] [Accepted: 11/15/2016] [Indexed: 01/31/2023]

Li Z, Liu Z, Zhong W, Huang M, Wu N, Xie Y, Dai Z, Zou X. Large-scale identification of human protein function using topological features of interaction network. Sci Rep 2016;6:37179. [PMID: 27849060 PMCID: PMC5111120 DOI: 10.1038/srep37179] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2015] [Accepted: 10/26/2016] [Indexed: 12/25/2022] Open

Huang CH, Chen TH, Ng KL. Graph theory and stability analysis of protein complex interaction networks. IET Syst Biol 2016;10:64-75. [PMID: 26997661 DOI: 10.1049/iet-syb.2015.0007] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022] Open

Morlot JB, Mozziconacci J, Lesne A. Network concepts for analyzing 3D genome structure from chromosomal contact maps. ACTA ACUST UNITED AC 2016. [DOI: 10.1140/epjnbp/s40366-016-0029-5] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]

GoFDR: A sequence alignment based method for predicting protein functions. Methods 2016;93:3-14. [DOI: 10.1016/j.ymeth.2015.08.009] [Citation(s) in RCA: 42] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2015] [Revised: 07/27/2015] [Accepted: 08/11/2015] [Indexed: 01/01/2023] Open

Sahraeian SM, Luo KR, Brenner SE. SIFTER search: a web server for accurate phylogeny-based protein function prediction. Nucleic Acids Res 2015;43:W141-7. [PMID: 25979264 PMCID: PMC4489292 DOI: 10.1093/nar/gkv461] [Citation(s) in RCA: 33] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2015] [Accepted: 04/27/2015] [Indexed: 12/26/2022] Open

Babaei S, Mahfouz A, Hulsman M, Lelieveldt BPF, de Ridder J, Reinders M. Hi-C Chromatin Interaction Networks Predict Co-expression in the Mouse Cortex. PLoS Comput Biol 2015;11:e1004221. [PMID: 25965262 PMCID: PMC4429121 DOI: 10.1371/journal.pcbi.1004221] [Citation(s) in RCA: 34] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2014] [Accepted: 03/03/2015] [Indexed: 01/08/2023] Open

Abstract

The three dimensional conformation of the genome in the cell nucleus influences important biological processes such as gene expression regulation. Recent studies have shown a strong correlation between chromatin interactions and gene co-expression. However, predicting gene co-expression from frequent long-range chromatin interactions remains challenging. We address this by characterizing the topology of the cortical chromatin interaction network using scale-aware topological measures. We demonstrate that based on these characterizations it is possible to accurately predict spatial co-expression between genes in the mouse cortex. Consistent with previous findings, we find that the chromatin interaction profile of a gene-pair is a good predictor of their spatial co-expression. However, the accuracy of the prediction can be substantially improved when chromatin interactions are described using scale-aware topological measures of the multi-resolution chromatin interaction network. We conclude that, for co-expression prediction, it is necessary to take into account different levels of chromatin interactions ranging from direct interaction between genes (i.e. small-scale) to chromatin compartment interactions (i.e. large-scale).

Regulatory elements can target genes over large genomic distances through long-range chromatin interactions. These interactions arise as a result of the three-dimensional (3D) conformation of chromosomes in the cell nucleus. This 3D conformation can also result in the co-localization of co-regulated genes. To investigate this, we asked whether genome-wide chromatin interactions can predict co-expression patterns of genes. To address this question, we characterized 3D interactions between genes, captured by Hi-C measurements, by a network, termed chromatin interaction network (CIN). We applied scale-aware topological measures to the network to comprehensively characterize the chromatin interactions at different scales, ranging from direct interaction between gene pairs to chromatin compartment interactions. We then used multi-scale chromatin interactions to predict spatial co-expression patterns in the mouse cortex. The results show that the prediction performance improves when scale-aware topological measures of the multi-resolution chromatin interaction network are used.

Collapse