Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Ning K, Ng HK, Srihari S, Leong HW, Nesvizhskii AI. Examination of the relationship between essential genes in PPI network and hub proteins in reverse nearest neighbor topology. BMC Bioinformatics 2010;11:505. [PMID: 20939873 PMCID: PMC3098085 DOI: 10.1186/1471-2105-11-505] [Citation(s) in RCA: 56] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2010] [Accepted: 10/12/2010] [Indexed: 12/03/2022] Open

For:	Ning K, Ng HK, Srihari S, Leong HW, Nesvizhskii AI. Examination of the relationship between essential genes in PPI network and hub proteins in reverse nearest neighbor topology. BMC Bioinformatics 2010;11:505. [PMID: 20939873 PMCID: PMC3098085 DOI: 10.1186/1471-2105-11-505] [Citation(s) in RCA: 56] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2010] [Accepted: 10/12/2010] [Indexed: 12/03/2022] Open

Number

Cited by Other Article(s)

Avecilla G, Spealman P, Matthews J, Caudal E, Schacherer J, Gresham D. Copy number variation alters local and global mutational tolerance. Genome Res 2023;33:1340-1353. [PMID: 37652668 PMCID: PMC10547251 DOI: 10.1101/gr.277625.122] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2022] [Accepted: 07/07/2023] [Indexed: 09/02/2023]

SMOTE-RkNN: A hybrid re-sampling method based on SMOTE and reverse k-nearest neighbors. Inf Sci (N Y) 2022. [DOI: 10.1016/j.ins.2022.02.038] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]

Predicting Essential Proteins Based on Integration of Local Fuzzy Fractal Dimension and Subcellular Location Information. Genes (Basel) 2022;13:genes13020173. [PMID: 35205217 PMCID: PMC8872415 DOI: 10.3390/genes13020173] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2021] [Revised: 01/08/2022] [Accepted: 01/12/2022] [Indexed: 11/17/2022] Open

Protein Integrated Network Analysis to Reveal Potential Drug Targets Against Extended Drug-Resistant Mycobacterium tuberculosis XDR1219. Mol Biotechnol 2021;63:1252-1267. [PMID: 34382159 DOI: 10.1007/s12033-021-00377-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2021] [Accepted: 07/30/2021] [Indexed: 10/20/2022]

Wang Y, Li Z, Zhang Y, Ma Y, Huang Q, Chen X, Dai Z, Zou X. Performance improvement for a 2D convolutional neural network by using SSC encoding on protein-protein interaction tasks. BMC Bioinformatics 2021;22:184. [PMID: 33845759 PMCID: PMC8042949 DOI: 10.1186/s12859-021-04111-w] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2020] [Accepted: 03/30/2021] [Indexed: 11/18/2022] Open

Abstract

BACKGROUND

The interactions of proteins are determined by their sequences and affect the regulation of the cell cycle, signal transduction and metabolism, which is of extraordinary significance to modern proteomics research. Despite advances in experimental technology, it is still expensive, laborious, and time-consuming to determine protein-protein interactions (PPIs), and there is a strong demand for effective bioinformatics approaches to identify potential PPIs. Considering the large amount of PPI data, a high-performance processor can be utilized to enhance the capability of the deep learning method and directly predict protein sequences.

RESULTS

We propose the Sequence-Statistics-Content protein sequence encoding format (SSC) based on information extraction from the original sequence for further performance improvement of the convolutional neural network. The original protein sequences are encoded in the three-channel format by introducing statistical information (the second channel) and bigram encoding information (the third channel), which can increase the unique sequence features to enhance the performance of the deep learning model. On predicting protein-protein interaction tasks, the results using the 2D convolutional neural network (2D CNN) with the SSC encoding method are better than those of the 1D CNN with one hot encoding. The independent validation of new interactions from the HIPPIE database (version 2.1 published on July 18, 2017) and the validation of directly predicted results by applying a molecular docking tool indicate the effectiveness of the proposed protein encoding improvement in the CNN model.

CONCLUSION

The proposed protein sequence encoding method is efficient at improving the capability of the CNN model on protein sequence-related tasks and may also be effective at enhancing the capability of other machine learning or deep learning methods. Prediction accuracy and molecular docking validation showed considerable improvement compared to the existing hot encoding method, indicating that the SSC encoding method may be useful for analyzing protein sequence-related tasks. The source code of the proposed methods is freely available for academic research at https://github.com/wangy496/SSC-format/ .

Collapse

Li Z, Jiang H, Kong L, Chen Y, Lang K, Fan X, Zhang L, Pian C. Deep6mA: A deep learning framework for exploring similar patterns in DNA N6-methyladenine sites across different species. PLoS Comput Biol 2021;17:e1008767. [PMID: 33600435 PMCID: PMC7924747 DOI: 10.1371/journal.pcbi.1008767] [Citation(s) in RCA: 25] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2020] [Revised: 03/02/2021] [Accepted: 02/03/2021] [Indexed: 12/25/2022] Open

Abstract

N6-methyladenine (6mA) is an important DNA modification form associated with a wide range of biological processes. Identifying accurately 6mA sites on a genomic scale is crucial for under-standing of 6mA’s biological functions. However, the existing experimental techniques for detecting 6mA sites are cost-ineffective, which implies the great need of developing new computational methods for this problem. In this paper, we developed, without requiring any prior knowledge of 6mA and manually crafted sequence features, a deep learning framework named Deep6mA to identify DNA 6mA sites, and its performance is superior to other DNA 6mA prediction tools. Specifically, the 5-fold cross-validation on a benchmark dataset of rice gives the sensitivity and specificity of Deep6mA as 92.96% and 95.06%, respectively, and the overall prediction accuracy is 94%. Importantly, we find that the sequences with 6mA sites share similar patterns across different species. The model trained with rice data predicts well the 6mA sites of other three species: Arabidopsis thaliana, Fragaria vesca and Rosa chinensis with a prediction accuracy over 90%. In addition, we find that (1) 6mA tends to occur at GAGG motifs, which means the sequence near the 6mA site may be conservative; (2) 6mA is enriched in the TATA box of the promoter, which may be the main source of its regulating downstream gene expression.

DNA N6 methyladenine (6mA) is a newly recognized methylation modification in eukaryotes. It exists widely and conservatively in organisms, and its modification level changes dynamically in the whole life cycle. This study proposes an algorithm based on a deep learning framework including LSTM and CNN to predict 6mA sites. The results showed that our method could accurately predict the 6mA sites in different species, which means DNA sub-sequences containing 6mA sites among species have certain conservation. Importantly, we found that 6mA methylation in most different species is more likely to occur on the GAGG motif. In addition, we also found that 6mA is rich in the promoter’s TATA box, which may be a mechanism of regulating downstream gene expression.

Collapse

Identifying patient-specific flow of signal transduction perturbed by multiple single-nucleotide alterations. QUANTITATIVE BIOLOGY 2020. [DOI: 10.1007/s40484-020-0227-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]

Lu WC, Xie H, Yuan C, Li JJ, Li ZY, Wu AH. Genomic landscape of the immune microenvironments of brain metastases in breast cancer. J Transl Med 2020;18:327. [PMID: 32867782 PMCID: PMC7461335 DOI: 10.1186/s12967-020-02503-9] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2020] [Accepted: 08/26/2020] [Indexed: 01/19/2023] Open

Li M, Ni P, Chen X, Wang J, Wu FX, Pan Y. Construction of Refined Protein Interaction Network for Predicting Essential Proteins. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2019;16:1386-1397. [PMID: 28186903 DOI: 10.1109/tcbb.2017.2665482] [Citation(s) in RCA: 24] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]

Sadhukhan P, Palit S. Reverse-nearest neighborhood based oversampling for imbalanced, multi-label datasets. Pattern Recognit Lett 2019. [DOI: 10.1016/j.patrec.2019.08.009] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/26/2022]

Zhao B, Zhao Y, Zhang X, Zhang Z, Zhang F, Wang L. An iteration method for identifying yeast essential proteins from heterogeneous network. BMC Bioinformatics 2019;20:355. [PMID: 31234779 PMCID: PMC6591974 DOI: 10.1186/s12859-019-2930-2] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2019] [Accepted: 06/04/2019] [Indexed: 02/02/2023] Open

Abstract

BACKGROUND

Essential proteins are distinctly important for an organism's survival and development and crucial to disease analysis and drug design as well. Large-scale protein-protein interaction (PPI) data sets exist in Saccharomyces cerevisiae, which provides us with a valuable opportunity to predict identify essential proteins from PPI networks. Many network topology-based computational methods have been designed to detect essential proteins. However, these methods are limited by the completeness of available PPI data. To break out of these restraints, some computational methods have been proposed by integrating PPI networks and multi-source biological data. Despite the progress in the research of multiple data fusion, it is still challenging to improve the prediction accuracy of the computational methods.

RESULTS

In this paper, we design a novel iterative model for essential proteins prediction, named Randomly Walking in the Heterogeneous Network (RWHN). In RWHN, a weighted protein-protein interaction network and a domain-domain association network are constructed according to the original PPI network and the known protein-domain association network, firstly. And then, we establish a new heterogeneous matrix by combining the two constructed networks with the protein-domain association network. Based on the heterogeneous matrix, a transition probability matrix is established by normalized operation. Finally, an improved PageRank algorithm is adopted on the heterogeneous network for essential proteins prediction. In order to eliminate the influence of the false negative, information on orthologous proteins and the subcellular localization information of proteins are integrated to initialize the score vector of proteins. In RWHN, the topology, conservative and functional features of essential proteins are all taken into account in the prediction process. The experimental results show that RWHN obviously exceeds in predicting essential proteins ten other competing methods.

CONCLUSIONS

We demonstrated that integrating multi-source data into a heterogeneous network can preserve the complex relationship among multiple biological data and improve the prediction accuracy of essential proteins. RWHN, our proposed method, is effective for the prediction of essential proteins.

Collapse

Alshabi AM, Vastrad B, Shaikh IA, Vastrad C. Identification of Crucial Candidate Genes and Pathways in Glioblastoma Multiform by Bioinformatics Analysis. Biomolecules 2019;9:biom9050201. [PMID: 31137733 PMCID: PMC6571969 DOI: 10.3390/biom9050201] [Citation(s) in RCA: 29] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2019] [Revised: 05/17/2019] [Accepted: 05/23/2019] [Indexed: 02/07/2023] Open

Abstract

The present study aimed to investigate the molecular mechanisms underlying glioblastoma multiform (GBM) and its biomarkers. The differentially expressed genes (DEGs) were diagnosed using the limma software package. The ToppGene (ToppFun) was used to perform pathway and Gene Ontology (GO) enrichment analysis of the DEGs. Protein-protein interaction (PPI) networks, extracted modules, miRNA-target genes regulatory network and TF-target genes regulatory network were used to obtain insight into the actions of DEGs. Survival analysis for DEGs was carried out. A total of 590 DEGs, including 243 up regulated and 347 down regulated genes, were diagnosed between scrambled shRNA expression and Lin7A knock down. The up-regulated genes were enriched in ribosome, mitochondrial translation termination, translation, and peptide biosynthetic process. The down-regulated genes were enriched in focal adhesion, VEGFR3 signaling in lymphatic endothelium, extracellular matrix organization, and extracellular matrix. The current study screened the genes in the PPI network, extracted modules, miRNA-target genes regulatory network, and TF-target genes regulatory network with higher degrees as hub genes, which included NPM1, CUL4A, YIPF1, SHC1, AKT1, VLDLR, RPL14, P3H2, DTNA, FAM126B, RPL34, and MYL5. Survival analysis indicated that the high expression of RPL36A and MRPL35 were predicting longer survival of GBM, while high expression of AP1S1 and AKAP12 were predicting shorter survival of GBM. High expression of RPL36A and AP1S1 were associated with pathogenesis of GBM, while low expression of ALPL was associated with pathogenesis of GBM. In conclusion, the current study diagnosed DEGs between scrambled shRNA expression and Lin7A knock down samples, which could improve our understanding of the molecular mechanisms in the progression of GBM, and these crucial as well as new diagnostic markers might be used as therapeutic targets for GBM.

Collapse

Joshi H, Vastrad B, Vastrad C. Identification of Important Invasion-Related Genes in Non-functional Pituitary Adenomas. J Mol Neurosci 2019;68:565-589. [PMID: 30982163 DOI: 10.1007/s12031-019-01318-8] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2018] [Accepted: 03/29/2019] [Indexed: 12/18/2022]

Abstract

Non-functioning pituitary adenomas (NFPAs) are locally invasive with high morbidity. The objective of this study was to diagnose important genes and pathways related to the invasiveness of NFPAs and gain more insights into the underlying molecular mechanisms of NFPAs. The gene expression profiles of GSE51618 were downloaded from the Gene Expression Omnibus database with 4 non-invasive NFPA samples, 3 invasive NFPA samples, and 3 normal pituitary gland samples. Differentially expressed genes (DEGs) are screened between invasive NFPA samples and normal pituitary gland samples, followed by pathway and ontology (GO) enrichment analyses. Subsequently, a protein-protein interaction (PPI) network was constructed and analyzed for these DEGs, and module analysis was performed. In addition, a target gene-miRNA network and target gene-TF (transcription factor) network were analyzed for these DEGs. A total of 879 DEGs were obtained. Among them, 439 genes were upregulated and 440 genes were downregulated. Pathway enrichment analysis indicated that the upregulated genes were significantly enriched in cysteine biosynthesis/homocysteine degradation (trans-sulfuration) and PI3K-Akt signaling pathway, while the downregulated genes were mainly associated with docosahexaenoate biosynthesis III (mammals) and chemokine signaling pathway. GO enrichment analysis indicated that the upregulated genes were significantly enriched in animal organ morphogenesis, extracellular matrix, and hormone activity, while the downregulated genes were mainly associated with leukocyte chemotaxis, dendrites, and RAGE receptor binding. Subsequently, ESR1, SOX2, TTN, GFAP, WIF1, TTR, XIST, SPAG5, PPBP, AR, IL1R2, and HIST1H1C were diagnosed as the top hub genes in the upregulated and downregulated PPI networks and modules. In addition, HS3ST1, GPC4, CCND2, and SCD were diagnosed as the top hub genes in the upregulated and downregulated target gene-miRNA networks, while CISH, ISLR, UBE2E3, and CCNG2 were diagnosed as the top hub genes in the upregulated and downregulated target gene-TF networks. The new important DEGs and pathways diagnosed in this study may serve key roles in the invasiveness of NFPAs and indicate more molecular targets for the treatment of NFPAs.

Collapse

Fang M, Lei X, Guo L. A Survey on Computational Methods for Essential Proteins and Genes Prediction. Curr Bioinform 2019. [DOI: 10.2174/1574893613666181112150422] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]

Azhagesan K, Ravindran B, Raman K. Network-based features enable prediction of essential genes across diverse organisms. PLoS One 2018;13:e0208722. [PMID: 30543651 PMCID: PMC6292609 DOI: 10.1371/journal.pone.0208722] [Citation(s) in RCA: 23] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2018] [Accepted: 11/21/2018] [Indexed: 12/19/2022] Open

Abstract

Machine learning approaches to predict essential genes have gained a lot of traction in recent years. These approaches predominantly make use of sequence and network-based features to predict essential genes. However, the scope of network-based features used by the existing approaches is very narrow. Further, many of these studies focus on predicting essential genes within the same organism, which cannot be readily used to predict essential genes across organisms. Therefore, there is clearly a need for a method that is able to predict essential genes across organisms, by leveraging network-based features. In this study, we extract several sets of network-based features from protein-protein association networks available from the STRING database. Our network features include some common measures of centrality, and also some novel recursive measures recently proposed in social network literature. We extract hundreds of network-based features from networks of 27 diverse organisms to predict the essentiality of 87000+ genes. Our results show that network-based features are statistically significantly better at classifying essential genes across diverse bacterial species, compared to the current state-of-the-art methods, which use mostly sequence and a few 'conventional' network-based features. Our diverse set of network properties gave an AUROC of 0.847 and a precision of 0.320 across 27 organisms. When we augmented the complete set of network features with sequence-derived features, we achieved an improved AUROC of 0.857 and a precision of 0.335. We also constructed a reduced set of 100 sequence and network features, which gave a comparable performance. Further, we show that our features are useful for predicting essential genes in new organisms by using leave-one-species-out validation. Our network features capture the local, global and neighbourhood properties of the network and are hence effective for prediction of essential genes across diverse organisms, even in the absence of other complex biological knowledge. Our approach can be readily exploited to predict essentiality for organisms in interactome databases such as the STRING, where both network and sequence are readily available. All codes are available at https://github.com/RamanLab/nbfpeg.

Collapse

Shokri-Gharelo R, Noparvar PM. Molecular response of canola to salt stress: insights on tolerance mechanisms. PeerJ 2018;6:e4822. [PMID: 29844974 PMCID: PMC5969047 DOI: 10.7717/peerj.4822] [Citation(s) in RCA: 25] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2018] [Accepted: 05/02/2018] [Indexed: 01/16/2023] Open

Chen L, Zhang YH, Wang S, Zhang Y, Huang T, Cai YD. Prediction and analysis of essential genes using the enrichments of gene ontology and KEGG pathways. PLoS One 2017;12:e0184129. [PMID: 28873455 PMCID: PMC5584762 DOI: 10.1371/journal.pone.0184129] [Citation(s) in RCA: 191] [Impact Index Per Article: 27.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2017] [Accepted: 08/18/2017] [Indexed: 12/20/2022] Open

Suratanee A, Plaimas K. Reverse Nearest Neighbor Search on a Protein-Protein Interaction Network to Infer Protein-Disease Associations. Bioinform Biol Insights 2017;11:1177932217720405. [PMID: 28757797 PMCID: PMC5513527 DOI: 10.1177/1177932217720405] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2017] [Accepted: 06/18/2017] [Indexed: 12/17/2022] Open

A New Method for Human Mental Fatigue Detection with Several EEG Channels. J Med Biol Eng 2017. [DOI: 10.1007/s40846-017-0224-6] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]

Zhang X, Xiao W, Acencio ML, Lemke N, Wang X. An ensemble framework for identifying essential proteins. BMC Bioinformatics 2016;17:322. [PMID: 27557880 PMCID: PMC4997703 DOI: 10.1186/s12859-016-1166-7] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2016] [Accepted: 08/09/2016] [Indexed: 11/10/2022] Open

Grazziotin AL, Vidal NM, Venancio TM. Uncovering major genomic features of essential genes in Bacteria and a methanogenic Archaea. FEBS J 2015;282:3395-3411. [PMID: 26084810 DOI: 10.1111/febs.13350] [Citation(s) in RCA: 28] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2015] [Revised: 06/02/2015] [Accepted: 06/15/2015] [Indexed: 12/19/2022]

Abstract

Identification of essential genes is critical to understanding the physiology of a species, proposing novel drug targets and uncovering minimal gene sets required for life. Although essential gene sets of several organisms have been determined using large-scale mutagenesis techniques, systematic studies addressing their conservation, genomic context and functions remain scant. Here we integrate 17 essential gene sets from genome-wide in vitro screenings and three gene collections required for growth in vivo, encompassing 15 Bacteria and one Archaea. We refine and generalize important theories proposed using Escherichia coli. Essential genes are typically monogenic and more conserved than nonessential genes. Genes required in vivo are less conserved than those essential in vitro, suggesting that more divergent strategies are deployed when the organism is stressed by the host immune system and unstable nutrient availability. We identified essential analogous pathways that would probably be missed by orthology-based essentiality prediction strategies. For example, Streptococcus sanguinis carries horizontally transferred isoprenoid biosynthesis genes that are widespread in Archaea. Genes specifically essential in Mycobacterium tuberculosis and Burkholderia pseudomallei are reported as potential drug targets. Moreover, essential genes are not only preferentially located in operons, but also occupy the first position therein, supporting the influence of their regulatory regions in driving transcription of whole operons. Finally, these important genomic features are shared between Bacteria and at least one Archaea, suggesting that high order properties of gene essentiality and genome architecture were probably present in the last universal common ancestor or evolved independently in the prokaryotic domains.

Collapse

Musungu B, Bhatnagar D, Brown RL, Fakhoury AM, Geisler M. A predicted protein interactome identifies conserved global networks and disease resistance subnetworks in maize. Front Genet 2015;6:201. [PMID: 26089837 PMCID: PMC4454876 DOI: 10.3389/fgene.2015.00201] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2015] [Accepted: 05/21/2015] [Indexed: 12/30/2022] Open

Srihari S, Yong CH, Patil A, Wong L. Methods for protein complex prediction and their contributions towards understanding the organisation, function and dynamics of complexes. FEBS Lett 2015;589:2590-602. [PMID: 25913176 DOI: 10.1016/j.febslet.2015.04.026] [Citation(s) in RCA: 53] [Impact Index Per Article: 5.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/19/2015] [Revised: 04/14/2015] [Accepted: 04/14/2015] [Indexed: 12/30/2022]

Srihari S, Madhamshettiwar PB, Song S, Liu C, Simpson PT, Khanna KK, Ragan MA. Complex-based analysis of dysregulated cellular processes in cancer. BMC SYSTEMS BIOLOGY 2014;8 Suppl 4:S1. [PMID: 25521701 PMCID: PMC4290683 DOI: 10.1186/1752-0509-8-s4-s1] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 01/17/2023]

Abstract

Background

Differential expression analysis of (individual) genes is often used to study their roles in diseases. However, diseases such as cancer are a result of the combined effect of multiple genes. Gene products such as proteins seldom act in isolation, but instead constitute stable multi-protein complexes performing dedicated functions. Therefore, complexes aggregate the effect of individual genes (proteins) and can be used to gain a better understanding of cancer mechanisms. Here, we observe that complexes show considerable changes in their expression, in turn directed by the concerted action of transcription factors (TFs), across cancer conditions. We seek to gain novel insights into cancer mechanisms through a systematic analysis of complexes and their transcriptional regulation.

Results

We integrated large-scale protein-interaction (PPI) and gene-expression datasets to identify complexes that exhibit significant changes in their expression across different conditions in cancer. We devised a log-linear model to relate these changes to the differential regulation of complexes by TFs. The application of our model on two case studies involving pancreatic and familial breast tumour conditions revealed: (i) complexes in core cellular processes, especially those responsible for maintaining genome stability and cell proliferation (e.g. DNA damage repair and cell cycle) show considerable changes in expression; (ii) these changes include decrease and countering increase for different sets of complexes indicative of compensatory mechanisms coming into play in tumours; and (iii) TFs work in cooperative and counteractive ways to regulate these mechanisms. Such aberrant complexes and their regulating TFs play vital roles in the initiation and progression of cancer.

Conclusions

Complexes in core cellular processes display considerable decreases and countering increases in expression, strongly reflective of compensatory mechanisms in cancer. These changes are directed by the concerted action of cooperative and counteractive TFs. Our study highlights the roles of these complexes and TFs and presents several case studies of compensatory processes, thus providing novel insights into cancer mechanisms.

Collapse

Suratanee A, Plaimas K. Identification of inflammatory bowel disease-related proteins using a reverse k-nearest neighbor search. J Bioinform Comput Biol 2014;12:1450017. [DOI: 10.1142/s0219720014500176] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022]

Effective identification of essential proteins based on priori knowledge, network topology and gene expressions. Methods 2014;67:325-33. [DOI: 10.1016/j.ymeth.2014.02.016] [Citation(s) in RCA: 80] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2013] [Revised: 01/16/2014] [Accepted: 02/11/2014] [Indexed: 11/23/2022] Open

Rhee SY, Mutwil M. Towards revealing the functions of all genes in plants. TRENDS IN PLANT SCIENCE 2014;19:212-21. [PMID: 24231067 DOI: 10.1016/j.tplants.2013.10.006] [Citation(s) in RCA: 146] [Impact Index Per Article: 14.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/06/2013] [Revised: 10/10/2013] [Accepted: 10/16/2013] [Indexed: 05/19/2023]

Tang X, Wang J, Zhong J, Pan Y. Predicting Essential Proteins Based on Weighted Degree Centrality. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2014;11:407-18. [PMID: 26355787 DOI: 10.1109/tcbb.2013.2295318] [Citation(s) in RCA: 93] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/06/2023]

Srihari S, Raman V, Leong HW, Ragan MA. Evolution and Controllability of Cancer Networks: A Boolean Perspective. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2014;11:83-94. [PMID: 26355510 DOI: 10.1109/tcbb.2013.128] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]

Abstract

Cancer forms a robust system capable of maintaining stable functioning (cell sustenance and proliferation) despite perturbations. Cancer progresses as stages over time typically with increasing aggressiveness and worsening prognosis. Characterizing these stages and identifying the genes driving transitions between them is critical to understand cancer progression and to develop effective anti-cancer therapies. In this work, we propose a novel model for the `cancer system' as a Boolean state space in which a Boolean network, built from protein-interaction and gene-expression data from different stages of cancer, transits between Boolean satisfiability states by "editing" interactions and "flipping" genes. Edits reflect rewiring of the PPI network while flipping of genes reflect activation or silencing of genes between stages. We formulate a minimization problem min flip to identify these genes driving the transitions. The application of our model (called BoolSpace) on three case studies-pancreatic and breast tumours in human and post spinal-cord injury (SCI) in rats-reveals valuable insights into the phenomenon of cancer progression: (i) interactions involved in core cell-cycle and DNA-damage repair pathways are significantly rewired in tumours, indicating significant impact to key genome-stabilizing mechanisms; (ii) several of the genes flipped are serine/threonine kinases which act as biological switches, reflecting cellular switching mechanisms between stages; and (iii) different sets of genes are flipped during the initial and final stages indicating a pattern to tumour progression. Based on these results, we hypothesize that robustness of cancer partly stems from "passing of the baton" between genes at different stages-genes from different biological processes and/or cellular components are involved in different stages of tumour progression thereby allowing tumour cells to evade targeted therapy, and therefore an effective therapy should target a "cover set" of these genes. A C/C++ implementation of BoolSpace is freely available at: http://www.bioinformatics.org.au/tools-data.

Collapse

Raman K, Damaraju N, Joshi GK. The organisational structure of protein networks: revisiting the centrality-lethality hypothesis. SYSTEMS AND SYNTHETIC BIOLOGY 2013;8:73-81. [PMID: 24592293 DOI: 10.1007/s11693-013-9123-5] [Citation(s) in RCA: 71] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/03/2013] [Revised: 08/05/2013] [Accepted: 08/12/2013] [Indexed: 01/09/2023]

Zhang X, Xu J, Xiao WX. A new method for the discovery of essential proteins. PLoS One 2013;8:e58763. [PMID: 23555595 PMCID: PMC3605424 DOI: 10.1371/journal.pone.0058763] [Citation(s) in RCA: 47] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/25/2012] [Accepted: 02/06/2013] [Indexed: 11/18/2022] Open

Abstract

BACKGROUND

Experimental methods for the identification of essential proteins are always costly, time-consuming, and laborious. It is a challenging task to find protein essentiality only through experiments. With the development of high throughput technologies, a vast amount of protein-protein interactions are available, which enable the identification of essential proteins from the network level. Many computational methods for such task have been proposed based on the topological properties of protein-protein interaction (PPI) networks. However, the currently available PPI networks for each species are not complete, i.e. false negatives, and very noisy, i.e. high false positives, network topology-based centrality measures are often very sensitive to such noise. Therefore, exploring robust methods for identifying essential proteins would be of great value.

METHOD

In this paper, a new essential protein discovery method, named CoEWC (Co-Expression Weighted by Clustering coefficient), has been proposed. CoEWC is based on the integration of the topological properties of PPI network and the co-expression of interacting proteins. The aim of CoEWC is to capture the common features of essential proteins in both date hubs and party hubs. The performance of CoEWC is validated based on the PPI network of Saccharomyces cerevisiae. Experimental results show that CoEWC significantly outperforms the classical centrality measures, and that it also outperforms PeC, a newly proposed essential protein discovery method which outperforms 15 other centrality measures on the PPI network of Saccharomyces cerevisiae. Especially, when predicting no more than 500 proteins, even more than 50% improvements are obtained by CoEWC over degree centrality (DC), a better centrality measure for identifying protein essentiality.

CONCLUSIONS

We demonstrate that more robust essential protein discovery method can be developed by integrating the topological properties of PPI network and the co-expression of interacting proteins. The proposed centrality measure, CoEWC, is effective for the discovery of essential proteins.

Collapse

Wang J, Peng W, Wu FX. Computational approaches to predicting essential proteins: A survey. Proteomics Clin Appl 2013;7:181-92. [DOI: 10.1002/prca.201200068] [Citation(s) in RCA: 52] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2012] [Revised: 09/12/2012] [Accepted: 11/06/2012] [Indexed: 12/13/2022]

Schrum AG, Gil D. Robustness and Specificity in Signal Transduction via Physiologic Protein Interaction Networks. CLINICAL & EXPERIMENTAL PHARMACOLOGY 2012;2:S3.001. [PMID: 24535485 PMCID: PMC3923534 DOI: 10.4172/2161-1459.s3-001] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/05/2022]

Wang J, Li M, Wang H, Pan Y. Identification of essential proteins based on edge clustering coefficient. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2012;9:1070-1080. [PMID: 22084147 DOI: 10.1109/tcbb.2011.147] [Citation(s) in RCA: 149] [Impact Index Per Article: 12.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/31/2023]

Li M, Zhang H, Wang JX, Pan Y. A new essential protein discovery method based on the integration of protein-protein interaction and gene expression data. BMC SYSTEMS BIOLOGY 2012;6:15. [PMID: 22405054 PMCID: PMC3325894 DOI: 10.1186/1752-0509-6-15] [Citation(s) in RCA: 133] [Impact Index Per Article: 11.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/24/2011] [Accepted: 03/10/2012] [Indexed: 01/09/2023]

Abstract

BACKGROUND

Identification of essential proteins is always a challenging task since it requires experimental approaches that are time-consuming and laborious. With the advances in high throughput technologies, a large number of protein-protein interactions are available, which have produced unprecedented opportunities for detecting proteins' essentialities from the network level. There have been a series of computational approaches proposed for predicting essential proteins based on network topologies. However, the network topology-based centrality measures are very sensitive to the robustness of network. Therefore, a new robust essential protein discovery method would be of great value.

RESULTS

In this paper, we propose a new centrality measure, named PeC, based on the integration of protein-protein interaction and gene expression data. The performance of PeC is validated based on the protein-protein interaction network of Saccharomyces cerevisiae. The experimental results show that the predicted precision of PeC clearly exceeds that of the other fifteen previously proposed centrality measures: Degree Centrality (DC), Betweenness Centrality (BC), Closeness Centrality (CC), Subgraph Centrality (SC), Eigenvector Centrality (EC), Information Centrality (IC), Bottle Neck (BN), Density of Maximum Neighborhood Component (DMNC), Local Average Connectivity-based method (LAC), Sum of ECC (SoECC), Range-Limited Centrality (RL), L-index (LI), Leader Rank (LR), Normalized α-Centrality (NC), and Moduland-Centrality (MC). Especially, the improvement of PeC over the classic centrality measures (BC, CC, SC, EC, and BN) is more than 50% when predicting no more than 500 proteins.

CONCLUSIONS

We demonstrate that the integration of protein-protein interaction network and gene expression data can help improve the precision of predicting essential proteins. The new centrality measure, PeC, is an effective essential protein discovery method.

Collapse

Srihari S, Ning K, Leong HW. MCL-CAw: a refinement of MCL for detecting yeast complexes from weighted PPI networks by incorporating core-attachment structure. BMC Bioinformatics 2010;11:504. [PMID: 20939868 PMCID: PMC2965181 DOI: 10.1186/1471-2105-11-504] [Citation(s) in RCA: 40] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2010] [Accepted: 10/12/2010] [Indexed: 01/23/2023] Open

Abstract

BACKGROUND

The reconstruction of protein complexes from the physical interactome of organisms serves as a building block towards understanding the higher level organization of the cell. Over the past few years, several independent high-throughput experiments have helped to catalogue enormous amount of physical protein interaction data from organisms such as yeast. However, these individual datasets show lack of correlation with each other and also contain substantial number of false positives (noise). Over these years, several affinity scoring schemes have also been devised to improve the qualities of these datasets. Therefore, the challenge now is to detect meaningful as well as novel complexes from protein interaction (PPI) networks derived by combining datasets from multiple sources and by making use of these affinity scoring schemes. In the attempt towards tackling this challenge, the Markov Clustering algorithm (MCL) has proved to be a popular and reasonably successful method, mainly due to its scalability, robustness, and ability to work on scored (weighted) networks. However, MCL produces many noisy clusters, which either do not match known complexes or have additional proteins that reduce the accuracies of correctly predicted complexes.

RESULTS

Inspired by recent experimental observations by Gavin and colleagues on the modularity structure in yeast complexes and the distinctive properties of "core" and "attachment" proteins, we develop a core-attachment based refinement method coupled to MCL for reconstruction of yeast complexes from scored (weighted) PPI networks. We combine physical interactions from two recent "pull-down" experiments to generate an unscored PPI network. We then score this network using available affinity scoring schemes to generate multiple scored PPI networks. The evaluation of our method (called MCL-CAw) on these networks shows that: (i) MCL-CAw derives larger number of yeast complexes and with better accuracies than MCL, particularly in the presence of natural noise; (ii) Affinity scoring can effectively reduce the impact of noise on MCL-CAw and thereby improve the quality (precision and recall) of its predicted complexes; (iii) MCL-CAw responds well to most available scoring schemes. We discuss several instances where MCL-CAw was successful in deriving meaningful complexes, and where it missed a few proteins or whole complexes due to affinity scoring of the networks. We compare MCL-CAw with several recent complex detection algorithms on unscored and scored networks, and assess the relative performance of the algorithms on these networks. Further, we study the impact of augmenting physical datasets with computationally inferred interactions for complex detection. Finally, we analyse the essentiality of proteins within predicted complexes to understand a possible correlation between protein essentiality and their ability to form complexes.

CONCLUSIONS

We demonstrate that core-attachment based refinement in MCL-CAw improves the predictions of MCL on yeast PPI networks. We show that affinity scoring improves the performance of MCL-CAw.

Collapse