Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Song K, Tong T, Wu F. Predicting essential genes in prokaryotic genomes using a linear method: ZUPLS. Integr Biol (Camb) 2014;6:460-9. [PMID: 24603751 DOI: 10.1039/c3ib40241j] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]

For:	Song K, Tong T, Wu F. Predicting essential genes in prokaryotic genomes using a linear method: ZUPLS. Integr Biol (Camb) 2014;6:460-9. [PMID: 24603751 DOI: 10.1039/c3ib40241j] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]

Number

Cited by Other Article(s)

Hu W, Li M, Xiao H, Guan L. Essential genes identification model based on sequence feature map and graph convolutional neural network. BMC Genomics 2024;25:47. [PMID: 38200437 PMCID: PMC10777564 DOI: 10.1186/s12864-024-09958-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2023] [Accepted: 01/01/2024] [Indexed: 01/12/2024] Open

Abstract

BACKGROUND

Essential genes encode functions that play a vital role in the life activities of organisms, encompassing growth, development, immune system functioning, and cell structure maintenance. Conventional experimental techniques for identifying essential genes are resource-intensive and time-consuming, and the accuracy of current machine learning models needs further enhancement. Therefore, it is crucial to develop a robust computational model to accurately predict essential genes.

RESULTS

In this study, we introduce GCNN-SFM, a computational model for identifying essential genes in organisms, based on graph convolutional neural networks (GCNN). GCNN-SFM integrates a graph convolutional layer, a convolutional layer, and a fully connected layer to model and extract features from gene sequences of essential genes. Initially, the gene sequence is transformed into a feature map using coding techniques. Subsequently, a multi-layer GCN is employed to perform graph convolution operations, effectively capturing both local and global features of the gene sequence. Further feature extraction is performed, followed by integrating convolution and fully-connected layers to generate prediction results for essential genes. The gradient descent algorithm is utilized to iteratively update the cross-entropy loss function, thereby enhancing the accuracy of the prediction results. Meanwhile, model parameters are tuned to determine the optimal parameter combination that yields the best prediction performance during training.

CONCLUSIONS

Experimental evaluation demonstrates that GCNN-SFM surpasses various advanced essential gene prediction models and achieves an average accuracy of 94.53%. This study presents a novel and effective approach for identifying essential genes, which has significant implications for biology and genomics research.

Collapse

Beder T, Aromolaran O, Dönitz J, Tapanelli S, Adedeji E, Adebiyi E, Bucher G, Koenig R. Identifying essential genes across eukaryotes by machine learning. NAR Genom Bioinform 2021;3:lqab110. [PMID: 34859210 PMCID: PMC8634067 DOI: 10.1093/nargab/lqab110] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2021] [Revised: 10/09/2021] [Accepted: 11/29/2021] [Indexed: 02/07/2023] Open

Senthamizhan V, Ravindran B, Raman K. NetGenes: A Database of Essential Genes Predicted Using Features From Interaction Networks. Front Genet 2021;12:722198. [PMID: 34630517 PMCID: PMC8495214 DOI: 10.3389/fgene.2021.722198] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2021] [Accepted: 08/16/2021] [Indexed: 11/26/2022] Open

DELEAT: gene essentiality prediction and deletion design for bacterial genome reduction. BMC Bioinformatics 2021;22:444. [PMID: 34537011 PMCID: PMC8449488 DOI: 10.1186/s12859-021-04348-5] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2021] [Accepted: 08/26/2021] [Indexed: 11/10/2022] Open

Campos TL, Korhonen PK, Hofmann A, Gasser RB, Young ND. Harnessing model organism genomics to underpin the machine learning-based prediction of essential genes in eukaryotes - Biotechnological implications. Biotechnol Adv 2021;54:107822. [PMID: 34461202 DOI: 10.1016/j.biotechadv.2021.107822] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2021] [Revised: 08/17/2021] [Accepted: 08/24/2021] [Indexed: 12/17/2022]

Abstract

The availability of high-quality genomes and advances in functional genomics have enabled large-scale studies of essential genes in model eukaryotes, including the 'elegant worm' (Caenorhabditis elegans; Nematoda) and the 'vinegar fly' (Drosophila melanogaster; Arthropoda). However, this is not the case for other, much less-studied organisms, such as socioeconomically important parasites, for which functional genomic platforms usually do not exist. Thus, there is a need to develop innovative techniques or approaches for the prediction, identification and investigation of essential genes. A key approach that could enable the prediction of such genes is machine learning (ML). Here, we undertake an historical review of experimental and computational approaches employed for the characterisation of essential genes in eukaryotes, with a particular focus on model ecdysozoans (C. elegans and D. melanogaster), and discuss the possible applicability of ML-approaches to organisms such as socioeconomically important parasites. We highlight some recent results showing that high-performance ML, combined with feature engineering, allows a reliable prediction of essential genes from extensive, publicly available 'omic data sets, with major potential to prioritise such genes (with statistical confidence) for subsequent functional genomic validation. These findings could 'open the door' to fundamental and applied research areas. Evidence of some commonality in the essential gene-complement between these two organisms indicates that an ML-engineering approach could find broader applicability to ecdysozoans such as parasitic nematodes or arthropods, provided that suitably large and informative data sets become/are available for proper feature engineering, and for the robust training and validation of algorithms. This area warrants detailed exploration to, for example, facilitate the identification and characterisation of essential molecules as novel targets for drugs and vaccines against parasitic diseases. This focus is particularly important, given the substantial impact that such diseases have worldwide, and the current challenges associated with their prevention and control and with drug resistance in parasite populations.

Collapse

Le NQK, Do DT, Hung TNK, Lam LHT, Huynh TT, Nguyen NTK. A Computational Framework Based on Ensemble Deep Neural Networks for Essential Genes Identification. Int J Mol Sci 2020;21:E9070. [PMID: 33260643 PMCID: PMC7730808 DOI: 10.3390/ijms21239070] [Citation(s) in RCA: 37] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2020] [Revised: 11/25/2020] [Accepted: 11/26/2020] [Indexed: 01/13/2023] Open

Hasan MA, Lonardi S. DeeplyEssential: a deep neural network for predicting essential genes in microbes. BMC Bioinformatics 2020;21:367. [PMID: 32998698 PMCID: PMC7525945 DOI: 10.1186/s12859-020-03688-y] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022] Open

Xu L, Guo Z, Liu X. Prediction of essential genes in prokaryote based on artificial neural network. Genes Genomics 2019;42:97-106. [DOI: 10.1007/s13258-019-00884-w] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2019] [Accepted: 10/30/2019] [Indexed: 12/12/2022]

Azhagesan K, Ravindran B, Raman K. Network-based features enable prediction of essential genes across diverse organisms. PLoS One 2018;13:e0208722. [PMID: 30543651 PMCID: PMC6292609 DOI: 10.1371/journal.pone.0208722] [Citation(s) in RCA: 23] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2018] [Accepted: 11/21/2018] [Indexed: 12/19/2022] Open

Abstract

Machine learning approaches to predict essential genes have gained a lot of traction in recent years. These approaches predominantly make use of sequence and network-based features to predict essential genes. However, the scope of network-based features used by the existing approaches is very narrow. Further, many of these studies focus on predicting essential genes within the same organism, which cannot be readily used to predict essential genes across organisms. Therefore, there is clearly a need for a method that is able to predict essential genes across organisms, by leveraging network-based features. In this study, we extract several sets of network-based features from protein-protein association networks available from the STRING database. Our network features include some common measures of centrality, and also some novel recursive measures recently proposed in social network literature. We extract hundreds of network-based features from networks of 27 diverse organisms to predict the essentiality of 87000+ genes. Our results show that network-based features are statistically significantly better at classifying essential genes across diverse bacterial species, compared to the current state-of-the-art methods, which use mostly sequence and a few 'conventional' network-based features. Our diverse set of network properties gave an AUROC of 0.847 and a precision of 0.320 across 27 organisms. When we augmented the complete set of network features with sequence-derived features, we achieved an improved AUROC of 0.857 and a precision of 0.335. We also constructed a reduced set of 100 sequence and network features, which gave a comparable performance. Further, we show that our features are useful for predicting essential genes in new organisms by using leave-one-species-out validation. Our network features capture the local, global and neighbourhood properties of the network and are hence effective for prediction of essential genes across diverse organisms, even in the absence of other complex biological knowledge. Our approach can be readily exploited to predict essentiality for organisms in interactome databases such as the STRING, where both network and sequence are readily available. All codes are available at https://github.com/RamanLab/nbfpeg.

Collapse

Dong C, Jin YT, Hua HL, Wen QF, Luo S, Zheng WX, Guo FB. Comprehensive review of the identification of essential genes using computational methods: focusing on feature implementation and assessment. Brief Bioinform 2018;21:171-181. [PMID: 30496347 DOI: 10.1093/bib/bby116] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2018] [Revised: 11/01/2018] [Accepted: 11/02/2018] [Indexed: 02/06/2023] Open

Peng C, Lin Y, Luo H, Gao F. A Comprehensive Overview of Online Resources to Identify and Predict Bacterial Essential Genes. Front Microbiol 2017;8:2331. [PMID: 29230204 PMCID: PMC5711816 DOI: 10.3389/fmicb.2017.02331] [Citation(s) in RCA: 32] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2017] [Accepted: 11/13/2017] [Indexed: 12/15/2022] Open

Khan S, Somvanshi P, Bhardwaj T, Mandal RK, Dar SA, Wahid M, Jawed A, Lohani M, Khan M, Areeshi MY, Haque S. Aspartate‐β‐semialdeyhyde dehydrogenase as a potential therapeutic target of Mycobacterium tuberculosis H37Rv: Evidence from in silico elementary mode analysis of biological network model. J Cell Biochem 2017;119:2832-2842. [DOI: 10.1002/jcb.26458] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2017] [Accepted: 10/24/2017] [Indexed: 02/06/2023]

Nigatu D, Sobetzko P, Yousef M, Henkel W. Sequence-based information-theoretic features for gene essentiality prediction. BMC Bioinformatics 2017;18:473. [PMID: 29121868 PMCID: PMC5679510 DOI: 10.1186/s12859-017-1884-5] [Citation(s) in RCA: 28] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2017] [Accepted: 10/26/2017] [Indexed: 11/10/2022] Open

Liu X, Wang BJ, Xu L, Tang HL, Xu GQ. Selection of key sequence-based features for prediction of essential genes in 31 diverse bacterial species. PLoS One 2017;12:e0174638. [PMID: 28358836 PMCID: PMC5373589 DOI: 10.1371/journal.pone.0174638] [Citation(s) in RCA: 24] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2016] [Accepted: 03/12/2017] [Indexed: 01/02/2023] Open

Nandi S, Subramanian A, Sarkar RR. An integrative machine learning strategy for improved prediction of essential genes in Escherichia coli metabolism using flux-coupled features. MOLECULAR BIOSYSTEMS 2017;13:1584-1596. [DOI: 10.1039/c7mb00234c] [Citation(s) in RCA: 21] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/24/2023]

An Approach for Predicting Essential Genes Using Multiple Homology Mapping and Machine Learning Algorithms. BIOMED RESEARCH INTERNATIONAL 2016;2016:7639397. [PMID: 27660763 PMCID: PMC5021884 DOI: 10.1155/2016/7639397] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/30/2016] [Revised: 07/25/2016] [Accepted: 08/04/2016] [Indexed: 11/17/2022]