Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Do DT, Le TQT, Le NQK. Using deep neural networks and biological subwords to detect protein S-sulfenylation sites. Brief Bioinform 2020;22:5866114. [PMID: 32613242 DOI: 10.1093/bib/bbaa128] [Citation(s) in RCA: 40] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2020] [Revised: 05/11/2020] [Accepted: 05/26/2020] [Indexed: 12/11/2022] Open

For:	Do DT, Le TQT, Le NQK. Using deep neural networks and biological subwords to detect protein S-sulfenylation sites. Brief Bioinform 2020;22:5866114. [PMID: 32613242 DOI: 10.1093/bib/bbaa128] [Citation(s) in RCA: 40] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2020] [Revised: 05/11/2020] [Accepted: 05/26/2020] [Indexed: 12/11/2022] Open

Number

Cited by Other Article(s)

Zhelyazkova M, Yordanova R, Mihaylov I, Tsonev S, Vassilev D. In silico discovering relationship between bacteriophages and antimicrobial resistance. BIOTECHNOL BIOTEC EQ 2023. [DOI: 10.1080/13102818.2022.2151378] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/05/2023] Open

He S, Gao B, Sabnis R, Sun Q. Nucleic Transformer: Classifying DNA Sequences with Self-Attention and Convolutions. ACS Synth Biol 2023;12:3205-3214. [PMID: 37916871 PMCID: PMC10863451 DOI: 10.1021/acssynbio.3c00154] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2023] [Revised: 10/04/2023] [Accepted: 10/06/2023] [Indexed: 11/03/2023]

Bischoff E, Lang L, Zimmermann J, Luczak M, Kiefer AM, Niedner-Schatteburg G, Manolikakes G, Morgan B, Deponte M. Glutathione kinetically outcompetes reactions between dimedone and a cyclic sulfenamide or physiological sulfenic acids. Free Radic Biol Med 2023;208:165-177. [PMID: 37541455 DOI: 10.1016/j.freeradbiomed.2023.08.005] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/07/2023] [Revised: 07/31/2023] [Accepted: 08/01/2023] [Indexed: 08/06/2023]

Le NQK, Li W, Cao Y. Sequence-based prediction model of protein crystallization propensity using machine learning and two-level feature selection. Brief Bioinform 2023;24:bbad319. [PMID: 37649385 DOI: 10.1093/bib/bbad319] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2023] [Revised: 07/09/2023] [Accepted: 08/16/2023] [Indexed: 09/01/2023] Open

Zhang T, Jia J, Chen C, Zhang Y, Yu B. BiGRUD-SA: Protein S-sulfenylation sites prediction based on BiGRU and self-attention. Comput Biol Med 2023;163:107145. [PMID: 37336062 DOI: 10.1016/j.compbiomed.2023.107145] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2023] [Revised: 05/18/2023] [Accepted: 06/06/2023] [Indexed: 06/21/2023]

Palangi V. Identification of Ruminal Fermentation Curves of Some Legume Forages Using Particle Swarm Optimization. Animals (Basel) 2023;13:ani13081339. [PMID: 37106901 PMCID: PMC10135319 DOI: 10.3390/ani13081339] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2023] [Revised: 04/11/2023] [Accepted: 04/12/2023] [Indexed: 04/29/2023] Open

Luo H, Shan W, Chen C, Ding P, Luo L. Improving language model of human genome for DNA-protein binding prediction based on task-specific pre-training. Interdiscip Sci 2023;15:32-43. [PMID: 36136096 DOI: 10.1007/s12539-022-00537-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2022] [Revised: 08/30/2022] [Accepted: 09/07/2022] [Indexed: 11/27/2022]

Watanabe N, Yamamoto M, Murata M, Vavricka CJ, Ogino C, Kondo A, Araki M. Comprehensive Machine Learning Prediction of Extensive Enzymatic Reactions. J Phys Chem B 2022;126:6762-6770. [PMID: 36053051 DOI: 10.1021/acs.jpcb.2c03287] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2023]

Jiang Z, Lu Y, Liu Z, Wu W, Xu X, Dinnyés A, Yu Z, Chen L, Sun Q. Drug resistance prediction and resistance genes identification in Mycobacterium tuberculosis based on a hierarchical attentive neural network utilizing genome-wide variants. Brief Bioinform 2022;23:6553603. [PMID: 35325021 DOI: 10.1093/bib/bbac041] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2021] [Revised: 01/18/2022] [Accepted: 01/27/2022] [Indexed: 01/25/2023] Open

Guo H, Song Y, Tang H, Zhao J. An ensemble deep neural network approach for predicting TOC concentration in lakes along the middle-lower reaches of Yangtze River. JOURNAL OF INTELLIGENT & FUZZY SYSTEMS 2022. [DOI: 10.3233/jifs-210708] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]

Yao M, Fu L, Liu X, Zheng D. In-Silico Multi-Omics Analysis of the Functional Significance of Calmodulin 1 in Multiple Cancers. Front Genet 2022;12:793508. [PMID: 35096010 PMCID: PMC8790318 DOI: 10.3389/fgene.2021.793508] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2021] [Accepted: 12/23/2021] [Indexed: 01/14/2023] Open

Abstract

Aberrant activation of calmodulin 1 (CALM1) has been reported in human cancers. However, comprehensive understanding of the role of CALM1 in most cancer types has remained unclear. We systematically analyzed the expression landscape, DNA methylation, gene alteration, immune infiltration, clinical relevance, and molecular pathway of CALM1 in multiple cancers using various online tools, including The Cancer Genome Atlas, cBioPortal and the Human Protein Atlas databases. Kaplan–Meier and receiver operating characteristic (ROC) curves were plotted to explore the prognostic and diagnostic potential of CALM1 expression. Multivariate analyses were used to evaluate whether the CALM1 expression could be an independent risk factor. A nomogram predicting the overall survival (OS) of patients was developed, evaluated, and compared with the traditional Tumor-Node-Metastasis (TNM) model using decision curve analysis. R language was employed as the main tool for analysis and visualization. Results revealed CALM1 to be highly expressed in most cancers, its expression being regulated by DNA methylation in multiple cancers. CALM1 had a low mutation frequency (within 3%) and was associated with immune infiltration. We observed a substantial positive correlation between CALM1 expression and macrophage and neutrophil infiltration levels in multiple cancers. Different mutational forms of CALM1 hampered immune cell infiltration. Additionally, CALM1 expression had high diagnostic and prognostic potential. Multivariate analyses revealed CALM1 expression to be an independent risk factor for OS. Therefore, our newly developed nomogram had a higher clinical value than the TNM model. The concordance index, calibration curve, and time-dependent ROC curves of the nomogram exhibited excellent performance in terms of predicting the survival rate of patients. Moreover, elevated CALM1 expression contributes to the activation of cancer-related pathways, such as the WNT and MAPK pathways. Overall, our findings improved our understanding of the function of CALM1 in human cancers.

Collapse

Recognition of mRNA N4 Acetylcytidine (ac4C) by Using Non-Deep vs. Deep Learning. APPLIED SCIENCES-BASEL 2022. [DOI: 10.3390/app12031344] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/04/2023]

Yang Y, Lin L, Qiao L. Deep learning approaches for data-independent acquisition proteomics. Expert Rev Proteomics 2021;18:1031-1043. [PMID: 34918987 DOI: 10.1080/14789450.2021.2020654] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/30/2023]

Le NQK, Ho QT. Deep transformers and convolutional neural network in identifying DNA N6-methyladenine sites in cross-species genomes. Methods 2021;204:199-206. [PMID: 34915158 DOI: 10.1016/j.ymeth.2021.12.004] [Citation(s) in RCA: 28] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2021] [Revised: 11/30/2021] [Accepted: 12/09/2021] [Indexed: 12/19/2022] Open

Pakhrin SC, Aoki-Kinoshita KF, Caragea D, KC DB. DeepNGlyPred: A Deep Neural Network-Based Approach for Human N-Linked Glycosylation Site Prediction. Molecules 2021;26:molecules26237314. [PMID: 34885895 PMCID: PMC8658957 DOI: 10.3390/molecules26237314] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2021] [Revised: 11/22/2021] [Accepted: 11/26/2021] [Indexed: 12/21/2022] Open

Predicting Three-Dimensional Dose Distribution of Prostate Volumetric Modulated Arc Therapy Using Deep Learning. Life (Basel) 2021;11:life11121305. [PMID: 34947836 PMCID: PMC8706736 DOI: 10.3390/life11121305] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2021] [Revised: 11/19/2021] [Accepted: 11/23/2021] [Indexed: 11/21/2022] Open

Huang S, Liu Y, Sun X, Li J. Application of Artificial Neural Network Based on Traditional Detection and GC-MS in Prediction of Free Radicals in Thermal Oxidation of Vegetable Oil. Molecules 2021;26:6717. [PMID: 34771126 PMCID: PMC8586939 DOI: 10.3390/molecules26216717] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2021] [Revised: 11/01/2021] [Accepted: 11/02/2021] [Indexed: 11/30/2022] Open

Xu Z, Luo M, Lin W, Xue G, Wang P, Jin X, Xu C, Zhou W, Cai Y, Yang W, Nie H, Jiang Q. DLpTCR: an ensemble deep learning framework for predicting immunogenic peptide recognized by T cell receptor. Brief Bioinform 2021;22:6355415. [PMID: 34415016 DOI: 10.1093/bib/bbab335] [Citation(s) in RCA: 36] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2021] [Revised: 07/25/2021] [Accepted: 07/28/2021] [Indexed: 12/30/2022] Open

Ali H, Iqbal K, Mujtaba G, Fayyaz A, Bulbul MF, Karam FW, Zahir A. Urdu text in natural scene images: a new dataset and preliminary text detection. PeerJ Comput Sci 2021;7:e717. [PMID: 34616893 PMCID: PMC8459794 DOI: 10.7717/peerj-cs.717] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2020] [Accepted: 08/25/2021] [Indexed: 06/13/2023]

Jia Y, Liu Y, Han Z, Tian R. Identification of potential gene signatures associated with osteosarcoma by integrated bioinformatics analysis. PeerJ 2021;9:e11496. [PMID: 34123594 PMCID: PMC8164836 DOI: 10.7717/peerj.11496] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2020] [Accepted: 04/30/2021] [Indexed: 12/21/2022] Open

Abstract

Background

Osteosarcoma (OS) is the most primary malignant bone cancer in children and adolescents with a high mortality rate. This work aims to screen novel potential gene signatures associated with OS by integrated microarray analysis of the Gene Expression Omnibus (GEO) database.

Material and Methods

The OS microarray datasets were searched and downloaded from GEO database to identify differentially expressed genes (DEGs) between OS and normal samples. Afterwards, the functional enrichment analysis, protein–protein interaction (PPI) network analysis and transcription factor (TF)-target gene regulatory network were applied to uncover the biological function of DEGs. Finally, two published OS datasets (GSE39262 and GSE126209) were obtained from GEO database for evaluating the expression level and diagnostic values of key genes.

Results

In total 1,059 DEGs (569 up-regulated DEGs and 490 down-regulated DEGs) between OS and normal samples were screened. Functional analysis showed that these DEGs were markedly enriched in 214 GO terms and 54 KEGG pathways such as pathways in cancer. Five genes (CAMP, METTL7A, TCN1, LTF and CXCL12) acted as hub genes in PPI network. Besides, METTL7A, CYP4F3, TCN1, LTF and NETO2 were key genes in TF-gene network. Moreover, Pax-6 regulated four key genes (TCN1, CYP4F3, NETO2 and CXCL12). The expression levels of four genes (METTL7A, TCN1, CXCL12 and NETO2) in GSE39262 set were consistent with our integration analysis. The expression levels of two genes (CXCL12 and NETO2) in GSE126209 set were consistent with our integration analysis. ROC analysis of GSE39262 set revealed that CYP4F3, CXCL12, METTL7A, TCN1 and NETO2 had good diagnostic values for OS patients. ROC analysis of GSE126209 set revealed that CXCL12, METTL7A, TCN1 and NETO2 had good diagnostic values for OS patients.

Collapse

Mosquera Navarro R, Castrillón OD, Parra Osorio L, Oliveira T, Novais P, Valencia JF. Improving classification based on physical surface tension-neural net for the prediction of psychosocial-risk level in public school teachers. PeerJ Comput Sci 2021;7:e511. [PMID: 34141875 PMCID: PMC8176537 DOI: 10.7717/peerj-cs.511] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2020] [Accepted: 04/06/2021] [Indexed: 06/12/2023]

Abstract

BACKGROUND

Psychosocial risks, also present in educational processes, are stress factors particularly critical in state-schools, affecting the efficacy, stress, and job satisfaction of the teachers. This study proposes an intelligent algorithm to improve the prediction of psychosocial risk, as a tool for the generation of health and risk prevention assistance programs.

METHODS

The proposed approach, Physical Surface Tension-Neural Net (PST-NN), applied the theory of superficial tension in liquids to an artificial neural network (ANN), in order to model four risk levels (low, medium, high and very high psychosocial risk). The model was trained and tested using the results of tests for measurement of the psychosocial risk levels of 5,443 teachers. Psychosocial, and also physiological and musculoskeletal symptoms, factors were included as inputs of the model. The classification efficiency of the PST-NN approach was evaluated by using the sensitivity, specificity, accuracy and ROC curve metrics, and compared against other techniques as the Decision Tree model, Naïve Bayes, ANN, Support Vector Machines, Robust Linear Regression and the Logistic Regression Model.

RESULTS

The modification of the ANN model, by the adaptation of a layer that includes concepts related to the theory of physical surface tension, improved the separation of the subjects according to the risk level group, as a function of the mass and perimeter outputs. Indeed, the PST-NN model showed better performance to classify psychosocial risk level on state-school teachers than the linear, probabilistic and logistic models included in this study, obtaining an average accuracy value of 97.31%.

CONCLUSIONS

The introduction of physical models, such as the physical surface tension, can improve the classification performance of ANN. Particularly, the PST-NN model can be used to predict and classify psychosocial risk levels among state-school teachers at work. This model could help to early identification of psychosocial risk and to the development of programs to prevent it.

Collapse

Makarov I, Makarov M, Kiselev D. Fusion of text and graph information for machine learning problems on networks. PeerJ Comput Sci 2021;7:e526. [PMID: 34084929 PMCID: PMC8157042 DOI: 10.7717/peerj-cs.526] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2020] [Accepted: 04/14/2021] [Indexed: 06/12/2023]

Shafiq S, Azim T. Introspective analysis of convolutional neural networks for improving discrimination performance and feature visualisation. PeerJ Comput Sci 2021;7:e497. [PMID: 34013030 PMCID: PMC8114803 DOI: 10.7717/peerj-cs.497] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2020] [Accepted: 03/30/2021] [Indexed: 06/12/2023]

Zhelyazkova M, Yordanova R, Mihaylov I, Kirov S, Tsonev S, Danko D, Mason C, Vassilev D. Origin Sample Prediction and Spatial Modeling of Antimicrobial Resistance in Metagenomic Sequencing Data. Front Genet 2021;12:642991. [PMID: 33763122 PMCID: PMC7983949 DOI: 10.3389/fgene.2021.642991] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2020] [Accepted: 02/02/2021] [Indexed: 12/18/2022] Open

Xiao J, Wang R, Cai X, Ye Z. Coupling of Co-expression Network Analysis and Machine Learning Validation Unearthed Potential Key Genes Involved in Rheumatoid Arthritis. Front Genet 2021;12:604714. [PMID: 33643380 PMCID: PMC7905311 DOI: 10.3389/fgene.2021.604714] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/10/2020] [Accepted: 01/04/2021] [Indexed: 12/21/2022] Open

Wang X, Li BB. Deep Learning in Head and Neck Tumor Multiomics Diagnosis and Analysis: Review of the Literature. Front Genet 2021;12:624820. [PMID: 33643386 PMCID: PMC7902873 DOI: 10.3389/fgene.2021.624820] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2020] [Accepted: 01/07/2021] [Indexed: 12/24/2022] Open

Bai R, Jiang S, Sun H, Yang Y, Li G. Deep Neural Network-Based Semantic Segmentation of Microvascular Decompression Images. SENSORS 2021;21:s21041167. [PMID: 33562275 PMCID: PMC7915571 DOI: 10.3390/s21041167] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/31/2020] [Revised: 01/26/2021] [Accepted: 02/02/2021] [Indexed: 11/30/2022]

Le NQK, Ho QT, Nguyen TTD, Ou YY. A transformer architecture based on BERT and 2D convolutional neural network to identify DNA enhancers from sequence information. Brief Bioinform 2021;22:6128847. [PMID: 33539511 DOI: 10.1093/bib/bbab005] [Citation(s) in RCA: 75] [Impact Index Per Article: 25.0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2020] [Revised: 01/01/2021] [Accepted: 01/03/2021] [Indexed: 01/11/2023] Open

Explainable AI Framework for Multivariate Hydrochemical Time Series. MACHINE LEARNING AND KNOWLEDGE EXTRACTION 2021. [DOI: 10.3390/make3010009] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/09/2023]

Makarov I, Kiselev D, Nikitinsky N, Subelj L. Survey on graph embeddings and their applications to machine learning problems on graphs. PeerJ Comput Sci 2021;7:e357. [PMID: 33817007 PMCID: PMC7959646 DOI: 10.7717/peerj-cs.357] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2020] [Accepted: 12/18/2020] [Indexed: 05/13/2023]

Abstract

Dealing with relational data always required significant computational resources, domain expertise and task-dependent feature engineering to incorporate structural information into a predictive model. Nowadays, a family of automated graph feature engineering techniques has been proposed in different streams of literature. So-called graph embeddings provide a powerful tool to construct vectorized feature spaces for graphs and their components, such as nodes, edges and subgraphs under preserving inner graph properties. Using the constructed feature spaces, many machine learning problems on graphs can be solved via standard frameworks suitable for vectorized feature representation. Our survey aims to describe the core concepts of graph embeddings and provide several taxonomies for their description. First, we start with the methodological approach and extract three types of graph embedding models based on matrix factorization, random-walks and deep learning approaches. Next, we describe how different types of networks impact the ability of models to incorporate structural and attributed data into a unified embedding. Going further, we perform a thorough evaluation of graph embedding applications to machine learning problems on graphs, among which are node classification, link prediction, clustering, visualization, compression, and a family of the whole graph embedding algorithms suitable for graph classification, similarity and alignment problems. Finally, we overview the existing applications of graph embeddings to computer science domains, formulate open problems and provide experiment results, explaining how different networks properties result in graph embeddings quality in the four classic machine learning problems on graphs, such as node classification, link prediction, clustering and graph visualization. As a result, our survey covers a new rapidly growing field of network feature engineering, presents an in-depth analysis of models based on network types, and overviews a wide range of applications to machine learning problems on graphs.

Collapse

SSnet: A Deep Learning Approach for Protein-Ligand Interaction Prediction. Int J Mol Sci 2021;22:ijms22031392. [PMID: 33573266 PMCID: PMC7869013 DOI: 10.3390/ijms22031392] [Citation(s) in RCA: 20] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/28/2020] [Revised: 01/24/2021] [Accepted: 01/27/2021] [Indexed: 12/15/2022] Open

Wu F, Yang R, Zhang C, Zhang L. A deep learning framework combined with word embedding to identify DNA replication origins. Sci Rep 2021;11:844. [PMID: 33436981 PMCID: PMC7804333 DOI: 10.1038/s41598-020-80670-x] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2020] [Accepted: 12/24/2020] [Indexed: 01/29/2023] Open

Abstract

The DNA replication influences the inheritance of genetic information in the DNA life cycle. As the distribution of replication origins (ORIs) is the major determinant to precisely regulate the replication process, the correct identification of ORIs is significant in giving an insightful understanding of DNA replication mechanisms and the regulatory mechanisms of genetic expressions. For eukaryotes in particular, multiple ORIs exist in each of their gene sequences to complete the replication in a reasonable period of time. To simplify the identification process of eukaryote's ORIs, most of existing methods are developed by traditional machine learning algorithms, and target to the gene sequences with a fixed length. Consequently, the identification results are not satisfying, i.e. there is still great room for improvement. To break through the limitations in previous studies, this paper develops sequence segmentation methods, and employs the word embedding technique, 'Word2vec', to convert gene sequences into word vectors, thereby grasping the inner correlations of gene sequences with different lengths. Then, a deep learning framework to perform the ORI identification task is constructed by a convolutional neural network with an embedding layer. On the basis of the analysis of similarity reduction dimensionality diagram, Word2vec can effectively transform the inner relationship among words into numerical feature. For four species in this study, the best models are obtained with the overall accuracy of 0.975, 0.765, 0.885, 0.967, the Matthew's correlation coefficient of 0.940, 0.530, 0.771, 0.934, and the AUC of 0.975, 0.800, 0.888, 0.981, which indicate that the proposed predictor has a stable ability and provide a high confidence coefficient to classify both of ORIs and non-ORIs. Compared with state-of-the-art methods, the proposed predictor can achieve ORI identification with significant improvement. It is therefore reasonable to anticipate that the proposed method will make a useful high throughput tool for genome analysis.

Collapse

Wang P, Zhang Q, Li S, Cheng B, Xue H, Wei Z, Shao T, Liu ZX, Cheng H, Wang Z. iCysMod: an integrative database for protein cysteine modifications in eukaryotes. Brief Bioinform 2021;22:6066620. [PMID: 33406221 DOI: 10.1093/bib/bbaa400] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2020] [Revised: 11/23/2020] [Accepted: 12/07/2020] [Indexed: 01/06/2023] Open

Zhao X, He M. Comprehensive pathway-related genes signature for prognosis and recurrence of ovarian cancer. PeerJ 2020;8:e10437. [PMID: 33344083 PMCID: PMC7718801 DOI: 10.7717/peerj.10437] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2020] [Accepted: 11/06/2020] [Indexed: 12/14/2022] Open

Le NQK, Do DT, Hung TNK, Lam LHT, Huynh TT, Nguyen NTK. A Computational Framework Based on Ensemble Deep Neural Networks for Essential Genes Identification. Int J Mol Sci 2020;21:E9070. [PMID: 33260643 PMCID: PMC7730808 DOI: 10.3390/ijms21239070] [Citation(s) in RCA: 37] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2020] [Revised: 11/25/2020] [Accepted: 11/26/2020] [Indexed: 01/13/2023] Open

Liu W, Juhas M, Zhang Y. Fine-Grained Breast Cancer Classification With Bilinear Convolutional Neural Networks (BCNNs). Front Genet 2020;11:547327. [PMID: 33101377 PMCID: PMC7500315 DOI: 10.3389/fgene.2020.547327] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2020] [Accepted: 08/17/2020] [Indexed: 12/24/2022] Open

Machine Learning Model for Identifying Antioxidant Proteins Using Features Calculated from Primary Sequences. BIOLOGY 2020;9:biology9100325. [PMID: 33036150 PMCID: PMC7599600 DOI: 10.3390/biology9100325] [Citation(s) in RCA: 39] [Impact Index Per Article: 9.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/27/2020] [Revised: 10/03/2020] [Accepted: 10/04/2020] [Indexed: 12/15/2022]

Le NQK, Do DT, Chiu FY, Yapp EKY, Yeh HY, Chen CY. XGBoost Improves Classification of MGMT Promoter Methylation Status in IDH1 Wildtype Glioblastoma. J Pers Med 2020;10:jpm10030128. [PMID: 32942564 PMCID: PMC7563334 DOI: 10.3390/jpm10030128] [Citation(s) in RCA: 57] [Impact Index Per Article: 14.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2020] [Revised: 09/03/2020] [Accepted: 09/09/2020] [Indexed: 02/07/2023] Open