Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Zhang Z, Zhao Y, Liao X, Shi W, Li K, Zou Q, Peng S. Deep learning in omics: a survey and guideline. Brief Funct Genomics 2020;18:41-57. [PMID: 30265280 DOI: 10.1093/bfgp/ely030] [Citation(s) in RCA: 80] [Impact Index Per Article: 20.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2018] [Revised: 07/31/2018] [Accepted: 08/30/2018] [Indexed: 01/17/2023] Open

For:	Zhang Z, Zhao Y, Liao X, Shi W, Li K, Zou Q, Peng S. Deep learning in omics: a survey and guideline. Brief Funct Genomics 2020;18:41-57. [PMID: 30265280 DOI: 10.1093/bfgp/ely030] [Citation(s) in RCA: 80] [Impact Index Per Article: 20.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2018] [Revised: 07/31/2018] [Accepted: 08/30/2018] [Indexed: 01/17/2023] Open

Number

Cited by Other Article(s)

Kang Q, Meng J, Shi W, Luan Y. Ensemble Deep Learning Based on Multi-level Information Enhancement and Greedy Fuzzy Decision for Plant miRNA-lncRNA Interaction Prediction. Interdiscip Sci 2021;13:603-614. [PMID: 33900552 DOI: 10.1007/s12539-021-00434-7] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2021] [Revised: 04/01/2021] [Accepted: 04/16/2021] [Indexed: 12/18/2022]

Performance Comparison of Deep Learning Autoencoders for Cancer Subtype Detection Using Multi-Omics Data. Cancers (Basel) 2021;13:cancers13092013. [PMID: 33921978 PMCID: PMC8122584 DOI: 10.3390/cancers13092013] [Citation(s) in RCA: 20] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2021] [Revised: 03/29/2021] [Accepted: 04/06/2021] [Indexed: 12/14/2022] Open

Gu T, Zhao X, Barbazuk WB, Lee JH. miTAR: a hybrid deep learning-based approach for predicting miRNA targets. BMC Bioinformatics 2021;22:96. [PMID: 33639834 PMCID: PMC7912887 DOI: 10.1186/s12859-021-04026-6] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2020] [Accepted: 02/14/2021] [Indexed: 02/08/2023] Open

Abstract

BACKGROUND

microRNAs (miRNAs) have been shown to play essential roles in a wide range of biological processes. Many computational methods have been developed to identify targets of miRNAs. However, the majority of these methods depend on pre-defined features that require considerable efforts and resources to compute and often prove suboptimal at predicting miRNA targets.

RESULTS

We developed a novel hybrid deep learning-based (DL-based) approach that is capable of predicting miRNA targets at a higher accuracy. This approach integrates convolutional neural networks (CNNs) that excel in learning spatial features and recurrent neural networks (RNNs) that discern sequential features. Therefore, our approach has the advantages of learning both the intrinsic spatial and sequential features of miRNA:target. The inputs for our approach are raw sequences of miRNAs and genes that can be obtained effortlessly. We applied our approach on two human datasets from recently miRNA target prediction studies and trained two models. We demonstrated that the two models consistently outperform the previous methods according to evaluation metrics on test datasets. Comparing our approach with currently available alternatives on independent datasets shows that our approach delivers substantial improvements in performance. We also show with multiple evidences that our approach is more robust than other methods on small datasets. Our study is the first study to perform comparisons across multiple existing DL-based approaches on miRNA target prediction. Furthermore, we examined the contribution of a Max pooling layer in between the CNN and RNN and demonstrated that it improves the performance of all our models. Finally, a unified model was developed that is robust on fitting different input datasets.

CONCLUSIONS

We present a new DL-based approach for predicting miRNA targets and demonstrate that our approach outperforms the current alternatives. We supplied an easy-to-use tool, miTAR, at https://github.com/tjgu/miTAR . Furthermore, our analysis results support that Max Pooling generally benefits the hybrid models and potentially prevents overfitting for hybrid models.

Collapse

Wörheide MA, Krumsiek J, Kastenmüller G, Arnold M. Multi-omics integration in biomedical research - A metabolomics-centric review. Anal Chim Acta 2021;1141:144-162. [PMID: 33248648 PMCID: PMC7701361 DOI: 10.1016/j.aca.2020.10.038] [Citation(s) in RCA: 101] [Impact Index Per Article: 33.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2020] [Revised: 10/09/2020] [Accepted: 10/19/2020] [Indexed: 02/07/2023]

"Omics" in traumatic brain injury: novel approaches to a complex disease. Acta Neurochir (Wien) 2021;163:2581-2594. [PMID: 34273044 PMCID: PMC8357753 DOI: 10.1007/s00701-021-04928-7] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2021] [Accepted: 06/23/2021] [Indexed: 11/12/2022]

Lv Z, Ding H, Wang L, Zou Q. A Convolutional Neural Network Using Dinucleotide One-hot Encoder for identifying DNA N6-Methyladenine Sites in the Rice Genome. Neurocomputing 2021. [DOI: 10.1016/j.neucom.2020.09.056] [Citation(s) in RCA: 24] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]

Wang XF, Gao P, Liu YF, Li HF, Lu F. Predicting Thermophilic Proteins by Machine Learning. Curr Bioinform 2020. [DOI: 10.2174/1574893615666200207094357] [Citation(s) in RCA: 83] [Impact Index Per Article: 20.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/05/2023]

Fu Y, Xu J, Tang Z, Wang L, Yin D, Fan Y, Zhang D, Deng F, Zhang Y, Zhang H, Wang H, Xing W, Yin L, Zhu S, Zhu M, Yu M, Li X, Liu X, Yuan X, Zhao S. A gene prioritization method based on a swine multi-omics knowledgebase and a deep learning model. Commun Biol 2020;3:502. [PMID: 32913254 PMCID: PMC7483748 DOI: 10.1038/s42003-020-01233-4] [Citation(s) in RCA: 31] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2020] [Accepted: 08/07/2020] [Indexed: 12/27/2022] Open

Affiliation(s)

Yuhua Fu Key Laboratory of Agricultural Animal Genetics, Breeding and Reproduction, Ministry of Education, Key Laboratory of Swine Genetics and Breeding, Ministry of Agriculture, College of Animal Science and Technology, Huazhong Agricultural University, 430070, Wuhan, Hubei, P.R. China.,School of Computer Science and Technology, Wuhan University of Technology, 430070, Wuhan, Hubei, P.R. China
Jingya Xu Key Laboratory of Agricultural Animal Genetics, Breeding and Reproduction, Ministry of Education, Key Laboratory of Swine Genetics and Breeding, Ministry of Agriculture, College of Animal Science and Technology, Huazhong Agricultural University, 430070, Wuhan, Hubei, P.R. China
Zhenshuang Tang Key Laboratory of Agricultural Animal Genetics, Breeding and Reproduction, Ministry of Education, Key Laboratory of Swine Genetics and Breeding, Ministry of Agriculture, College of Animal Science and Technology, Huazhong Agricultural University, 430070, Wuhan, Hubei, P.R. China
Lu Wang Key Laboratory of Agricultural Animal Genetics, Breeding and Reproduction, Ministry of Education, Key Laboratory of Swine Genetics and Breeding, Ministry of Agriculture, College of Animal Science and Technology, Huazhong Agricultural University, 430070, Wuhan, Hubei, P.R. China
Dong Yin Key Laboratory of Agricultural Animal Genetics, Breeding and Reproduction, Ministry of Education, Key Laboratory of Swine Genetics and Breeding, Ministry of Agriculture, College of Animal Science and Technology, Huazhong Agricultural University, 430070, Wuhan, Hubei, P.R. China
Yu Fan Key Laboratory of Agricultural Animal Genetics, Breeding and Reproduction, Ministry of Education, Key Laboratory of Swine Genetics and Breeding, Ministry of Agriculture, College of Animal Science and Technology, Huazhong Agricultural University, 430070, Wuhan, Hubei, P.R. China
Dongdong Zhang School of Computer Science and Technology, Wuhan University of Technology, 430070, Wuhan, Hubei, P.R. China
Fei Deng School of Computer Science and Technology, Wuhan University of Technology, 430070, Wuhan, Hubei, P.R. China
Yanping Zhang School of Computer Science and Technology, Wuhan University of Technology, 430070, Wuhan, Hubei, P.R. China
Haohao Zhang School of Computer Science and Technology, Wuhan University of Technology, 430070, Wuhan, Hubei, P.R. China
Haiyan Wang Key Laboratory of Agricultural Animal Genetics, Breeding and Reproduction, Ministry of Education, Key Laboratory of Swine Genetics and Breeding, Ministry of Agriculture, College of Animal Science and Technology, Huazhong Agricultural University, 430070, Wuhan, Hubei, P.R. China
Wenhui Xing School of Computer Science and Technology, Wuhan University of Technology, 430070, Wuhan, Hubei, P.R. China
Lilin Yin Key Laboratory of Agricultural Animal Genetics, Breeding and Reproduction, Ministry of Education, Key Laboratory of Swine Genetics and Breeding, Ministry of Agriculture, College of Animal Science and Technology, Huazhong Agricultural University, 430070, Wuhan, Hubei, P.R. China
Shilin Zhu Key Laboratory of Agricultural Animal Genetics, Breeding and Reproduction, Ministry of Education, Key Laboratory of Swine Genetics and Breeding, Ministry of Agriculture, College of Animal Science and Technology, Huazhong Agricultural University, 430070, Wuhan, Hubei, P.R. China
Mengjin Zhu Key Laboratory of Agricultural Animal Genetics, Breeding and Reproduction, Ministry of Education, Key Laboratory of Swine Genetics and Breeding, Ministry of Agriculture, College of Animal Science and Technology, Huazhong Agricultural University, 430070, Wuhan, Hubei, P.R. China
Mei Yu Key Laboratory of Agricultural Animal Genetics, Breeding and Reproduction, Ministry of Education, Key Laboratory of Swine Genetics and Breeding, Ministry of Agriculture, College of Animal Science and Technology, Huazhong Agricultural University, 430070, Wuhan, Hubei, P.R. China
Xinyun Li Key Laboratory of Agricultural Animal Genetics, Breeding and Reproduction, Ministry of Education, Key Laboratory of Swine Genetics and Breeding, Ministry of Agriculture, College of Animal Science and Technology, Huazhong Agricultural University, 430070, Wuhan, Hubei, P.R. China
Xiaolei Liu Key Laboratory of Agricultural Animal Genetics, Breeding and Reproduction, Ministry of Education, Key Laboratory of Swine Genetics and Breeding, Ministry of Agriculture, College of Animal Science and Technology, Huazhong Agricultural University, 430070, Wuhan, Hubei, P.R. China.
Xiaohui Yuan School of Computer Science and Technology, Wuhan University of Technology, 430070, Wuhan, Hubei, P.R. China.
Shuhong Zhao Key Laboratory of Agricultural Animal Genetics, Breeding and Reproduction, Ministry of Education, Key Laboratory of Swine Genetics and Breeding, Ministry of Agriculture, College of Animal Science and Technology, Huazhong Agricultural University, 430070, Wuhan, Hubei, P.R. China.

Collapse

Meijering E. A bird's-eye view of deep learning in bioimage analysis. Comput Struct Biotechnol J 2020;18:2312-2325. [PMID: 32994890 PMCID: PMC7494605 DOI: 10.1016/j.csbj.2020.08.003] [Citation(s) in RCA: 58] [Impact Index Per Article: 14.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2020] [Revised: 07/26/2020] [Accepted: 08/01/2020] [Indexed: 02/07/2023] Open

Li H, Du H, Wang X, Gao P, Liu Y, Lin W. Remarks on Computational Method for Identifying Acid and Alkaline Enzymes. Curr Pharm Des 2020;26:3105-3114. [PMID: 32552636 DOI: 10.2174/1381612826666200617170826] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/01/2020] [Accepted: 05/07/2020] [Indexed: 11/22/2022]

Adam G, Rampášek L, Safikhani Z, Smirnov P, Haibe-Kains B, Goldenberg A. Machine learning approaches to drug response prediction: challenges and recent progress. NPJ Precis Oncol 2020;4:19. [PMID: 32566759 PMCID: PMC7296033 DOI: 10.1038/s41698-020-0122-1] [Citation(s) in RCA: 120] [Impact Index Per Article: 30.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2019] [Accepted: 04/17/2020] [Indexed: 12/24/2022] Open

Wilentzik Müller R, Gat-Viks I. Exploring Neural Networks and Related Visualization Techniques in Gene Expression Data. Front Genet 2020;11:402. [PMID: 32499810 PMCID: PMC7243731 DOI: 10.3389/fgene.2020.00402] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2019] [Accepted: 03/30/2020] [Indexed: 12/04/2022] Open

Dou L, Li X, Ding H, Xu L, Xiang H. Is There Any Sequence Feature in the RNA Pseudouridine Modification Prediction Problem? MOLECULAR THERAPY. NUCLEIC ACIDS 2020;19:293-303. [PMID: 31865116 PMCID: PMC6931122 DOI: 10.1016/j.omtn.2019.11.014] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/08/2019] [Revised: 10/29/2019] [Accepted: 11/11/2019] [Indexed: 01/01/2023]

Abstract

Pseudouridine (Ψ) is the most abundant RNA modification and has been found in many kinds of RNAs, including snRNA, rRNA, tRNA, mRNA, and snoRNA. Thus, Ψ sites play a significant role in basic research and drug development. Although some experimental techniques have been developed to identify Ψ sites, they are expensive and time consuming, especially in the post-genomic era with the explosive growth of known RNA sequences. Thus, highly accurate computational methods are urgently required to quickly detect the Ψ sites on uncharacterized RNA sequences. Several predictors have been proposed using multifarious features, but their evaluated performances are still unsatisfactory. In this study, we first identified Ψ sites for H. sapiens, S. cerevisiae, and M. musculus using the sequence features from the bi-profile Bayes (BPB) method based on the random forest (RF) and support vector machine (SVM) algorithms, where the performances were evaluated using 5-fold cross-validation and independent tests. It was found that the SVM-based accuracies were 3.55% and 5.09% lower than the iPseU-CUU predictor for the H_990 and S_628 datasets, respectively. Almost the same-level results were obtained for M_994 and an independent H_200 dataset, even showing a 5.0% improvement for S_200. Then, three different kinds of features, including basic Kmer, general parallel correlation pseudo-dinucleotide composition (PC-PseDNC-General), and nucleotide chemical property (NCP) and nucleotide density (ND) from the iRNA-PseU method, were combined with BPB to show their comprehensive performances, where the effective features are selected by the max-relevance-max-distance (MRMD) method. The best evaluated accuracies of the combined features for the S_628 and M_994 datasets were achieved at 70.54% and 72.45%, which were 2.39% and 0.65% higher than iPseU-CUU. For the S_200 dataset, it was also improved 8% from 69% to 77%. However, there was no obvious improvement for H. sapiens, which was evaluated as approximately 63.23% and 72.0% for the H_990 and H_200 datasets, respectively. The overall performances for Ψ identification using BPB features as well as the combined features were not obviously improved. Although some kinds of feature extraction methods based on the RNA sequence information have been applied to construct the predictors in previous studies, the corresponding accuracies are generally in the range of 60%-70%. Thus, researchers need to reconsider whether there is any sequence feature in the RNA Ψ modification prediction problem.

Collapse

Peng L, Liu F, Yang J, Liu X, Meng Y, Deng X, Peng C, Tian G, Zhou L. Probing lncRNA-Protein Interactions: Data Repositories, Models, and Algorithms. Front Genet 2020;10:1346. [PMID: 32082358 PMCID: PMC7005249 DOI: 10.3389/fgene.2019.01346] [Citation(s) in RCA: 26] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2019] [Accepted: 12/09/2019] [Indexed: 12/31/2022] Open

Padi S, Manescu P, Schaub N, Hotaling N, Simon C, Bharti K, Bajcsy P. Comparison of Artificial Intelligence based approaches to cell function prediction. INFORMATICS IN MEDICINE UNLOCKED 2020. [DOI: 10.1016/j.imu.2019.100270] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022] Open

Le NQK, Ho QT, Yapp EKY, Ou YY, Yeh HY. DeepETC: A deep convolutional neural network architecture for investigating and classifying electron transport chain's complexes. Neurocomputing 2020. [DOI: 10.1016/j.neucom.2019.09.070] [Citation(s) in RCA: 18] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]

Guo L, Wang S, Li M, Cao Z. Accurate classification of membrane protein types based on sequence and evolutionary information using deep learning. BMC Bioinformatics 2019;20:700. [PMID: 31874615 PMCID: PMC6929490 DOI: 10.1186/s12859-019-3275-6] [Citation(s) in RCA: 17] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022] Open

Liu B, Zhu Y, Yan K. Fold-LTR-TCP: protein fold recognition based on triadic closure principle. Brief Bioinform 2019;21:2185-2193. [DOI: 10.1093/bib/bbz139] [Citation(s) in RCA: 50] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/03/2019] [Revised: 10/01/2019] [Accepted: 10/09/2019] [Indexed: 11/13/2022] Open

Hong J, Luo Y, Mou M, Fu J, Zhang Y, Xue W, Xie T, Tao L, Lou Y, Zhu F. Convolutional neural network-based annotation of bacterial type IV secretion system effectors with enhanced accuracy and reduced false discovery. Brief Bioinform 2019;21:1825-1836. [PMID: 31860715 DOI: 10.1093/bib/bbz120] [Citation(s) in RCA: 87] [Impact Index Per Article: 17.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2019] [Revised: 08/12/2019] [Accepted: 08/21/2019] [Indexed: 12/20/2022] Open

Abstract

The type IV bacterial secretion system (SS) is reported to be one of the most ubiquitous SSs in nature and can induce serious conditions by secreting type IV SS effectors (T4SEs) into the host cells. Recent studies mainly focus on annotating new T4SE from the huge amount of sequencing data, and various computational tools are therefore developed to accelerate T4SE annotation. However, these tools are reported as heavily dependent on the selected methods and their annotation performance need to be further enhanced. Herein, a convolution neural network (CNN) technique was used to annotate T4SEs by integrating multiple protein encoding strategies. First, the annotation accuracies of nine encoding strategies integrated with CNN were assessed and compared with that of the popular T4SE annotation tools based on independent benchmark. Second, false discovery rates of various models were systematically evaluated by (1) scanning the genome of Legionella pneumophila subsp. ATCC 33152 and (2) predicting the real-world non-T4SEs validated using published experiments. Based on the above analyses, the encoding strategies, (a) position-specific scoring matrix (PSSM), (b) protein secondary structure & solvent accessibility (PSSSA) and (c) one-hot encoding scheme (Onehot), were identified as well-performing when integrated with CNN. Finally, a novel strategy that collectively considers the three well-performing models (CNN-PSSM, CNN-PSSSA and CNN-Onehot) was proposed, and a new tool (CNN-T4SE, https://idrblab.org/cnnt4se/) was constructed to facilitate T4SE annotation. All in all, this study conducted a comprehensive analysis on the performance of a collection of encoding strategies when integrated with CNN, which could facilitate the suppression of T4SS in infection and limit the spread of antimicrobial resistance.

Collapse

Yu H, Dai Z. SNNRice6mA: A Deep Learning Method for Predicting DNA N6-Methyladenine Sites in Rice Genome. Front Genet 2019;10:1071. [PMID: 31681441 PMCID: PMC6797597 DOI: 10.3389/fgene.2019.01071] [Citation(s) in RCA: 48] [Impact Index Per Article: 9.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2019] [Accepted: 10/04/2019] [Indexed: 01/08/2023] Open

Meng C, Jin S, Wang L, Guo F, Zou Q. AOPs-SVM: A Sequence-Based Classifier of Antioxidant Proteins Using a Support Vector Machine. Front Bioeng Biotechnol 2019;7:224. [PMID: 31620433 PMCID: PMC6759716 DOI: 10.3389/fbioe.2019.00224] [Citation(s) in RCA: 49] [Impact Index Per Article: 9.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2019] [Accepted: 09/03/2019] [Indexed: 01/03/2023] Open

Lv Z, Jin S, Ding H, Zou Q. A Random Forest Sub-Golgi Protein Classifier Optimized via Dipeptide and Amino Acid Composition Features. Front Bioeng Biotechnol 2019;7:215. [PMID: 31552241 PMCID: PMC6737778 DOI: 10.3389/fbioe.2019.00215] [Citation(s) in RCA: 80] [Impact Index Per Article: 16.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2019] [Accepted: 08/22/2019] [Indexed: 02/01/2023] Open

Abstract

To gain insight into the malfunction of the Golgi apparatus and its relationship to various genetic and neurodegenerative diseases, the identification of sub-Golgi proteins, both cis-Golgi and trans-Golgi proteins, is of great significance. In this study, a state-of-art random forests sub-Golgi protein classifier, rfGPT, was developed. The rfGPT used 2-gap dipeptide and split amino acid composition for the feature vectors and was combined with the synthetic minority over-sampling technique (SMOTE) and an analysis of variance (ANOVA) feature selection method. The rfGPT was trained on a sub-Golgi protein sequence data set (137 sequences), with sequence identity less than 25%. For the optimal rfGPT classifier with 93 features, the accuracy (ACC) was 90.5%; the Matthews correlation coefficient (MCC) was 0.811; the sensitivity (Sn) was 92.6%; and the specificity (Sp) was 88.4%. The independent testing scores for the rfGPT were ACC = 90.6%; MCC = 0.696; Sn = 96.1%; and Sp = 69.2%. Although the independent testing accuracy was 4.4% lower than that for the best reported sub-Golgi classifier trained on a data set with 40% sequence identity (304 sequences), the rfGPT is currently the top sub-Golgi protein predictor utilizing feature vectors without any position-specific scoring matrix and its derivative features. Therefore, the rfGPT is a more practical tool, because no sequence alignment is required with tens of millions of protein sequences. To date, the rfGPT is the Golgi classifier with the best independent testing scores, optimized by training on smaller benchmark data sets. Feature importance analysis proves that the non-polar and aliphatic residues composition, the (aromatic residues) + (non-polar, aliphatic residues) dipeptide and aromatic residues composition between NH2-termial and COOH-terminal of protein sequences are the three top biological features for distinguishing the sub-Golgi proteins.

Collapse

Hong J, Luo Y, Zhang Y, Ying J, Xue W, Xie T, Tao L, Zhu F. Protein functional annotation of simultaneously improved stability, accuracy and false discovery rate achieved by a sequence-based deep learning. Brief Bioinform 2019;21:1437-1447. [PMID: 31504150 PMCID: PMC7412958 DOI: 10.1093/bib/bbz081] [Citation(s) in RCA: 90] [Impact Index Per Article: 18.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2019] [Revised: 05/27/2019] [Accepted: 06/10/2019] [Indexed: 11/12/2022] Open

Deep Splicing Code: Classifying Alternative Splicing Events Using Deep Learning. Genes (Basel) 2019;10:genes10080587. [PMID: 31374967 PMCID: PMC6722613 DOI: 10.3390/genes10080587] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2019] [Revised: 07/20/2019] [Accepted: 07/30/2019] [Indexed: 12/11/2022] Open

Lai HY, Zhang ZY, Su ZD, Su W, Ding H, Chen W, Lin H. iProEP: A Computational Predictor for Predicting Promoter. MOLECULAR THERAPY. NUCLEIC ACIDS 2019;17:337-346. [PMID: 31299595 PMCID: PMC6616480 DOI: 10.1016/j.omtn.2019.05.028] [Citation(s) in RCA: 99] [Impact Index Per Article: 19.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/23/2019] [Revised: 05/18/2019] [Accepted: 05/19/2019] [Indexed: 11/29/2022]

Affiliation(s)

Hong-Yan Lai Key Laboratory for NeuroInformation of Ministry of Education, School of Life Science and Technology and Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu 610054, China
Zhao-Yue Zhang Key Laboratory for NeuroInformation of Ministry of Education, School of Life Science and Technology and Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu 610054, China
Zhen-Dong Su Key Laboratory for NeuroInformation of Ministry of Education, School of Life Science and Technology and Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu 610054, China
Wei Su Key Laboratory for NeuroInformation of Ministry of Education, School of Life Science and Technology and Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu 610054, China
Hui Ding Key Laboratory for NeuroInformation of Ministry of Education, School of Life Science and Technology and Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu 610054, China
Wei Chen Key Laboratory for NeuroInformation of Ministry of Education, School of Life Science and Technology and Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu 610054, China; Innovative Institute of Chinese Medicine and Pharmacy, Chengdu University of Traditional Chinese Medicine, Chengdu 611730, China; Center for Genomics and Computational Biology, School of Life Sciences, North China University of Science and Technology, Tangshan 063000, China.
Hao Lin Key Laboratory for NeuroInformation of Ministry of Education, School of Life Science and Technology and Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu 610054, China.

Collapse

Ho DSW, Schierding W, Wake M, Saffery R, O’Sullivan J. Machine Learning SNP Based Prediction for Precision Medicine. Front Genet 2019;10:267. [PMID: 30972108 PMCID: PMC6445847 DOI: 10.3389/fgene.2019.00267] [Citation(s) in RCA: 102] [Impact Index Per Article: 20.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2018] [Accepted: 03/11/2019] [Indexed: 12/17/2022] Open

Huang Z, Zhan X, Xiang S, Johnson TS, Helm B, Yu CY, Zhang J, Salama P, Rizkalla M, Han Z, Huang K. SALMON: Survival Analysis Learning With Multi-Omics Neural Networks on Breast Cancer. Front Genet 2019;10:166. [PMID: 30906311 PMCID: PMC6419526 DOI: 10.3389/fgene.2019.00166] [Citation(s) in RCA: 114] [Impact Index Per Article: 22.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2018] [Accepted: 02/14/2019] [Indexed: 12/22/2022] Open

Affiliation(s)

Zhi Huang School of Electrical and Computer Engineering, Purdue University, West Lafayette, IN, United States.,Department of Medicine, Indiana University School of Medicine, Indianapolis, IN, United States.,Department of Electrical and Computer Engineering, Indiana University-Purdue University Indianapolis, Indianapolis, IN, United States
Xiaohui Zhan Department of Medicine, Indiana University School of Medicine, Indianapolis, IN, United States.,National-Regional Key Technology Engineering Laboratory for Medical Ultrasound, Guangdong Key Laboratory for Biomedical Measurements and Ultrasound Imaging, School of Biomedical Engineering, Health Science Center, Shenzhen University, Shenzhen, China
Shunian Xiang National-Regional Key Technology Engineering Laboratory for Medical Ultrasound, Guangdong Key Laboratory for Biomedical Measurements and Ultrasound Imaging, School of Biomedical Engineering, Health Science Center, Shenzhen University, Shenzhen, China.,Department of Medical and Molecular Genetics, Indiana University School of Medicine, Indianapolis, IN, United States
Travis S Johnson Department of Medicine, Indiana University School of Medicine, Indianapolis, IN, United States.,Department of Biomedical Informatics, The Ohio State University, Columbus, OH, United States
Bryan Helm Department of Medicine, Indiana University School of Medicine, Indianapolis, IN, United States
Christina Y Yu Department of Medicine, Indiana University School of Medicine, Indianapolis, IN, United States.,Department of Biomedical Informatics, The Ohio State University, Columbus, OH, United States
Jie Zhang Department of Medical and Molecular Genetics, Indiana University School of Medicine, Indianapolis, IN, United States
Paul Salama Department of Electrical and Computer Engineering, Indiana University-Purdue University Indianapolis, Indianapolis, IN, United States
Maher Rizkalla Department of Electrical and Computer Engineering, Indiana University-Purdue University Indianapolis, Indianapolis, IN, United States
Zhi Han Department of Medicine, Indiana University School of Medicine, Indianapolis, IN, United States.,Regenstrief Institute, Indianapolis, IN, United States
Kun Huang Department of Medicine, Indiana University School of Medicine, Indianapolis, IN, United States.,Department of Electrical and Computer Engineering, Indiana University-Purdue University Indianapolis, Indianapolis, IN, United States.,Regenstrief Institute, Indianapolis, IN, United States

Collapse

Guo M, Zou Q. Perspectives of Bioinformatics in Big Data Era. Curr Genomics 2019;20:79-80. [PMID: 31555058 PMCID: PMC6728898] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/03/2022] Open

Zou Q, Xing P, Wei L, Liu B. Gene2vec: gene subsequence embedding for prediction of mammalian N⁶-methyladenosine sites from mRNA. RNA (NEW YORK, N.Y.) 2019;25:205-218. [PMID: 30425123 PMCID: PMC6348985 DOI: 10.1261/rna.069112.118] [Citation(s) in RCA: 311] [Impact Index Per Article: 62.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/03/2018] [Accepted: 11/01/2018] [Indexed: 05/20/2023]

Dao FY, Lv H, Wang F, Ding H. Recent Advances on the Machine Learning Methods in Identifying DNA Replication Origins in Eukaryotic Genomics. Front Genet 2018;9:613. [PMID: 30619452 PMCID: PMC6295579 DOI: 10.3389/fgene.2018.00613] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2018] [Accepted: 11/21/2018] [Indexed: 01/01/2023] Open