Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Wang L, Yan X, Liu ML, Song KJ, Sun XF, Pan WW. Prediction of RNA-protein interactions by combining deep convolutional neural network with feature selection ensemble method. J Theor Biol 2018;461:230-238. [PMID: 30321541 DOI: 10.1016/j.jtbi.2018.10.029] [Citation(s) in RCA: 26] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2017] [Revised: 03/22/2018] [Accepted: 10/11/2018] [Indexed: 01/01/2023]

For:	Wang L, Yan X, Liu ML, Song KJ, Sun XF, Pan WW. Prediction of RNA-protein interactions by combining deep convolutional neural network with feature selection ensemble method. J Theor Biol 2018;461:230-238. [PMID: 30321541 DOI: 10.1016/j.jtbi.2018.10.029] [Citation(s) in RCA: 26] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2017] [Revised: 03/22/2018] [Accepted: 10/11/2018] [Indexed: 01/01/2023]

Number

Cited by Other Article(s)

Espinosa R, Jimenez F, Palma J. Surrogate-Assisted and Filter-Based Multiobjective Evolutionary Feature Selection for Deep Learning. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024;35:9591-9605. [PMID: 37018667 DOI: 10.1109/tnnls.2023.3234629] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/19/2023]

Wu J, Liu B, Zhang J, Wang Z, Li J. DL-PPI: a method on prediction of sequenced protein-protein interaction based on deep learning. BMC Bioinformatics 2023;24:473. [PMID: 38097937 PMCID: PMC10722729 DOI: 10.1186/s12859-023-05594-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2023] [Accepted: 12/01/2023] [Indexed: 12/17/2023] Open

Abstract

PURPOSE

Sequenced Protein-Protein Interaction (PPI) prediction represents a pivotal area of study in biology, playing a crucial role in elucidating the mechanistic underpinnings of diseases and facilitating the design of novel therapeutic interventions. Conventional methods for extracting features through experimental processes have proven to be both costly and exceedingly complex. In light of these challenges, the scientific community has turned to computational approaches, particularly those grounded in deep learning methodologies. Despite the progress achieved by current deep learning technologies, their effectiveness diminishes when applied to larger, unfamiliar datasets.

RESULTS

In this study, the paper introduces a novel deep learning framework, termed DL-PPI, for predicting PPIs based on sequence data. The proposed framework comprises two key components aimed at improving the accuracy of feature extraction from individual protein sequences and capturing relationships between proteins in unfamiliar datasets. 1. Protein Node Feature Extraction Module: To enhance the accuracy of feature extraction from individual protein sequences and facilitate the understanding of relationships between proteins in unknown datasets, the paper devised a novel protein node feature extraction module utilizing the Inception method. This module efficiently captures relevant patterns and representations within protein sequences, enabling more informative feature extraction. 2. Feature-Relational Reasoning Network (FRN): In the Global Feature Extraction module of our model, the paper developed a novel FRN that leveraged Graph Neural Networks to determine interactions between pairs of input proteins. The FRN effectively captures the underlying relational information between proteins, contributing to improved PPI predictions. DL-PPI framework demonstrates state-of-the-art performance in the realm of sequence-based PPI prediction.

Collapse

Zhang F, Zhang Y, Zhu X, Chen X, Lu F, Zhang X. DeepSG2PPI: A Protein-Protein Interaction Prediction Method Based on Deep Learning. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2023;20:2907-2919. [PMID: 37079417 DOI: 10.1109/tcbb.2023.3268661] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/03/2023]

Han Y, Zhang SW. Docsubty: FLAncRPI-LGAT: Prediction of ncRNA-Protein Interactions with Line Graph Attention Network Framework. Comput Struct Biotechnol J 2023;21:2286-2295. [PMID: 37035546 PMCID: PMC10073990 DOI: 10.1016/j.csbj.2023.03.027] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2022] [Revised: 03/11/2023] [Accepted: 03/16/2023] [Indexed: 03/19/2023] Open

Abstract

Identification of ncRNA-protein interactions (ncRPIs) through wet experiments is still time-consuming and highly-costly. Although several computational approaches have been developed to predict ncRPIs using the structure and sequence information of ncRNAs and proteins, the prediction accuracy needs to be improved, and the results lack interpretability. In this work, we proposed a novel computational method (called ncRPI-LGAT) to predict the ncRNA-Protein Interactions by transforming the link prediction (i.e., subgraph classification) task into a node classification task in the line network, and introducing a Line Graph ATtention network framework. ncRPI-LGAT first extracts the ncRNA/protein attributes using node2vec, and then generates the local enclosing subgraph of a target ncRNA-protein pair with SEAL. Because using the pooling operations in local enclosing subgraphs to learn a fixed-size feature vector for representing ncRNAs/proteins will cause the information loss, ncRPI-LGAT converts the local enclosing subgraphs into their corresponding line graphs, in which the node corresponds to the edge (i.e., ncRNA-protein pair) of the local enclosing subgraphs. Then, the attention mechanism-based graph neural network GATv2 is used on these line graphs to efficiently learn the embedding features of the target nodes (i.e., ncRNA-protein pairs) by focusing on learning the significance of one ncRNA-protein pair to another ncRNA-protein pair. These embedding features of one ncRNA-protein pair obtained from multi-head attention are concatenated in series and then fed them into a fully connected network to predict ncRPIs. Compared with other state-of-the-art methods in the 5CV test, ncRPI-LGAT shows superior performance on three benchmark datasets, demonstrating the effectiveness of our ncRPI-LGAT method in predicting ncRNA-protein interactions.

Collapse

Arora V, Sanguinetti G. De novo prediction of RNA-protein interactions with graph neural networks. RNA (NEW YORK, N.Y.) 2022;28:1469-1480. [PMID: 36008134 PMCID: PMC9745830 DOI: 10.1261/rna.079365.122] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/15/2022] [Accepted: 08/17/2022] [Indexed: 06/15/2023]

Pepe G, Appierdo R, Carrino C, Ballesio F, Helmer-Citterich M, Gherardini PF. Artificial intelligence methods enhance the discovery of RNA interactions. Front Mol Biosci 2022;9:1000205. [PMID: 36275611 PMCID: PMC9585310 DOI: 10.3389/fmolb.2022.1000205] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2022] [Accepted: 09/20/2022] [Indexed: 11/13/2022] Open

Wang Y, Wang LL, Wong L, Li Y, Wang L, You ZH. SIPGCN: A Novel Deep Learning Model for Predicting Self-Interacting Proteins from Sequence Information Using Graph Convolutional Networks. Biomedicines 2022;10:biomedicines10071543. [PMID: 35884848 PMCID: PMC9313220 DOI: 10.3390/biomedicines10071543] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2022] [Revised: 06/24/2022] [Accepted: 06/24/2022] [Indexed: 11/16/2022] Open

Huang X, Shi Y, Yan J, Qu W, Li X, Tan J. LPI-CSFFR: Combining serial fusion with feature reuse for predicting LncRNA-protein interactions. Comput Biol Chem 2022;99:107718. [PMID: 35785626 DOI: 10.1016/j.compbiolchem.2022.107718] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2022] [Revised: 05/24/2022] [Accepted: 06/22/2022] [Indexed: 11/03/2022]

Nallasamy V, Seshiah M. Protein Structure Prediction Using Quantile Dragonfly and Structural Class-Based Deep Learning. INT J PATTERN RECOGN 2022. [DOI: 10.1142/s021800142250015x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]

Abstract Predicting three-dimensional structure of a protein in the field of computational molecular biology has received greater attention. Most of the recent research works aimed at exploring search space, however with the increasing nature and size of data, protein structure identification and prediction are still in the preliminary stage. This work is aimed at exploring search space to tackle protein structure prediction with minimum execution time and maximum accuracy by means of quantile regressive dragonfly and structural class homolog-based deep learning (QRD-SCHDL). The proposed QRD-SCHDL method consists of two distinct steps. They are protein structure identification and prediction. In the first step, protein structure identification is performed by means of QRD optimization model to identify protein structure with minimum error. Here the protein structure identification is first performed as the raw database contains sequence information and does not contain structural information. An optimization model is designed to obtain the structural information from the database. However, protein structure gives much more insight than its sequence. Therefore, to perform computational prediction of protein structure from its sequence, actual protein structure prediction is made. The second step involves the actual protein structure prediction via structural class and homolog-based deep learning. For each protein structure prediction, a scoring matrix is obtained by utilizing structural class maximum correlation coefficient. Finally, the proposed method is tested on a set of different unique numbers of protein data and compared to the state-of-the-art methods. The obtained results showed the potentiality of the proposed method in terms of metrics, error rate, protein structure prediction time, protein structure prediction accuracy, precision, specificity, recall, ROC, Kappa coefficient and [Formula: see text]-measure, respectively. It also shows that the proposed QRD-SCHDL method attains comparable results and outperformed in certain cases, thereby signifying the efficiency of the proposed work. Collapse

Yu B, Wang X, Zhang Y, Gao H, Wang Y, Liu Y, Gao X. RPI-MDLStack: Predicting RNA-protein interactions through deep learning with stacking strategy and LASSO. Appl Soft Comput 2022. [DOI: 10.1016/j.asoc.2022.108676] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/05/2023]

Wang L, You ZH, Li JQ, Huang YA. IMS-CDA: Prediction of CircRNA-Disease Associations From the Integration of Multisource Similarity Information With Deep Stacked Autoencoder Model. IEEE TRANSACTIONS ON CYBERNETICS 2021;51:5522-5531. [PMID: 33027025 DOI: 10.1109/tcyb.2020.3022852] [Citation(s) in RCA: 15] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]

Zheng K, You ZH, Wang L, Li YR, Zhou JR, Zeng HT. MISSIM: An Incremental Learning-Based Model With Applications to the Prediction of miRNA-Disease Association. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2021;18:1733-1742. [PMID: 32749964 DOI: 10.1109/tcbb.2020.3013837] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]

Dong XH, Dai D, Yang ZD, Yu XO, Li H, Kang H. S100 calcium binding protein A6 and associated long noncoding ribonucleic acids as biomarkers in the diagnosis and staging of primary biliary cholangitis. World J Gastroenterol 2021;27:1973-1992. [PMID: 34007134 PMCID: PMC8108032 DOI: 10.3748/wjg.v27.i17.1973] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/07/2020] [Revised: 01/23/2021] [Accepted: 03/10/2021] [Indexed: 02/06/2023] Open

Abstract

BACKGROUND

Primary biliary cholangitis (PBC) is a chronic and slowly progressing cholestatic disease, which causes damage to the small intrahepatic bile duct by immuno-regulation, and may lead to cholestasis, liver fibrosis, cirrhosis and, eventually, liver failure.

AIM

To explore the potential diagnosis and staging value of plasma S100 calcium binding protein A6 (S100A6) messenger ribonucleic acid (mRNA), LINC00312, LINC00472, and LINC01257 in primary biliary cholangitis.

METHODS

A total of 145 PBC patients and 110 healthy controls (HCs) were enrolled. Among them, 80 PBC patients and 60 HCs were used as the training set, and 65 PBC patients and 50 HCs were used as the validation set. The relative expression levels of plasma S100A6 mRNA, long noncoding ribonucleic acids LINC00312, LINC00472 and LINC01257 were analyzed using quantitative reverse transcription-polymerase chain reaction. The bile duct ligation (BDL) mouse model was used to simulate PBC. Then double immunofluorescence was conducted to verify the overexpression of S100A6 protein in intrahepatic bile duct cells of BDL mice. Human intrahepatic biliary epithelial cells were treated with glycochenodeoxycholate to simulate the cholestatic environment of intrahepatic biliary epithelial cells in PBC.

RESULTS

The expression of S100A6 protein in intrahepatic bile duct cells was up-regulated in the BDL mouse model compared with sham mice. The relative expression levels of plasma S100A6 mRNA, log10 LINC00472 and LINC01257 were up-regulated while LINC00312 was down-regulated in plasma of PBC patients compared with HCs (3.01 ± 1.04 vs 2.09 ± 0.87, P < 0.0001; 2.46 ± 1.03 vs 1.77 ± 0.84, P < 0.0001; 3.49 ± 1.64 vs 2.37 ± 0.96, P < 0.0001; 1.70 ± 0.33 vs 2.07 ± 0.53, P < 0.0001, respectively). The relative expression levels of S100A6 mRNA, LINC00472 and LINC01257 were up-regulated and LINC00312 was down-regulated in human intrahepatic biliary epithelial cells treated with glycochenodeoxycholate compared with control (2.97 ± 0.43 vs 1.09 ± 0.08, P = 0.0018; 2.70 ± 0.26 vs 1.10 ± 0.10, P = 0.0006; 2.23 ± 0.21 vs 1.10 ± 0.10, P = 0.0011; 1.20 ± 0.04 vs 3.03 ± 0.15, P < 0.0001, respectively). The mean expression of S100A6 in the advanced stage (III and IV) of PBC was up-regulated compared to that in HCs and the early stage (II) (3.38 ± 0.71 vs 2.09 ± 0.87, P < 0.0001; 3.38 ± 0.71 vs 2.57 ± 1.21, P = 0.0003, respectively); and in the early stage (II), it was higher than that in HCs (2.57 ± 1.21 vs 2.09 ± 0.87, P = 0.03). The mean expression of LINC00312 in the advanced stage was lower than that in the early stage and HCs (1.39 ± 0.29 vs 1.56 ± 0.33, P = 0.01; 1.39 ± 0.29 vs 2.07 ± 0.53, P < 0.0001, respectively); in addition, the mean expression of LINC00312 in the early stage was lower than that in HCs (1.56 ± 0.33 vs 2.07 ± 0.53, P < 0.0001). The mean expression of log10 LINC00472 in the advanced stage was higher than those in the early stage and HCs (2.99 ± 0.87 vs 1.81 ± 0.83, P < 0.0001; 2.99 ± 0.87 vs 1.77 ± 0.84, P < 0.0001, respectively). The mean expression of LINC01257 in both the early stage and advanced stage were up-regulated compared with HCs (3.88 ± 1.55 vs 2.37 ± 0.96, P < 0.0001; 3.57 ± 1.79 vs 2.37 ± 0.96, P < 0.0001, respectively). The areas under the curves (AUC) for S100A6, LINC00312, log10 LINC00472 and LINC01257 in PBC diagnosis were 0.759, 0.7292, 0.6942 and 0.7158, respectively. Furthermore, the AUC for these four genes in PBC staging were 0.666, 0.661, 0.839 and 0.5549, respectively. The expression levels of S100A6 mRNA, log10 LINC00472, and LINC01257 in plasma of PBC patients were decreased (2.35 ± 1.02 vs 3.06 ± 1.04, P = 0.0018; 1.99 ± 0.83 vs 2.33 ± 0.96, P = 0.036; 2.84 ± 0.92 vs 3.69 ± 1.54, P = 0.0006), and the expression level of LINC00312 was increased (1.95 ± 0.35 vs 1.73 ± 0.32, P = 0.0007) after treatment compared with before treatment using the paired t-test. Relative expression of S100A6 mRNA was positively correlated with log10 LINC00472 (r = 0.683, P < 0.0001); serum level of collagen type IV was positively correlated with the relative expression of log10 LINC00472 (r = 0.482, P < 0.0001); relative expression of S100A6 mRNA was positively correlated with the serum level of collagen type IV (r = 0.732, P < 0.0001). The AUC for the four biomarkers obtained in the validation set were close to the training set.

CONCLUSION

These four genes may potentially act as novel biomarkers for the diagnosis of PBC. Moreover, LINC00472 acts as a potential biomarker for staging in PBC.

Collapse

Wang J, Zhao Y, Gong W, Liu Y, Wang M, Huang X, Tan J. EDLMFC: an ensemble deep learning framework with multi-scale features combination for ncRNA-protein interaction prediction. BMC Bioinformatics 2021;22:133. [PMID: 33740884 PMCID: PMC7980572 DOI: 10.1186/s12859-021-04069-9] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2021] [Accepted: 03/05/2021] [Indexed: 11/29/2022] Open

Abstract

Background

Non-coding RNA (ncRNA) and protein interactions play essential roles in various physiological and pathological processes. The experimental methods used for predicting ncRNA–protein interactions are time-consuming and labor-intensive. Therefore, there is an increasing demand for computational methods to accurately and efficiently predict ncRNA–protein interactions.

Results

In this work, we presented an ensemble deep learning-based method, EDLMFC, to predict ncRNA–protein interactions using the combination of multi-scale features, including primary sequence features, secondary structure sequence features, and tertiary structure features. Conjoint k-mer was used to extract protein/ncRNA sequence features, integrating tertiary structure features, then fed into an ensemble deep learning model, which combined convolutional neural network (CNN) to learn dominating biological information with bi-directional long short-term memory network (BLSTM) to capture long-range dependencies among the features identified by the CNN. Compared with other state-of-the-art methods under five-fold cross-validation, EDLMFC shows the best performance with accuracy of 93.8%, 89.7%, and 86.1% on RPI1807, NPInter v2.0, and RPI488 datasets, respectively. The results of the independent test demonstrated that EDLMFC can effectively predict potential ncRNA–protein interactions from different organisms. Furtherly, EDLMFC is also shown to predict hub ncRNAs and proteins presented in ncRNA–protein networks of Mus musculus successfully.

Conclusions

In general, our proposed method EDLMFC improved the accuracy of ncRNA–protein interaction predictions and anticipated providing some helpful guidance on ncRNA functions research. The source code of EDLMFC and the datasets used in this work are available at https://github.com/JingjingWang-87/EDLMFC.

Supplementary Information

The online version contains supplementary material available at 10.1186/s12859-021-04069-9.

Collapse

Affiliation(s)

Jingjing Wang Department of Biomedical Engineering, Faculty of Environment and Life, Beijing International Science and Technology Cooperation Base for Intelligent Physiological Measurement and Clinical Transformation, Beijing University of Technology, Beijing, 100124, China
Yanpeng Zhao Department of Biomedical Engineering, Faculty of Environment and Life, Beijing International Science and Technology Cooperation Base for Intelligent Physiological Measurement and Clinical Transformation, Beijing University of Technology, Beijing, 100124, China
Weikang Gong Department of Biomedical Engineering, Faculty of Environment and Life, Beijing International Science and Technology Cooperation Base for Intelligent Physiological Measurement and Clinical Transformation, Beijing University of Technology, Beijing, 100124, China
Yang Liu Department of Biomedical Engineering, Faculty of Environment and Life, Beijing International Science and Technology Cooperation Base for Intelligent Physiological Measurement and Clinical Transformation, Beijing University of Technology, Beijing, 100124, China
Mei Wang Department of Biomedical Engineering, Faculty of Environment and Life, Beijing International Science and Technology Cooperation Base for Intelligent Physiological Measurement and Clinical Transformation, Beijing University of Technology, Beijing, 100124, China
Xiaoqian Huang Department of Biomedical Engineering, Faculty of Environment and Life, Beijing International Science and Technology Cooperation Base for Intelligent Physiological Measurement and Clinical Transformation, Beijing University of Technology, Beijing, 100124, China
Jianjun Tan Department of Biomedical Engineering, Faculty of Environment and Life, Beijing International Science and Technology Cooperation Base for Intelligent Physiological Measurement and Clinical Transformation, Beijing University of Technology, Beijing, 100124, China.

Collapse

Shaw D, Chen H, Xie M, Jiang T. DeepLPI: a multimodal deep learning method for predicting the interactions between lncRNAs and protein isoforms. BMC Bioinformatics 2021;22:24. [PMID: 33461501 PMCID: PMC7814738 DOI: 10.1186/s12859-020-03914-7] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2020] [Accepted: 11/30/2020] [Indexed: 12/14/2022] Open

Abstract

BACKGROUND

Long non-coding RNAs (lncRNAs) regulate diverse biological processes via interactions with proteins. Since the experimental methods to identify these interactions are expensive and time-consuming, many computational methods have been proposed. Although these computational methods have achieved promising prediction performance, they neglect the fact that a gene may encode multiple protein isoforms and different isoforms of the same gene may interact differently with the same lncRNA.

RESULTS

In this study, we propose a novel method, DeepLPI, for predicting the interactions between lncRNAs and protein isoforms. Our method uses sequence and structure data to extract intrinsic features and expression data to extract topological features. To combine these different data, we adopt a hybrid framework by integrating a multimodal deep learning neural network and a conditional random field. To overcome the lack of known interactions between lncRNAs and protein isoforms, we apply a multiple instance learning (MIL) approach. In our experiment concerning the human lncRNA-protein interactions in the NPInter v3.0 database, DeepLPI improved the prediction performance by 4.7% in term of AUC and 5.9% in term of AUPRC over the state-of-the-art methods. Our further correlation analyses between interactive lncRNAs and protein isoforms also illustrated that their co-expression information helped predict the interactions. Finally, we give some examples where DeepLPI was able to outperform the other methods in predicting mouse lncRNA-protein interactions and novel human lncRNA-protein interactions.

CONCLUSION

Our results demonstrated that the use of isoforms and MIL contributed significantly to the improvement of performance in predicting lncRNA and protein interactions. We believe that such an approach would find more applications in predicting other functional roles of RNAs and proteins.

Collapse

Jia LN, Yan X, You ZH, Zhou X, Li LP, Wang L, Song KJ. NLPEI: A Novel Self-Interacting Protein Prediction Model Based on Natural Language Processing and Evolutionary Information. Evol Bioinform Online 2020;16:1176934320984171. [PMID: 33488064 PMCID: PMC7768313 DOI: 10.1177/1176934320984171] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2020] [Accepted: 12/01/2020] [Indexed: 12/13/2022] Open

Ensembles of feature selectors for dealing with class-imbalanced datasets: A proposal and comparative study. Inf Sci (N Y) 2020. [DOI: 10.1016/j.ins.2020.05.077] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]

RNA-Seq-Based Breast Cancer Subtypes Classification Using Machine Learning Approaches. COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE 2020;2020:4737969. [PMID: 33178256 PMCID: PMC7644310 DOI: 10.1155/2020/4737969] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/21/2019] [Revised: 05/31/2020] [Accepted: 10/09/2020] [Indexed: 12/20/2022]

Abstract

Background

Breast invasive carcinoma (BRCA) is not a single disease as each subtype has a distinct morphology structure. Although several computational methods have been proposed to conduct breast cancer subtype identification, the specific interaction mechanisms of genes involved in the subtypes are still incomplete. To identify and explore the corresponding interaction mechanisms of genes for each subtype of breast cancer can impose an important impact on the personalized treatment for different patients.

Methods

We integrate the biological importance of genes from the gene regulatory networks to the differential expression analysis and then obtain the weighted differentially expressed genes (weighted DEGs). A gene with a high weight means it regulates more target genes and thus holds more biological importance. Besides, we constructed gene coexpression networks for control and experiment groups, and the significantly differentially interacting structures encouraged us to design the corresponding Gene Ontology (GO) enrichment based on gene coexpression networks (GOEGCN). The GOEGCN considers the two-side distinction analysis between gene coexpression networks for control and experiment groups. The method allows us to study how the modulated coexpressed gene couples impact biological functions at a GO level.

Results

We modeled the binary classification with weighted DEGs for each subtype. The binary classifier could make a good prediction for an unseen sample, and the experimental results validated the effectiveness of our proposed approaches. The novel enriched GO terms based on GOEGCN for control and experiment groups of each subtype explain the specific biological function changes according to the two-side distinction of coexpression network structures to some extent.

Conclusion

The weighted DEGs contain biological importance derived from the gene regulatory network. Based on the weighted DEGs, five binary classifiers were learned and showed good performance concerning the “Sensitivity,” “Specificity,” “Accuracy,” “F1,” and “AUC” metrics. The GOEGCN with weighted DEGs for control and experiment groups presented a novel GO enrichment analysis results and the novel enriched GO terms would further unveil the changes of specific biological functions among all the BRCA subtypes to some extent. The R code in this research is available at https://github.com/yxchspring/GOEGCN_BRCA_Subtypes.

Collapse

Russo DP, Yan X, Shende S, Huang H, Yan B, Zhu H. Virtual Molecular Projections and Convolutional Neural Networks for the End-to-End Modeling of Nanoparticle Activities and Properties. Anal Chem 2020;92:13971-13979. [PMID: 32970421 DOI: 10.1021/acs.analchem.0c02878] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022]

Zhang SW, Zhang XX, Fan XN, Li WN. LPI-CNNCP: Prediction of lncRNA-protein interactions by using convolutional neural network with the copy-padding trick. Anal Biochem 2020;601:113767. [PMID: 32454029 DOI: 10.1016/j.ab.2020.113767] [Citation(s) in RCA: 20] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2020] [Revised: 04/27/2020] [Accepted: 05/01/2020] [Indexed: 11/17/2022]

Zheng K, You ZH, Li JQ, Wang L, Guo ZH, Huang YA. iCDA-CGR: Identification of circRNA-disease associations based on Chaos Game Representation. PLoS Comput Biol 2020;16:e1007872. [PMID: 32421715 PMCID: PMC7259804 DOI: 10.1371/journal.pcbi.1007872] [Citation(s) in RCA: 40] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2019] [Revised: 05/29/2020] [Accepted: 04/13/2020] [Indexed: 12/14/2022] Open

Abstract

Found in recent research, tumor cell invasion, proliferation, or other biological processes are controlled by circular RNA. Understanding the association between circRNAs and diseases is an important way to explore the pathogenesis of complex diseases and promote disease-targeted therapy. Most methods, such as k-mer and PSSM, based on the analysis of high-throughput expression data have the tendency to think functionally similar nucleic acid lack direct linear homology regardless of positional information and only quantify nonlinear sequence relationships. However, in many complex diseases, the sequence nonlinear relationship between the pathogenic nucleic acid and ordinary nucleic acid is not much different. Therefore, the analysis of positional information expression can help to predict the complex associations between circRNA and disease. To fill up this gap, we propose a new method, named iCDA-CGR, to predict the circRNA-disease associations. In particular, we introduce circRNA sequence information and quantifies the sequence nonlinear relationship of circRNA by Chaos Game Representation (CGR) technology based on the biological sequence position information for the first time in the circRNA-disease prediction model. In the cross-validation experiment, our method achieved 0.8533 AUC, which was significantly higher than other existing methods. In the validation of independent data sets including circ2Disease, circRNADisease and CRDD, the prediction accuracy of iCDA-CGR reached 95.18%, 90.64% and 95.89%. Moreover, in the case studies, 19 of the top 30 circRNA-disease associations predicted by iCDA-CGR on circRDisease dataset were confirmed by newly published literature. These results demonstrated that iCDA-CGR has outstanding robustness and stability, and can provide highly credible candidates for biological experiments.

Understanding the association between circRNAs and diseases is an important step to explore the pathogenesis of complex diseases and promote disease-targeted therapy. Computational methods contribute to discovering the potential disease-related circRNAs. Based on the analysis of the location information expression of biological sequences, the model of iCDA-CGR is proposed to predict the circRNA-disease associations by integrates multi-source information, including circRNA sequence information, gene-circRNA associations information, circRNA-disease associations information and the disease semantic information. In particular, the location information of circRNA sequences was first introduced into the circRNA-disease associations prediction model. The promising results on cross-validation and independent data sets demonstrated the effectiveness of the proposed model. We further implemented case studies, and 19 of the top 30 predicted scores of the proposed model were confirmed by recent experimental reports. The results show that iCDA-CGR model can effectively predict the potential circRNA-disease associations and provide highly reliable candidates for biological experiments, thus helping to further understand the complex disease mechanism.

Collapse

Torkamanian-Afshar M, Lanjanian H, Nematzadeh S, Tabarzad M, Najafi A, Kiani F, Masoudi-Nejad A. RPINBASE: An online toolbox to extract features for predicting RNA-protein interactions. Genomics 2020;112:2623-2632. [DOI: 10.1016/j.ygeno.2020.02.013] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2019] [Revised: 01/04/2020] [Accepted: 02/13/2020] [Indexed: 12/12/2022]

Emami N, Pakchin PS, Ferdousi R. Computational predictive approaches for interaction and structure of aptamers. J Theor Biol 2020;497:110268. [PMID: 32311376 DOI: 10.1016/j.jtbi.2020.110268] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2020] [Revised: 03/27/2020] [Accepted: 04/02/2020] [Indexed: 02/07/2023]

Shi Q, Chen W, Huang S, Wang Y, Xue Z. Deep learning for mining protein data. Brief Bioinform 2019;22:194-218. [PMID: 31867611 DOI: 10.1093/bib/bbz156] [Citation(s) in RCA: 31] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2019] [Revised: 10/21/2019] [Accepted: 11/07/2019] [Indexed: 01/16/2023] Open

LPI-BLS: Predicting lncRNA–protein interactions with a broad learning system-based stacked ensemble classifier. Neurocomputing 2019. [DOI: 10.1016/j.neucom.2019.08.084] [Citation(s) in RCA: 32] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/20/2023]

Shi C, Chen J, Kang X, Zhao G, Lao X, Zheng H. Deep Learning in the Study of Protein-Related Interactions. Protein Pept Lett 2019;27:359-369. [PMID: 31538879 DOI: 10.2174/0929866526666190723114142] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2019] [Revised: 03/13/2019] [Accepted: 04/05/2019] [Indexed: 11/22/2022]

Pan X, Yang Y, Xia C, Mirza AH, Shen H. Recent methodology progress of deep learning for RNA–protein interaction prediction. WILEY INTERDISCIPLINARY REVIEWS-RNA 2019;10:e1544. [DOI: 10.1002/wrna.1544] [Citation(s) in RCA: 35] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/28/2019] [Revised: 04/07/2019] [Accepted: 04/11/2019] [Indexed: 12/17/2022]