Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: You ZH, Li X, Chan KCC. An improved sequence-based prediction protocol for protein-protein interactions using amino acids substitution matrix and rotation forest ensemble classifiers. Neurocomputing 2017. [DOI: 10.1016/j.neucom.2016.10.042] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]

For:	You ZH, Li X, Chan KCC. An improved sequence-based prediction protocol for protein-protein interactions using amino acids substitution matrix and rotation forest ensemble classifiers. Neurocomputing 2017. [DOI: 10.1016/j.neucom.2016.10.042] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]

Number

Cited by Other Article(s)

Bao W, Liu Y, Chen B. Oral_voting_transfer: classification of oral microorganisms' function proteins with voting transfer model. Front Microbiol 2024;14:1277121. [PMID: 38384719 PMCID: PMC10879614 DOI: 10.3389/fmicb.2023.1277121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2023] [Accepted: 12/19/2023] [Indexed: 02/23/2024] Open

Zandi F, Mansouri P, Goodarzi M. Global protein-protein interaction networks in yeast saccharomyces cerevisiae and helicobacter pylori. Talanta 2023;265:124836. [PMID: 37393709 DOI: 10.1016/j.talanta.2023.124836] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2023] [Revised: 06/04/2023] [Accepted: 06/17/2023] [Indexed: 07/04/2023]

Wang XW, Madeddu L, Spirohn K, Martini L, Fazzone A, Becchetti L, Wytock TP, Kovács IA, Balogh OM, Benczik B, Pétervári M, Ágg B, Ferdinandy P, Vulliard L, Menche J, Colonnese S, Petti M, Scarano G, Cuomo F, Hao T, Laval F, Willems L, Twizere JC, Vidal M, Calderwood MA, Petrillo E, Barabási AL, Silverman EK, Loscalzo J, Velardi P, Liu YY. Assessment of community efforts to advance network-based prediction of protein-protein interactions. Nat Commun 2023;14:1582. [PMID: 36949045 PMCID: PMC10033937 DOI: 10.1038/s41467-023-37079-7] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/24/2022] [Accepted: 03/02/2023] [Indexed: 03/24/2023] Open

Affiliation(s)

Xu-Wen Wang Channing Division of Network Medicine, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, 02115, USA
Lorenzo Madeddu Translational and Precision Medicine Department Sapienza University of Rome, Rome, Italy
Kerstin Spirohn Center for Cancer Systems Biology (CCSB), Dana-Farber Cancer Institute, Boston, MA, 02215, USA Department of Genetics, Blavatnik Institute, Harvard Medical School, Boston, MA, 02115, USA Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA, 02215, USA
Leonardo Martini Department of Computer, Control, and Management Engineering "Antonio Rubert", Sapienza University of Rome, Rome, Italy
Adriano Fazzone CENTAI Institute, Turin, Italy
Luca Becchetti Department of Computer, Control, and Management Engineering "Antonio Rubert", Sapienza University of Rome, Rome, Italy
Thomas P Wytock Department of Physics and Astronomy, Northwestern University, Evanston, IL, 60208, USA
István A Kovács Department of Physics and Astronomy, Northwestern University, Evanston, IL, 60208, USA Northwestern Institute on Complex Systems, Northwestern University, Evanston, IL, 60208, USA
Olivér M Balogh Cardiometabolic and MTA-SE System Pharmacology Research Group, Department of Pharmacology and Pharmacotherapy, Semmelweis University, Budapest, Hungary
Bettina Benczik Cardiometabolic and MTA-SE System Pharmacology Research Group, Department of Pharmacology and Pharmacotherapy, Semmelweis University, Budapest, Hungary Pharmahungary Group, 6722, Szeged, Hungary
Mátyás Pétervári Cardiometabolic and MTA-SE System Pharmacology Research Group, Department of Pharmacology and Pharmacotherapy, Semmelweis University, Budapest, Hungary
Bence Ágg Cardiometabolic and MTA-SE System Pharmacology Research Group, Department of Pharmacology and Pharmacotherapy, Semmelweis University, Budapest, Hungary Pharmahungary Group, 6722, Szeged, Hungary
Péter Ferdinandy Cardiometabolic and MTA-SE System Pharmacology Research Group, Department of Pharmacology and Pharmacotherapy, Semmelweis University, Budapest, Hungary Pharmahungary Group, 6722, Szeged, Hungary
Loan Vulliard CeMM Research Center for Molecular Medicine of the Austrian Academy of Sciences, Vienna, Austria Department of Structural and Computational Biology, Max Perutz Labs, University of Vienna, Vienna, Austria
Jörg Menche CeMM Research Center for Molecular Medicine of the Austrian Academy of Sciences, Vienna, Austria Department of Structural and Computational Biology, Max Perutz Labs, University of Vienna, Vienna, Austria Faculty of Mathematics, University of Vienna, Vienna, Austria
Stefania Colonnese Department of Information Engineering, Electronics, and Telecommunications (DIET), University of Rome "Sapienza", Rome, Italy
Manuela Petti Department of Computer, Control, and Management Engineering "Antonio Rubert", Sapienza University of Rome, Rome, Italy
Gaetano Scarano Department of Information Engineering, Electronics, and Telecommunications (DIET), University of Rome "Sapienza", Rome, Italy
Francesca Cuomo Department of Information Engineering, Electronics, and Telecommunications (DIET), University of Rome "Sapienza", Rome, Italy
Tong Hao Center for Cancer Systems Biology (CCSB), Dana-Farber Cancer Institute, Boston, MA, 02215, USA Department of Genetics, Blavatnik Institute, Harvard Medical School, Boston, MA, 02115, USA Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA, 02215, USA
Florent Laval Center for Cancer Systems Biology (CCSB), Dana-Farber Cancer Institute, Boston, MA, 02215, USA Department of Genetics, Blavatnik Institute, Harvard Medical School, Boston, MA, 02115, USA Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA, 02215, USA Laboratory of Molecular and Cellular Epigenetic, GIGA Institute, University of Liège, Liège, Belgium Laboratory of Viral Interactomes, GIGA Institute, University of Liège, Liège, Belgium TERRA Teaching and Research Centre, University of Liège, Gembloux, Belgium
Luc Willems Laboratory of Molecular and Cellular Epigenetic, GIGA Institute, University of Liège, Liège, Belgium TERRA Teaching and Research Centre, University of Liège, Gembloux, Belgium
Jean-Claude Twizere Laboratory of Viral Interactomes, GIGA Institute, University of Liège, Liège, Belgium TERRA Teaching and Research Centre, University of Liège, Gembloux, Belgium
Marc Vidal Center for Cancer Systems Biology (CCSB), Dana-Farber Cancer Institute, Boston, MA, 02215, USA Department of Genetics, Blavatnik Institute, Harvard Medical School, Boston, MA, 02115, USA
Michael A Calderwood Center for Cancer Systems Biology (CCSB), Dana-Farber Cancer Institute, Boston, MA, 02215, USA Department of Genetics, Blavatnik Institute, Harvard Medical School, Boston, MA, 02115, USA Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA, 02215, USA
Enrico Petrillo Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, 02115, USA Department of General Internal Medicine and Primary Care, Brigham and Women's Hospital, Boston, MA, 02115, USA
Albert-László Barabási Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, 02115, USA Network Science Institute and Department of Physics, Northeastern University, Boston, MA, 02115, USA Department of Network and Data Science, Central European University, Budapest, H-1051, Hungary
Edwin K Silverman Channing Division of Network Medicine, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, 02115, USA
Joseph Loscalzo Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, 02115, USA
Paola Velardi Translational and Precision Medicine Department Sapienza University of Rome, Rome, Italy.
Yang-Yu Liu Channing Division of Network Medicine, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, 02115, USA. Center for Artificial Intelligence and Modeling, The Carl R. Woese Institute for Genomic Biology, University of Illinois at Urbana-Champaign, Champaign, IL, 61801, USA.

Collapse

Zhang Y, Li Z. RF_phage virion: Classification of phage virion proteins with a random forest model. Front Genet 2023;13:1103783. [PMID: 36846294 PMCID: PMC9945117 DOI: 10.3389/fgene.2022.1103783] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2022] [Accepted: 12/30/2022] [Indexed: 02/10/2023] Open

DeepCF-PPI: improved prediction of protein-protein interactions by combining learned and handcrafted features based on attention mechanisms. APPL INTELL 2023. [DOI: 10.1007/s10489-022-04387-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/18/2023]

Ahmed F, Dehzangi I, Hasan MM, Shatabda S. Accurately predicting microbial phosphorylation sites using evolutionary and structural features. Gene 2023;851:146993. [DOI: 10.1016/j.gene.2022.146993] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2022] [Revised: 10/05/2022] [Accepted: 10/14/2022] [Indexed: 11/27/2022]

MFIDMA: A Multiple Information Integration Model for the Prediction of Drug-miRNA Associations. BIOLOGY 2022;12:biology12010041. [PMID: 36671734 PMCID: PMC9855084 DOI: 10.3390/biology12010041] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/20/2022] [Revised: 12/19/2022] [Accepted: 12/22/2022] [Indexed: 12/28/2022]

Zheng K, Zhang XL, Wang L, You ZH, Zhan ZH, Li HY. Line graph attention networks for predicting disease-associated Piwi-interacting RNAs. Brief Bioinform 2022;23:6748487. [PMID: 36198846 DOI: 10.1093/bib/bbac393] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2022] [Revised: 08/08/2022] [Accepted: 08/12/2022] [Indexed: 12/14/2022] Open

Predicting Protein-Protein Interactions via Random Ferns with Evolutionary Matrix Representation. COMPUTATIONAL AND MATHEMATICAL METHODS IN MEDICINE 2022;2022:7191684. [PMID: 35242211 PMCID: PMC8888042 DOI: 10.1155/2022/7191684] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/03/2021] [Revised: 01/15/2022] [Accepted: 01/18/2022] [Indexed: 11/27/2022]

Abstract

Protein-protein interactions (PPIs) play a crucial role in understanding disease pathogenesis, genetic mechanisms, guiding drug design, and other biochemical processes, thus, the identification of PPIs is of great importance. With the rapid development of high-throughput sequencing technology, a large amount of PPIs sequence data has been accumulated. Researchers have designed many experimental methods to detect PPIs by using these sequence data, hence, the prediction of PPIs has become a research hotspot in proteomics. However, since traditional experimental methods are both time-consuming and costly, it is difficult to analyze and predict the massive amount of PPI data quickly and accurately. To address these issues, many computational systems employing machine learning knowledge were widely applied to PPIs prediction, thereby improving the overall recognition rate. In this paper, a novel and efficient computational technology is presented to implement a protein interaction prediction system using only protein sequence information. First, the Position-Specific Iterated Basic Local Alignment Search Tool (PSI-BLAST) was employed to generate a position-specific scoring matrix (PSSM) containing protein evolutionary information from the initial protein sequence. Second, we used a novel data processing feature representation scheme, MatFLDA, to extract the essential information of PSSM for protein sequences and obtained five training and five testing datasets by adopting a five-fold cross-validation method. Finally, the random fern (RFs) classifier was employed to infer the interactions among proteins, and a model called MatFLDA_RFs was developed. The proposed MatFLDA_RFs model achieved good prediction performance with 95.03% average accuracy on Yeast dataset and 85.35% average accuracy on H. pylori dataset, which effectively outperformed other existing computational methods. The experimental results indicate that the proposed method is capable of yielding better prediction results of PPIs, which provides an effective tool for the detection of new PPIs and the in-depth study of proteomics. Finally, we also developed a web server for the proposed model to predict protein-protein interactions, which is freely accessible online at http://120.77.11.78:5001/webserver/MatFLDA_RFs.

Collapse

Li LP, Zhang B, Cheng L. CPIELA: Computational Prediction of Plant Protein–Protein Interactions by Ensemble Learning Approach From Protein Sequences and Evolutionary Information. Front Genet 2022;13:857839. [PMID: 35360876 PMCID: PMC8963800 DOI: 10.3389/fgene.2022.857839] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2022] [Accepted: 02/10/2022] [Indexed: 11/22/2022] Open

Mahapatra S, Gupta VR, Sahu SS, Panda G. Deep Neural Network and Extreme Gradient Boosting Based Hybrid Classifier for Improved Prediction of Protein-Protein Interaction. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2022;19:155-165. [PMID: 33621179 DOI: 10.1109/tcbb.2021.3061300] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]

Ma Y, Li Q, Hu N, Li L. SeBioGraph: Semi-supervised Deep Learning for the Graph via Sustainable Knowledge Transfer. Front Neurorobot 2021;15:665055. [PMID: 33867966 PMCID: PMC8047129 DOI: 10.3389/fnbot.2021.665055] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2021] [Accepted: 03/09/2021] [Indexed: 11/17/2022] Open

GTB-PPI: Predict Protein-protein Interactions Based on L1-regularized Logistic Regression and Gradient Tree Boosting. GENOMICS PROTEOMICS & BIOINFORMATICS 2021;18:582-592. [PMID: 33515750 PMCID: PMC8377384 DOI: 10.1016/j.gpb.2021.01.001] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/05/2018] [Revised: 12/21/2019] [Accepted: 05/12/2020] [Indexed: 11/20/2022]

Peng J, Lu G, Shang X. A Survey of Network Representation Learning Methods for Link Prediction in Biological Network. Curr Pharm Des 2021;26:3076-3084. [PMID: 31951161 DOI: 10.2174/1381612826666200116145057] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2019] [Accepted: 01/09/2020] [Indexed: 11/22/2022]

Varrone M, Nanni L, Ciriello G, Ceri S. Exploring chromatin conformation and gene co-expression through graph embedding. Bioinformatics 2020;36:i700-i708. [PMID: 33381846 DOI: 10.1093/bioinformatics/btaa803] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open

Abstract

MOTIVATION

The relationship between gene co-expression and chromatin conformation is of great biological interest. Thanks to high-throughput chromosome conformation capture technologies (Hi-C), researchers are gaining insights on the tri-dimensional organization of the genome. Given the high complexity of Hi-C data and the difficult definition of gene co-expression networks, the development of proper computational tools to investigate such relationship is rapidly gaining the interest of researchers. One of the most fascinating questions in this context is how chromatin topology correlates with gene co-expression and which physical interaction patterns are most predictive of co-expression relationships.

RESULTS

To address these questions, we developed a computational framework for the prediction of co-expression networks from chromatin conformation data. We first define a gene chromatin interaction network where each gene is associated to its physical interaction profile; then, we apply two graph embedding techniques to extract a low-dimensional vector representation of each gene from the interaction network; finally, we train a classifier on gene embedding pairs to predict if they are co-expressed. Both graph embedding techniques outperform previous methods based on manually designed topological features, highlighting the need for more advanced strategies to encode chromatin information. We also establish that the most recent technique, based on random walks, is superior. Overall, our results demonstrate that chromatin conformation and gene regulation share a non-linear relationship and that gene topological embeddings encode relevant information, which could be used also for downstream analysis.

AVAILABILITY AND IMPLEMENTATION

The source code for the analysis is available at: https://github.com/marcovarrone/gene-expression-chromatin.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

Collapse

Dipta SR, Taherzadeh G, Ahmad MW, Arafat ME, Shatabda S, Dehzangi A. SEMal: Accurate protein malonylation site predictor using structural and evolutionary information. Comput Biol Med 2020;125:104022. [PMID: 33022522 DOI: 10.1016/j.compbiomed.2020.104022] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2020] [Revised: 09/24/2020] [Accepted: 09/25/2020] [Indexed: 10/23/2022]

Chen Y, Wang W, Liu J, Feng J, Gong X. Protein Interface Complementarity and Gene Duplication Improve Link Prediction of Protein-Protein Interaction Network. Front Genet 2020;11:291. [PMID: 32300358 PMCID: PMC7142252 DOI: 10.3389/fgene.2020.00291] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2020] [Accepted: 03/10/2020] [Indexed: 12/20/2022] Open

Yue X, Wang Z, Huang J, Parthasarathy S, Moosavinasab S, Huang Y, Lin SM, Zhang W, Zhang P, Sun H. Graph embedding on biomedical networks: methods, applications and evaluations. Bioinformatics 2020;36:1241-1251. [PMID: 31584634 PMCID: PMC7703771 DOI: 10.1093/bioinformatics/btz718] [Citation(s) in RCA: 102] [Impact Index Per Article: 25.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2019] [Revised: 08/25/2019] [Accepted: 09/26/2019] [Indexed: 01/12/2023] Open

Abstract

MOTIVATION

Graph embedding learning that aims to automatically learn low-dimensional node representations, has drawn increasing attention in recent years. To date, most recent graph embedding methods are evaluated on social and information networks and are not comprehensively studied on biomedical networks under systematic experiments and analyses. On the other hand, for a variety of biomedical network analysis tasks, traditional techniques such as matrix factorization (which can be seen as a type of graph embedding methods) have shown promising results, and hence there is a need to systematically evaluate the more recent graph embedding methods (e.g. random walk-based and neural network-based) in terms of their usability and potential to further the state-of-the-art.

RESULTS

We select 11 representative graph embedding methods and conduct a systematic comparison on 3 important biomedical link prediction tasks: drug-disease association (DDA) prediction, drug-drug interaction (DDI) prediction, protein-protein interaction (PPI) prediction; and 2 node classification tasks: medical term semantic type classification, protein function prediction. Our experimental results demonstrate that the recent graph embedding methods achieve promising results and deserve more attention in the future biomedical graph analysis. Compared with three state-of-the-art methods for DDAs, DDIs and protein function predictions, the recent graph embedding methods achieve competitive performance without using any biological features and the learned embeddings can be treated as complementary representations for the biological features. By summarizing the experimental results, we provide general guidelines for properly selecting graph embedding methods and setting their hyper-parameters for different biomedical tasks.

AVAILABILITY AND IMPLEMENTATION

As part of our contributions in the paper, we develop an easy-to-use Python package with detailed instructions, BioNEV, available at: https://github.com/xiangyue9607/BioNEV, including all source code and datasets, to facilitate studying various graph embedding methods on biomedical tasks.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

Collapse

Chen ZH, You ZH, Li LP, Wang YB, Qiu Y, Hu PW. Identification of self-interacting proteins by integrating random projection classifier and finite impulse response filter. BMC Genomics 2019;20:928. [PMID: 31881833 PMCID: PMC6933882 DOI: 10.1186/s12864-019-6301-1] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/02/2023] Open

Abstract

Background

Identification of protein-protein interactions (PPIs) is crucial for understanding biological processes and investigating the cellular functions of genes. Self-interacting proteins (SIPs) are those in which more than two identical proteins can interact with each other and they are the specific type of PPIs. More and more researchers draw attention to the SIPs detection, and several prediction model have been proposed, but there are still some problems. Hence, there is an urgent need to explore a efficient computational model for SIPs prediction.

Results

In this study, we developed an effective model to predict SIPs, called RP-FIRF, which merges the Random Projection (RP) classifier and Finite Impulse Response Filter (FIRF) together. More specifically, each protein sequence was firstly transformed into the Position Specific Scoring Matrix (PSSM) by exploiting Position Specific Iterated BLAST (PSI-BLAST). Then, to effectively extract the discriminary SIPs feature to improve the performance of SIPs prediction, a FIRF method was used on PSSM. The R’classifier was proposed to execute the classification and predict novel SIPs. We evaluated the performance of the proposed RP-FIRF model and compared it with the state-of-the-art support vector machine (SVM) on human and yeast datasets, respectively. The proposed model can achieve high average accuracies of 97.89 and 97.35% using five-fold cross-validation. To further evaluate the high performance of the proposed method, we also compared it with other six exiting methods, the experimental results demonstrated that the capacity of our model surpass that of the other previous approaches.

Conclusion

Experimental results show that self-interacting proteins are accurately well-predicted by the proposed model on human and yeast datasets, respectively. It fully show that the proposed model can predict the SIPs effectively and sufficiently. Thus, RP-FIRF model is an automatic decision support method which should provide useful insights into the recognition of SIPs.

Collapse

Li Z, Nie R, You Z, Cao C, Li J. Using discriminative vector machine model with 2DPCA to predict interactions among proteins. BMC Bioinformatics 2019;20:694. [PMID: 31874626 PMCID: PMC6929273 DOI: 10.1186/s12859-019-3268-5] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022] Open

Bustamam A, Musti MIS, Hartomo S, Aprilia S, Tampubolon PP, Lestari D. Performance of rotation forest ensemble classifier and feature extractor in predicting protein interactions using amino acid sequences. BMC Genomics 2019;20:950. [PMID: 31874636 PMCID: PMC6929266 DOI: 10.1186/s12864-019-6304-y] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2019] [Accepted: 11/18/2019] [Indexed: 01/08/2023] Open

Abstract

Background

There are two significant problems associated with predicting protein-protein interactions using the sequences of amino acids. The first problem is representing each sequence as a feature vector, and the second is designing a model that can identify the protein interactions. Thus, effective feature extraction methods can lead to improved model performance. In this study, we used two types of feature extraction methods—global encoding and pseudo-substitution matrix representation (PseudoSMR)—to represent the sequences of amino acids in human proteins and Human Immunodeficiency Virus type 1 (HIV-1) to address the classification problem of predicting protein-protein interactions. We also compared principal component analysis (PCA) with independent principal component analysis (IPCA) as methods for transforming Rotation Forest.

Results

The results show that using global encoding and PseudoSMR as a feature extraction method successfully represents the amino acid sequence for the Rotation Forest classifier with PCA or with IPCA. This can be seen from the comparison of the results of evaluation metrics, which were >73% across the six different parameters. The accuracy of both methods was >74%. The results for the other model performance criteria, such as sensitivity, specificity, precision, and F1-score, were all >73%. The data used in this study can be accessed using the following link: https://www.dsc.ui.ac.id/research/amino-acid-pred/.

Conclusions

Both global encoding and PseudoSMR can successfully represent the sequences of amino acids. Rotation Forest (PCA) performed better than Rotation Forest (IPCA) in terms of predicting protein-protein interactions between HIV-1 and human proteins. Both the Rotation Forest (PCA) classifier and the Rotation Forest IPCA classifier performed better than other classifiers, such as Gradient Boosting, K-Nearest Neighbor, Logistic Regression, Random Forest, and Support Vector Machine (SVM). Rotation Forest (PCA) and Rotation Forest (IPCA) have accuracy, sensitivity, specificity, precision, and F1-score values >70% while the other classifiers have values <70%.

Collapse

An JY, Zhou Y, Zhao YJ, Yan ZJ. An Efficient Feature Extraction Technique Based on Local Coding PSSM and Multifeatures Fusion for Predicting Protein-Protein Interactions. Evol Bioinform Online 2019;15:1176934319879920. [PMID: 31619921 PMCID: PMC6777060 DOI: 10.1177/1176934319879920] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2019] [Accepted: 09/11/2019] [Indexed: 12/20/2022] Open

Abstract

Background

Increasing evidence has indicated that protein-protein interactions (PPIs) play important roles in various aspects of the structural and functional organization of a cell. Thus, continuing to uncover potential PPIs is an important topic in the biomedical domain. Although various feature extraction methods with machine learning approaches have enhanced the prediction of PPIs. There remains room for improvement by developing novel and effective feature extraction methods and classifier approaches to identify PPIs.

Method

In this study, we proposed a sequence-based feature extraction method called LCPSSMMF, which combined local coding position-specific scoring matrix (PSSM) with multifeatures fusion. First, we used a novel local coding method based on PSSM to build a new PSSM (CPSSM); the advantage of this method is that it incorporated global and local feature extraction, which can account for the interactions between residues in both continuous and discontinuous regions of amino acid sequences. Second, we adopted 2 different feature extraction methods (Local Average Group [LAG] and Bigram Probability [BP]) to capture multiple key feature information by employing the evolutionary information embedded in the CPSSM matrix. Finally, feature vectors were acquired by using multifeatures fusion method.

Result

To evaluate the performance of the proposed feature extraction approach, we employed support vector machine (SVM) as a prediction classifier and applied this method to yeast and human PPI datasets. The prediction accuracies of LCPSSMMF were 93.43% and 90.41% on the yeast and human datasets, respectively. Moreover, we also compared the proposed method with the previous sequence-based approaches on the yeast datasets by using the same SVM classifier. The experimental results indicated that the performance of LCPSSMMF significantly exceeded that of several other state-of-the-art methods. It is proven that the LCPSSMMF approach can capture more local and global discriminatory information than almost all previous methods and can function remarkably well in identifying PPIs. To facilitate extensive research in future proteomics studies, we developed a LCPSSMMFSVM server, which is freely available for academic use at http://219.219.62.123:8888/LCPSSMMFSVM.

Collapse

Chen ZH, Li LP, He Z, Zhou JR, Li Y, Wong L. An Improved Deep Forest Model for Predicting Self-Interacting Proteins From Protein Sequence Using Wavelet Transformation. Front Genet 2019;10:90. [PMID: 30881376 PMCID: PMC6405691 DOI: 10.3389/fgene.2019.00090] [Citation(s) in RCA: 30] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2018] [Accepted: 01/29/2019] [Indexed: 12/23/2022] Open

Bharti P, Mittal D, Ananthasivan R. Preliminary Study of Chronic Liver Classification on Ultrasound Images Using an Ensemble Model. ULTRASONIC IMAGING 2018;40:357-379. [PMID: 30015593 DOI: 10.1177/0161734618787447] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/21/2023]

Abstract

Chronic liver diseases are fifth leading cause of fatality in developing countries. Their early diagnosis is extremely important for timely treatment and salvage life. To examine abnormalities of liver, ultrasound imaging is the most frequently used modality. However, the visual differentiation between chronic liver and cirrhosis, and presence of heptocellular carcinomas (HCC) evolved over cirrhotic liver is difficult, as they appear almost similar in ultrasound images. In this paper, to deal with this difficult visualization problem, a method has been developed for classifying four liver stages, that is, normal, chronic, cirrhosis, and HCC evolved over cirrhosis. The method is formulated with selected set of "handcrafted" texture features obtained after hierarchal feature fusion. These multiresolution and higher order features, which are able to characterize echotexture and roughness of liver surface, are extracted by using ranklet, gray-level difference matrix and gray-level co-occurrence matrix methods. Thereafter, these features are applied on proposed ensemble classifier that is designed with voting algorithm in conjunction with three classifiers, namely, k-nearest neighbor (k-NN), support vector machine (SVM), and rotation forest. The experiments are conducted to evaluate the (a) effectiveness of "handcrafted" texture features, (b) performance of proposed ensemble model, (c) effectiveness of proposed ensemble strategy, (d) performance of different classifiers, and (e) performance of proposed ensemble model based on Convolutional Neural Networks (CNN) features to differentiate four liver stages. These experiments are carried out on database of 754 segmented regions of interest formed by clinically acquired ultrasound images. The results show that classification accuracy of 96.6% is obtained by use of proposed classifier model.

Collapse

Liu Y, Wang X, Liu B. IDP⁻CRF: Intrinsically Disordered Protein/Region Identification Based on Conditional Random Fields. Int J Mol Sci 2018;19:E2483. [PMID: 30135358 PMCID: PMC6164615 DOI: 10.3390/ijms19092483] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2018] [Revised: 08/14/2018] [Accepted: 08/18/2018] [Indexed: 12/16/2022] Open

Zhu HJ, You ZH, Zhu ZX, Shi WL, Chen X, Cheng L. DroidDet: Effective and robust detection of android malware using static analysis along with rotation forest model. Neurocomputing 2018. [DOI: 10.1016/j.neucom.2017.07.030] [Citation(s) in RCA: 109] [Impact Index Per Article: 18.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]

Zhan ZH, You ZH, Zhou Y, Li LP, Li ZW. Efficient Framework for Predicting ncRNA-Protein Interactions Based on Sequence Information by Deep Learning. INTELLIGENT COMPUTING THEORIES AND APPLICATION 2018:337-344. [DOI: 10.1007/978-3-319-95933-7_41] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/30/2023]

Li Y, Ilie L. SPRINT: ultrafast protein-protein interaction prediction of the entire human interactome. BMC Bioinformatics 2017;18:485. [PMID: 29141584 PMCID: PMC5688644 DOI: 10.1186/s12859-017-1871-x] [Citation(s) in RCA: 40] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2017] [Accepted: 10/17/2017] [Indexed: 12/30/2022] Open