Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Zhang S, Zhu F, Yu Q, Zhu X. Identifying DNA-binding proteins based on multi-features and LASSO feature selection. Biopolymers 2021;112:e23419. [PMID: 33476047 DOI: 10.1002/bip.23419] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2021] [Revised: 01/08/2021] [Accepted: 01/08/2021] [Indexed: 01/22/2023]

For:	Zhang S, Zhu F, Yu Q, Zhu X. Identifying DNA-binding proteins based on multi-features and LASSO feature selection. Biopolymers 2021;112:e23419. [PMID: 33476047 DOI: 10.1002/bip.23419] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2021] [Revised: 01/08/2021] [Accepted: 01/08/2021] [Indexed: 01/22/2023]

Number

Cited by Other Article(s)

Qian J, Jin P, Yang Y, Ma N, Yang Z, Zhang X. Protein function annotation and virulence factor identification of Klebsiella pneumoniae genome by multiple machine learning models. Microb Pathog 2024;193:106727. [PMID: 38851362 DOI: 10.1016/j.micpath.2024.106727] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2023] [Revised: 05/27/2024] [Accepted: 06/03/2024] [Indexed: 06/10/2024]

Zou H. iDPPIV-SI: identifying dipeptidyl peptidase IV inhibitory peptides by using multiple sequence information. J Biomol Struct Dyn 2024;42:2144-2152. [PMID: 37125813 DOI: 10.1080/07391102.2023.2203257] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2022] [Accepted: 04/10/2023] [Indexed: 05/02/2023]

Xiao S, Liu F, Yu L, Li X, Ye X, Gong X. Development and validation of a nomogram for blood transfusion during intracranial aneurysm clamping surgery: a retrospective analysis. BMC Med Inform Decis Mak 2023;23:71. [PMID: 37076865 PMCID: PMC10114399 DOI: 10.1186/s12911-023-02157-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2022] [Accepted: 03/17/2023] [Indexed: 04/21/2023] Open

Liu Z, Zhang T, Lin L, Long F, Guo H, Han L. Applications of radiomics-based analysis pipeline for predicting epidermal growth factor receptor mutation status. Biomed Eng Online 2023;22:17. [PMID: 36810090 PMCID: PMC9945395 DOI: 10.1186/s12938-022-01049-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2022] [Accepted: 11/04/2022] [Indexed: 02/24/2023] Open

Robust and accurate prediction of self-interacting proteins from protein sequence information by exploiting weighted sparse representation based classifier. BMC Bioinformatics 2022;23:518. [PMID: 36457083 PMCID: PMC9713954 DOI: 10.1186/s12859-022-04880-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2022] [Accepted: 08/03/2022] [Indexed: 12/04/2022] Open

Abstract

BACKGROUND

Self-interacting proteins (SIPs), two or more copies of the protein that can interact with each other expressed by one gene, play a central role in the regulation of most living cells and cellular functions. Although numerous SIPs data can be provided by using high-throughput experimental techniques, there are still several shortcomings such as in time-consuming, costly, inefficient, and inherently high in false-positive rates, for the experimental identification of SIPs even nowadays. Therefore, it is more and more significant how to develop efficient and accurate automatic approaches as a supplement of experimental methods for assisting and accelerating the study of predicting SIPs from protein sequence information.

RESULTS

In this paper, we present a novel framework, termed GLCM-WSRC (gray level co-occurrence matrix-weighted sparse representation based classification), for predicting SIPs automatically based on protein evolutionary information from protein primary sequences. More specifically, we firstly convert the protein sequence into Position Specific Scoring Matrix (PSSM) containing protein sequence evolutionary information, exploiting the Position Specific Iterated BLAST (PSI-BLAST) tool. Secondly, using an efficient feature extraction approach, i.e., GLCM, we extract abstract salient and invariant feature vectors from the PSSM, and then perform a pre-processing operation, the adaptive synthetic (ADASYN) technique, to balance the SIPs dataset to generate new feature vectors for classification. Finally, we employ an efficient and reliable WSRC model to identify SIPs according to the known information of self-interacting and non-interacting proteins.

CONCLUSIONS

Extensive experimental results show that the proposed approach exhibits high prediction performance with 98.10% accuracy on the yeast dataset, and 91.51% accuracy on the human dataset, which further reveals that the proposed model could be a useful tool for large-scale self-interacting protein prediction and other bioinformatics tasks detection in the future.

Collapse

Li Y, Cheng P, Liang L, Dong H, Liu H, Shen W, Zhou W. Abnormal resting-state functional connectome in methamphetamine-dependent patients and its application in machine-learning-based classification. Front Neurosci 2022;16:1014539. [DOI: 10.3389/fnins.2022.1014539] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2022] [Accepted: 11/04/2022] [Indexed: 11/18/2022] Open

Abstract Brain resting-state functional connectivity (rsFC) has been widely analyzed in substance use disorders (SUDs), including methamphetamine (MA) dependence. Most of these studies utilized Pearson correlation analysis to assess rsFC, which cannot determine whether two brain regions are connected by direct or indirect pathways. Moreover, few studies have reported the application of rsFC-based graph theory in MA dependence. We evaluated alterations in Tikhonov regularization-based rsFC and rsFC-based topological attributes in 46 MA-dependent patients, as well as the correlations between topological attributes and clinical variables. Moreover, the topological attributes selected by least absolute shrinkage and selection operator (LASSO) were used to construct a support vector machine (SVM)-based classifier for MA dependence. The MA group presented a subnetwork with increased rsFC, indicating overactivation of the reward circuit that makes patients very sensitive to drug-related visual cues, and a subnetwork with decreased rsFC suggesting aberrant synchronized spontaneous activity in subregions within the orbitofrontal cortex (OFC) system. The MA group demonstrated a significantly decreased area under the curve (AUC) for the clustering coefficient (Cp) (Pperm < 0.001), shortest path length (Lp) (Pperm = 0.007), modularity (Pperm = 0.006), and small-worldness (σ, Pperm = 0.004), as well as an increased AUC for global efficiency (E.glob) (Pperm = 0.009), network strength (Sp) (Pperm = 0.009), and small-worldness (ω, Pperm < 0.001), implying a shift toward random networks. MA-related increased nodal efficiency (E.nodal) and altered betweenness centrality were also discovered in several brain regions. The AUC for ω was significantly positively associated with psychiatric symptoms. An SVM classifier trained by 36 features selected by LASSO from all topological attributes achieved excellent performance, cross-validated prediction area under the receiver operating characteristics curve, accuracy, sensitivity, specificity, and kappa of 99.03 ± 1.79, 94.00 ± 5.78, 93.46 ± 8.82, 94.52 ± 8.11, and 87.99 ± 11.57%, respectively (Pperm < 0.001), indicating that rsFC-based topological attributes can provide promising features for constructing a high-efficacy classifier for MA dependence. Collapse

Nguyen Q, Tran HV, Nguyen BP, Do TTT. Identifying Transcription Factors That Prefer Binding to Methylated DNA Using Reduced G-Gap Dipeptide Composition. ACS OMEGA 2022;7:32322-32330. [PMID: 36119976 PMCID: PMC9475634 DOI: 10.1021/acsomega.2c03696] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/14/2022] [Accepted: 08/23/2022] [Indexed: 06/15/2023]

Identification of DNA-binding proteins via Multi-view LSSVM with independence criterion. Methods 2022;207:29-37. [PMID: 36087888 DOI: 10.1016/j.ymeth.2022.08.015] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2022] [Revised: 08/06/2022] [Accepted: 08/25/2022] [Indexed: 11/24/2022] Open

Feature Selection Based on Adaptive Particle Swarm Optimization with Leadership Learning. COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE 2022;2022:1825341. [PMID: 36072739 PMCID: PMC9441366 DOI: 10.1155/2022/1825341] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/29/2022] [Revised: 08/07/2022] [Accepted: 08/09/2022] [Indexed: 12/02/2022]

Sun CK, Tang YX, Liu TC, Lu CJ. An Integrated Machine Learning Scheme for Predicting Mammographic Anomalies in High-Risk Individuals Using Questionnaire-Based Predictors. INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH 2022;19:ijerph19159756. [PMID: 35955112 PMCID: PMC9368335 DOI: 10.3390/ijerph19159756] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/30/2022] [Revised: 08/02/2022] [Accepted: 08/06/2022] [Indexed: 05/09/2023]

Qin Y, Li C, Shi X, Wang W. MLP-Based Regression Prediction Model For Compound Bioactivity. Front Bioeng Biotechnol 2022;10:946329. [PMID: 35910022 PMCID: PMC9326362 DOI: 10.3389/fbioe.2022.946329] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2022] [Accepted: 06/23/2022] [Indexed: 11/13/2022] Open

Comparative Analysis on Alignment-Based and Pretrained Feature Representations for the Identification of DNA-Binding Proteins. COMPUTATIONAL AND MATHEMATICAL METHODS IN MEDICINE 2022;2022:5847242. [PMID: 35799660 PMCID: PMC9256349 DOI: 10.1155/2022/5847242] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/10/2022] [Accepted: 06/07/2022] [Indexed: 11/17/2022]

Yao Y, Zhang S, Xue T. Integrating LASSO Feature Selection and Soft Voting Classifier to Identify Origins of Replication Sites. Curr Genomics 2022;23:83-93. [PMID: 36778978 PMCID: PMC9878833 DOI: 10.2174/1389202923666220214122506] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2021] [Revised: 12/11/2021] [Accepted: 01/18/2022] [Indexed: 11/22/2022] Open

iDHS-DT: Identifying DNase I hypersensitive sites by integrating DNA dinucleotide and trinucleotide information. Biophys Chem 2021;281:106717. [PMID: 34798459 DOI: 10.1016/j.bpc.2021.106717] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2021] [Revised: 11/10/2021] [Accepted: 11/10/2021] [Indexed: 01/02/2023]

Zou Y, Ding Y, Peng L, Zou Q. FTWSVM-SR: DNA-Binding Proteins Identification via Fuzzy Twin Support Vector Machines on Self-Representation. Interdiscip Sci 2021;14:372-384. [PMID: 34743286 DOI: 10.1007/s12539-021-00489-6] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2021] [Revised: 10/11/2021] [Accepted: 10/24/2021] [Indexed: 12/01/2022]

Zou H, Yang F, Yin Z. Identifying N7-methylguanosine sites by integrating multiple features. Biopolymers 2021;113:e23480. [PMID: 34709657 DOI: 10.1002/bip.23480] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/27/2021] [Revised: 10/12/2021] [Accepted: 10/14/2021] [Indexed: 11/10/2022]

Zou H, Yin Z. m7G-DPP: Identifying N7-methylguanosine sites based on dinucleotide physicochemical properties of RNA. Biophys Chem 2021;279:106697. [PMID: 34628276 DOI: 10.1016/j.bpc.2021.106697] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2021] [Revised: 10/01/2021] [Accepted: 10/02/2021] [Indexed: 11/17/2022]

Shen Z, Liu T, Xu T. Accurate Identification of Antioxidant Proteins Based on a Combination of Machine Learning Techniques and Hidden Markov Model Profiles. COMPUTATIONAL AND MATHEMATICAL METHODS IN MEDICINE 2021;2021:5770981. [PMID: 34413898 PMCID: PMC8369162 DOI: 10.1155/2021/5770981] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/21/2021] [Revised: 07/15/2021] [Accepted: 07/26/2021] [Indexed: 01/19/2023]

i6mA-VC: A Multi-Classifier Voting Method for the Computational Identification of DNA N6-methyladenine Sites. Interdiscip Sci 2021;13:413-425. [PMID: 33834381 DOI: 10.1007/s12539-021-00429-4] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2020] [Revised: 03/26/2021] [Accepted: 03/29/2021] [Indexed: 12/14/2022]

Abstract

DNA N6-methyladenine (6 mA), as an essential component of epigenetic modification, cannot be neglected in genetic regulation mechanism. The efficient and accurate prediction of 6 mA sites is beneficial to the development of biological genetics. Biochemical experimental methods are considered to be time-consuming and laborious. Most of the established machine learning methods have a single dataset. Although some of them have achieved cross-species prediction, their results are not satisfactory. Therefore, we designed a novel statistical model called i6mA-VC to improve the accuracy for 6 mA sites. On the one hand, kmer and binary encoding are applied to extract features, and then gradient boosting decision tree (GBDT) embedded method is applied as the feature selection strategy. On the other hand, DNA sequences are represented by vectors through the feature extraction method of ring-function-hydrogen-chemical properties (RFHCP) and the feature selection strategy of ExtraTree. After fusing the two optimal features, a voting classifier based on gradient boosting decision tree (GBDT), light gradient boosting machine (LightGBM) and multilayer perceptron classifier (MLPC) is constructed for final classification and prediction. The accuracy of Rice dataset and M.musculus dataset with five-fold cross-validation are 0.888 and 0.967, respectively. The cross-species dataset is selected as independent testing dataset, and the accuracy reaches 0.848. Through rigorous experiments, it is demonstrated that the proposed predictor is convincing and applicable. The development of i6mA-VC predictor will become an effective way for the recognition of N6-methyladenine sites, and it will also be beneficial for biological geneticists to further study gene expression and DNA modification. In addition, an accessible web-server for i6mA-VC is available from http://www.zhanglab.site/ .

Collapse