Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Cai YD, Ricardo PW, Jen CH, Chou KC. Application of SVM to predict membrane protein types. J Theor Biol 2004;226:373-6. [PMID: 14759643 DOI: 10.1016/j.jtbi.2003.08.015] [Citation(s) in RCA: 115] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2003] [Revised: 08/22/2003] [Accepted: 08/28/2003] [Indexed: 11/28/2022]

For:	Cai YD, Ricardo PW, Jen CH, Chou KC. Application of SVM to predict membrane protein types. J Theor Biol 2004;226:373-6. [PMID: 14759643 DOI: 10.1016/j.jtbi.2003.08.015] [Citation(s) in RCA: 115] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2003] [Revised: 08/22/2003] [Accepted: 08/28/2003] [Indexed: 11/28/2022]

Number

Cited by Other Article(s)

Singh L, Singh S, Singh DD. A Machine Learning Approach to Identify C Type Lectin Domain (CTLD) Containing Proteins. Protein J 2024;43:718-725. [PMID: 39068630 DOI: 10.1007/s10930-024-10224-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 07/07/2024] [Indexed: 07/30/2024]

Abstract

Lectins are sugar interacting proteins which bind specific glycans reversibly and have ubiquitous presence in all forms of life. They have diverse biological functions such as cell signaling, molecular recognition, etc. C-type lectins (CTL) are a group of proteins from the lectin family which have been studied extensively in animals and are reported to be involved in immune functions, carcinogenesis, cell signaling, etc. The carbohydrate recognition domain (CRD) in CTL has a highly variable protein sequence and proteins carrying this domain are also referred to as C-type lectin domain containing proteins (CTLD). Because of this low sequence homology, identification of CTLD from hypothetical proteins in the sequenced genomes using homology based programs has limitations. Machine learning (ML) tools use characteristic features to identify homologous sequences and it has been used to develop a tool for identification of CTLD. Initially 500 sequences of well annotated CTLD and 500 sequences of non CTLD were used in developing the machine learning model. The classifier program Linear SVC from sci kit library of python was used and characteristic features in CTLD sequences like dipeptide and tripeptide composition were used as training attributes in various classifiers. A precision, recall and multiple correlation coefficient (MCC) value of 0.92, 0.91 and 0.82 respectively were obtained when tested on external test set. On fine tuning of the parameters like kernel, C value, gamma, degree and increasing number of non CTLD sequences there was improvement in precision, recall and MCC and the corresponding values were 0.99, 0.99 and 0.96. New CTLD have also been identified in the hypothetical segment of human genome using the trained model. The tool is available on our local server for interested users.

Collapse

Qian Y, Ding Y, Zou Q, Guo F. Multi-View Kernel Sparse Representation for Identification of Membrane Protein Types. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2023;20:1234-1245. [PMID: 35857734 DOI: 10.1109/tcbb.2022.3191325] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/04/2023]

Gogoi CR, Rahman A, Saikia B, Baruah A. Protein Dihedral Angle Prediction: The State of the Art. ChemistrySelect 2023. [DOI: 10.1002/slct.202203427] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/05/2023]

Moosaei H, Ganaie M, Hladík M, Tanveer M. Inverse free reduced universum twin support vector machine for imbalanced data classification. Neural Netw 2023;157:125-135. [DOI: 10.1016/j.neunet.2022.10.003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2022] [Revised: 10/04/2022] [Accepted: 10/04/2022] [Indexed: 11/09/2022]

A Review on Data-Driven Quality Prediction in the Production Process with Machine Learning for Industry 4.0. Processes (Basel) 2022. [DOI: 10.3390/pr10101966] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open

Waqas S, Harun NY, Sambudi NS, Arshad U, Nordin NAHM, Bilad MR, Saeed AAH, Malik AA. SVM and ANN Modelling Approach for the Optimization of Membrane Permeability of a Membrane Rotating Biological Contactor for Wastewater Treatment. MEMBRANES 2022;12:membranes12090821. [PMID: 36135840 PMCID: PMC9504877 DOI: 10.3390/membranes12090821] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/02/2022] [Revised: 08/15/2022] [Accepted: 08/17/2022] [Indexed: 05/31/2023]

Lu W, Shen J, Zhang Y, Wu H, Qian Y, Chen X, Fu Q. Identifying Membrane Protein Types Based on Lifelong Learning With Dynamically Scalable Networks. Front Genet 2022;12:834488. [PMID: 35371189 PMCID: PMC8964460 DOI: 10.3389/fgene.2021.834488] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2021] [Accepted: 12/21/2021] [Indexed: 11/13/2022] Open

Zhang Z, Wang L. Using Chou's 5-steps rule to identify N⁶-methyladenine sites by ensemble learning combined with multiple feature extraction methods. J Biomol Struct Dyn 2022;40:796-806. [PMID: 32948102 DOI: 10.1080/07391102.2020.1821778] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]

The Limitations in Current Studies of Organic Fouling and Future Prospects. MEMBRANES 2021;11:membranes11120922. [PMID: 34940423 PMCID: PMC8708778 DOI: 10.3390/membranes11120922] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Received: 10/27/2021] [Revised: 11/22/2021] [Accepted: 11/24/2021] [Indexed: 11/16/2022]

iMPT-FDNPL: Identification of Membrane Protein Types with Functional Domains and a Natural Language Processing Approach. COMPUTATIONAL AND MATHEMATICAL METHODS IN MEDICINE 2021;2021:7681497. [PMID: 34671418 PMCID: PMC8523280 DOI: 10.1155/2021/7681497] [Citation(s) in RCA: 29] [Impact Index Per Article: 9.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/22/2021] [Revised: 09/15/2021] [Accepted: 09/27/2021] [Indexed: 12/20/2022]

Cao Y, Yu C, Huang S, Wang S, Zuo Y, Yang L. Characterization and Prediction of Presynaptic and Postsynaptic Neurotoxins Based on Reduced Amino Acids and Biological Properties. Curr Bioinform 2021. [DOI: 10.2174/1574893615999200707150512] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]

Prediction of Maize Yield at the City Level in China Using Multi-Source Data. REMOTE SENSING 2021. [DOI: 10.3390/rs13010146] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/13/2023]

Abstract Maize is a widely grown crop in China, and the relationships between agroclimatic parameters and maize yield are complicated, hence, accurate and timely yield prediction is challenging. Here, climate, satellite data, and meteorological indices were integrated to predict maize yield at the city-level in China from 2000 to 2015 using four machine learning approaches, e.g., cubist, random forest (RF), extreme gradient boosting (Xgboost), and support vector machine (SVM). The climate variables included the diffuse flux of photosynthetic active radiation (PDf), the diffuse flux of shortwave radiation (SDf), the direct flux of shortwave radiation (SDr), minimum temperature (Tmn), potential evapotranspiration (Pet), vapor pressure deficit (Vpd), vapor pressure (Vap), and wet day frequency (Wet). Satellite data, including the enhanced vegetation index (EVI), normalized difference vegetation index (NDVI), and adjusted vegetation index (SAVI) from the Moderate Resolution Imaging Spectroradiometer (MODIS), were used. Meteorological indices, including growing degree day (GDD), extreme degree day (EDD), and the Standardized Precipitation Evapotranspiration Index (SPEI), were used. The results showed that integrating all climate, satellite data, and meteorological indices could achieve the highest accuracy. The highest estimated correlation coefficient (R) values for the cubist, RF, SVM, and Xgboost methods were 0.828, 0.806, 0.742, and 0.758, respectively. The climate, satellite data, or meteorological indices inputs from all growth stages were essential for maize yield prediction, especially in late growth stages. R improved by about 0.126, 0.117, and 0.143 by adding climate data from the early, peak, and late-period to satellite data and meteorological indices from all stages via the four machine learning algorithms, respectively. R increased by 0.016, 0.016, and 0.017 when adding satellite data from the early, peak, and late stages to climate data and meteorological indices from all stages, respectively. R increased by 0.003, 0.032, and 0.042 when adding meteorological indices from the early, peak, and late stages to climate and satellite data from all stages, respectively. The analysis found that the spatial divergences were large and the R value in Northwest region reached 0.942, 0.904, 0.934, and 0.850 for the Cubist, RF, SVM, and Xgboost, respectively. This study highlights the advantages of using climate, satellite data, and meteorological indices for large-scale maize yield estimation with machine learning algorithms. Collapse

Zhang S, Duan Z, Yang W, Qian C, You Y. iDHS-DASTS: identifying DNase I hypersensitive sites based on LASSO and stacking learning. Mol Omics 2021;17:130-141. [PMID: 33295914 DOI: 10.1039/d0mo00115e] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]

Zhang S, Qiao H. KD-KLNMF: Identification of lncRNAs subcellular localization with multiple features and nonnegative matrix factorization. Anal Biochem 2020;610:113995. [PMID: 33080214 DOI: 10.1016/j.ab.2020.113995] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2020] [Revised: 09/07/2020] [Accepted: 10/12/2020] [Indexed: 12/18/2022]

Alphonse AS, Mary NAB, Starvin MS. Classification of membrane protein using Tetra Peptide Pattern. Anal Biochem 2020;606:113845. [PMID: 32739352 DOI: 10.1016/j.ab.2020.113845] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2020] [Revised: 06/17/2020] [Accepted: 06/22/2020] [Indexed: 11/29/2022]

Zhang X, Chen L. Prediction of membrane protein types by fusing protein-protein interaction and protein sequence information. BIOCHIMICA ET BIOPHYSICA ACTA-PROTEINS AND PROTEOMICS 2020;1868:140524. [PMID: 32858174 DOI: 10.1016/j.bbapap.2020.140524] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/26/2020] [Revised: 07/17/2020] [Accepted: 07/30/2020] [Indexed: 11/30/2022]

Identification of membrane protein types via multivariate information fusion with Hilbert–Schmidt Independence Criterion. Neurocomputing 2020. [DOI: 10.1016/j.neucom.2019.11.103] [Citation(s) in RCA: 88] [Impact Index Per Article: 22.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]

Xu ZC, Xiao X, Qiu WR, Wang P, Fang XZ. iAI-DSAE: A Computational Method for Adenosine to Inosine Editing Site Prediction. LETT ORG CHEM 2019. [DOI: 10.2174/1570178615666181016112546] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]

Abstract As an important post-transcriptional modification, adenosine-to-inosine RNA editing generally occurs in both coding and noncoding RNA transcripts in which adenosines are converted to inosines. Accordingly, the diversification of the transcriptome can be resulted in by this modification. It is significant to accurately identify adenosine-to-inosine editing sites for further understanding their biological functions. Currently, the adenosine-to-inosine editing sites would be determined by experimental methods, unfortunately, it may be costly and time consuming. Furthermore, there are only a few existing computational prediction models in this field. Therefore, the work in this study is starting to develop other computational methods to address these problems. Given an uncharacterized RNA sequence that contains many adenosine resides, can we identify which one of them can be converted to inosine, and which one cannot? To deal with this problem, a novel predictor called iAI-DSAE is proposed in the current study. In fact, there are two key issues to address: one is ‘what feature extraction methods should be adopted to formulate the given sample sequence?’ The other is ‘what classification algorithms should be used to construct the classification model?’ For the former, a 540-dimensional feature vector is extracted to formulate the sample sequence by dinucleotide-based auto-cross covariance, pseudo dinucleotide composition, and nucleotide density methods. For the latter, we use the present more popular method i.e. deep spare autoencoder to construct the classification model. Generally, ACC and MCC are considered as the two of the most important performance indicators of a predictor. In this study, in comparison with those of predictor PAI, they are up 2.46% and 4.14%, respectively. The two other indicators, Sn and Sp, rise at certain degree also. This indicates that our predictor can be as an important complementary tool to identify adenosine-toinosine RNA editing sites. For the convenience of most experimental scientists, an easy-to-use web-server for identifying adenosine-to-inosine editing sites has been established at: http://www.jci-bioinfo.cn/iAI-DSAE, by which users can easily obtain their desired results without the need to go through the complicated mathematical equations involved. It is important to identify adenosine-to-inosine editing sites in RNA sequences for the intensive study on RNA function and the development of new medicine. In current study, a novel predictor, called iAI-DSAE, was proposed by using three feature extraction methods including dinucleotidebased auto-cross covariance, pseudo dinucleotide composition and nucleotide density. The jackknife test results of the iAI-DSAE predictor based on deep spare auto-encoder model show that our predictor is more stable and reliable. It has not escaped our notice that the methods proposed in the current paper can be used to solve many other problems in genome analysis. Collapse

Jayapriya K, Mary NAB. Employing a novel 2-gram subgroup intra pattern (2GSIP) with stacked auto encoder for membrane protein classification. Mol Biol Rep 2019;46:2259-2272. [PMID: 30778923 DOI: 10.1007/s11033-019-04680-3] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2018] [Accepted: 02/07/2019] [Indexed: 12/01/2022]

Prediction of membrane protein types by exploring local discriminative information from evolutionary profiles. Anal Biochem 2019;564-565:123-132. [DOI: 10.1016/j.ab.2018.10.027] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2018] [Revised: 10/23/2018] [Accepted: 10/25/2018] [Indexed: 11/17/2022]

Sankari ES, Manimegalai D. Predicting membrane protein types by incorporating a novel feature set into Chou's general PseAAC. J Theor Biol 2018;455:319-328. [DOI: 10.1016/j.jtbi.2018.07.032] [Citation(s) in RCA: 36] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2018] [Revised: 06/27/2018] [Accepted: 07/23/2018] [Indexed: 10/28/2022]

Huang G, Li J, Zhao C. Computational Prediction and Analysis of Associations between Small Molecules and Binding-Associated S-Nitrosylation Sites. Molecules 2018;23:molecules23040954. [PMID: 29671802 PMCID: PMC6017196 DOI: 10.3390/molecules23040954] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/24/2018] [Revised: 03/30/2018] [Accepted: 04/09/2018] [Indexed: 01/12/2023] Open

iMem-2LSAAC: A two-level model for discrimination of membrane proteins and their types by extending the notion of SAAC into chou's pseudo amino acid composition. J Theor Biol 2018;442:11-21. [DOI: 10.1016/j.jtbi.2018.01.008] [Citation(s) in RCA: 83] [Impact Index Per Article: 13.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2017] [Revised: 12/23/2017] [Accepted: 01/10/2018] [Indexed: 02/08/2023]

Sankari ES, Manimegalai D. Predicting membrane protein types using various decision tree classifiers based on various modes of general PseAAC for imbalanced datasets. J Theor Biol 2017;435:208-217. [PMID: 28941868 DOI: 10.1016/j.jtbi.2017.09.018] [Citation(s) in RCA: 24] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2017] [Revised: 09/15/2017] [Accepted: 09/18/2017] [Indexed: 12/19/2022]

Xu ZC, Wang P, Qiu WR, Xiao X. iSS-PC: Identifying Splicing Sites via Physical-Chemical Properties Using Deep Sparse Auto-Encoder. Sci Rep 2017;7:8222. [PMID: 28811565 PMCID: PMC5557945 DOI: 10.1038/s41598-017-08523-8] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2017] [Accepted: 07/10/2017] [Indexed: 12/13/2022] Open

Butt AH, Rasool N, Khan YD. A Treatise to Computational Approaches Towards Prediction of Membrane Protein and Its Subtypes. J Membr Biol 2016;250:55-76. [DOI: 10.1007/s00232-016-9937-7] [Citation(s) in RCA: 34] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2016] [Accepted: 11/02/2016] [Indexed: 10/20/2022]

Xiao X, Hui M, Liu Z. iAFP-Ense: An Ensemble Classifier for Identifying Antifreeze Protein by Incorporating Grey Model and PSSM into PseAAC. J Membr Biol 2016;249:845-854. [DOI: 10.1007/s00232-016-9935-9] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2016] [Accepted: 10/24/2016] [Indexed: 12/12/2022]

A Prediction Model for Membrane Proteins Using Moments Based Features. BIOMED RESEARCH INTERNATIONAL 2016;2016:8370132. [PMID: 26966690 PMCID: PMC4761391 DOI: 10.1155/2016/8370132] [Citation(s) in RCA: 39] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/27/2015] [Accepted: 01/12/2016] [Indexed: 01/29/2023]

Wu CY, Li QZ, Feng ZX. Non-coding RNA identification based on topology secondary structure and reading frame in organelle genome level. Genomics 2015;107:9-15. [PMID: 26697761 DOI: 10.1016/j.ygeno.2015.12.002] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2015] [Revised: 12/08/2015] [Accepted: 12/12/2015] [Indexed: 10/22/2022]

Tripathi V, Tripathi P, Gupta D. Statistical approach for lysosomal membrane proteins (LMPs) identification. SYSTEMS AND SYNTHETIC BIOLOGY 2014;8:313-9. [PMID: 26396655 PMCID: PMC4571724 DOI: 10.1007/s11693-014-9153-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/11/2014] [Revised: 06/11/2014] [Accepted: 07/26/2014] [Indexed: 10/25/2022]

Chou׳s pseudo amino acid composition improves sequence-based antifreeze protein prediction. J Theor Biol 2014;356:30-5. [DOI: 10.1016/j.jtbi.2014.04.006] [Citation(s) in RCA: 116] [Impact Index Per Article: 11.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2014] [Revised: 03/28/2014] [Accepted: 04/02/2014] [Indexed: 11/22/2022]

A Multi-label Classifier for Prediction Membrane Protein Functional Types in Animal. J Membr Biol 2014;247:1141-8. [DOI: 10.1007/s00232-014-9708-2] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2014] [Accepted: 07/14/2014] [Indexed: 11/26/2022]

Piao H, Froula J, Du C, Kim TW, Hawley ER, Bauer S, Wang Z, Ivanova N, Clark DS, Klenk HP, Hess M. Identification of novel biomass-degrading enzymes from genomic dark matter: Populating genomic sequence space with functional annotation. Biotechnol Bioeng 2014;111:1550-65. [PMID: 24728961 DOI: 10.1002/bit.25250] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2013] [Revised: 02/21/2014] [Accepted: 03/24/2014] [Indexed: 11/06/2022]

Prediction of multi-type membrane proteins in human by an integrated approach. PLoS One 2014;9:e93553. [PMID: 24676214 PMCID: PMC3968155 DOI: 10.1371/journal.pone.0093553] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2013] [Accepted: 03/05/2014] [Indexed: 11/29/2022] Open

Han GS, Yu ZG, Anh V. A two-stage SVM method to predict membrane protein types by incorporating amino acid classifications and physicochemical properties into a general form of Chou's PseAAC. J Theor Biol 2013;344:31-9. [PMID: 24316387 DOI: 10.1016/j.jtbi.2013.11.017] [Citation(s) in RCA: 49] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2013] [Revised: 10/16/2013] [Accepted: 11/24/2013] [Indexed: 01/12/2023]

Fan GL, Li QZ. Discriminating bioluminescent proteins by incorporating average chemical shift and evolutionary information into the general form of Chou's pseudo amino acid composition. J Theor Biol 2013;334:45-51. [DOI: 10.1016/j.jtbi.2013.06.003] [Citation(s) in RCA: 39] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2013] [Revised: 05/30/2013] [Accepted: 06/03/2013] [Indexed: 01/22/2023]

Xiaohui N, Nana L, Jingbo X, Dingyan C, Yuehua P, Yang X, Weiquan W, Dongming W, Zengzhen W. Using the concept of Chou's pseudo amino acid composition to predict protein solubility: An approach with entropies in information theory. J Theor Biol 2013;332:211-7. [DOI: 10.1016/j.jtbi.2013.03.010] [Citation(s) in RCA: 34] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2012] [Revised: 03/10/2013] [Accepted: 03/11/2013] [Indexed: 11/15/2022]

Tripathi V, Gupta DK. Discriminating lysosomal membrane protein types using dynamic neural network. J Biomol Struct Dyn 2013;32:1575-82. [PMID: 23968467 DOI: 10.1080/07391102.2013.827133] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/26/2022]

Liu B, Wang X, Zou Q, Dong Q, Chen Q. Protein Remote Homology Detection by Combining Chou’s Pseudo Amino Acid Composition and Profile-Based Protein Representation. Mol Inform 2013;32:775-82. [DOI: 10.1002/minf.201300084] [Citation(s) in RCA: 97] [Impact Index Per Article: 8.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2013] [Accepted: 06/11/2013] [Indexed: 11/12/2022]

Predicting acidic and alkaline enzymes by incorporating the average chemical shift and gene ontology informations into the general form of Chou's PseAAC. Process Biochem 2013. [DOI: 10.1016/j.procbio.2013.05.012] [Citation(s) in RCA: 36] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]

Yu L, Luo J, Guo Y, Li Y, Pu X, Li M. In silico identification of Gram-negative bacterial secreted proteins from primary sequence. Comput Biol Med 2013;43:1177-81. [PMID: 23930811 DOI: 10.1016/j.compbiomed.2013.06.001] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2013] [Revised: 05/30/2013] [Accepted: 06/04/2013] [Indexed: 11/26/2022]

Li T, Li QZ, Liu S, Fan GL, Zuo YC, Peng Y. PreDNA: accurate prediction of DNA-binding sites in proteins by integrating sequence and geometric structure information. ACTA ACUST UNITED AC 2013;29:678-85. [PMID: 23335013 DOI: 10.1093/bioinformatics/btt029] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/03/2023]

Lei JB, Yin JB, Shen HB. GFO: A data driven approach for optimizing the Gaussian function based similarity metric in computational biology. Neurocomputing 2013. [DOI: 10.1016/j.neucom.2012.07.003] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]

Ding C, Yuan LF, Guo SH, Lin H, Chen W. Identification of mycobacterial membrane proteins and their types using over-represented tripeptide compositions. J Proteomics 2012;77:321-8. [DOI: 10.1016/j.jprot.2012.09.006] [Citation(s) in RCA: 82] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2012] [Revised: 08/18/2012] [Accepted: 09/08/2012] [Indexed: 11/25/2022]

Chen YK, Li KB. Predicting membrane protein types by incorporating protein topology, domains, signal peptides, and physicochemical properties into the general form of Chou's pseudo amino acid composition. J Theor Biol 2012;318:1-12. [PMID: 23137835 DOI: 10.1016/j.jtbi.2012.10.033] [Citation(s) in RCA: 98] [Impact Index Per Article: 8.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2012] [Revised: 10/25/2012] [Accepted: 10/26/2012] [Indexed: 01/04/2023]

Li T, Li QZ. Annotating the protein-RNA interaction sites in proteins using evolutionary information and protein backbone structure. J Theor Biol 2012;312:55-64. [PMID: 22874580 DOI: 10.1016/j.jtbi.2012.07.020] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2012] [Revised: 07/19/2012] [Accepted: 07/21/2012] [Indexed: 12/11/2022]

Predict mycobacterial proteins subcellular locations by incorporating pseudo-average chemical shift into the general form of Chou’s pseudo amino acid composition. J Theor Biol 2012;304:88-95. [DOI: 10.1016/j.jtbi.2012.03.017] [Citation(s) in RCA: 89] [Impact Index Per Article: 7.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2011] [Revised: 03/13/2012] [Accepted: 03/14/2012] [Indexed: 11/18/2022]

Hayat M, Khan A. Mem-PHybrid: hybrid features-based prediction system for classifying membrane protein types. Anal Biochem 2012;424:35-44. [PMID: 22342883 DOI: 10.1016/j.ab.2012.02.007] [Citation(s) in RCA: 38] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/27/2011] [Revised: 02/04/2012] [Accepted: 02/06/2012] [Indexed: 11/29/2022]

Su CH, Pal NR, Lin KL, Chung IF. Identification of amino acid propensities that are strong determinants of linear B-cell epitope using neural networks. PLoS One 2012;7:e30617. [PMID: 22347389 PMCID: PMC3275595 DOI: 10.1371/journal.pone.0030617] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2011] [Accepted: 12/22/2011] [Indexed: 11/18/2022] Open

Abstract

BACKGROUND

Identification of amino acid propensities that are strong determinants of linear B-cell epitope is very important to enrich our knowledge about epitopes. This can also help to obtain better epitope prediction. Typical linear B-cell epitope prediction methods combine various propensities in different ways to improve prediction accuracies. However, fewer but better features may yield better prediction. Moreover, for a propensity, when the sequence length is k, there will be k values, which should be treated as a single unit for feature selection and hence usual feature selection method will not work. Here we use a novel Group Feature Selecting Multilayered Perceptron, GFSMLP, which treats a group of related information as a single entity and selects useful propensities related to linear B-cell epitopes, and uses them to predict epitopes.

METHODOLOGY/ PRINCIPAL FINDINGS

We use eight widely known propensities and four data sets. We use GFSMLP to rank propensities by the frequency with which they are selected. We find that Chou's beta-turn and Ponnuswamy's polarity are better features for prediction of linear B-cell epitope. We examine the individual and combined discriminating power of the selected propensities and analyze the correlation between paired propensities. Our results show that the selected propensities are indeed good features, which also cooperate with other propensities to enhance the discriminating power for predicting epitopes. We find that individually polarity is not the best predictor, but it collaborates with others to yield good prediction. Usual feature selection methods cannot provide such information.

CONCLUSIONS/ SIGNIFICANCE

Our results confirm the effectiveness of active (group) feature selection by GFSMLP over the traditional passive approaches of evaluating various combinations of propensities. The GFSMLP-based feature selection can be extended to more than 500 remaining propensities to enhance our biological knowledge about epitopes and to obtain better prediction. A graphical-user-interface version of GFSMLP is available at: http://bio.classcloud.org/GFSMLP/.

Collapse

Identification of voltage-gated potassium channel subfamilies from sequence information using support vector machine. Comput Biol Med 2012;42:504-7. [PMID: 22297432 DOI: 10.1016/j.compbiomed.2012.01.003] [Citation(s) in RCA: 40] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/31/2010] [Revised: 10/16/2011] [Accepted: 01/12/2012] [Indexed: 02/05/2023]