Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Yang Y, Tantoso E, Li KB. Remote protein homology detection using recurrence quantification analysis and amino acid physicochemical properties. J Theor Biol 2008;252:145-54. [PMID: 18342336 DOI: 10.1016/j.jtbi.2008.01.028] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2007] [Revised: 11/29/2007] [Accepted: 01/26/2008] [Indexed: 11/29/2022]

For:	Yang Y, Tantoso E, Li KB. Remote protein homology detection using recurrence quantification analysis and amino acid physicochemical properties. J Theor Biol 2008;252:145-54. [PMID: 18342336 DOI: 10.1016/j.jtbi.2008.01.028] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2007] [Revised: 11/29/2007] [Accepted: 01/26/2008] [Indexed: 11/29/2022]

Number

Cited by Other Article(s)

Ahmad HI, Ijaz N, Afzal G, Asif AR, ur Rehman A, Rahman A, Ahmed I, Yousaf M, Elokil A, Muhammad SA, Albogami SM, Alotaibi SS. Computational Insights into the Structural and Functional Impacts of nsSNPs of Bone Morphogenetic Proteins. BIOMED RESEARCH INTERNATIONAL 2022;2022:4013729. [PMID: 35832847 PMCID: PMC9273450 DOI: 10.1155/2022/4013729] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/23/2022] [Accepted: 06/15/2022] [Indexed: 12/12/2022]

Abstract

BMPs (bone morphogenetic proteins) are multipurpose (transforming growth factor)TGF-superfamily released cytokines. These glycoproteins, acting as disulfide-linked homo- or heterodimers, are highly potent regulators of bone and cartilage production and repair, cell proliferation throughout embryonic development, and bone homeostasis in the adults. Due to the fact that genetic variation might influence structural functions, this study is aimed to determine the pathogenic effect of nonsynonymous single-nucleotide polymorphisms (nsSNPs) in BMP genes. The implications of these variations, investigated using computational analysis and molecular models of the mature TGF-β domain, revealed the impact of modifications on the function of BMP protein. The three-dimensional (3D) structure analysis was performed on the nsSNP Y316S, V386G, E387G, C389G, and C391G nsSNP in the TGF-β domain of chicken BMP2 and H344P, S347P, V357A nsSNP in the TGF-β domain of chicken BMP4 protein that was anticipated to be harmful and of high risk. The ability of the proteins to perform variety of tasks interact with other molecules depends on their tertiary structural composition. The current analysis revealed the four most damaging variants (Y316S, V386G, E387G, C389G, and C391G), highly conserved and functional and are located in the TGF-beta domain of BMP2 and BMP4. The amino acid substitutions E387G, C389G, and C391G are discovered in the binding region. It was observed that the mutations in the TGF-beta domain caused significant changes in its structural organization including the substrate binding sites. Current findings will assist future research focused on the role of these variants in BMP function loss and their role in skeletal disorders, and this will possibly help to develop practical strategies for treating bone-related conditions.

Collapse

Liu B, Li S. ProtDet-CCH: Protein Remote Homology Detection by Combining Long Short-Term Memory and Ranking Methods. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2019;16:1203-1210. [PMID: 29993950 DOI: 10.1109/tcbb.2018.2789880] [Citation(s) in RCA: 21] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]

Brick TR, Gray AL, Staples AD. Recurrence Quantification for the Analysis of Coupled Processes in Aging. J Gerontol B Psychol Sci Soc Sci 2017;73:134-147. [PMID: 28958046 DOI: 10.1093/geronb/gbx018] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2016] [Accepted: 02/11/2017] [Indexed: 11/14/2022] Open

Karain WI. Detecting transitions in protein dynamics using a recurrence quantification analysis based bootstrap method. BMC Bioinformatics 2017;18:525. [PMID: 29179670 PMCID: PMC5704401 DOI: 10.1186/s12859-017-1943-y] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2017] [Accepted: 11/15/2017] [Indexed: 11/17/2022] Open

Li S, Chen J, Liu B. Protein remote homology detection based on bidirectional long short-term memory. BMC Bioinformatics 2017;18:443. [PMID: 29017445 PMCID: PMC5634958 DOI: 10.1186/s12859-017-1842-2] [Citation(s) in RCA: 41] [Impact Index Per Article: 5.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2017] [Accepted: 09/21/2017] [Indexed: 01/05/2023] Open

Konopka BM, Marciniak M, Dyrka W. Quantiprot - a Python package for quantitative analysis of protein sequences. BMC Bioinformatics 2017;18:339. [PMID: 28716000 PMCID: PMC5512976 DOI: 10.1186/s12859-017-1751-4] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2017] [Accepted: 07/05/2017] [Indexed: 11/17/2022] Open

Chen J, Guo M, Wang X, Liu B. A comprehensive review and comparison of different computational methods for protein remote homology detection. Brief Bioinform 2016;19:231-244. [DOI: 10.1093/bib/bbw108] [Citation(s) in RCA: 81] [Impact Index Per Article: 10.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2016] [Indexed: 01/02/2023] Open

Koyano H, Hayashida M, Akutsu T. Maximum margin classifier working in a set of strings. Proc Math Phys Eng Sci 2016;472:20150551. [PMID: 27118908 PMCID: PMC4841474 DOI: 10.1098/rspa.2015.0551] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/10/2015] [Accepted: 02/02/2016] [Indexed: 11/12/2022] Open

Oh Brother, Where Art Thou? Finding Orthologs in the Twilight and Midnight Zones of Sequence Similarity. Evol Biol 2016. [DOI: 10.1007/978-3-319-41324-2_22] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]

Liu B, Chen J, Wang X. Protein remote homology detection by combining Chou’s distance-pair pseudo amino acid composition and principal component analysis. Mol Genet Genomics 2015;290:1919-31. [DOI: 10.1007/s00438-015-1044-4] [Citation(s) in RCA: 61] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2015] [Accepted: 04/06/2015] [Indexed: 02/07/2023]

Bedoya O, Tischer I. Reducing dimensionality in remote homology detection using predicted contact maps. Comput Biol Med 2015;59:64-72. [DOI: 10.1016/j.compbiomed.2015.01.020] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2014] [Revised: 01/05/2015] [Accepted: 01/22/2015] [Indexed: 11/28/2022]

Detecting protein atom correlations using correlation of probability of recurrence. Proteins 2014;82:2180-9. [DOI: 10.1002/prot.24574] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2014] [Accepted: 03/29/2014] [Indexed: 11/07/2022]

Remote homology detection incorporating the context of physicochemical properties. Comput Biol Med 2014;45:43-50. [PMID: 24480162 DOI: 10.1016/j.compbiomed.2013.11.012] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2013] [Revised: 11/10/2013] [Accepted: 11/18/2013] [Indexed: 11/22/2022]

Liu B, Xu J, Zou Q, Xu R, Wang X, Chen Q. Using distances between Top-n-gram and residue pairs for protein remote homology detection. BMC Bioinformatics 2014;15 Suppl 2:S3. [PMID: 24564580 PMCID: PMC4015815 DOI: 10.1186/1471-2105-15-s2-s3] [Citation(s) in RCA: 42] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022] Open

Abstract

Background

Protein remote homology detection is one of the central problems in bioinformatics, which is important for both basic research and practical application. Currently, discriminative methods based on Support Vector Machines (SVMs) achieve the state-of-the-art performance. Exploring feature vectors incorporating the position information of amino acids or other protein building blocks is a key step to improve the performance of the SVM-based methods.

Results

Two new methods for protein remote homology detection were proposed, called SVM-DR and SVM-DT. SVM-DR is a sequence-based method, in which the feature vector representation for protein is based on the distances between residue pairs. SVM-DT is a profile-based method, which considers the distances between Top-n-gram pairs. Top-n-gram can be viewed as a profile-based building block of proteins, which is calculated from the frequency profiles. These two methods are position dependent approaches incorporating the sequence-order information of protein sequences. Various experiments were conducted on a benchmark dataset containing 54 families and 23 superfamilies. Experimental results showed that these two new methods are very promising. Compared with the position independent methods, the performance improvement is obvious. Furthermore, the proposed methods can also provide useful insights for studying the features of protein families.

Conclusion

The better performance of the proposed methods demonstrates that the position dependant approaches are efficient for protein remote homology detection. Another advantage of our methods arises from the explicit feature space representation, which can be used to analyze the characteristic features of protein families. The source code of SVM-DT and SVM-DR is available at http://bioinformatics.hitsz.edu.cn/DistanceSVM/index.jsp

Collapse

Kuksa PP. Biological sequence classification with multivariate string kernels. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2013;10:1201-1210. [PMID: 24384708 DOI: 10.1109/tcbb.2013.15] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/03/2023]

Huang CH, Chou SY, Ng KL. Improving protein complex classification accuracy using amino acid composition profile. Comput Biol Med 2013;43:1196-204. [PMID: 23930814 DOI: 10.1016/j.compbiomed.2013.05.026] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2012] [Revised: 05/29/2013] [Accepted: 05/30/2013] [Indexed: 11/18/2022]

Han GS, Yu ZG, Anh V, Krishnajith APD, Tian YC. An ensemble method for predicting subnuclear localizations from primary protein structures. PLoS One 2013;8:e57225. [PMID: 23460833 PMCID: PMC3584121 DOI: 10.1371/journal.pone.0057225] [Citation(s) in RCA: 33] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2012] [Accepted: 01/18/2013] [Indexed: 12/04/2022] Open

Abstract

Background

Predicting protein subnuclear localization is a challenging problem. Some previous works based on non-sequence information including Gene Ontology annotations and kernel fusion have respective limitations. The aim of this work is twofold: one is to propose a novel individual feature extraction method; another is to develop an ensemble method to improve prediction performance using comprehensive information represented in the form of high dimensional feature vector obtained by 11 feature extraction methods.

Methodology/Principal Findings

A novel two-stage multiclass support vector machine is proposed to predict protein subnuclear localizations. It only considers those feature extraction methods based on amino acid classifications and physicochemical properties. In order to speed up our system, an automatic search method for the kernel parameter is used. The prediction performance of our method is evaluated on four datasets: Lei dataset, multi-localization dataset, SNL9 dataset and a new independent dataset. The overall accuracy of prediction for 6 localizations on Lei dataset is 75.2% and that for 9 localizations on SNL9 dataset is 72.1% in the leave-one-out cross validation, 71.7% for the multi-localization dataset and 69.8% for the new independent dataset, respectively. Comparisons with those existing methods show that our method performs better for both single-localization and multi-localization proteins and achieves more balanced sensitivities and specificities on large-size and small-size subcellular localizations. The overall accuracy improvements are 4.0% and 4.7% for single-localization proteins and 6.5% for multi-localization proteins. The reliability and stability of our classification model are further confirmed by permutation analysis.

Conclusions

It can be concluded that our method is effective and valuable for predicting protein subnuclear localizations. A web server has been designed to implement the proposed method. It is freely available at http://bioinformatics.awowshop.com/snlpred_page.php.

Collapse

Liu B, Wang X, Chen Q, Dong Q, Lan X. Using amino acid physicochemical distance transformation for fast protein remote homology detection. PLoS One 2012;7:e46633. [PMID: 23029559 PMCID: PMC3460876 DOI: 10.1371/journal.pone.0046633] [Citation(s) in RCA: 81] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2012] [Accepted: 09/03/2012] [Indexed: 11/18/2022] Open

Abstract

Protein remote homology detection is one of the most important problems in bioinformatics. Discriminative methods such as support vector machines (SVM) have shown superior performance. However, the performance of SVM-based methods depends on the vector representations of the protein sequences. Prior works have demonstrated that sequence-order effects are relevant for discrimination, but little work has explored how to incorporate the sequence-order information along with the amino acid physicochemical properties into the prediction. In order to incorporate the sequence-order effects into the protein remote homology detection, the physicochemical distance transformation (PDT) method is proposed. Each protein sequence is converted into a series of numbers by using the physicochemical property scores in the amino acid index (AAIndex), and then the sequence is converted into a fixed length vector by PDT. The sequence-order information can be efficiently included into the feature vector with little computational cost by this approach. Finally, the feature vectors are input into a support vector machine classifier to detect the protein remote homologies. Our experiments on a well-known benchmark show the proposed method SVM-PDT achieves superior or comparable performance with current state-of-the-art methods and its computational cost is considerably superior to those of other methods. When the evolutionary information extracted from the frequency profiles is combined with the PDT method, the profile-based PDT approach can improve the performance by 3.4% and 11.4% in terms of ROC score and ROC50 score respectively. The local sequence-order information of the protein can be efficiently captured by the proposed PDT and the physicochemical properties extracted from the amino acid index are incorporated into the prediction. The physicochemical distance transformation provides a general framework, which would be a valuable tool for protein-level study.

Collapse

Liu X, Zhao L, Dong Q. Protein remote homology detection based on auto-cross covariance transformation. Comput Biol Med 2011;41:640-7. [DOI: 10.1016/j.compbiomed.2011.05.015] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2010] [Revised: 05/03/2011] [Accepted: 05/24/2011] [Indexed: 11/26/2022]

Shah AR, Agarwal K, Baker ES, Singhal M, Mayampurath AM, Ibrahim YM, Kangas LJ, Monroe ME, Zhao R, Belov ME, Anderson GA, Smith RD. Machine learning based prediction for peptide drift times in ion mobility spectrometry. Bioinformatics 2010;26:1601-7. [PMID: 20495001 PMCID: PMC2913656 DOI: 10.1093/bioinformatics/btq245] [Citation(s) in RCA: 34] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/06/2010] [Revised: 04/18/2010] [Accepted: 05/02/2010] [Indexed: 11/14/2022] Open

Webb-Robertson BJM, Ratuiste KG, Oehmen CS. Physicochemical property distributions for accurate and rapid pairwise protein homology detection. BMC Bioinformatics 2010;11:145. [PMID: 20302613 PMCID: PMC2851606 DOI: 10.1186/1471-2105-11-145] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2009] [Accepted: 03/19/2010] [Indexed: 11/10/2022] Open

Bruni R, Costantino A, Tritarelli E, Marcantonio C, Ciccozzi M, Rapicetta M, El Sawaf G, Giuliani A, Ciccaglione AR. A computational approach identifies two regions of Hepatitis C Virus E1 protein as interacting domains involved in viral fusion process. BMC STRUCTURAL BIOLOGY 2009;9:48. [PMID: 19640267 PMCID: PMC2732612 DOI: 10.1186/1472-6807-9-48] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/22/2008] [Accepted: 07/29/2009] [Indexed: 01/01/2023]

Abstract

Background

The E1 protein of Hepatitis C Virus (HCV) can be dissected into two distinct hydrophobic regions: a central domain containing an hypothetical fusion peptide (FP), and a C-terminal domain (CT) comprising two segments, a pre-anchor and a trans-membrane (TM) region. In the currently accepted model of the viral fusion process, the FP and the TM regions are considered to be closely juxtaposed in the post-fusion structure and their physical interaction cannot be excluded. In the present study, we took advantage of the natural sequence variability present among HCV strains to test, by purely sequence-based computational tools, the hypothesis that in this virus the fusion process involves the physical interaction of the FP and CT regions of E1.

Results

Two computational approaches were applied. The first one is based on the co-evolution paradigm of interacting peptides and consequently on the correlation between the distance matrices generated by the sequence alignment method applied to FP and CT primary structures, respectively. In spite of the relatively low random genetic drift between genotypes, co-evolution analysis of sequences from five HCV genotypes revealed a greater correlation between the FP and CT domains than respect to a control HCV sequence from Core protein, so giving a clear, albeit still inconclusive, support to the physical interaction hypothesis.

The second approach relies upon a non-linear signal analysis method widely used in protein science called Recurrence Quantification Analysis (RQA). This method allows for a direct comparison of domains for the presence of common hydrophobicity patterns, on which the physical interaction is based upon. RQA greatly strengthened the reliability of the hypothesis by the scoring of a lot of cross-recurrences between FP and CT peptides hydrophobicity patterning largely outnumbering chance expectations and pointing to putative interaction sites. Intriguingly, mutations in the CT region of E1, reducing the fusion process in vitro, strongly reduced the amount of cross-recurrence further supporting interaction between this region and FP.

Conclusion

Our results support a fusion model for HCV in which the FP and the C-terminal region of E1 are juxtaposed and interact in the post-fusion structure. These findings have general implications for viruses, as any visualization of the post-fusion FP-TM complex has been precluded by the impossibility to obtain crystallised viral fusion proteins containing the trans-membrane region. This limitation gives to sequence based modelling efforts a crucial role in the sketching of a molecular interpretation of the fusion process. Moreover, our data also have a more general relevance for cell biology as the mechanism of intracellular fusion showed remarkable similarities with viral fusion

Collapse