Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Kumar A, Cowen L. Augmented training of hidden Markov models to recognize remote homologs via simulated evolution. Bioinformatics 2009;25:1602-8. [PMID: 19389731 PMCID: PMC2732314 DOI: 10.1093/bioinformatics/btp265] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open

For:	Kumar A, Cowen L. Augmented training of hidden Markov models to recognize remote homologs via simulated evolution. Bioinformatics 2009;25:1602-8. [PMID: 19389731 PMCID: PMC2732314 DOI: 10.1093/bioinformatics/btp265] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open

Number

Cited by Other Article(s)

Orientation algorithm for PPI networks based on network propagation approach. J Biosci 2022. [DOI: 10.1007/s12038-022-00284-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/17/2022]

Sandhya S, Mudgal R, Kumar G, Sowdhamini R, Srinivasan N. Protein sequence design and its applications. Curr Opin Struct Biol 2016;37:71-80. [PMID: 26773478 DOI: 10.1016/j.sbi.2015.12.004] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2015] [Revised: 12/07/2015] [Accepted: 12/15/2015] [Indexed: 01/14/2023]

Oh Brother, Where Art Thou? Finding Orthologs in the Twilight and Midnight Zones of Sequence Similarity. Evol Biol 2016. [DOI: 10.1007/978-3-319-41324-2_22] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]

Song T, Bu X, Gu H. Combining intrinsic disorder prediction and augmented training of hidden Markov models improves discriminative motif discovery. Chem Phys Lett 2015. [DOI: 10.1016/j.cplett.2015.06.030] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]

Daniels NM, Gallant A, Ramsey N, Cowen LJ. MRFy: Remote Homology Detection for Beta-Structural Proteins Using Markov Random Fields and Stochastic Search. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2015;12:4-16. [PMID: 26357074 DOI: 10.1109/tcbb.2014.2344682] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]

Du N, Knecht MR, Swihart MT, Tang Z, Walsh TR, Zhang A. Identifying Affinity Classes of Inorganic Materials Binding Sequences via a Graph-Based Model. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2015;12:193-204. [PMID: 26357089 DOI: 10.1109/tcbb.2014.2321158] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]

Mudgal R, Sandhya S, Kumar G, Sowdhamini R, Chandra NR, Srinivasan N. NrichD database: sequence databases enriched with computationally designed protein-like sequences aid in remote homology detection. Nucleic Acids Res 2014;43:D300-5. [PMID: 25262355 PMCID: PMC4384005 DOI: 10.1093/nar/gku888] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open

Song T, Gu H. Discriminative motif discovery via simulated evolution and random under-sampling. PLoS One 2014;9:e87670. [PMID: 24551063 PMCID: PMC3923751 DOI: 10.1371/journal.pone.0087670] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2013] [Accepted: 12/29/2013] [Indexed: 11/22/2022] Open

Mudgal R, Sowdhamini R, Chandra N, Srinivasan N, Sandhya S. Filling-in void and sparse regions in protein sequence space by protein-like artificial sequences enables remarkable enhancement in remote homology detection capability. J Mol Biol 2013;426:962-79. [PMID: 24316367 DOI: 10.1016/j.jmb.2013.11.026] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2013] [Revised: 11/23/2013] [Accepted: 11/26/2013] [Indexed: 12/11/2022]

Abstract

Protein functional annotation relies on the identification of accurate relationships, sequence divergence being a key factor. This is especially evident when distant protein relationships are demonstrated only with three-dimensional structures. To address this challenge, we describe a computational approach to purposefully bridge gaps between related protein families through directed design of protein-like "linker" sequences. For this, we represented SCOP domain families, integrated with sequence homologues, as multiple profiles and performed HMM-HMM alignments between related domain families. Where convincing alignments were achieved, we applied a roulette wheel-based method to design 3,611,010 protein-like sequences corresponding to 374 SCOP folds. To analyze their ability to link proteins in homology searches, we used 3024 queries to search two databases, one containing only natural sequences and another one additionally containing designed sequences. Our results showed that augmented database searches showed up to 30% improvement in fold coverage for over 74% of the folds, with 52 folds achieving all theoretically possible connections. Although sequences could not be designed between some families, the availability of designed sequences between other families within the fold established the sequence continuum to demonstrate 373 difficult relationships. Ultimately, as a practical and realistic extension, we demonstrate that such protein-like sequences can be "plugged-into" routine and generic sequence database searches to empower not only remote homology detection but also fold recognition. Our richly statistically supported findings show that complementary searches in both databases will increase the effectiveness of sequence-based searches in recognizing all homologues sharing a common fold.

Collapse

Daniels NM, Gallant A, Peng J, Cowen LJ, Baym M, Berger B. Compressive genomics for protein databases. Bioinformatics 2013;29:i283-90. [PMID: 23812995 PMCID: PMC3851851 DOI: 10.1093/bioinformatics/btt214] [Citation(s) in RCA: 37] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open

Ma J, Peng J, Wang S, Xu J. A conditional neural fields model for protein threading. ACTA ACUST UNITED AC 2013;28:i59-66. [PMID: 22689779 PMCID: PMC3371845 DOI: 10.1093/bioinformatics/bts213] [Citation(s) in RCA: 73] [Impact Index Per Article: 6.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]

A computational framework for boosting confidence in high-throughput protein-protein interaction datasets. Genome Biol 2012;13:R76. [PMID: 22937800 PMCID: PMC4053744 DOI: 10.1186/gb-2012-13-8-r76] [Citation(s) in RCA: 42] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2012] [Accepted: 08/31/2012] [Indexed: 12/28/2022] Open

Terrapon N, Gascuel O, Maréchal E, Bréhélin L. Fitting hidden Markov models of protein domains to a target species: application to Plasmodium falciparum. BMC Bioinformatics 2012;13:67. [PMID: 22548871 PMCID: PMC3434054 DOI: 10.1186/1471-2105-13-67] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2011] [Accepted: 05/01/2012] [Indexed: 01/12/2023] Open

Abstract

BACKGROUND

Hidden Markov Models (HMMs) are a powerful tool for protein domain identification. The Pfam database notably provides a large collection of HMMs which are widely used for the annotation of proteins in new sequenced organisms. In Pfam, each domain family is represented by a curated multiple sequence alignment from which a profile HMM is built. In spite of their high specificity, HMMs may lack sensitivity when searching for domains in divergent organisms. This is particularly the case for species with a biased amino-acid composition, such as P. falciparum, the main causal agent of human malaria. In this context, fitting HMMs to the specificities of the target proteome can help identify additional domains.

RESULTS

Using P. falciparum as an example, we compare approaches that have been proposed for this problem, and present two alternative methods. Because previous attempts strongly rely on known domain occurrences in the target species or its close relatives, they mainly improve the detection of domains which belong to already identified families. Our methods learn global correction rules that adjust amino-acid distributions associated with the match states of HMMs. These rules are applied to all match states of the whole HMM library, thus enabling the detection of domains from previously absent families. Additionally, we propose a procedure to estimate the proportion of false positives among the newly discovered domains. Starting with the Pfam standard library, we build several new libraries with the different HMM-fitting approaches. These libraries are first used to detect new domain occurrences with low E-values. Second, by applying the Co-Occurrence Domain Discovery (CODD) procedure we have recently proposed, the libraries are further used to identify likely occurrences among potential domains with higher E-values.

CONCLUSION

We show that the new approaches allow identification of several domain families previously absent in the P. falciparum proteome and the Apicomplexa phylum, and identify many domains that are not detected by previous approaches. In terms of the number of new discovered domains, the new approaches outperform the previous ones when no close species are available or when they are used to identify likely occurrences among potential domains with high E-values. All predictions on P. falciparum have been integrated into a dedicated website which pools all known/new annotations of protein domains and functions for this organism. A software implementing the two proposed approaches is available at the same address: http://www.lirmm.fr/~terrapon/HMMﬁt/

Collapse

Daniels NM, Hosur R, Berger B, Cowen LJ. SMURFLite: combining simplified Markov random fields with simulated evolution improves remote homology detection for beta-structural proteins into the twilight zone. Bioinformatics 2012;28:1216-22. [PMID: 22408192 PMCID: PMC3338012 DOI: 10.1093/bioinformatics/bts110] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open

Peng J, Xu J. RaptorX: exploiting structure information for protein alignment by statistical inference. Proteins 2011;79 Suppl 10:161-71. [PMID: 21987485 DOI: 10.1002/prot.23175] [Citation(s) in RCA: 241] [Impact Index Per Article: 18.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2011] [Revised: 07/25/2011] [Accepted: 08/19/2011] [Indexed: 12/13/2022]

Kumar A, Cowen L. Recognition of beta-structural motifs using hidden Markov models trained with simulated evolution. Bioinformatics 2010;26:i287-93. [PMID: 20529918 PMCID: PMC2881384 DOI: 10.1093/bioinformatics/btq199] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022] Open

Webb-Robertson BJM, Ratuiste KG, Oehmen CS. Physicochemical property distributions for accurate and rapid pairwise protein homology detection. BMC Bioinformatics 2010;11:145. [PMID: 20302613 PMCID: PMC2851606 DOI: 10.1186/1471-2105-11-145] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2009] [Accepted: 03/19/2010] [Indexed: 11/10/2022] Open