Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Kim JT, Martinetz T, Polani D. Bioinformatic principles underlying the information content of transcription factor binding sites. J Theor Biol 2003;220:529-44. [PMID: 12623284 DOI: 10.1006/jtbi.2003.3153] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]

For:	Kim JT, Martinetz T, Polani D. Bioinformatic principles underlying the information content of transcription factor binding sites. J Theor Biol 2003;220:529-44. [PMID: 12623284 DOI: 10.1006/jtbi.2003.3153] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]

Number

Cited by Other Article(s)

A search for the physical basis of the genetic code. Biosystems 2020;195:104148. [DOI: 10.1016/j.biosystems.2020.104148] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2020] [Revised: 04/09/2020] [Accepted: 04/09/2020] [Indexed: 01/01/2023]

O'Neill PK, Erill I. Parametric bootstrapping for biological sequence motifs. BMC Bioinformatics 2016;17:406. [PMID: 27716039 PMCID: PMC5052923 DOI: 10.1186/s12859-016-1246-8] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/20/2016] [Accepted: 09/08/2016] [Indexed: 11/10/2022] Open

Abstract

Background

Biological sequence motifs drive the specific interactions of proteins and nucleic acids. Accordingly, the effective computational discovery and analysis of such motifs is a central theme in bioinformatics. Many practical questions about the properties of motifs can be recast as random sampling problems. In this light, the task is to determine for a given motif whether a certain feature of interest is statistically unusual among relevantly similar alternatives. Despite the generality of this framework, its use has been frustrated by the difficulties of defining an appropriate reference class of motifs for comparison and of sampling from it effectively.

Results

We define two distributions over the space of all motifs of given dimension. The first is the maximum entropy distribution subject to mean information content, and the second is the truncated uniform distribution over all motifs having information content within a given interval. We derive exact sampling algorithms for each. As a proof of concept, we employ these sampling methods to analyze a broad collection of prokaryotic and eukaryotic transcription factor binding site motifs. In addition to positional information content, we consider the informational Gini coefficient of the motif, a measure of the degree to which information is evenly distributed throughout a motif’s positions. We find that both prokaryotic and eukaryotic motifs tend to exhibit higher informational Gini coefficients (IGC) than would be expected by chance under either reference distribution. As a second application, we apply maximum entropy sampling to the motif p-value problem and use it to give elementary derivations of two new estimators.

Conclusions

Despite the historical centrality of biological sequence motif analysis, this study constitutes to our knowledge the first use of principled null hypotheses for sequence motifs given information content. Through their use, we are able to characterize for the first time differerences in global motif statistics between biological motifs and their null distributions. In particular, we observe that biological sequence motifs show an unusual distribution of IGC, presumably due to biochemical constraints on the mechanisms of direct read-out.

Electronic supplementary material

The online version of this article (doi:10.1186/s12859-016-1246-8) contains supplementary material, which is available to authorized users.

Collapse

Zhang Y, He Y, Zheng G, Wei C. MOST+: A de novo motif finding approach combining genomic sequence and heterogeneous genome-wide signatures. BMC Genomics 2015;16 Suppl 7:S13. [PMID: 26099518 PMCID: PMC4474412 DOI: 10.1186/1471-2164-16-s7-s13] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/09/2023] Open

O'Neill PK, Forder R, Erill I. Informational requirements for transcriptional regulation. J Comput Biol 2014;21:373-84. [PMID: 24689750 DOI: 10.1089/cmb.2014.0032] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open

Davies PC, Rieper E, Tuszynski JA. Self-organization and entropy reduction in a living cell. Biosystems 2013;111:1-10. [PMID: 23159919 PMCID: PMC3712629 DOI: 10.1016/j.biosystems.2012.10.005] [Citation(s) in RCA: 49] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2012] [Revised: 10/22/2012] [Accepted: 10/22/2012] [Indexed: 12/19/2022]

Mahdevar G, Sadeghi M, Nowzari-Dalini A. Transcription factor binding sites detection by using alignment-based approach. J Theor Biol 2012;304:96-102. [PMID: 22504445 DOI: 10.1016/j.jtbi.2012.03.039] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2011] [Revised: 03/27/2012] [Accepted: 03/29/2012] [Indexed: 11/25/2022]

Batista MV, Ferreira TA, Freitas AC, Balbino VQ. An entropy-based approach for the identification of phylogenetically informative genomic regions of Papillomavirus. INFECTION GENETICS AND EVOLUTION 2011;11:2026-33. [DOI: 10.1016/j.meegid.2011.09.013] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/14/2011] [Revised: 09/09/2011] [Accepted: 09/14/2011] [Indexed: 11/17/2022]

Kim JT, Gewehr JE, Martinetz T. BINDING MATRIX: A NOVEL APPROACH FOR BINDING SITE RECOGNITION. J Bioinform Comput Biol 2011;2:289-307. [PMID: 15297983 DOI: 10.1142/s0219720004000569] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/03/2003] [Revised: 12/19/2003] [Accepted: 12/19/2003] [Indexed: 11/18/2022]

Stochastic oscillations in genetic regulatory networks: application to microarray experiments. EURASIP JOURNAL ON BIOINFORMATICS & SYSTEMS BIOLOGY 2010:59526. [PMID: 18427584 DOI: 10.1155/bsb/2006/59526] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/19/2006] [Revised: 06/26/2006] [Accepted: 06/27/2006] [Indexed: 11/17/2022]

Biomolecular information gained through in vitro evolution. Biophys Rev 2010;2:1-11. [DOI: 10.1007/s12551-009-0021-8] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2009] [Accepted: 10/22/2009] [Indexed: 11/30/2022] Open

Information content based model for the topological properties of the gene regulatory network of Escherichia coli. J Theor Biol 2009;263:281-94. [PMID: 19962388 DOI: 10.1016/j.jtbi.2009.11.017] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2009] [Revised: 11/21/2009] [Accepted: 11/23/2009] [Indexed: 11/22/2022]

van Hijum SAFT, Medema MH, Kuipers OP. Mechanisms and evolution of control logic in prokaryotic transcriptional regulation. Microbiol Mol Biol Rev 2009;73:481-509, Table of Contents. [PMID: 19721087 PMCID: PMC2738135 DOI: 10.1128/mmbr.00037-08] [Citation(s) in RCA: 96] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open

Erill I, O'Neill MC. A reexamination of information theory-based methods for DNA-binding site identification. BMC Bioinformatics 2009;10:57. [PMID: 19210776 PMCID: PMC2680408 DOI: 10.1186/1471-2105-10-57] [Citation(s) in RCA: 29] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2008] [Accepted: 02/11/2009] [Indexed: 11/10/2022] Open

Abstract

Background

Searching for transcription factor binding sites in genome sequences is still an open problem in bioinformatics. Despite substantial progress, search methods based on information theory remain a standard in the field, even though the full validity of their underlying assumptions has only been tested in artificial settings. Here we use newly available data on transcription factors from different bacterial genomes to make a more thorough assessment of information theory-based search methods.

Results

Our results reveal that conventional benchmarking against artificial sequence data leads frequently to overestimation of search efficiency. In addition, we find that sequence information by itself is often inadequate and therefore must be complemented by other cues, such as curvature, in real genomes. Furthermore, results on skewed genomes show that methods integrating skew information, such as Relative Entropy, are not effective because their assumptions may not hold in real genomes. The evidence suggests that binding sites tend to evolve towards genomic skew, rather than against it, and to maintain their information content through increased conservation. Based on these results, we identify several misconceptions on information theory as applied to binding sites, such as negative entropy, and we propose a revised paradigm to explain the observed results.

Conclusion

We conclude that, among information theory-based methods, the most unassuming search methods perform, on average, better than any other alternatives, since heuristic corrections to these methods are prone to fail when working on real data. A reexamination of information content in binding sites reveals that information content is a compound measure of search and binding affinity requirements, a fact that has important repercussions for our understanding of binding site evolution.

Collapse

Vyhlidal CA, Rogan PK, Leeder JS. Development and refinement of pregnane X receptor (PXR) DNA binding site model using information theory: insights into PXR-mediated gene regulation. J Biol Chem 2004;279:46779-86. [PMID: 15316010 DOI: 10.1074/jbc.m408395200] [Citation(s) in RCA: 32] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/13/2023] Open