Meng H, Banerjee A, Zhou L. BLISS: binding site level identification of shared signal-modules in DNA regulatory sequences.
BMC Bioinformatics 2006;
7:287. [PMID:
16756683 PMCID:
PMC1526457 DOI:
10.1186/1471-2105-7-287]
[Citation(s) in RCA: 4] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2006] [Accepted: 06/07/2006] [Indexed: 11/10/2022] Open
Abstract
Background
Regulatory modules are segments of the DNA that control particular aspects of gene expression. Their identification is therefore of great importance to the field of molecular genetics. Each module is composed of a distinct set of binding sites for specific transcription factors. Since experimental identification of regulatory modules is an arduous process, accurate computational techniques that supplement this process can be very beneficial. Functional modules are under selective pressure to be evolutionarily conserved. Most current approaches therefore attempt to detect conserved regulatory modules through similarity comparisons at the DNA sequence level. However, some regulatory modules, despite the conservation of their responsible binding sites, are embedded in sequences that have little overall similarity.
Results
In this study, we present a novel approach that detects conserved regulatory modules via comparisons at the binding site level. The technique compares the binding site profiles of orthologs and identifies those segments that have similar (not necessarily identical) profiles. The similarity measure is based on the inner product of transformed profiles, which takes into consideration the p values of binding sites as well as the potential shift of binding site positions. We tested this approach on simulated sequence pairs as well as real world examples. In both cases our technique was able to identify regulatory modules which could not to be identified using sequence-similarity based approaches such as rVista 2.0 and Blast.
Conclusion
The results of our experiments demonstrate that, for sequences with little overall similarity at the DNA sequence level, it is still possible to identify conserved regulatory modules based solely on binding site profiles.
Collapse