Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Gordân R, Narlikar L, Hartemink AJ. Finding regulatory DNA motifs using alignment-free evolutionary conservation information. Nucleic Acids Res 2010;38:e90. [PMID: 20047961 PMCID: PMC2847231 DOI: 10.1093/nar/gkp1166] [Citation(s) in RCA: 34] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2009] [Revised: 10/30/2009] [Accepted: 11/23/2009] [Indexed: 01/01/2023] Open

For:	Gordân R, Narlikar L, Hartemink AJ. Finding regulatory DNA motifs using alignment-free evolutionary conservation information. Nucleic Acids Res 2010;38:e90. [PMID: 20047961 PMCID: PMC2847231 DOI: 10.1093/nar/gkp1166] [Citation(s) in RCA: 34] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2009] [Revised: 10/30/2009] [Accepted: 11/23/2009] [Indexed: 01/01/2023] Open

Number

Cited by Other Article(s)

Karollus A, Hingerl J, Gankin D, Grosshauser M, Klemon K, Gagneur J. Species-aware DNA language models capture regulatory elements and their evolution. Genome Biol 2024;25:83. [PMID: 38566111 PMCID: PMC10985990 DOI: 10.1186/s13059-024-03221-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2023] [Accepted: 03/20/2024] [Indexed: 04/04/2024] Open

Lieberman-Lazarovich M, Yahav C, Israeli A, Efroni I. Deep Conservation of cis-Element Variants Regulating Plant Hormonal Responses. THE PLANT CELL 2019;31:2559-2572. [PMID: 31467248 PMCID: PMC6881130 DOI: 10.1105/tpc.19.00129] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/25/2019] [Accepted: 08/27/2019] [Indexed: 05/14/2023]

Lu J, Cao X, Zhong S. A likelihood approach to testing hypotheses on the co-evolution of epigenome and genome. PLoS Comput Biol 2018;14:e1006673. [PMID: 30586383 PMCID: PMC6324829 DOI: 10.1371/journal.pcbi.1006673] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2018] [Revised: 01/08/2019] [Accepted: 11/26/2018] [Indexed: 01/03/2023] Open

Abstract

Central questions to epigenome evolution include whether interspecies changes of histone modifications are independent of evolutionary changes of DNA, and if there is dependence whether they depend on any specific types of DNA sequence changes. Here, we present a likelihood approach for testing hypotheses on the co-evolution of genome and histone modifications. The gist of this approach is to convert evolutionary biology hypotheses into probabilistic forms, by explicitly expressing the joint probability of multispecies DNA sequences and histone modifications, which we refer to as a class of Joint Evolutionary Model for the Genome and the Epigenome (JEMGE). JEMGE can be summarized as a mixture model of four components representing four evolutionary hypotheses, namely dependence and independence of interspecies epigenomic variations to underlying sequence substitutions and to underlying sequence insertions and deletions (indels). We implemented a maximum likelihood method to fit the models to the data. Based on comparison of likelihoods, we inferred whether interspecies epigenomic variations depended on substitution or indels in local genomic sequences based on DNase hypersensitivity and spermatid H3K4me3 ChIP-seq data from human and rhesus macaque. Approximately 5.5% of homologous regions in the genomes exhibited H3K4me3 modification in either species, among which approximately 67% homologous regions exhibited local-sequence-dependent interspecies H3K4me3 variations. Substitutions accounted for less local-sequence-dependent H3K4me3 variations than indels. Among transposon-mediated indels, ERV1 insertions and L1 insertions were most strongly associated with H3K4me3 gains and losses, respectively. By initiating probabilistic formulation on the co-evolution of genomes and epigenomes, JEMGE helps to bring evolutionary biology principles to comparative epigenomic studies.

Epigenetic modifications play a significant role in gene regulations and thus heavily influence phenotypic outcomes. Whereas cross-species epigenomic comparisons have been fruitful in revealing the function of epigenetic modifications, it still remains unclear how the epigenome changes across species. A central question in epigenome evolution studies is whether interspecies epigenomic variations rely on genomic changes in cis and, if partially yes, whether different genomic changes have distinct impacts. To tackle this question, we initiated a likelihood-based approach, in which different hypotheses related to the co-evolution of the genome and the epigenome could be converted into probabilistic models. By fitting the models to actual data, each model yielded a likelihood, and the hypothesis corresponded to the largest likelihood was selected as most supported by observed data. In this work, we focused on the influence of two types of underlying sequence changes: substitutions, and insertions and deletions (indels). We quantitatively assessed the dependence of H3K4me3 variations on substitutions and indels between human and rhesus, and separated their relative impacts within each genomic region with H3K4me3. The methodology presented here provides a framework for modeling the epigenome together with the genome and a quantitative approach to test different evolutionary hypotheses.

Collapse

Medina EM, Turner JJ, Gordân R, Skotheim JM, Buchler NE. Punctuated evolution and transitional hybrid network in an ancestral cell cycle of fungi. eLife 2016;5. [PMID: 27162172 PMCID: PMC4862756 DOI: 10.7554/elife.09492] [Citation(s) in RCA: 41] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2015] [Accepted: 04/07/2016] [Indexed: 12/12/2022] Open

Abstract

Although cell cycle control is an ancient, conserved, and essential process, some core animal and fungal cell cycle regulators share no more sequence identity than non-homologous proteins. Here, we show that evolution along the fungal lineage was punctuated by the early acquisition and entrainment of the SBF transcription factor through horizontal gene transfer. Cell cycle evolution in the fungal ancestor then proceeded through a hybrid network containing both SBF and its ancestral animal counterpart E2F, which is still maintained in many basal fungi. We hypothesize that a virally-derived SBF may have initially hijacked cell cycle control by activating transcription via the cis-regulatory elements targeted by the ancestral cell cycle regulator E2F, much like extant viral oncogenes. Consistent with this hypothesis, we show that SBF can regulate promoters with E2F binding sites in budding yeast.

DOI:http://dx.doi.org/10.7554/eLife.09492.001

Living cells grow and divide with remarkable precision to ensure that their genetic material is faithfully duplicated and distributed equally to the newly formed daughter cells. This precision is achieved through a series of steps known as the cell cycle. The cell cycle is ancient and conserved across all Eukaryotes, including plants, animals and fungi. However, some of the core proteins present in animals and fungi are unrelated. This raises the question as to how a drastic change could have occurred and been tolerated over evolution.

In animals and plants, a protein called E2F controls the expression of genes that are needed to begin the cell cycle. In most fungi, an equivalent protein called SBF performs the same role as E2F, but the two proteins are very different and do not appear to share a common ancestor. This is unexpected given that fungi and animals are more closely related to one another than either is to plants.

Medina et al. searched the genomes of many animals, fungi, plants, algae, and their closest relatives for genes that encoded proteins like E2F and SBF. SBF-like proteins were only found in fungi, yet some fungal groups had cell cycle regulators like those found in animals. Zoosporic fungi, which diverged early from the fungal ancestor, had both SBF- and E2F-like proteins, while many fungi later lost E2F during evolution.

So how did fungi acquire SBF? Medina et al. observed that part of the SBF protein is similar to proteins found in many viruses. The broad distribution of these viral SBF-like proteins suggests that they arose first in viruses, and a fungal ancestor acquired one such protein during a viral infection. As SBF and E2F bind similar DNA sequences, Medina et al. hypothesized that this viral SBF hijacked control of the cell cycle in the fungal ancestor by controlling expression of genes that were originally controlled only by E2F. In support of this idea, experiments showed that many E2F binding sites in modern genes are also SBF binding sites, and that E2F sites can substitute for SBF sites in SBF-controlled genes. Future experiments in zoosporic fungi, which have animal-like and fungal-like features, would provide a glimpse of how a fungal ancestor may have used both SBF and E2F. These experiments may also reveal why most fungi have retained the newer SBF but lost the ancestral and widely conserved E2F protein.

DOI:http://dx.doi.org/10.7554/eLife.09492.002

Collapse

Thompson D, Regev A, Roy S. Comparative analysis of gene regulatory networks: from network reconstruction to evolution. Annu Rev Cell Dev Biol 2015;31:399-428. [PMID: 26355593 DOI: 10.1146/annurev-cellbio-100913-012908] [Citation(s) in RCA: 95] [Impact Index Per Article: 10.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]

De Witte D, Van de Velde J, Decap D, Van Bel M, Audenaert P, Demeester P, Dhoedt B, Vandepoele K, Fostier J. BLSSpeller: exhaustive comparative discovery of conserved cis-regulatory elements. Bioinformatics 2015;31:3758-66. [PMID: 26254488 PMCID: PMC4653392 DOI: 10.1093/bioinformatics/btv466] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2014] [Accepted: 08/03/2015] [Indexed: 11/14/2022] Open

Taher L, Narlikar L, Ovcharenko I. Identification and computational analysis of gene regulatory elements. Cold Spring Harb Protoc 2015;2015:pdb.top083642. [PMID: 25561628 PMCID: PMC5885252 DOI: 10.1101/pdb.top083642] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/04/2023]

Siggers T, Gilmore TD, Barron B, Penvose A. Characterizing the DNA binding site specificity of NF-κB with protein-binding microarrays (PBMs). Methods Mol Biol 2015;1280:609-30. [PMID: 25736775 DOI: 10.1007/978-1-4939-2422-6_36] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023]

Ballester B, Medina-Rivera A, Schmidt D, Gonzàlez-Porta M, Carlucci M, Chen X, Chessman K, Faure AJ, Funnell APW, Goncalves A, Kutter C, Lukk M, Menon S, McLaren WM, Stefflova K, Watt S, Weirauch MT, Crossley M, Marioni JC, Odom DT, Flicek P, Wilson MD. Multi-species, multi-transcription factor binding highlights conserved control of tissue-specific biological pathways. eLife 2014;3:e02626. [PMID: 25279814 PMCID: PMC4359374 DOI: 10.7554/elife.02626] [Citation(s) in RCA: 66] [Impact Index Per Article: 6.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2014] [Accepted: 09/02/2014] [Indexed: 12/20/2022] Open

Affiliation(s)

Benoit Ballester European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Cambridge, United Kingdom Aix-Marseille Université, UMR1090 TAGC, Marseille, France INSERM, UMR1090 TAGC, Marseille, France
Alejandra Medina-Rivera Genetics and Genome Biology Program, SickKids Research Institute, Toronto, Canada
Dominic Schmidt Cancer Research UK–Cambridge InstituteUniversity of Cambridge, Cambridge, United Kingdom
Mar Gonzàlez-Porta European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Cambridge, United Kingdom
Matthew Carlucci Genetics and Genome Biology Program, SickKids Research Institute, Toronto, Canada
Xiaoting Chen School of Electronic and Computing Systems, University of Cincinnati, Cincinnati, United States
Kyle Chessman Genetics and Genome Biology Program, SickKids Research Institute, Toronto, Canada
Andre J Faure European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Cambridge, United Kingdom
Alister PW Funnell School of Biotechnology and Biomolecular Sciences, University of New South Wales, Kensington, Australia
Angela Goncalves European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Cambridge, United Kingdom
Claudia Kutter Cancer Research UK–Cambridge InstituteUniversity of Cambridge, Cambridge, United Kingdom
Margus Lukk Cancer Research UK–Cambridge InstituteUniversity of Cambridge, Cambridge, United Kingdom
Suraj Menon Cancer Research UK–Cambridge InstituteUniversity of Cambridge, Cambridge, United Kingdom
William M McLaren European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Cambridge, United Kingdom
Klara Stefflova Cancer Research UK–Cambridge InstituteUniversity of Cambridge, Cambridge, United Kingdom
Stephen Watt Cancer Research UK–Cambridge InstituteUniversity of Cambridge, Cambridge, United Kingdom
Matthew T Weirauch Center for Autoimmune Genomics and Etiology, Cincinnati Children's Hospital Medical Center, Cincinnati, United States Divisions of Biomedical Informatics and Developmental Biology, Cincinnati Children's Hospital Medical Center, Cincinnati, United States
Merlin Crossley School of Biotechnology and Biomolecular Sciences, University of New South Wales, Kensington, Australia
John C Marioni European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Cambridge, United Kingdom
Duncan T Odom Cancer Research UK–Cambridge InstituteUniversity of Cambridge, Cambridge, United Kingdom Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Cambridge, United Kingdom
Paul Flicek European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Cambridge, United Kingdom Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Cambridge, United Kingdom
Michael D Wilson Genetics and Genome Biology Program, SickKids Research Institute, Toronto, Canada Cancer Research UK–Cambridge InstituteUniversity of Cambridge, Cambridge, United Kingdom Department of Molecular Genetics, University of Toronto, Toronto, Canada

Collapse

Glenwinkel L, Wu D, Minevich G, Hobert O. TargetOrtho: a phylogenetic footprinting tool to identify transcription factor targets. Genetics 2014;197:61-76. [PMID: 24558259 PMCID: PMC4012501 DOI: 10.1534/genetics.113.160721] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2014] [Accepted: 02/09/2014] [Indexed: 11/18/2022] Open

Roccaro M, Ahmadinejad N, Colby T, Somssich IE. Identification of functional cis-regulatory elements by sequential enrichment from a randomized synthetic DNA library. BMC PLANT BIOLOGY 2013;13:164. [PMID: 24138055 PMCID: PMC3923269 DOI: 10.1186/1471-2229-13-164] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/08/2013] [Accepted: 10/08/2013] [Indexed: 06/01/2023]

Klepper K, Drabløs F. MotifLab: a tools and data integration workbench for motif discovery and regulatory sequence analysis. BMC Bioinformatics 2013;14:9. [PMID: 23323883 PMCID: PMC3556059 DOI: 10.1186/1471-2105-14-9] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2012] [Accepted: 01/10/2013] [Indexed: 12/19/2022] Open

Abstract

Background

Traditional methods for computational motif discovery often suffer from poor performance. In particular, methods that search for sequence matches to known binding motifs tend to predict many non-functional binding sites because they fail to take into consideration the biological state of the cell. In recent years, genome-wide studies have generated a lot of data that has the potential to improve our ability to identify functional motifs and binding sites, such as information about chromatin accessibility and epigenetic states in different cell types. However, it is not always trivial to make use of this data in combination with existing motif discovery tools, especially for researchers who are not skilled in bioinformatics programming.

Results

Here we present MotifLab, a general workbench for analysing regulatory sequence regions and discovering transcription factor binding sites and cis-regulatory modules. MotifLab supports comprehensive motif discovery and analysis by allowing users to integrate several popular motif discovery tools as well as different kinds of additional information, including phylogenetic conservation, epigenetic marks, DNase hypersensitive sites, ChIP-Seq data, positional binding preferences of transcription factors, transcription factor interactions and gene expression. MotifLab offers several data-processing operations that can be used to create, manipulate and analyse data objects, and complete analysis workflows can be constructed and automatically executed within MotifLab, including graphical presentation of the results.

Conclusions

We have developed MotifLab as a flexible workbench for motif analysis in a genomic context. The flexibility and effectiveness of this workbench has been demonstrated on selected test cases, in particular two previously published benchmark data sets for single motifs and modules, and a realistic example of genes responding to treatment with forskolin. MotifLab is freely available at http://www.motiflab.org.

Collapse

Narlikar L, Mehta N, Galande S, Arjunwadkar M. One size does not fit all: on how Markov model order dictates performance of genomic sequence analyses. Nucleic Acids Res 2012;41:1416-24. [PMID: 23267010 PMCID: PMC3562003 DOI: 10.1093/nar/gks1285] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/01/2023] Open

Narlikar L. MuMoD: a Bayesian approach to detect multiple modes of protein-DNA binding from genome-wide ChIP data. Nucleic Acids Res 2012;41:21-32. [PMID: 23093591 PMCID: PMC3592440 DOI: 10.1093/nar/gks950] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/15/2023] Open

Hartmann H, Guthöhrlein EW, Siebert M, Luehr S, Söding J. P-value-based regulatory motif discovery using positional weight matrices. Genome Res 2012;23:181-94. [PMID: 22990209 PMCID: PMC3530678 DOI: 10.1101/gr.139881.112] [Citation(s) in RCA: 45] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]

Luehr S, Hartmann H, Söding J. The XXmotif web server for eXhaustive, weight matriX-based motif discovery in nucleotide sequences. Nucleic Acids Res 2012;40:W104-9. [PMID: 22693218 PMCID: PMC3394272 DOI: 10.1093/nar/gks602] [Citation(s) in RCA: 36] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/04/2022] Open

Busser BW, Shokri L, Jaeger SA, Gisselbrecht SS, Singhania A, Berger MF, Zhou B, Bulyk ML, Michelson AM. Molecular mechanism underlying the regulatory specificity of a Drosophila homeodomain protein that specifies myoblast identity. Development 2012;139:1164-74. [PMID: 22296846 PMCID: PMC3283125 DOI: 10.1242/dev.077362] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/25/2023]

Göke J, Schulz MH, Lasserre J, Vingron M. Estimation of pairwise sequence similarity of mammalian enhancers with word neighbourhood counts. ACTA ACUST UNITED AC 2012;28:656-63. [PMID: 22247280 PMCID: PMC3289921 DOI: 10.1093/bioinformatics/bts028] [Citation(s) in RCA: 39] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022]

Mahmood K, Webb GI, Song J, Whisstock JC, Konagurthu AS. Efficient large-scale protein sequence comparison and gene matching to identify orthologs and co-orthologs. Nucleic Acids Res 2011;40:e44. [PMID: 22210858 PMCID: PMC3315314 DOI: 10.1093/nar/gkr1261] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open

Principles of dimer-specific gene regulation revealed by a comprehensive characterization of NF-κB family DNA binding. Nat Immunol 2011;13:95-102. [PMID: 22101729 PMCID: PMC3242931 DOI: 10.1038/ni.2151] [Citation(s) in RCA: 165] [Impact Index Per Article: 12.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2011] [Accepted: 09/26/2011] [Indexed: 12/14/2022]

Cuellar-Partida G, Buske FA, McLeay RC, Whitington T, Noble WS, Bailey TL. Epigenetic priors for identifying active transcription factor binding sites. ACTA ACUST UNITED AC 2011;28:56-62. [PMID: 22072382 DOI: 10.1093/bioinformatics/btr614] [Citation(s) in RCA: 76] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022]

Zhang S, Li S, Niu M, Pham PT, Su Z. MotifClick: prediction of cis-regulatory binding sites via merging cliques. BMC Bioinformatics 2011;12:238. [PMID: 21679436 PMCID: PMC3225181 DOI: 10.1186/1471-2105-12-238] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2010] [Accepted: 06/16/2011] [Indexed: 11/21/2022] Open

Carvalho AM, Oliveira AL. GRISOTTO: A greedy approach to improve combinatorial algorithms for motif discovery with prior knowledge. Algorithms Mol Biol 2011;6:13. [PMID: 21513505 PMCID: PMC3112114 DOI: 10.1186/1748-7188-6-13] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2010] [Accepted: 04/22/2011] [Indexed: 11/30/2022] Open

When needles look like hay: how to find tissue-specific enhancers in model organism genomes. Dev Biol 2010;350:239-54. [PMID: 21130761 DOI: 10.1016/j.ydbio.2010.11.026] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2010] [Revised: 11/11/2010] [Accepted: 11/22/2010] [Indexed: 01/22/2023]

Garcia-Alcalde F, Blanco A, Shepherd AJ. An intuitionistic approach to scoring DNA sequences against transcription factor binding site motifs. BMC Bioinformatics 2010;11:551. [PMID: 21059262 PMCID: PMC3098096 DOI: 10.1186/1471-2105-11-551] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2010] [Accepted: 11/08/2010] [Indexed: 02/04/2023] Open

Abstract

Background

Transcription factors (TFs) control transcription by binding to specific regions of DNA called transcription factor binding sites (TFBSs). The identification of TFBSs is a crucial problem in computational biology and includes the subtask of predicting the location of known TFBS motifs in a given DNA sequence. It has previously been shown that, when scoring matches to known TFBS motifs, interdependencies between positions within a motif should be taken into account. However, this remains a challenging task owing to the fact that sequences similar to those of known TFBSs can occur by chance with a relatively high frequency. Here we present a new method for matching sequences to TFBS motifs based on intuitionistic fuzzy sets (IFS) theory, an approach that has been shown to be particularly appropriate for tackling problems that embody a high degree of uncertainty.

Results

We propose SC_intuit, a new scoring method for measuring sequence-motif affinity based on IFS theory. Unlike existing methods that consider dependencies between positions, SC_intuitis designed to prevent overestimation of less conserved positions of TFBSs. For a given pair of bases, SC_intuitis computed not only as a function of their combined probability of occurrence, but also taking into account the individual importance of each single base at its corresponding position. We used SC_intuitto identify known TFBSs in DNA sequences. Our method provides excellent results when dealing with both synthetic and real data, outperforming the sensitivity and the specificity of two existing methods in all the experiments we performed.

Conclusions

The results show that SC_intuitimproves the prediction quality for TFs of the existing approaches without compromising sensitivity. In addition, we show how SC_intuitcan be successfully applied to real research problems. In this study the reliability of the IFS theory for motif discovery tasks is proven.

Collapse

Schmidt D, Wilson MD, Ballester B, Schwalie PC, Brown GD, Marshall A, Kutter C, Watt S, Martinez-Jimenez CP, Mackay S, Talianidis I, Flicek P, Odom DT. Five-vertebrate ChIP-seq reveals the evolutionary dynamics of transcription factor binding. Science 2010;328:1036-40. [PMID: 20378774 PMCID: PMC3008766 DOI: 10.1126/science.1186176] [Citation(s) in RCA: 539] [Impact Index Per Article: 38.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]