Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Sandve GK, Drabløs F. A survey of motif discovery methods in an integrated framework. Biol Direct 2006;1:11. [PMID: 16600018 PMCID: PMC1479319 DOI: 10.1186/1745-6150-1-11] [Citation(s) in RCA: 109] [Impact Index Per Article: 6.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2006] [Accepted: 04/06/2006] [Indexed: 11/10/2022] Open

For:	Sandve GK, Drabløs F. A survey of motif discovery methods in an integrated framework. Biol Direct 2006;1:11. [PMID: 16600018 PMCID: PMC1479319 DOI: 10.1186/1745-6150-1-11] [Citation(s) in RCA: 109] [Impact Index Per Article: 6.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2006] [Accepted: 04/06/2006] [Indexed: 11/10/2022] Open

Number

Cited by Other Article(s)

Maseko NN, Steenkamp ET, Wingfield BD, Wilken PM. An in Silico Approach to Identifying TF Binding Sites: Analysis of the Regulatory Regions of BUSCO Genes from Fungal Species in the Ceratocystidaceae Family. Genes (Basel) 2023;14:genes14040848. [PMID: 37107606 PMCID: PMC10137650 DOI: 10.3390/genes14040848] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2023] [Revised: 03/26/2023] [Accepted: 03/27/2023] [Indexed: 04/03/2023] Open

Sequence graph transform (SGT): a feature embedding function for sequence data mining. Data Min Knowl Discov 2022. [DOI: 10.1007/s10618-021-00813-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/03/2022]

Vahed M, Vahed M, Garmire LX. BML: a versatile web server for bipartite motif discovery. Brief Bioinform 2021;23:6490318. [PMID: 34974623 PMCID: PMC8769915 DOI: 10.1093/bib/bbab536] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2021] [Revised: 11/18/2021] [Accepted: 11/19/2021] [Indexed: 11/28/2022] Open

Menzel M, Hurka S, Glasenhardt S, Gogol-Döring A. NoPeak: k-mer-based motif discovery in ChIP-Seq data without peak calling. Bioinformatics 2021;37:596-602. [PMID: 32991679 DOI: 10.1093/bioinformatics/btaa845] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2020] [Accepted: 09/14/2020] [Indexed: 01/30/2023] Open

He Y, Shen Z, Zhang Q, Wang S, Huang DS. A survey on deep learning in DNA/RNA motif mining. Brief Bioinform 2020;22:5916939. [PMID: 33005921 PMCID: PMC8293829 DOI: 10.1093/bib/bbaa229] [Citation(s) in RCA: 34] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2020] [Revised: 08/19/2020] [Accepted: 08/24/2020] [Indexed: 01/18/2023] Open

Sultan I, Fromion V, Schbath S, Nicolas P. Statistical modelling of bacterial promoter sequences for regulatory motif discovery with the help of transcriptome data: application to Listeria monocytogenes. J R Soc Interface 2020;17:20200600. [PMID: 33023397 PMCID: PMC7653377 DOI: 10.1098/rsif.2020.0600] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2020] [Accepted: 09/10/2020] [Indexed: 11/12/2022] Open

Chahal G, Tyagi S, Ramialison M. Navigating the non-coding genome in heart development and Congenital Heart Disease. Differentiation 2019;107:11-23. [PMID: 31102825 DOI: 10.1016/j.diff.2019.05.001] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2018] [Revised: 01/14/2019] [Accepted: 05/06/2019] [Indexed: 12/12/2022]

Lebatteux D, Remita AM, Diallo AB. Toward an Alignment-Free Method for Feature Extraction and Accurate Classification of Viral Sequences. J Comput Biol 2019;26:519-535. [PMID: 31050550 DOI: 10.1089/cmb.2018.0239] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022] Open

Abstract

The classification of pathogens in emerging and re-emerging viruses represents major interests in taxonomic studies, functional genomics, host-pathogen interplay, prevention, and disease treatments. It consists of assigning a given sequence to its related group of known sequences sharing similar characteristics and traits. The challenges to such classification could be associated with several virus properties including recombination, mutation rate, multiplicity of motifs, and diversity. In domains such as pathogen monitoring and surveillance, it is important to detect and quantify known and novel taxa without exploiting the full and accurate alignments or virus family profiles. In this study, we propose an alignment-free method, CASTOR-KRFE, to detect discriminating subsequences within known pathogen sequences to classify accurately unknown pathogen sequences. This method includes three major steps: (1) vectorization of known viral genomic sequences based on k-mers to constitute the potential features, (2) efficient way of pattern extraction and evaluation maximizing classification performance, and (3) prediction of the minimal set of features fitting a given criterion (threshold of performance metric and maximum number of features). We assessed this method through a jackknife data partitioning on a dozen of various virus data sets, covering the seven major virus groups and including influenza virus, Ebola virus, human immunodeficiency virus 1, hepatitis C virus, hepatitis B virus, and human papillomavirus. CASTOR-KRFE provides a weighted average F-measure >0.96 over a wide range of viruses. Our method also shows better performance on complex virus data sets than multiple subsequences extractor for classification (MISSEL), a subsequence extraction method, and the Discriminative mode of MEME patterns extraction tool.

Collapse

Djordjevic M, Rodic A, Graovac S. From biophysics to 'omics and systems biology. EUROPEAN BIOPHYSICS JOURNAL: EBJ 2019;48:413-424. [PMID: 30972433 DOI: 10.1007/s00249-019-01366-3] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Subscribe] [Scholar Register] [Received: 10/12/2018] [Revised: 02/12/2019] [Accepted: 04/03/2019] [Indexed: 01/03/2023]

Bioinformatics Approaches to Gain Insights into cis-Regulatory Motifs Involved in mRNA Localization. ADVANCES IN EXPERIMENTAL MEDICINE AND BIOLOGY 2019;1203:165-194. [PMID: 31811635 DOI: 10.1007/978-3-030-31434-7_7] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/21/2022]

Saad C, Noé L, Richard H, Leclerc J, Buisine MP, Touzet H, Figeac M. DiNAMO: highly sensitive DNA motif discovery in high-throughput sequencing data. BMC Bioinformatics 2018;19:223. [PMID: 29890948 PMCID: PMC5996464 DOI: 10.1186/s12859-018-2215-1] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2017] [Accepted: 05/21/2018] [Indexed: 12/30/2022] Open

Caldonazzo Garbelini JM, Kashiwabara AY, Sanches DS. Sequence motif finder using memetic algorithm. BMC Bioinformatics 2018;19:4. [PMID: 29298679 PMCID: PMC5751424 DOI: 10.1186/s12859-017-2005-1] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2017] [Accepted: 12/18/2017] [Indexed: 11/10/2022] Open

Triska M, Ivliev A, Nikolsky Y, Tatarinova TV. Analysis of cis-Regulatory Elements in Gene Co-expression Networks in Cancer. Methods Mol Biol 2017;1613:291-310. [PMID: 28849565 DOI: 10.1007/978-1-4939-7027-8_11] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/22/2023]

Abstract

Analysis of gene co-expression networks is a powerful "data-driven" tool, invaluable for understanding cancer biology and mechanisms of tumor development. Yet, despite of completion of thousands of studies on cancer gene expression, there were few attempts to normalize and integrate co-expression data from scattered sources in a concise "meta-analysis" framework. Here we describe an integrated approach to cancer expression meta-analysis, which combines generation of "data-driven" co-expression networks with detailed statistical detection of promoter sequence motifs within the co-expression clusters. First, we applied Weighted Gene Co-Expression Network Analysis (WGCNA) workflow and Pearson's correlation to generate a comprehensive set of over 3000 co-expression clusters in 82 normalized microarray datasets from nine cancers of different origin. Next, we designed a genome-wide statistical approach to the detection of specific DNA sequence motifs based on similarities between the promoters of similarly expressed genes. The approach, realized as cisExpress software module, was specifically designed for analysis of very large data sets such as those generated by publicly accessible whole genome and transcriptome projects. cisExpress uses a task farming algorithm to exploit all available computational cores within a shared memory node.We discovered that although co-expression modules are populated with different sets of genes, they share distinct stable patterns of co-regulation based on promoter sequence analysis. The number of motifs per co-expression cluster varies widely in accordance with cancer tissue of origin, with the largest number in colon (68 motifs) and the lowest in ovary (18 motifs). The top scored motifs are typically shared between several tissues; they define sets of target genes responsible for certain functionality of cancerogenesis. Both the co-expression modules and a database of precalculated motifs are publically available and accessible for further studies.

Collapse

Czeizler E, Hirvola T, Karhu K. A graph-theoretical approach for motif discovery in protein sequences. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2017;14:121-130. [PMID: 28055896 DOI: 10.1109/tcbb.2015.2511750] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]

Jayaram N, Usvyat D, R Martin AC. Evaluating tools for transcription factor binding site prediction. BMC Bioinformatics 2016;17:547. [PMID: 27806697 PMCID: PMC6889335 DOI: 10.1186/s12859-016-1298-9] [Citation(s) in RCA: 58] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2016] [Accepted: 10/20/2016] [Indexed: 12/21/2022] Open

Tangirala K, Herndon N, Caragea D. A Comparative Analysis Between k-Mers and Community Detection-Based Features for the Task of Protein Classification. IEEE Trans Nanobioscience 2016;15:84-92. [PMID: 26863669 PMCID: PMC6245644 DOI: 10.1109/tnb.2016.2523501] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]

Kibet CK, Machanick P. Transcription factor motif quality assessment requires systematic comparative analysis. F1000Res 2015;4:ISCB Comm J-1429. [PMID: 27092243 PMCID: PMC4821295 DOI: 10.12688/f1000research.7408.2] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 02/29/2016] [Indexed: 11/22/2022] Open

Kibet CK, Machanick P. Transcription factor motif quality assessment requires systematic comparative analysis. F1000Res 2015;4:ISCB Comm J-1429. [PMID: 27092243 DOI: 10.12688/f1000research.7408.1] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 11/19/2015] [Indexed: 03/26/2024] Open

Maynou J, Pairó E, Marco S, Perera A. Sequence information gain based motif analysis. BMC Bioinformatics 2015;16:377. [PMID: 26553056 PMCID: PMC4640167 DOI: 10.1186/s12859-015-0811-x] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2014] [Accepted: 10/30/2015] [Indexed: 11/23/2022] Open

Zhang Y, He Y, Zheng G, Wei C. MOST+: A de novo motif finding approach combining genomic sequence and heterogeneous genome-wide signatures. BMC Genomics 2015;16 Suppl 7:S13. [PMID: 26099518 PMCID: PMC4474412 DOI: 10.1186/1471-2164-16-s7-s13] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/09/2023] Open

Slattery M, Zhou T, Yang L, Dantas Machado AC, Gordân R, Rohs R. Absence of a simple code: how transcription factors read the genome. Trends Biochem Sci 2014;39:381-99. [PMID: 25129887 DOI: 10.1016/j.tibs.2014.07.002] [Citation(s) in RCA: 332] [Impact Index Per Article: 33.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2014] [Revised: 07/11/2014] [Accepted: 07/15/2014] [Indexed: 12/21/2022]

Wong AKC, Lee ESA. Aligning and Clustering Patterns to Reveal the Protein Functionality of Sequences. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2014;11:548-560. [PMID: 26356022 DOI: 10.1109/tcbb.2014.2306840] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]

Carvalho L. Bayesian centroid estimation for motif discovery. PLoS One 2013;8:e80511. [PMID: 24324603 PMCID: PMC3855595 DOI: 10.1371/journal.pone.0080511] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/04/2013] [Accepted: 10/03/2013] [Indexed: 11/29/2022] Open

Transcription of Tnfaip3 is regulated by NF-κB and p38 via C/EBPβ in activated macrophages. PLoS One 2013;8:e73153. [PMID: 24023826 PMCID: PMC3759409 DOI: 10.1371/journal.pone.0073153] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2013] [Accepted: 07/17/2013] [Indexed: 11/19/2022] Open

Triska M, Grocutt D, Southern J, Murphy DJ, Tatarinova T. cisExpress: motif detection in DNA sequences. ACTA ACUST UNITED AC 2013;29:2203-5. [PMID: 23793750 DOI: 10.1093/bioinformatics/btt366] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022]

Leibovich L, Paz I, Yakhini Z, Mandel-Gutfreund Y. DRIMust: a web server for discovering rank imbalanced motifs using suffix trees. Nucleic Acids Res 2013;41:W174-9. [PMID: 23685432 PMCID: PMC3692051 DOI: 10.1093/nar/gkt407] [Citation(s) in RCA: 46] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open

Orenstein Y, Linhart C, Shamir R. Assessment of algorithms for inferring positional weight matrix motifs of transcription factor binding sites using protein binding microarray data. PLoS One 2012;7:e46145. [PMID: 23029415 PMCID: PMC3460961 DOI: 10.1371/journal.pone.0046145] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2012] [Accepted: 08/27/2012] [Indexed: 01/05/2023] Open

Lee C, Huang CH. Searching for transcription factor binding sites in vector spaces. BMC Bioinformatics 2012;13:215. [PMID: 23244338 PMCID: PMC3543194 DOI: 10.1186/1471-2105-13-215] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2012] [Accepted: 08/16/2012] [Indexed: 11/10/2022] Open

Abstract

BACKGROUND

Computational approaches to transcription factor binding site identification have been actively researched in the past decade. Learning from known binding sites, new binding sites of a transcription factor in unannotated sequences can be identified. A number of search methods have been introduced over the years. However, one can rarely find one single method that performs the best on all the transcription factors. Instead, to identify the best method for a particular transcription factor, one usually has to compare a handful of methods. Hence, it is highly desirable for a method to perform automatic optimization for individual transcription factors.

RESULTS

We proposed to search for transcription factor binding sites in vector spaces. This framework allows us to identify the best method for each individual transcription factor. We further introduced two novel methods, the negative-to-positive vector (NPV) and optimal discriminating vector (ODV) methods, to construct query vectors to search for binding sites in vector spaces. Extensive cross-validation experiments showed that the proposed methods significantly outperformed the ungapped likelihood under positional background method, a state-of-the-art method, and the widely-used position-specific scoring matrix method. We further demonstrated that motif subtypes of a TF can be readily identified in this framework and two variants called the k NPV and k ODV methods benefited significantly from motif subtype identification. Finally, independent validation on ChIP-seq data showed that the ODV and NPV methods significantly outperformed the other compared methods.

CONCLUSIONS

We conclude that the proposed framework is highly flexible. It enables the two novel methods to automatically identify a TF-specific subspace to search for binding sites. Implementations are available as source code at: http://biogrid.engr.uconn.edu/tfbs_search/.

Collapse

PROSPERI MATTIACF, PROSPERI LUCIANO, GRAY REBECCAR, SALEMI MARCO. ON COUNTING THE FREQUENCY DISTRIBUTION OF STRING MOTIFS IN MOLECULAR SEQUENCES. INT J BIOMATH 2012. [DOI: 10.1142/s1793524512500556] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]

Deyneko IV, Weiss S, Leschner S. An integrative computational approach to effectively guide experimental identification of regulatory elements in promoters. BMC Bioinformatics 2012;13:202. [PMID: 22897887 PMCID: PMC3465240 DOI: 10.1186/1471-2105-13-202] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2012] [Accepted: 08/01/2012] [Indexed: 01/22/2023] Open

Abstract

Background

Transcriptional activity of genes depends on many factors like DNA motifs, conformational characteristics of DNA, melting etc. and there are computational approaches for their identification. However, in real applications, the number of predicted, for example, DNA motifs may be considerably large. In cases when various computational programs are applied, systematic experimental knock out of each of the potential elements obviously becomes nonproductive. Hence, one needs an approach that is able to integrate many heterogeneous computational methods and upon that suggest selected regulatory elements for experimental verification.

Results

Here, we present an integrative bioinformatic approach aimed at the discovery of regulatory modules that can be effectively verified experimentally. It is based on combinatorial analysis of known and novel binding motifs, as well as of any other known features of promoters. The goal of this method is the identification of a collection of modules that are specific for an established dataset and at the same time are optimal for experimental verification. The method is particularly effective on small datasets, where most statistical approaches fail. We apply it to promoters that drive tumor-specific gene expression in tumor-colonizing Gram-negative bacteria. The method successfully identified a number of potential modules, which required only a few experiments to be verified. The resulting minimal functional bacterial promoter exhibited high specificity of expression in cancerous tissue.

Conclusions

Experimental analysis of promoter structures guided by bioinformatics has proved to be efficient. The developed computational method is able to include heterogeneous features of promoters and suggest combinatorial modules for experimental testing. Expansibility and robustness of the methodology implemented in the approach ensures good results for a wide range of problems.

Collapse

Cserháti M, Turóczy Z, Dudits D, Györgyey J. The rice word landscape: a detailed catalogue of the rice motif content in the non-coding regions. OMICS-A JOURNAL OF INTEGRATIVE BIOLOGY 2012;16:334-42. [PMID: 22702246 DOI: 10.1089/omi.2011.0056] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/13/2022]

Cserháti M, Turóczy Z, Dudits D, Györgyey J. The rice word landscape--a detailed catalog of the rice motif content in the noncoding regions. OMICS-A JOURNAL OF INTEGRATIVE BIOLOGY 2012;15:819-28. [PMID: 22122670 DOI: 10.1089/omi.2011.0132] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]

Zambelli F, Pesole G, Pavesi G. Motif discovery and transcription factor binding sites before and after the next-generation sequencing era. Brief Bioinform 2012;14:225-37. [PMID: 22517426 PMCID: PMC3603212 DOI: 10.1093/bib/bbs016] [Citation(s) in RCA: 93] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023] Open

Pairó E, Maynou J, Marco S, Perera A. A subspace method for the detection of transcription factor binding sites. Bioinformatics 2012;28:1328-35. [DOI: 10.1093/bioinformatics/bts147] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open

Leibovich L, Yakhini Z. Efficient motif search in ranked lists and applications to variable gap motifs. Nucleic Acids Res 2012;40:5832-47. [PMID: 22416066 PMCID: PMC3401424 DOI: 10.1093/nar/gks206] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022] Open

Wang D, Do HT. Computational localization of transcription factor binding sites using extreme learning machines. Soft comput 2012. [DOI: 10.1007/s00500-012-0820-x] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/27/2023]

Vijayvargiya S, Shukla P. A niched Pareto genetic algorithm for finding variable length regulatory motifs in DNA sequences. 3 Biotech 2011. [PMCID: PMC3376862 DOI: 10.1007/s13205-011-0040-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022] Open

Ichinose N, Yada T, Gotoh O. Large-scale motif discovery using DNA Gray code and equiprobable oligomers. ACTA ACUST UNITED AC 2011;28:25-31. [PMID: 22057160 PMCID: PMC3244767 DOI: 10.1093/bioinformatics/btr606] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]

Linhart C, Halperin Y, Darom A, Kidron S, Broday L, Shamir R. A novel candidate cis-regulatory motif pair in the promoters of germline and oogenesis genes in C. elegans. Genome Res 2011;22:76-83. [PMID: 21930893 DOI: 10.1101/gr.115626.110] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/03/2023]

Tree-based position weight matrix approach to model transcription factor binding site profiles. PLoS One 2011;6:e24210. [PMID: 21912677 PMCID: PMC3166302 DOI: 10.1371/journal.pone.0024210] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2011] [Accepted: 08/02/2011] [Indexed: 11/30/2022] Open

Zhang S, Li S, Niu M, Pham PT, Su Z. MotifClick: prediction of cis-regulatory binding sites via merging cliques. BMC Bioinformatics 2011;12:238. [PMID: 21679436 PMCID: PMC3225181 DOI: 10.1186/1471-2105-12-238] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2010] [Accepted: 06/16/2011] [Indexed: 11/21/2022] Open

Zheng X, Liu T, Yang Z, Wang J. Large cliques in Arabidopsis gene coexpression network and motif discovery. JOURNAL OF PLANT PHYSIOLOGY 2011;168:611-618. [PMID: 21044807 DOI: 10.1016/j.jplph.2010.09.010] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/29/2010] [Revised: 08/31/2010] [Accepted: 09/06/2010] [Indexed: 05/30/2023]

Cserháti M, Turóczy Z, Zombori Z, Cserzo M, Dudits D, Pongor S, Györgyey J. Prediction of new abiotic stress genes in Arabidopsis thaliana and Oryza sativa according to enumeration-based statistical analysis. Mol Genet Genomics 2011;285:375-91. [PMID: 21437642 DOI: 10.1007/s00438-011-0605-4] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2010] [Accepted: 01/31/2011] [Indexed: 10/18/2022]

CHEN RM, HOU MT, CHANG NW, CHEN YT, TSAI JEFFREYJP. CUMULATIVE SPECTRAL REPEAT FINDER (CSRF): A SPECTRAL APPROACH FOR IDENTIFYING THE LENGTH OF REPEATS IN DNA SEQUENCES. INT J ARTIF INTELL T 2011. [DOI: 10.1142/s0218213011000073] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]

When needles look like hay: how to find tissue-specific enhancers in model organism genomes. Dev Biol 2010;350:239-54. [PMID: 21130761 DOI: 10.1016/j.ydbio.2010.11.026] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2010] [Revised: 11/11/2010] [Accepted: 11/22/2010] [Indexed: 01/22/2023]

Oberto J. FITBAR: a web tool for the robust prediction of prokaryotic regulons. BMC Bioinformatics 2010;11:554. [PMID: 21070640 PMCID: PMC3098098 DOI: 10.1186/1471-2105-11-554] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2010] [Accepted: 11/11/2010] [Indexed: 11/24/2022] Open

Abstract

Background

The binding of regulatory proteins to their specific DNA targets determines the accurate expression of the neighboring genes. The in silico prediction of new binding sites in completely sequenced genomes is a key aspect in the deeper understanding of gene regulatory networks. Several algorithms have been described to discriminate against false-positives in the prediction of new binding targets; however none of them has been implemented so far to assist the detection of binding sites at the genomic scale.

Results

FITBAR (Fast Investigation Tool for Bacterial and Archaeal Regulons) is a web service designed to identify new protein binding sites on fully sequenced prokaryotic genomes. This tool consists in a workbench where the significance of the predictions can be compared using different statistical methods, a feature not found in existing resources. The Local Markov Model and the Compound Importance Sampling algorithms have been implemented to compute the P-value of newly discovered binding sites. In addition, FITBAR provides two optimized genomic scanning algorithms using either log-odds or entropy-weighted position-specific scoring matrices. Other significant features include the production of a detailed genomic context map for each detected binding site and the export of the search results in spreadsheet and portable document formats. FITBAR discovery of a high affinity Escherichia coli NagC binding site was validated experimentally in vitro as well as in vivo and published.

Conclusions

FITBAR was developed in order to allow fast, accurate and statistically robust predictions of prokaryotic regulons. This feature constitutes the main advantage of this web tool over other matrix search programs and does not impair its performance. The web service is available at http://archaea.u-psud.fr/fitbar.

Collapse

Li G, Chan TM, Leung KS, Lee KH. A cluster refinement algorithm for motif discovery. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2010;7:654-668. [PMID: 21030733 DOI: 10.1109/tcbb.2009.25] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/30/2023]

Mason MJ, Plath K, Zhou Q. Identification of context-dependent motifs by contrasting ChIP binding data. ACTA ACUST UNITED AC 2010;26:2826-32. [PMID: 20870645 DOI: 10.1093/bioinformatics/btq546] [Citation(s) in RCA: 34] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]

Klepper K, Drabløs F. PriorsEditor: a tool for the creation and use of positional priors in motif discovery. Bioinformatics 2010;26:2195-7. [PMID: 20628076 PMCID: PMC2922893 DOI: 10.1093/bioinformatics/btq357] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022] Open

Ma PC, Chan KC. Discovering Interesting Motif-Sets for Multi-Class Protein Sequence Classification. J Comput Biol 2010;17:733-43. [DOI: 10.1089/cmb.2008.0213] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open