Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Day WH, McMorris FR. Critical comparison of consensus methods for molecular sequences. Nucleic Acids Res 1992;20:1093-9. [PMID: 1549472 PMCID: PMC312096 DOI: 10.1093/nar/20.5.1093] [Citation(s) in RCA: 57] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022] Open

For:	Day WH, McMorris FR. Critical comparison of consensus methods for molecular sequences. Nucleic Acids Res 1992;20:1093-9. [PMID: 1549472 PMCID: PMC312096 DOI: 10.1093/nar/20.5.1093] [Citation(s) in RCA: 57] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022] Open

Number

Cited by Other Article(s)

Tognon M, Giugno R, Pinello L. A survey on algorithms to characterize transcription factor binding sites. Brief Bioinform 2023;24:bbad156. [PMID: 37099664 PMCID: PMC10422928 DOI: 10.1093/bib/bbad156] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2023] [Revised: 03/27/2023] [Accepted: 04/01/2023] [Indexed: 04/28/2023] Open

Mohanty S, Pattnaik PK, Al-Absi AA, Kang DK. A Review on Planted (l, d) Motif Discovery Algorithms for Medical Diagnose. SENSORS (BASEL, SWITZERLAND) 2022;22:1204. [PMID: 35161949 PMCID: PMC8838483 DOI: 10.3390/s22031204] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/01/2021] [Revised: 01/19/2022] [Accepted: 01/31/2022] [Indexed: 11/16/2022]

Quan L, Mei J, He R, Sun X, Nie L, Li K, Lyu Q. Quantifying Intensities of Transcription Factor-DNA Binding by Learning From an Ensemble of Protein Binding Microarrays. IEEE J Biomed Health Inform 2021;25:2811-2819. [PMID: 33571101 DOI: 10.1109/jbhi.2021.3058518] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]

RNAdemocracy: an ensemble method for RNA secondary structure prediction using consensus scoring. Comput Biol Chem 2019;83:107151. [DOI: 10.1016/j.compbiolchem.2019.107151] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2018] [Revised: 06/05/2019] [Accepted: 10/15/2019] [Indexed: 11/18/2022]

Rogozin IB, Pavlov YI, Goncearenco A, De S, Lada AG, Poliakov E, Panchenko AR, Cooper DN. Mutational signatures and mutable motifs in cancer genomes. Brief Bioinform 2019;19:1085-1101. [PMID: 28498882 DOI: 10.1093/bib/bbx049] [Citation(s) in RCA: 23] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2017] [Indexed: 12/22/2022] Open

Lee NK, Li X, Wang D. A comprehensive survey on genetic algorithms for DNA motif prediction. Inf Sci (N Y) 2018. [DOI: 10.1016/j.ins.2018.07.004] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022]

Raghunath A, Nagarajan R, Sundarraj K, Panneerselvam L, Perumal E. Genome-wide identification and analysis of Nrf2 binding sites - Antioxidant response elements in zebrafish. Toxicol Appl Pharmacol 2018;360:236-248. [PMID: 30243843 DOI: 10.1016/j.taap.2018.09.013] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2018] [Revised: 09/08/2018] [Accepted: 09/13/2018] [Indexed: 12/30/2022]

Lee C, Moroldo M, Perdomo-Sabogal A, Mach N, Marthey S, Lecardonnel J, Wahlberg P, Chong AY, Estellé J, Ho SYW, Rogel-Gaillard C, Gongora J. Inferring the evolution of the major histocompatibility complex of wild pigs and peccaries using hybridisation DNA capture-based sequencing. Immunogenetics 2017;70:401-417. [PMID: 29256177 DOI: 10.1007/s00251-017-1048-9] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2017] [Accepted: 11/25/2017] [Indexed: 12/20/2022]

Wolff JG. A scaleable technique for best-match retrieval of sequential information using metrics-guided search. J Inf Sci 2016. [DOI: 10.1177/016555159402000103] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]

Hosseinpour B, Bakhtiarizadeh MR, Khosravi P, Ebrahimie E. Predicting distinct organization of transcription factor binding sites on the promoter regions: a new genome-based approach to expand human embryonic stem cell regulatory network. Gene 2013;531:212-9. [DOI: 10.1016/j.gene.2013.09.011] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2013] [Revised: 09/01/2013] [Accepted: 09/04/2013] [Indexed: 12/23/2022]

Quader S, Huang CH. Effect of positional dependence and alignment strategy on modeling transcription factor binding sites. BMC Res Notes 2012;5:340. [PMID: 22748199 PMCID: PMC3465234 DOI: 10.1186/1756-0500-5-340] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2012] [Accepted: 06/07/2012] [Indexed: 11/29/2022] Open

Intron identification approaches based on weighted features and fuzzy decision trees. Comput Biol Med 2011;42:112-22. [PMID: 22099702 DOI: 10.1016/j.compbiomed.2011.10.015] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2010] [Revised: 04/11/2011] [Accepted: 10/13/2011] [Indexed: 11/22/2022]

Reid JE, Evans KJ, Dyer N, Wernisch L, Ott S. Variable structure motifs for transcription factor binding sites. BMC Genomics 2010;11:30. [PMID: 20074339 PMCID: PMC2824720 DOI: 10.1186/1471-2164-11-30] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2009] [Accepted: 01/14/2010] [Indexed: 02/06/2023] Open

Abstract

Background

Classically, models of DNA-transcription factor binding sites (TFBSs) have been based on relatively few known instances and have treated them as sites of fixed length using position weight matrices (PWMs). Various extensions to this model have been proposed, most of which take account of dependencies between the bases in the binding sites. However, some transcription factors are known to exhibit some flexibility and bind to DNA in more than one possible physical configuration. In some cases this variation is known to affect the function of binding sites. With the increasing volume of ChIP-seq data available it is now possible to investigate models that incorporate this flexibility. Previous work on variable length models has been constrained by: a focus on specific zinc finger proteins in yeast using restrictive models; a reliance on hand-crafted models for just one transcription factor at a time; and a lack of evaluation on realistically sized data sets.

Results

We re-analysed binding sites from the TRANSFAC database and found motivating examples where our new variable length model provides a better fit. We analysed several ChIP-seq data sets with a novel motif search algorithm and compared the results to one of the best standard PWM finders and a recently developed alternative method for finding motifs of variable structure. All the methods performed comparably in held-out cross validation tests. Known motifs of variable structure were recovered for p53, Stat5a and Stat5b. In addition our method recovered a novel generalised version of an existing PWM for Sp1 that allows for variable length binding. This motif improved classification performance.

Conclusions

We have presented a new gapped PWM model for variable length DNA binding sites that is not too restrictive nor over-parameterised. Our comparison with existing tools shows that on average it does not have better predictive accuracy than existing methods. However, it does provide more interpretable models of motifs of variable structure that are suitable for follow-up structural studies. To our knowledge, we are the first to apply variable length motif models to eukaryotic ChIP-seq data sets and consequently the first to show their value in this domain. The results include a novel motif for the ubiquitous transcription factor Sp1.

Collapse

Sereno PC. Comparative cladistics. Cladistics 2009;25:624-659. [DOI: 10.1111/j.1096-0031.2009.00265.x] [Citation(s) in RCA: 34] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022] Open

van Hijum SAFT, Medema MH, Kuipers OP. Mechanisms and evolution of control logic in prokaryotic transcriptional regulation. Microbiol Mol Biol Rev 2009;73:481-509, Table of Contents. [PMID: 19721087 PMCID: PMC2738135 DOI: 10.1128/mmbr.00037-08] [Citation(s) in RCA: 96] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open

HOU L, QIAN MP, ZHU YP, DENG MH. Advances on bioinformatic research in transcription factor binding sites. YI CHUAN = HEREDITAS 2009;31:365-73. [DOI: 10.3724/sp.j.1005.2009.00365] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]

Zare-Mirakabad F, Ahrabian H, Sadeghi M, Nowzari-Dalini A, Goliaei B. New scoring schema for finding motifs in DNA Sequences. BMC Bioinformatics 2009;10:93. [PMID: 19302709 PMCID: PMC2679735 DOI: 10.1186/1471-2105-10-93] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2008] [Accepted: 03/20/2009] [Indexed: 11/10/2022] Open

Abstract

BACKGROUND

Pattern discovery in DNA sequences is one of the most fundamental problems in molecular biology with important applications in finding regulatory signals and transcription factor binding sites. An important task in this problem is to search (or predict) known binding sites in a new DNA sequence. For this reason, all subsequences of the given DNA sequence are scored based on an scoring function and the prediction is done by selecting the best score. By assuming no dependency between binding site base positions, most of the available tools for known binding site prediction are designed. Recently Tomovic and Oakeley investigated the statistical basis for either a claim of dependence or independence, to determine whether such a claim is generally true, and they presented a scoring function for binding site prediction based on the dependency between binding site base positions. Our primary objective is to investigate the scoring functions which can be used in known binding site prediction based on the assumption of dependency or independency in binding site base positions.

RESULTS

We propose a new scoring function based on the dependency between all positions in biding site base positions. This scoring function uses joint information content and mutual information as a measure of dependency between positions in transcription factor binding site. Our method for modeling dependencies is simply an extension of position independency methods. We evaluate our new scoring function on the real data sets extracted from JASPAR and TRANSFAC data bases, and compare the obtained results with two other well known scoring functions.

CONCLUSION

The results demonstrate that the new approach improves known binding site discovery and show that the joint information content and mutual information provide a better and more general criterion to investigate the relationships between positions in the TFBS. Our scoring function is formulated by simple mathematical calculations. By implementing our method on several biological data sets, it can be induced that this method performs better than methods that do not consider dependencies.

Collapse

Liu S, Song Q, Cao A, Yang X, Wu Y. Robust mixture model clustering of DNA binding sites. CONFERENCE PROCEEDINGS : ... ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL CONFERENCE 2008;2006:2032-5. [PMID: 17946928 DOI: 10.1109/iembs.2006.260414] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]

Schug J. Using TESS to Predict Transcription Factor Binding Sites in DNA Sequence. ACTA ACUST UNITED AC 2008;Chapter 2:Unit 2.6. [DOI: 10.1002/0471250953.bi0206s21] [Citation(s) in RCA: 182] [Impact Index Per Article: 11.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]

Habib N, Kaplan T, Margalit H, Friedman N. A novel Bayesian DNA motif comparison method for clustering and retrieval. PLoS Comput Biol 2008;4:e1000010. [PMID: 18463706 PMCID: PMC2265534 DOI: 10.1371/journal.pcbi.1000010] [Citation(s) in RCA: 39] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2007] [Accepted: 01/24/2008] [Indexed: 11/17/2022] Open

Abstract

Characterizing the DNA-binding specificities of transcription factors is a key problem in computational biology that has been addressed by multiple algorithms. These usually take as input sequences that are putatively bound by the same factor and output one or more DNA motifs. A common practice is to apply several such algorithms simultaneously to improve coverage at the price of redundancy. In interpreting such results, two tasks are crucial: clustering of redundant motifs, and attributing the motifs to transcription factors by retrieval of similar motifs from previously characterized motif libraries. Both tasks inherently involve motif comparison. Here we present a novel method for comparing and merging motifs, based on Bayesian probabilistic principles. This method takes into account both the similarity in positional nucleotide distributions of the two motifs and their dissimilarity to the background distribution. We demonstrate the use of the new comparison method as a basis for motif clustering and retrieval procedures, and compare it to several commonly used alternatives. Our results show that the new method outperforms other available methods in accuracy and sensitivity. We incorporated the resulting motif clustering and retrieval procedures in a large-scale automated pipeline for analyzing DNA motifs. This pipeline integrates the results of various DNA motif discovery algorithms and automatically merges redundant motifs from multiple training sets into a coherent annotated library of motifs. Application of this pipeline to recent genome-wide transcription factor location data in S. cerevisiae successfully identified DNA motifs in a manner that is as good as semi-automated analysis reported in the literature. Moreover, we show how this analysis elucidates the mechanisms of condition-specific preferences of transcription factors.

Regulation of gene expression plays a central role in the activity of living cells and in their response to internal (e.g., cell division) or external (e.g., stress) stimuli. Key players in determining gene-specific regulation are transcription factors that bind sequence-specific sites on the DNA, modulating the expression of nearby genes. To understand the regulatory program of the cell, we need to identify these transcription factors, when they act, and on which genes. Transcription regulatory maps can be assembled by computational analysis of experimental data, by discovering the DNA recognition sequences (motifs) of transcription factors and their occurrences along the genome. Such an analysis usually results in a large number of overlapping motifs. To reconstruct regulatory maps, it is crucial to combine similar motifs and to relate them to transcription factors. To this end we developed an accurate fully-automated method, termed BLiC, based upon an improved similarity measure for comparing DNA motifs. By applying it to genome-wide data in yeast, we identified the DNA motifs of transcription factors and their putative target genes. Finally, we analyze motifs of transcription factor that alter their target genes under different conditions, and show how cells adjust their regulatory program in response to environmental changes.

Collapse

Harris SR, Pisani D, Gower DJ, Wilkinson M. Investigating stagnation in morphological phylogenetics using consensus data. Syst Biol 2007;56:125-9. [PMID: 17366142 DOI: 10.1080/10635150601115624] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022] Open

Tomovic A, Oakeley EJ. Position dependencies in transcription factor binding sites. Bioinformatics 2007;23:933-41. [PMID: 17308339 DOI: 10.1093/bioinformatics/btm055] [Citation(s) in RCA: 56] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open

Bhardwaj N, Lu H. Residue-level prediction of DNA-binding sites and its application on DNA-binding protein predictions. FEBS Lett 2007;581:1058-66. [PMID: 17316627 PMCID: PMC1993824 DOI: 10.1016/j.febslet.2007.01.086] [Citation(s) in RCA: 52] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2006] [Revised: 12/11/2006] [Accepted: 01/25/2007] [Indexed: 11/19/2022]

Abstract

Protein-DNA interactions are crucial to many cellular activities such as expression-control and DNA-repair. These interactions between amino acids and nucleotides are highly specific and any aberrance at the binding site can render the interaction completely incompetent. In this study, we have three aims focusing on DNA-binding residues on the protein surface: to develop an automated approach for fast and reliable recognition of DNA-binding sites; to improve the prediction by distance-dependent refinement; use these predictions to identify DNA-binding proteins. We use a support vector machines (SVM)-based approach to harness the features of the DNA-binding residues to distinguish them from non-binding residues. Features used for distinction include the residue's identity, charge, solvent accessibility, average potential, the secondary structure it is embedded in, neighboring residues, and location in a cationic patch. These features collected from 50 proteins are used to train SVM. Testing is then performed on another set of 37 proteins, much larger than any testing set used in previous studies. The testing set has no more than 20% sequence identity not only among its pairs, but also with the proteins in the training set, thus removing any undesired redundancy due to homology. This set also has proteins with an unseen DNA-binding structural class not present in the training set. With the above features, an accuracy of 66% with balanced sensitivity and specificity is achieved without relying on homology or evolutionary information. We then develop a post-processing scheme to improve the prediction using the relative location of the predicted residues. Balanced success is then achieved with average sensitivity, specificity and accuracy pegged at 71.3%, 69.3% and 70.5%, respectively. Average net prediction is also around 70%. Finally, we show that the number of predicted DNA-binding residues can be used to differentiate DNA-binding proteins from non-DNA-binding proteins with an accuracy of 78%. Results presented here demonstrate that machine-learning can be applied to automated identification of DNA-binding residues and that the success rate can be ameliorated as more features are added. Such functional site prediction protocols can be useful in guiding consequent works such as site-directed mutagenesis and macromolecular docking.

Collapse

MacIsaac KD, Fraenkel E. Practical strategies for discovering regulatory DNA sequence motifs. PLoS Comput Biol 2006;2:e36. [PMID: 16683017 PMCID: PMC1447654 DOI: 10.1371/journal.pcbi.0020036] [Citation(s) in RCA: 116] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open

Mahony S, Benos PV, Smith TJ, Golden A. Self-organizing neural networks to support the discovery of DNA-binding motifs. Neural Netw 2006;19:950-62. [PMID: 16839740 DOI: 10.1016/j.neunet.2006.05.023] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/24/2022]

Hu J, Li B, Kihara D. Limitations and potentials of current motif discovery algorithms. Nucleic Acids Res 2005;33:4899-913. [PMID: 16284194 PMCID: PMC1199555 DOI: 10.1093/nar/gki791] [Citation(s) in RCA: 112] [Impact Index Per Article: 5.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open

Styczynski MP, Stephanopoulos G. Overview of computational methods for the inference of gene regulatory networks. Comput Chem Eng 2005. [DOI: 10.1016/j.compchemeng.2004.08.029] [Citation(s) in RCA: 41] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]

Cowell LG, Davila M, Ramsden D, Kelsoe G. Computational tools for understanding sequence variability in recombination signals. Immunol Rev 2004;200:57-69. [PMID: 15242396 DOI: 10.1111/j.0105-2896.2004.00171.x] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]

Rogozin IB, Pavlov YI. Theoretical analysis of mutation hotspots and their DNA sequence context specificity. Mutat Res 2003;544:65-85. [PMID: 12888108 DOI: 10.1016/s1383-5742(03)00032-2] [Citation(s) in RCA: 123] [Impact Index Per Article: 5.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]

Abstract

Mutation frequencies vary significantly along nucleotide sequences such that mutations often concentrate at certain positions called hotspots. Mutation hotspots in DNA reflect intrinsic properties of the mutation process, such as sequence specificity, that manifests itself at the level of interaction between mutagens, DNA, and the action of the repair and replication machineries. The hotspots might also reflect structural and functional features of the respective DNA sequences. When mutations in a gene are identified using a particular experimental system, resulting hotspots could reflect the properties of the gene product and the mutant selection scheme. Analysis of the nucleotide sequence context of hotspots can provide information on the molecular mechanisms of mutagenesis. However, the determinants of mutation frequency and specificity are complex, and there are many analytical methods for their study. Here we review computational approaches for analyzing mutation spectra (distribution of mutations along the target genes) that include many mutable (detectable) positions. The following methods are reviewed: derivation of a consensus sequence, application of regression approaches to correlate nucleotide sequence features with mutation frequency, mutation hotspot prediction, analysis of oligonucleotide composition of regions containing mutations, pairwise comparison of mutation spectra, analysis of multiple spectra, and analysis of "context-free" characteristics. The advantages and pitfalls of these methods are discussed and illustrated by examples from the literature. The most reliable analyses were obtained when several methods were combined and information from theoretical analysis and experimental observations was considered simultaneously. Simple, robust approaches should be used with small samples of mutations, whereas combinations of simple and complex approaches may be required for large samples. We discuss several well-documented studies where analysis of mutation spectra has substantially contributed to the current understanding of molecular mechanisms of mutagenesis. The nucleotide sequence context of mutational hotspots is a fingerprint of interactions between DNA and DNA repair, replication, and modification enzymes, and the analysis of hotspot context provides evidence of such interactions.

Collapse

Sosinsky A, Bonin CP, Mann RS, Honig B. Target Explorer: An automated tool for the identification of new target genes for a specified set of transcription factors. Nucleic Acids Res 2003;31:3589-92. [PMID: 12824372 PMCID: PMC168951 DOI: 10.1093/nar/gkg544] [Citation(s) in RCA: 82] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open

Schug J. Using TESS to Predict Transcription Factor Binding Sites in DNA Sequence. ACTA ACUST UNITED AC 2003. [DOI: 10.1002/0471250953.bi0206s00] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/22/2023]

Cowell LG, Davila M, Kepler TB, Kelsoe G. Identification and utilization of arbitrary correlations in models of recombination signal sequences. Genome Biol 2002;3:RESEARCH0072. [PMID: 12537561 PMCID: PMC151174 DOI: 10.1186/gb-2002-3-12-research0072] [Citation(s) in RCA: 47] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/24/2002] [Revised: 09/04/2002] [Accepted: 10/10/2002] [Indexed: 01/26/2023] Open

Schneider TD. Consensus sequence Zen. APPLIED BIOINFORMATICS 2002;1:111-9. [PMID: 15130839 PMCID: PMC1852464] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Subscribe] [Scholar Register] [Indexed: 04/29/2023]

Gelfand MS. Recognition of regulatory sites by genomic comparison. Res Microbiol 1999;150:755-71. [PMID: 10673013 DOI: 10.1016/s0923-2508(99)00117-5] [Citation(s) in RCA: 32] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022]

Torra V, Cortes U. Towards an automatic consensus generator tool: EGAC. ACTA ACUST UNITED AC 1995. [DOI: 10.1109/21.376503] [Citation(s) in RCA: 22] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]

Ramsden DA, Baetz K, Wu GE. Conservation of sequence in recombination signal sequence spacers. Nucleic Acids Res 1994;22:1785-96. [PMID: 8208601 PMCID: PMC308075 DOI: 10.1093/nar/22.10.1785] [Citation(s) in RCA: 122] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/29/2023] Open

Mirkin B, Roberts FS. Consensus functions and patterns in molecular sequences. Bull Math Biol 1993;55:695-713. [PMID: 8318927 DOI: 10.1007/bf02460669] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/29/2023]

Discovering Consensus Molecular Sequences. ACTA ACUST UNITED AC 1993. [DOI: 10.1007/978-3-642-50974-2_40] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register]

Day WH, McMorris FR. Threshold consensus methods for molecular sequences. J Theor Biol 1992;159:481-9. [PMID: 1296100 DOI: 10.1016/s0022-5193(05)80692-7] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022]