Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Pesole G, Prunella N, Liuni S, Attimonelli M, Saccone C. WORDUP: an efficient algorithm for discovering statistically significant patterns in DNA sequences. Nucleic Acids Res 1992;20:2871-5. [PMID: 1614873 PMCID: PMC336935 DOI: 10.1093/nar/20.11.2871] [Citation(s) in RCA: 37] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022] Open

For:	Pesole G, Prunella N, Liuni S, Attimonelli M, Saccone C. WORDUP: an efficient algorithm for discovering statistically significant patterns in DNA sequences. Nucleic Acids Res 1992;20:2871-5. [PMID: 1614873 PMCID: PMC336935 DOI: 10.1093/nar/20.11.2871] [Citation(s) in RCA: 37] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022] Open

Number

Cited by Other Article(s)

Tong H, Schliekelman P, Mrázek J. Unsupervised statistical discovery of spaced motifs in prokaryotic genomes. BMC Genomics 2017;18:27. [PMID: 28056763 PMCID: PMC5217627 DOI: 10.1186/s12864-016-3400-0] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2016] [Accepted: 12/09/2016] [Indexed: 12/23/2022] Open

Abstract

BACKGROUND

DNA sequences contain repetitive motifs which have various functions in the physiology of the organism. A number of methods have been developed for discovery of such sequence motifs with a primary focus on detection of regulatory motifs and particularly transcription factor binding sites. Most motif-finding methods apply probabilistic models to detect motifs characterized by unusually high number of copies of the motif in the analyzed sequences.

RESULTS

We present a novel method for detection of pairs of motifs separated by spacers of variable nucleotide sequence but conserved length. Unlike existing methods for motif discovery, the motifs themselves are not required to occur at unusually high frequency but only to exhibit a significant preference to occur at a specific distance from each other. In the present implementation of the method, motifs are represented by pentamers and all pairs of pentamers are evaluated for statistically significant preference for a specific distance. An important step of the algorithm eliminates motif pairs where the spacers separating the two motifs exhibit a high degree of sequence similarity; such motif pairs likely arise from duplications of the whole segment including the motifs and the spacer rather than due to selective constraints indicative of a functional importance of the motif pair. The method was used to scan 569 complete prokaryotic genomes for novel sequence motifs. Some motifs detected were previously known but other motifs found in the search appear to be novel. Selected motif pairs were subjected to further investigation and in some cases their possible biological functions were proposed.

CONCLUSIONS

We present a new motif-finding technique that is applicable to scanning complete genomes for sequence motifs. The results from analysis of 569 genomes suggest that the method detects previously known motifs that are expected to be found as well as new motifs that are unlikely to be discovered by traditional motif-finding methods. We conclude that our approach to detection of significant motif pairs can complement existing motif-finding techniques in discovery of novel functional sequence motifs in complete genomes.

Collapse

Misas E, Muñoz JF, Gallo JE, McEwen JG, Clay OK. From NGS assembly challenges to instability of fungal mitochondrial genomes: A case study in genome complexity. Comput Biol Chem 2016;61:258-69. [PMID: 26970210 DOI: 10.1016/j.compbiolchem.2016.02.016] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2015] [Revised: 02/03/2016] [Accepted: 02/16/2016] [Indexed: 01/26/2023]

Bi C. SEAM: A STOCHASTIC EM-TYPE ALGORITHM FOR MOTIF-FINDING IN BIOPOLYMER SEQUENCES. J Bioinform Comput Biol 2011;5:47-77. [PMID: 17477491 DOI: 10.1142/s0219720007002527] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2006] [Revised: 08/22/2006] [Accepted: 10/14/2006] [Indexed: 12/21/2022]

Bi C. A Monte Carlo EM algorithm for de novo motif discovery in biomolecular sequences. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2009;6:370-386. [PMID: 19644166 DOI: 10.1109/tcbb.2008.103] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/28/2023]

Jiang Y, Cukic B, Adjeroh DA, Skinner HD, Lin J, Shen QJ, Jiang BH. An algorithm for identifying novel targets of transcription factor families: application to hypoxia-inducible factor 1 targets. Cancer Inform 2009;7:75-89. [PMID: 19352460 PMCID: PMC2664698 DOI: 10.4137/cin.s1054] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022] Open

Cho YS, Lee SY, Kim KY, Bang IC, Kim DS, Nam YK. Gene structure and expression of metallothionein during metal exposures in Hemibarbus mylodon. ECOTOXICOLOGY AND ENVIRONMENTAL SAFETY 2008;71:125-37. [PMID: 17889936 DOI: 10.1016/j.ecoenv.2007.08.005] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/24/2007] [Revised: 06/28/2007] [Accepted: 08/02/2007] [Indexed: 05/17/2023]

Lascaro D, Castellana S, Gasparre G, Romeo G, Saccone C, Attimonelli M. The RHNumtS compilation: features and bioinformatics approaches to locate and quantify Human NumtS. BMC Genomics 2008;9:267. [PMID: 18522722 PMCID: PMC2447851 DOI: 10.1186/1471-2164-9-267] [Citation(s) in RCA: 33] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2007] [Accepted: 06/03/2008] [Indexed: 11/21/2022] Open

Abstract

Background

To a greater or lesser extent, eukaryotic nuclear genomes contain fragments of their mitochondrial genome counterpart, deriving from the random insertion of damaged mtDNA fragments. NumtS (Nuclear mt Sequences) are not equally abundant in all species, and are redundant and polymorphic in terms of copy number. In population and clinical genetics, it is important to have a complete overview of NumtS quantity and location. Searching PubMed for NumtS or Mitochondrial pseudo-genes yields hundreds of papers reporting Human NumtS compilations produced by in silico or wet-lab approaches. A comparison of published compilations clearly shows significant discrepancies among data, due both to unwise application of Bioinformatics methods and to a not yet correctly assembled nuclear genome. To optimize quantification and location of NumtS, we produced a consensus compilation of Human NumtS by applying various bioinformatics approaches.

Results

Location and quantification of NumtS may be achieved by applying database similarity searching methods: we have applied various methods such as Blastn, MegaBlast and BLAT, changing both parameters and database; the results were compared, further analysed and checked against the already published compilations, thus producing the Reference Human Numt Sequences (RHNumtS) compilation. The resulting NumtS total 190.

Conclusion

The RHNumtS compilation represents a highly reliable reference basis, which may allow designing a lab protocol to test the actual existence of each NumtS. Here we report preliminary results based on PCR amplification and sequencing on 41 NumtS selected from RHNumtS among those with lower score. In parallel, we are currently designing the RHNumtS database structure for implementation in the HmtDB resource. In the future, the same database will host NumtS compilations from other organisms, but these will be generated only when the nuclear genome of a specific organism has reached a high-quality level of assembly.

Collapse

Mrázek J, Xie S, Guo X, Srivastava A. AIMIE: a web-based environment for detection and interpretation of significant sequence motifs in prokaryotic genomes. Bioinformatics 2008;24:1041-8. [PMID: 18304933 DOI: 10.1093/bioinformatics/btn077] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open

Chin F, Leung HCM. DNA motif representation with nucleotide dependency. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2008;5:110-119. [PMID: 18245880 DOI: 10.1109/tcbb.2007.70220] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/25/2023]

Das MK, Dai HK. A survey of DNA motif finding algorithms. BMC Bioinformatics 2007;8 Suppl 7:S21. [PMID: 18047721 PMCID: PMC2099490 DOI: 10.1186/1471-2105-8-s7-s21] [Citation(s) in RCA: 275] [Impact Index Per Article: 16.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open

Abstract

BACKGROUND

Unraveling the mechanisms that regulate gene expression is a major challenge in biology. An important task in this challenge is to identify regulatory elements, especially the binding sites in deoxyribonucleic acid (DNA) for transcription factors. These binding sites are short DNA segments that are called motifs. Recent advances in genome sequence availability and in high-throughput gene expression analysis technologies have allowed for the development of computational methods for motif finding. As a result, a large number of motif finding algorithms have been implemented and applied to various motif models over the past decade. This survey reviews the latest developments in DNA motif finding algorithms.

RESULTS

Earlier algorithms use promoter sequences of coregulated genes from single genome and search for statistically overrepresented motifs. Recent algorithms are designed to use phylogenetic footprinting or orthologous sequences and also an integrated approach where promoter sequences of coregulated genes and phylogenetic footprinting are used. All the algorithms studied have been reported to correctly detect the motifs that have been previously detected by laboratory experimental approaches, and some algorithms were able to find novel motifs. However, most of these motif finding algorithms have been shown to work successfully in yeast and other lower organisms, but perform significantly worse in higher organisms.

CONCLUSION

Despite considerable efforts to date, DNA motif finding remains a complex challenge for biologists and computer scientists. Researchers have taken many different approaches in developing motif discovery tools and the progress made in this area of research is very encouraging. Performance comparison of different motif finding tools and identification of the best tools have proven to be a difficult task because tools are designed based on algorithms and motif models that are diverse and complex and our incomplete understanding of the biology of regulatory mechanism does not always provide adequate evaluation of underlying algorithms over motif models.

Collapse

GuhaThakurta D. Computational identification of transcriptional regulatory elements in DNA sequence. Nucleic Acids Res 2006;34:3585-98. [PMID: 16855295 PMCID: PMC1524905 DOI: 10.1093/nar/gkl372] [Citation(s) in RCA: 98] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/13/2023] Open

Leung HCM, Chin FYL. Algorithms for challenging motif problems. J Bioinform Comput Biol 2006;4:43-58. [PMID: 16568541 DOI: 10.1142/s0219720006001692] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2005] [Revised: 08/01/2005] [Accepted: 08/01/2005] [Indexed: 11/18/2022]

Fadiel A, Lithwick S, Ganji G, Scherer SW. Remarkable sequence signatures in archaeal genomes. ARCHAEA-AN INTERNATIONAL MICROBIOLOGICAL JOURNAL 2005;1:185-90. [PMID: 15803664 PMCID: PMC2685567 DOI: 10.1155/2003/458235] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/27/2022]

Betel D, Hogue CWV. Kangaroo--a pattern-matching program for biological sequences. BMC Bioinformatics 2002;3:20. [PMID: 12150718 PMCID: PMC119856 DOI: 10.1186/1471-2105-3-20] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2002] [Accepted: 07/31/2002] [Indexed: 11/16/2022] Open

Papatsenko DA, Makeev VJ, Lifanov AP, Régnier M, Nazina AG, Desplan C. Extraction of functional binding sites from unique regulatory regions: the Drosophila early developmental enhancers. Genome Res 2002;12:470-81. [PMID: 11875036 PMCID: PMC155290 DOI: 10.1101/gr.212502] [Citation(s) in RCA: 60] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]

Pesole G, Mignone F, Gissi C, Grillo G, Licciulli F, Liuni S. Structural and functional features of eukaryotic mRNA untranslated regions. Gene 2001;276:73-81. [PMID: 11591473 DOI: 10.1016/s0378-1119(01)00674-6] [Citation(s) in RCA: 292] [Impact Index Per Article: 12.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]

Medici N, Abbondanza C, Nigro V, Rossi V, Piluso G, Belsito A, Gallo L, Roscigno A, Bontempo P, Puca AA, Molinari AM, Moncharmont B, Puca GA. Identification of a DNA binding protein cooperating with estrogen receptor as RIZ (retinoblastoma interacting zinc finger protein). Biochem Biophys Res Commun 1999;264:983-9. [PMID: 10544042 DOI: 10.1006/bbrc.1999.1604] [Citation(s) in RCA: 30] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]

Vanet A, Marsan L, Sagot MF. Promoter sequences and algorithmical methods for identifying them. Res Microbiol 1999;150:779-99. [PMID: 10673015 DOI: 10.1016/s0923-2508(99)00115-1] [Citation(s) in RCA: 42] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]

Ostergaard L, Pedersen AG, Jespersen HM, Brunak S, Welinder KG. Computational analyses and annotations of the Arabidopsis peroxidase gene family. FEBS Lett 1998;433:98-102. [PMID: 9738941 DOI: 10.1016/s0014-5793(98)00849-7] [Citation(s) in RCA: 32] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]

Pesole G, Attimonelli M, Saccone C. Linguistic analysis of nucleotide sequences: algorithms for pattern recognition and analysis of codon strategy. Methods Enzymol 1996;266:281-94. [PMID: 8743690 DOI: 10.1016/s0076-6879(96)66019-4] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/01/2023]

Larsen NI, Engelbrecht J, Brunak S. Analysis of eukaryotic promoter sequences reveals a systematically occurring CT-signal. Nucleic Acids Res 1995;23:1223-30. [PMID: 7739901 PMCID: PMC306835 DOI: 10.1093/nar/23.7.1223] [Citation(s) in RCA: 26] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/26/2023] Open

Pesole G, Attimonelli M, Saccone C. Linguistic approaches to the analysis of sequence information. Trends Biotechnol 1994;12:401-8. [PMID: 7765386 DOI: 10.1016/0167-7799(94)90028-0] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/27/2023]

Scherer S, McPeek MS, Speed TP. Atypical regions in large genomic DNA sequences. Proc Natl Acad Sci U S A 1994;91:7134-8. [PMID: 8041759 PMCID: PMC44353 DOI: 10.1073/pnas.91.15.7134] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/28/2023] Open

Pesole G, Fiormarino G, Saccone C. Sequence analysis and compositional properties of untranslated regions of human mRNAs. Gene 1994;140:219-25. [PMID: 8144029 DOI: 10.1016/0378-1119(94)90547-9] [Citation(s) in RCA: 19] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/29/2023]

de Zamaroczy M, Bernardi G. The mosaic organization of the mitochondrial introns of Saccharomyces cerevisiae: features and evolutionary origins. Gene 1992;122:91-9. [PMID: 1452043 DOI: 10.1016/0378-1119(92)90036-o] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022]