Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Pellegrini M, Renda ME, Vecchio A. TRStalker: an efficient heuristic for finding fuzzy tandem repeats. ACTA ACUST UNITED AC 2010;26:i358-66. [PMID: 20529928 PMCID: PMC2881393 DOI: 10.1093/bioinformatics/btq209] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/16/2023]

For:	Pellegrini M, Renda ME, Vecchio A. TRStalker: an efficient heuristic for finding fuzzy tandem repeats. ACTA ACUST UNITED AC 2010;26:i358-66. [PMID: 20529928 PMCID: PMC2881393 DOI: 10.1093/bioinformatics/btq209] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/16/2023]

Number

Cited by Other Article(s)

Chaisson MJP, Sulovari A, Valdmanis PN, Miller DE, Eichler EE. Advances in the discovery and analyses of human tandem repeats. Emerg Top Life Sci 2023;7:361-381. [PMID: 37905568 PMCID: PMC10806765 DOI: 10.1042/etls20230074] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2023] [Revised: 10/18/2023] [Accepted: 10/18/2023] [Indexed: 11/02/2023]

Orlov YL, Orlova NG. Bioinformatics tools for the sequence complexity estimates. Biophys Rev 2023;15:1367-1378. [PMID: 37974990 PMCID: PMC10643780 DOI: 10.1007/s12551-023-01140-y] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2023] [Accepted: 09/01/2023] [Indexed: 11/19/2023] Open

Abstract

We review current methods and bioinformatics tools for the text complexity estimates (information and entropy measures). The search DNA regions with extreme statistical characteristics such as low complexity regions are important for biophysical models of chromosome function and gene transcription regulation in genome scale. We discuss the complexity profiling for segmentation and delineation of genome sequences, search for genome repeats and transposable elements, and applications to next-generation sequencing reads. We review the complexity methods and new applications fields: analysis of mutation hotspots loci, analysis of short sequencing reads with quality control, and alignment-free genome comparisons. The algorithms implementing various numerical measures of text complexity estimates including combinatorial and linguistic measures have been developed before genome sequencing era. The series of tools to estimate sequence complexity use compression approaches, mainly by modification of Lempel-Ziv compression. Most of the tools are available online providing large-scale service for whole genome analysis. Novel machine learning applications for classification of complete genome sequences also include sequence compression and complexity algorithms. We present comparison of the complexity methods on the different sequence sets, the applications for gene transcription regulatory regions analysis. Furthermore, we discuss approaches and application of sequence complexity for proteins. The complexity measures for amino acid sequences could be calculated by the same entropy and compression-based algorithms. But the functional and evolutionary roles of low complexity regions in protein have specific features differing from DNA. The tools for protein sequence complexity aimed for protein structural constraints. It was shown that low complexity regions in protein sequences are conservative in evolution and have important biological and structural functions. Finally, we summarize recent findings in large scale genome complexity comparison and applications for coronavirus genome analysis.

Collapse

Korotkov E, Zaytsev K, Fedorov A. Use of 6 Nucleotide Length Words to Study the Complexity of Gene Sequences from Different Organisms. ENTROPY (BASEL, SWITZERLAND) 2022;24:632. [PMID: 35626518 PMCID: PMC9141341 DOI: 10.3390/e24050632] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/15/2022] [Revised: 04/23/2022] [Accepted: 04/27/2022] [Indexed: 12/02/2022]

Morishita S, Ichikawa K, Myers EW. Finding long tandem repeats in long noisy reads. Bioinformatics 2021;37:612-621. [PMID: 33031558 PMCID: PMC8097686 DOI: 10.1093/bioinformatics/btaa865] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2020] [Revised: 09/07/2020] [Accepted: 09/23/2020] [Indexed: 11/13/2022] Open

Korotkov EV, Kamionskya AM, Korotkova MA. Detection of Highly Divergent Tandem Repeats in the Rice Genome. Genes (Basel) 2021;12:genes12040473. [PMID: 33806152 PMCID: PMC8064497 DOI: 10.3390/genes12040473] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/17/2021] [Revised: 03/11/2021] [Accepted: 03/23/2021] [Indexed: 11/25/2022] Open

Merski M, Młynarczyk K, Ludwiczak J, Skrzeczkowski J, Dunin-Horkawicz S, Górna MW. Self-analysis of repeat proteins reveals evolutionarily conserved patterns. BMC Bioinformatics 2020;21:179. [PMID: 32381046 PMCID: PMC7204011 DOI: 10.1186/s12859-020-3493-y] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2019] [Accepted: 04/15/2020] [Indexed: 11/26/2022] Open

Genovese LM, Mosca MM, Pellegrini M, Geraci F. Dot2dot: accurate whole-genome tandem repeats discovery. Bioinformatics 2019;35:914-922. [PMID: 30165507 PMCID: PMC6419916 DOI: 10.1093/bioinformatics/bty747] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2018] [Revised: 08/03/2018] [Accepted: 08/24/2018] [Indexed: 01/18/2023] Open

Gao Y, Liu B, Wang Y, Xing Y. TideHunter: efficient and sensitive tandem repeat detection from noisy long-reads using seed-and-chain. Bioinformatics 2019;35:i200-i207. [PMID: 31510677 PMCID: PMC6612900 DOI: 10.1093/bioinformatics/btz376] [Citation(s) in RCA: 22] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023] Open

Genovese LM, Geraci F, Corrado L, Mangano E, D'Aurizio R, Bordoni R, Severgnini M, Manzini G, De Bellis G, D'Alfonso S, Pellegrini M. A Census of Tandemly Repeated Polymorphic Loci in Genic Regions Through the Comparative Integration of Human Genome Assemblies. Front Genet 2018;9:155. [PMID: 29770143 PMCID: PMC5941971 DOI: 10.3389/fgene.2018.00155] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2017] [Accepted: 04/13/2018] [Indexed: 11/29/2022] Open

Abstract

Polymorphic Tandem Repeat (PTR) is a common form of polymorphism in the human genome. A PTR consists in a variation found in an individual (or in a population) of the number of repeating units of a Tandem Repeat (TR) locus of the genome with respect to the reference genome. Several phenotypic traits and diseases have been discovered to be strongly associated with or caused by specific PTR loci. PTR are further distinguished in two main classes: Short Tandem Repeats (STR) when the repeating unit has size up to 6 base pairs, and Variable Number Tandem Repeats (VNTR) for repeating units of size above 6 base pairs. As larger and larger populations are screened via high throughput sequencing projects, it becomes technically feasible and desirable to explore the association between PTR and a panoply of such traits and conditions. In order to facilitate these studies, we have devised a method for compiling catalogs of PTR from assembled genomes, and we have produced a catalog of PTR for genic regions (exons, introns, UTR and adjacent regions) of the human genome (GRCh38). We applied four different TR discovery software tools to uncover in the first phase 55,223,485 TR (after duplicate removal) in GRCh38, of which 373,173 were determined to be PTR in the second phase by comparison with five assembled human genomes. Of these, 263,266 are not included by state-of-the-art PTR catalogs. The new methodology is mainly based on a hierarchical and systematic application of alignment-based sequence comparisons to identify and measure the polymorphism of TR. While previous catalogs focus on the class of STR of small total size, we remove any size restrictions, aiming at the more general class of PTR, and we also target fuzzy TR by using specific detection tools. Similarly to other previous catalogs of human polymorphic loci, we focus our catalog toward applications in the discovery of disease-associated loci. Validation by cross-referencing with existing catalogs on common clinically-relevant loci shows good concordance. Overall, this proposed census of human PTR in genic regions is a shared resource (web accessible), complementary to existing catalogs, facilitating future genome-wide studies involving PTR.

Collapse

Database of Periodic DNA Regions in Major Genomes. BIOMED RESEARCH INTERNATIONAL 2017;2017:7949287. [PMID: 28182099 PMCID: PMC5274682 DOI: 10.1155/2017/7949287] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/09/2016] [Revised: 12/07/2016] [Accepted: 12/21/2016] [Indexed: 12/11/2022]

Fungtammasan A, Ananda G, Hile SE, Su MSW, Sun C, Harris R, Medvedev P, Eckert K, Makova KD. Accurate typing of short tandem repeats from genome-wide sequencing data and its applications. Genome Res 2015;25:736-49. [PMID: 25823460 PMCID: PMC4417121 DOI: 10.1101/gr.185892.114] [Citation(s) in RCA: 68] [Impact Index Per Article: 7.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2014] [Accepted: 03/16/2015] [Indexed: 11/24/2022]

Affiliation(s)

Arkarachai Fungtammasan Integrative Biosciences, Bioinformatics and Genomics Option, Pennsylvania State University, University Park, Pennsylvania 16802, USA; Department of Biology, Pennsylvania State University, University Park, Pennsylvania 16802, USA; Center for Medical Genomics, Pennsylvania State University, University Park, Pennsylvania 16802, USA; The Genome Science Institute at the Huck Institutes of Life Sciences, Pennsylvania State University, University Park, Pennsylvania 16802, USA
Guruprasad Ananda Integrative Biosciences, Bioinformatics and Genomics Option, Pennsylvania State University, University Park, Pennsylvania 16802, USA; Center for Medical Genomics, Pennsylvania State University, University Park, Pennsylvania 16802, USA; The Genome Science Institute at the Huck Institutes of Life Sciences, Pennsylvania State University, University Park, Pennsylvania 16802, USA; Department of Biochemistry and Molecular Biology, Pennsylvania State University, Pennsylvania 16802, USA
Suzanne E Hile Center for Medical Genomics, Pennsylvania State University, University Park, Pennsylvania 16802, USA; Department of Pathology, The Jake Gittlen Laboratories for Cancer Research, Pennsylvania State University College of Medicine, Hershey, Pennsylvania 17033, USA
Marcia Shu-Wei Su Department of Biology, Pennsylvania State University, University Park, Pennsylvania 16802, USA; Center for Medical Genomics, Pennsylvania State University, University Park, Pennsylvania 16802, USA
Chen Sun Department of Computer Science and Engineering, Pennsylvania State University, University Park, Pennsylvania 16802, USA
Robert Harris Department of Biology, Pennsylvania State University, University Park, Pennsylvania 16802, USA
Paul Medvedev Center for Medical Genomics, Pennsylvania State University, University Park, Pennsylvania 16802, USA; The Genome Science Institute at the Huck Institutes of Life Sciences, Pennsylvania State University, University Park, Pennsylvania 16802, USA; Department of Biochemistry and Molecular Biology, Pennsylvania State University, Pennsylvania 16802, USA; Department of Computer Science and Engineering, Pennsylvania State University, University Park, Pennsylvania 16802, USA
Kristin Eckert Center for Medical Genomics, Pennsylvania State University, University Park, Pennsylvania 16802, USA; Department of Pathology, The Jake Gittlen Laboratories for Cancer Research, Pennsylvania State University College of Medicine, Hershey, Pennsylvania 17033, USA
Kateryna D Makova Department of Biology, Pennsylvania State University, University Park, Pennsylvania 16802, USA; Center for Medical Genomics, Pennsylvania State University, University Park, Pennsylvania 16802, USA; The Genome Science Institute at the Huck Institutes of Life Sciences, Pennsylvania State University, University Park, Pennsylvania 16802, USA

Collapse

Chaley M, Kutyrkin V, Tulbasheva G, Teplukhina E, Nazipova N. HeteroGenome: database of genome periodicity. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2014;2014:bau040. [PMID: 24857969 PMCID: PMC4038257 DOI: 10.1093/database/bau040] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]

Doi K, Monjo T, Hoang PH, Yoshimura J, Yurino H, Mitsui J, Ishiura H, Takahashi Y, Ichikawa Y, Goto J, Tsuji S, Morishita S. Rapid detection of expanded short tandem repeats in personal genomics using hybrid sequencing. ACTA ACUST UNITED AC 2013;30:815-22. [PMID: 24215022 PMCID: PMC3957077 DOI: 10.1093/bioinformatics/btt647] [Citation(s) in RCA: 53] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]

Churbanov A, Ryan R, Hasan N, Bailey D, Chen H, Milligan B, Houde P. HighSSR: high-throughput SSR characterization and locus development from next-gen sequencing data. ACTA ACUST UNITED AC 2012;28:2797-803. [PMID: 22954626 DOI: 10.1093/bioinformatics/bts524] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/05/2023]

Simeonova I, Lejour V, Bardot B, Bouarich-Bourimi R, Morin A, Fang M, Charbonnier L, Toledo F. Fuzzy tandem repeats containing p53 response elements may define species-specific p53 target genes. PLoS Genet 2012;8:e1002731. [PMID: 22761580 PMCID: PMC3386156 DOI: 10.1371/journal.pgen.1002731] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2011] [Accepted: 04/11/2012] [Indexed: 12/21/2022] Open

Abstract

Evolutionary forces that shape regulatory networks remain poorly understood. In mammals, the Rb pathway is a classic example of species-specific gene regulation, as a germline mutation in one Rb allele promotes retinoblastoma in humans, but not in mice. Here we show that p53 transactivates the Retinoblastoma-like 2 (Rbl2) gene to produce p130 in murine, but not human, cells. We found intronic fuzzy tandem repeats containing perfect p53 response elements to be important for this regulation. We next identified two other murine genes regulated by p53 via fuzzy tandem repeats: Ncoa1 and Klhl26. The repeats are poorly conserved in evolution, and the p53-dependent regulation of the murine genes is lost in humans. Our results indicate a role for the rapid evolution of tandem repeats in shaping differences in p53 regulatory networks between mammalian species.

TP53, the gene encoding p53, is mutated in more than half of human cancers. Consequently, p53 is one of the most studied transcription factors, shown to directly regulate more than 150 genes. The mouse is a model of choice to study p53 mutants and cancer. However, differences were found between tumorigenesis in mice and humans, and these should be investigated to improve the relevance of mouse models. The distinct mutational events required to initiate retinoblastomas in these species constitute a classic example of such differences. Here we show that p53 regulates the Retinoblastoma-like 2 (Rbl2) gene, encoding tumor suppressor p130, in murine but not human cells. The p53-dependent regulation of murine Rbl2/p130 relies on clustered p53 response elements, located within tandem repeats poorly conserved in evolution. A similar situation was found for two other genes, also p53 targets in mice but not in humans. Thus, tandem repeats may shape differences in mammalian p53 regulatory networks. By uncovering differences in p53 target gene repertoires between mice and humans, our findings may help to improve mice as models of human disease. In addition, the role of tandem repeats in shaping the target gene repertoires of other mammalian transcription factors should be considered.

Collapse

Lim KG, Kwoh CK, Hsu LY, Wirawan A. Review of tandem repeat search tools: a systematic approach to evaluating algorithmic performance. Brief Bioinform 2012;14:67-81. [PMID: 22648964 DOI: 10.1093/bib/bbs023] [Citation(s) in RCA: 68] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/31/2023] Open

Pellegrini M, Renda ME, Vecchio A. Tandem repeats discovery service (TReaDS) applied to finding novel cis-acting factors in repeat expansion diseases. BMC Bioinformatics 2012;13 Suppl 4:S3. [PMID: 22536970 PMCID: PMC3303744 DOI: 10.1186/1471-2105-13-s4-s3] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022] Open

Abstract

Background

Tandem repeats are multiple duplications of substrings in the DNA that occur contiguously, or at a short distance, and may involve some mutations (such as substitutions, insertions, and deletions). Tandem repeats have been extensively studied also for their association with the class of repeat expansion diseases (mostly affecting the nervous system). Comparative studies on the output of different tools for finding tandem repeats highlighted significant differences among the sets of detected tandem repeats, while many authors pointed up how critical it is the right choice of parameters.

Results

In this paper we present TReaDS - Tandem Repeats Discovery Service, a tandem repeat meta search engine. TReaDS forwards user requests to several state of the art tools for finding tandem repeats and merges their outcome into a single report, providing a global, synthetic, and comparative view of the results. In particular, TReaDS allows the user to (i) simultaneously run different algorithms on the same data set, (ii) choose for each algorithm a different setting of parameters, and (iii) obtain a report that can be downloaded for further, off-line, investigations. We used TReaDS to investigate sequences associated with repeat expansion diseases.

Conclusions

By using the tool TReaDS we discover that, for 27 repeat expansion diseases out of a currently known set of 29, long fuzzy tandem repeats are covering the expansion loci. Tests with control sets confirm the specificity of this association. This finding suggests that long fuzzy tandem repeats can be a new class of cis-acting elements involved in the mechanisms leading to the expansion instability.

We strongly believe that biologists can be interested in a tool that, not only gives them the possibility of using multiple search algorithm at the same time, with the same effort exerted in using just one of the systems, but also simplifies the burden of comparing and merging the results, thus expanding our capabilities in detecting important phenomena related to tandem repeats.

Collapse

Pellegrini M, Renda ME, Vecchio A. Ab initio detection of fuzzy amino acid tandem repeats in protein sequences. BMC Bioinformatics 2012;13 Suppl 3:S8. [PMID: 22536906 PMCID: PMC3402919 DOI: 10.1186/1471-2105-13-s3-s8] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open