Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Jia C, Carson MB, Yu J. A fast weak motif-finding algorithm based on community detection in graphs. BMC Bioinformatics 2013;14:227. [PMID: 23865838 PMCID: PMC3726413 DOI: 10.1186/1471-2105-14-227] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2012] [Accepted: 07/12/2013] [Indexed: 12/02/2022] Open

For:	Jia C, Carson MB, Yu J. A fast weak motif-finding algorithm based on community detection in graphs. BMC Bioinformatics 2013;14:227. [PMID: 23865838 PMCID: PMC3726413 DOI: 10.1186/1471-2105-14-227] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2012] [Accepted: 07/12/2013] [Indexed: 12/02/2022] Open

Number

Cited by Other Article(s)

Computational discovery and modeling of novel gene expression rules encoded in the mRNA. Biochem Soc Trans 2020;48:1519-1528. [PMID: 32662820 DOI: 10.1042/bst20191048] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2020] [Revised: 06/15/2020] [Accepted: 06/17/2020] [Indexed: 11/17/2022]

Drini S, Criscuolo A, Lechat P, Imamura H, Skalický T, Rachidi N, Lukeš J, Dujardin JC, Späth GF. Species- and Strain-Specific Adaptation of the HSP70 Super Family in Pathogenic Trypanosomatids. Genome Biol Evol 2016;8:1980-95. [PMID: 27371955 PMCID: PMC4943205 DOI: 10.1093/gbe/evw140] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023] Open

Abstract

All eukaryotic genomes encode multiple members of the heat shock protein 70 (HSP70) family, which evolved distinctive structural and functional features in response to specific environmental constraints. Phylogenetic analysis of this protein family thus can inform on genetic and molecular mechanisms that drive species-specific environmental adaptation. Here we use the eukaryotic pathogen Leishmania spp. as a model system to investigate the evolution of the HSP70 protein family in an early-branching eukaryote that is prone to gene amplification and adapts to cytotoxic host environments by stress-induced and chaperone-dependent stage differentiation. Combining phylogenetic and comparative analyses of trypanosomatid genomes, draft genome of Paratrypanosoma and recently published genome sequences of 204 L. donovani field isolates, we gained unique insight into the evolutionary dynamics of the Leishmania HSP70 protein family. We provide evidence for (i) significant evolutionary expansion of this protein family in Leishmania through gene amplification and functional specialization of highly conserved canonical HSP70 members, (ii) evolution of trypanosomatid-specific, non-canonical family members that likely gained ATPase-independent functions, and (iii) loss of one atypical HSP70 member in the Trypanosoma genus. Finally, we reveal considerable copy number variation of canonical cytoplasmic HSP70 in highly related L. donovani field isolates, thus identifying this locus as a potential hot spot of environment–genotype interaction. Our data draw a complex picture of the genetic history of HSP70 in trypanosomatids that is driven by the remarkable plasticity of the Leishmania genome to undergo massive intra-chromosomal gene amplification to compensate for the absence of regulated transcriptional control in these parasites.

Collapse

Tangirala K, Herndon N, Caragea D. A Comparative Analysis Between k-Mers and Community Detection-Based Features for the Task of Protein Classification. IEEE Trans Nanobioscience 2016;15:84-92. [PMID: 26863669 PMCID: PMC6245644 DOI: 10.1109/tnb.2016.2523501] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]

Lihu A, Holban T. A review of ensemble methods for de novo motif discovery in ChIP-Seq data. Brief Bioinform 2015;16:964-73. [DOI: 10.1093/bib/bbv022] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2015] [Indexed: 01/17/2023] Open

Pissis SP. MoTeX-II: structured MoTif eXtraction from large-scale datasets. BMC Bioinformatics 2014;15:235. [PMID: 25004797 PMCID: PMC4227134 DOI: 10.1186/1471-2105-15-235] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2013] [Accepted: 06/04/2014] [Indexed: 11/23/2022] Open

Abstract

BACKGROUND

Identifying repeated factors that occur in a string of letters or common factors that occur in a set of strings represents an important task in computer science and biology. Such patterns are called motifs, and the process of identifying them is called motif extraction. In biology, motif extraction constitutes a fundamental step in understanding regulation of gene expression. State-of-the-art tools for motif extraction have their own constraints. Most of these tools are only designed for single motif extraction; structured motifs additionally allow for distance intervals between their single motif components. Moreover, motif extraction from large-scale datasets-for instance, large-scale ChIP-Seq datasets-cannot be performed by current tools. Other constraints include high time and/or space complexity for identifying long motifs with higher error thresholds.

RESULTS

In this article, we introduce MoTeX-II, a word-based high-performance computing tool for structured MoTif eXtraction from large-scale datasets. Similar to its predecessor for single motif extraction, it uses state-of-the-art algorithms for solving the fixed-length approximate string matching problem. It produces similar and partially identical results to state-of-the-art tools for structured motif extraction with respect to accuracy as quantified by statistical significance measures. Moreover, we show that it matches or outperforms these tools in terms of runtime efficiency by merging single motif occurrences efficiently. MoTeX-II comes in three flavors: a standard CPU version; an OpenMP-based version; and an MPI-based version. For instance, the MPI-based version of MoTeX-II requires only a couple of hours to process all human genes for structured motif extraction on 1056 processors, while current sequential tools require more than a week for this task. Finally, we show that MoTeX-II is successful in extracting known composite transcription factor binding sites from real datasets.

CONCLUSIONS

Use of MoTeX-II in biological frameworks may enable deriving reliable and important information since real full-length datasets can now be processed with almost any set of input parameters for both single and structured motif extraction in a reasonable amount of time. The open-source code of MoTeX-II is freely available at http://www.inf.kcl.ac.uk/research/projects/motex/.

Collapse

Bailey T, Krajewski P, Ladunga I, Lefebvre C, Li Q, Liu T, Madrigal P, Taslim C, Zhang J. Practical guidelines for the comprehensive analysis of ChIP-seq data. PLoS Comput Biol 2013;9:e1003326. [PMID: 24244136 PMCID: PMC3828144 DOI: 10.1371/journal.pcbi.1003326] [Citation(s) in RCA: 164] [Impact Index Per Article: 14.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022] Open