Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For:	Bergeron A, Corteel S, Raffinot M. The Algorithmic of Gene Teams. Lecture Notes in Computer Science 2002. [DOI: 10.1007/3-540-45784-4_36] [Citation(s) in RCA: 41] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/20/2023]

Number

Cited by Other Article(s)

Ozeri E, Zehavi M, Ziv-Ukelson M. New algorithms for structure informed genome rearrangement. Algorithms Mol Biol 2023;18:17. [PMID: 38037088 PMCID: PMC10691145 DOI: 10.1186/s13015-023-00239-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2023] [Accepted: 08/17/2023] [Indexed: 12/02/2023] Open

Abstract

We define two new computational problems in the domain of perfect genome rearrangements, and propose three algorithms to solve them. The rearrangement scenarios modeled by the problems consider Reversal and Block Interchange operations, and a PQ-tree is utilized to guide the allowed operations and to compute their weights. In the first problem, [Formula: see text] ([Formula: see text]), we define the basic structure-informed rearrangement measure. Here, we assume that the gene order members of the gene cluster from which the PQ-tree is constructed are permutations. The PQ-tree representing the gene cluster is ordered such that the series of gene IDs spelled by its leaves is equivalent to that of the reference gene order. Then, a structure-informed genome rearrangement distance is computed between the ordered PQ-tree and the target gene order. The second problem, [Formula: see text] ([Formula: see text]), generalizes [Formula: see text], where the gene order members are not necessarily permutations and the structure informed rearrangement measure is extended to also consider up to [Formula: see text] and [Formula: see text] gene insertion and deletion operations, respectively, when modelling the PQ-tree informed divergence process from the reference gene order to the target gene order. The first algorithm solves [Formula: see text] in [Formula: see text] time and [Formula: see text] space, where [Formula: see text] is the maximum number of children of a node, n is the length of the string and the number of leaves in the tree, and [Formula: see text] and [Formula: see text] are the number of P-nodes and Q-nodes in the tree, respectively. If one of the penalties of [Formula: see text] is 0, then the algorithm runs in [Formula: see text] time and [Formula: see text] space. The second algorithm solves [Formula: see text] in [Formula: see text] time and [Formula: see text] space, where [Formula: see text] is the maximum number of children of a node, n is the length of the string, m is the number of leaves in the tree, [Formula: see text] and [Formula: see text] are the number of P-nodes and Q-nodes in the tree, respectively, and allowing up to [Formula: see text] deletions from the tree and up to [Formula: see text] deletions from the string. The third algorithm is intended to reduce the space complexity of the second algorithm. It solves a variant of the problem (where one of the penalties of [Formula: see text] is 0) in [Formula: see text] time and [Formula: see text] space. The algorithm is implemented as a software tool, denoted MEM-Rearrange, and applied to the comparative and evolutionary analysis of 59 chromosomal gene clusters extracted from a dataset of 1487 prokaryotic genomes.

Collapse

Zimerman GR, Svetlitsky D, Zehavi M, Ziv-Ukelson M. Approximate search for known gene clusters in new genomes using PQ-trees. Algorithms Mol Biol 2021;16:16. [PMID: 34243815 PMCID: PMC8272295 DOI: 10.1186/s13015-021-00190-9] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2021] [Accepted: 06/05/2021] [Indexed: 11/13/2022] Open

Abstract

Gene clusters are groups of genes that are co-locally conserved across various genomes, not necessarily in the same order. Their discovery and analysis is valuable in tasks such as gene annotation and prediction of gene interactions, and in the study of genome organization and evolution. The discovery of conserved gene clusters in a given set of genomes is a well studied problem, but with the rapid sequencing of prokaryotic genomes a new problem is inspired. Namely, given an already known gene cluster that was discovered and studied in one genomic dataset, to identify all the instances of the gene cluster in a given new genomic sequence. Thus, we define a new problem in comparative genomics, denoted PQ-Tree Search that takes as input a PQ-tree T representing the known gene orders of a gene cluster of interest, a gene-to-gene substitution scoring function h, integer arguments \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$d_T$$\end{document}dT and \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$d_S$$\end{document}dS, and a new sequence of genes S. The objective is to identify in S approximate new instances of the gene cluster; These instances could vary from the known gene orders by genome rearrangements that are constrained by T, by gene substitutions that are governed by h, and by gene deletions and insertions that are bounded from above by \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$d_T$$\end{document}dT and \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$d_S$$\end{document}dS, respectively. We prove that PQ-Tree Search is NP-hard and propose a parameterized algorithm that solves the optimization variant of PQ-Tree Search in \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$O^*(2^{\gamma })$$\end{document}O∗(2γ) time, where \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\gamma$$\end{document}γ is the maximum degree of a node in T and \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$O^*$$\end{document}O∗ is used to hide factors polynomial in the input size. The algorithm is implemented as a search tool, denoted PQFinder, and applied to search for instances of chromosomal gene clusters in plasmids, within a dataset of 1,487 prokaryotic genomes. We report on 29 chromosomal gene clusters that are rearranged in plasmids, where the rearrangements are guided by the corresponding PQ-trees. One of these results, coding for a heavy metal efflux pump, is further analysed to exemplify how PQFinder can be harnessed to reveal interesting new structural variants of known gene clusters.

Collapse

Benshahar A, Chalifa-Caspi V, Hermelin D, Ziv-Ukelson M. A Biclique Approach to Reference-Anchored Gene Blocks and Its Applications to Genomic Islands. J Comput Biol 2018;25:214-235. [DOI: 10.1089/cmb.2017.0108] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/06/2023] Open

Winter S, Jahn K, Wehner S, Kuchenbecker L, Marz M, Stoye J, Böcker S. Finding approximate gene clusters with Gecko 3. Nucleic Acids Res 2016;44:9600-9610. [PMID: 27679480 PMCID: PMC5175365 DOI: 10.1093/nar/gkw843] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2015] [Revised: 09/06/2016] [Accepted: 09/12/2016] [Indexed: 12/15/2022] Open

Heydari M, Marashi SA, Tusserkani R, Sadeghi M. Reconstruction of phylogenetic trees of prokaryotes using maximal common intervals. Biosystems 2014;124:86-94. [PMID: 25195150 DOI: 10.1016/j.biosystems.2014.09.002] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/28/2012] [Revised: 08/13/2014] [Accepted: 09/01/2014] [Indexed: 11/15/2022]

Abstract

One of the fundamental problems in bioinformatics is phylogenetic tree reconstruction, which can be used for classifying living organisms into different taxonomic clades. The classical approach to this problem is based on a marker such as 16S ribosomal RNA. Since evolutionary events like genomic rearrangements are not included in reconstructions of phylogenetic trees based on single genes, much effort has been made to find other characteristics for phylogenetic reconstruction in recent years. With the increasing availability of completely sequenced genomes, gene order can be considered as a new solution for this problem. In the present work, we applied maximal common intervals (MCIs) in two or more genomes to infer their distance and to reconstruct their evolutionary relationship. Additionally, measures based on uncommon segments (UCS's), i.e., those genomic segments which are not detected as part of any of the MCIs, are also used for phylogenetic tree reconstruction. We applied these two types of measures for reconstructing the phylogenetic tree of 63 prokaryotes with known COG (clusters of orthologous groups) families. Similarity between the MCI-based (resp. UCS-based) reconstructed phylogenetic trees and the phylogenetic tree obtained from NCBI taxonomy browser is as high as 93.1% (resp. 94.9%). We show that in the case of this diverse dataset of prokaryotes, tree reconstruction based on MCI and UCS outperforms most of the currently available methods based on gene orders, including breakpoint distance and DCJ. We additionally tested our new measures on a dataset of 13 closely-related bacteria from the genus Prochlorococcus. In this case, distances like rearrangement distance, breakpoint distance and DCJ proved to be useful, while our new measures are still appropriate for phylogenetic reconstruction.

Collapse

Lechner M, Hernandez-Rosales M, Doerr D, Wieseke N, Thévenin A, Stoye J, Hartmann RK, Prohaska SJ, Stadler PF. Orthology detection combining clustering and synteny for very large datasets. PLoS One 2014;9:e105015. [PMID: 25137074 PMCID: PMC4138177 DOI: 10.1371/journal.pone.0105015] [Citation(s) in RCA: 73] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2014] [Accepted: 07/14/2014] [Indexed: 11/18/2022] Open

Affiliation(s)

Marcus Lechner Institut für Pharmazeutische Chemie, Philipps-Universität Marburg, Marburg, Germany * E-mail:
Maribel Hernandez-Rosales Bioinformatics Group, Department of Computer Science, Universität Leipzig, Leipzig, Germany Interdisciplinary Center for Bioinformatics, Universität Leipzig, Leipzig, Germany Max Planck Institute for Mathematics in the Sciences, Leipzig, Germany Departamento de Ciência da Computação, Instituto de Ciências Exatas, Universidade de Brasília, Brasília, Brasil
Daniel Doerr Genome Informatics, Faculty of Technology, Bielefeld University, Bielefeld, Germany Institute for Bioinformatics, Center for Biotechnology, Bielefeld University, Bielefeld, Germany
Nicolas Wieseke Faculty of Mathematics and Computer Science University of Leipzig, Leipzig, Germany
Annelyse Thévenin Genome Informatics, Faculty of Technology, Bielefeld University, Bielefeld, Germany Institute for Bioinformatics, Center for Biotechnology, Bielefeld University, Bielefeld, Germany
Jens Stoye Genome Informatics, Faculty of Technology, Bielefeld University, Bielefeld, Germany Institute for Bioinformatics, Center for Biotechnology, Bielefeld University, Bielefeld, Germany
Roland K. Hartmann Institut für Pharmazeutische Chemie, Philipps-Universität Marburg, Marburg, Germany
Sonja J. Prohaska Computational EvoDevo Group, Department of Computer Science, Universität Leipzig, Leipzig, Germany
Peter F. Stadler Bioinformatics Group, Department of Computer Science, Universität Leipzig, Leipzig, Germany Interdisciplinary Center for Bioinformatics, Universität Leipzig, Leipzig, Germany Max Planck Institute for Mathematics in the Sciences, Leipzig, Germany Institute for Theoretical Chemistry, University of Vienna, Vienna, Austria Center for non-coding RNA in Technology and Health, University of Copenhagen, Frederiksberg, Denmark The Santa Fe Institute, Santa Fe, New Mexico, United States of America RNomics Group, Fraunhofer Institut for Cell Therapy and Immunology, Leipzig, Germany

Collapse

Lucas JMEX, Muffato M, Roest Crollius H. PhylDiag: identifying complex synteny blocks that include tandem duplications using phylogenetic gene trees. BMC Bioinformatics 2014;15:268. [PMID: 25103980 PMCID: PMC4155083 DOI: 10.1186/1471-2105-15-268] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2014] [Accepted: 07/17/2014] [Indexed: 11/21/2022] Open

Abstract

Background

Extant genomes share regions where genes have the same order and orientation, which are thought to arise from the conservation of an ancestral order of genes during evolution. Such regions of so-called conserved synteny, or synteny blocks, must be precisely identified and quantified, as a prerequisite to better understand the evolutionary history of genomes.

Results

Here we describe PhylDiag, a software that identifies statistically significant synteny blocks in pairwise comparisons of eukaryote genomes. Compared to previous methods, PhylDiag uses gene trees to define gene homologies, thus allowing gene deletions to be considered as events that may break the synteny. PhylDiag also accounts for gene orientations, blocks of tandem duplicates and lineage specific de novo gene births. Starting from two genomes and the corresponding gene trees, PhylDiag returns synteny blocks with gaps less than or equal to the maximum gap parameter gap_max. This parameter is theoretically estimated, and together with a utility to graphically display results, contributes to making PhylDiag a user friendly method. In addition, putative synteny blocks are subject to a statistical validation to verify that they are unlikely to be due to a random combination of genes.

Conclusions

We benchmark several known metrics to measure 2D-distances in a matrix of homologies and we compare PhylDiag to i-ADHoRe 3.0 on real and simulated data. We show that PhylDiag correctly identifies small synteny blocks even with insertions, deletions, incorrect annotations or micro-inversions. Finally, PhylDiag allowed us to identify the most relevant distance metric for 2D-distance calculation between homologies.

Electronic supplementary material

The online version of this article (doi:10.1186/1471-2105-15-268) contains supplementary material, which is available to authorized users.

Collapse

Jahn K. Efficient computation of approximate gene clusters based on reference occurrences. J Comput Biol 2012;18:1255-74. [PMID: 21899430 DOI: 10.1089/cmb.2011.0132] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open

El-Mabrouk N, Sankoff D. Analysis of gene order evolution beyond single-copy genes. Methods Mol Biol 2012;855:397-429. [PMID: 22407718 DOI: 10.1007/978-1-61779-582-4_15] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/31/2023]

Grusea S. On the distribution of the number of cycles in the breakpoint graph of a random signed permutation. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2011;8:1411-1416. [PMID: 21116045 DOI: 10.1109/tcbb.2010.123] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/30/2023]

Grusea S, Pardoux E, Chabrol O, Pontarotti P. Compound Poisson Approximation and Testing for Gene Clusters with Multigene Families. J Comput Biol 2011;18:579-94. [DOI: 10.1089/cmb.2010.0043] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022] Open

Blin G, Faye D, Stoye J. Finding Nested Common Intervals Efficiently. J Comput Biol 2010;17:1183-94. [DOI: 10.1089/cmb.2010.0089] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open

Yang Q, Yi G, Zhang F, Thon MR, Sze SH. Identifying gene clusters within localized regions in multiple genomes. J Comput Biol 2010;17:657-68. [PMID: 20500020 DOI: 10.1089/cmb.2009.0116] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open

Approximative Gencluster und ihre Anwendung in der komparativen Genomik. ACTA ACUST UNITED AC 2009. [DOI: 10.1007/s00287-009-0350-9] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/27/2023]

Ling X, He X, Xin D. Detecting gene clusters under evolutionary constraint in a large number of genomes. ACTA ACUST UNITED AC 2009;25:571-7. [PMID: 19158161 DOI: 10.1093/bioinformatics/btp027] [Citation(s) in RCA: 41] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]

Schmidt T, Stoye J. Gecko and GhostFam: rigorous and efficient gene cluster detection in prokaryotic genomes. Methods Mol Biol 2008;396:165-82. [PMID: 18025693 DOI: 10.1007/978-1-59745-515-2_12] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/17/2023]

Domain team: synteny of domains is a new approach in comparative genomics. Methods Mol Biol 2007;396:17-29. [PMID: 18025683 DOI: 10.1007/978-1-59745-515-2_2] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/20/2023]

Durand D, Hoberman R. Diagnosing duplications – can it be done? Trends Genet 2006;22:156-64. [PMID: 16442663 DOI: 10.1016/j.tig.2006.01.002] [Citation(s) in RCA: 31] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2005] [Revised: 11/30/2005] [Accepted: 01/11/2006] [Indexed: 01/10/2023]

Gapped Permutation Patterns for Comparative Genomics. ACTA ACUST UNITED AC 2006. [DOI: 10.1007/11851561_35] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register]

Rahmann S, Klau GW. Integer Linear Programs for Discovering Approximate Gene Clusters. ACTA ACUST UNITED AC 2006. [DOI: 10.1007/11851561_28] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/20/2023]

Hoberman R, Sankoff D, Durand D. The statistical analysis of spatially clustered genes under the maximum gap criterion. J Comput Biol 2005;12:1083-102. [PMID: 16241899 DOI: 10.1089/cmb.2005.12.1083] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open

Pasek S, Bergeron A, Risler JL, Louis A, Ollivier E, Raffinot M. Identification of genomic features using microsyntenies of domains: domain teams. Genome Res 2005;15:867-74. [PMID: 15899966 PMCID: PMC1142477 DOI: 10.1101/gr.3638405] [Citation(s) in RCA: 34] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]

Hoberman R, Durand D. The Incompatible Desiderata of Gene Cluster Properties. COMPARATIVE GENOMICS 2005. [DOI: 10.1007/11554714_7] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/15/2023]

Using PQ Trees for Comparative Genomics. COMBINATORIAL PATTERN MATCHING 2005. [DOI: 10.1007/11496656_12] [Citation(s) in RCA: 16] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/11/2023]

Sankoff D. Rearrangements and chromosomal evolution. Curr Opin Genet Dev 2003;13:583-7. [PMID: 14638318 DOI: 10.1016/j.gde.2003.10.006] [Citation(s) in RCA: 28] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/04/2023]