Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Frith MC, Park Y, Sheetlin SL, Spouge JL. The whole alignment and nothing but the alignment: the problem of spurious alignment flanks. Nucleic Acids Res 2008;36:5863-71. [PMID: 18796526 PMCID: PMC2566872 DOI: 10.1093/nar/gkn579] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open

For:	Frith MC, Park Y, Sheetlin SL, Spouge JL. The whole alignment and nothing but the alignment: the problem of spurious alignment flanks. Nucleic Acids Res 2008;36:5863-71. [PMID: 18796526 PMCID: PMC2566872 DOI: 10.1093/nar/gkn579] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open

Number

Cited by Other Article(s)

Krause GR, Shands W, Wheeler TJ. Sensitive and error-tolerant annotation of protein-coding DNA with BATH. BIOINFORMATICS ADVANCES 2024;4:vbae088. [PMID: 38966592 PMCID: PMC11223822 DOI: 10.1093/bioadv/vbae088] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 02/04/2024] [Revised: 05/03/2024] [Accepted: 06/10/2024] [Indexed: 07/06/2024]

Krause GR, Shands W, Wheeler TJ. Sensitive and error-tolerant annotation of protein-coding DNA with BATH. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2023.12.31.573773. [PMID: 38260252 PMCID: PMC10802276 DOI: 10.1101/2023.12.31.573773] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/24/2024]

Frith MC. Paleozoic Protein Fossils Illuminate the Evolution of Vertebrate Genomes and Transposable Elements. Mol Biol Evol 2022;39:6555113. [PMID: 35348724 PMCID: PMC9004415 DOI: 10.1093/molbev/msac068] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open

Kiryu H, Ichikawa Y, Kojima Y. TMRS: an algorithm for computing the time to the most recent substitution event from a multiple alignment column. Algorithms Mol Biol 2019;14:23. [PMID: 31832082 PMCID: PMC6859643 DOI: 10.1186/s13015-019-0158-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2018] [Accepted: 11/04/2019] [Indexed: 11/10/2022] Open

Abstract

Background

As the number of sequenced genomes grows, researchers have access to an increasingly rich source for discovering detailed evolutionary information. However, the computational technologies for inferring biologically important evolutionary events are not sufficiently developed.

Results

We present algorithms to estimate the evolutionary time (\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$t_{\text {MRS}}$$\end{document}tMRS) to the most recent substitution event from a multiple alignment column by using a probabilistic model of sequence evolution. As the confidence in estimated \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$t_{\text {MRS}}$$\end{document}tMRS values varies depending on gap fractions and nucleotide patterns of alignment columns, we also compute the standard deviation \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\sigma$$\end{document}σ of \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$t_{\text {MRS}}$$\end{document}tMRS by using a dynamic programming algorithm. We identified a number of human genomic sites at which the last substitutions occurred between two speciation events in the human lineage with confidence. A large fraction of such sites have substitutions that occurred between the concestor nodes of Hominoidea and Euarchontoglires. We investigated the correlation between tissue-specific transcribed enhancers and the distribution of the sites with specific substitution time intervals, and found that brain-specific transcribed enhancers are threefold enriched in the density of substitutions in the human lineage relative to expectations.

Conclusions

We have presented algorithms to estimate the evolutionary time (\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$t_{\text {MRS}}$$\end{document}tMRS) to the most recent substitution event from a multiple alignment column by using a probabilistic model of sequence evolution. Our algorithms will be useful for Evo-Devo studies, as they facilitate screening potential genomic sites that have played an important role in the acquisition of unique biological features by target species.

Electronic supplementary material

The online version of this article (10.1186/s13015-019-0158-3) contains supplementary material, which is available to authorized users.

Collapse

Mahlich Y, Steinegger M, Rost B, Bromberg Y. HFSP: high speed homology-driven function annotation of proteins. Bioinformatics 2019;34:i304-i312. [PMID: 29950013 PMCID: PMC6022561 DOI: 10.1093/bioinformatics/bty262] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open

Dewey CN. Whole-Genome Alignment. Methods Mol Biol 2019;1910:121-147. [PMID: 31278663 DOI: 10.1007/978-1-4939-9074-0_4] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]

MMseqs2 enables sensitive protein sequence searching for the analysis of massive data sets. Nat Biotechnol 2017;35:1026-1028. [PMID: 29035372 DOI: 10.1038/nbt.3988] [Citation(s) in RCA: 1427] [Impact Index Per Article: 203.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]

Lim K, Yamada KD, Frith MC, Tomii K. Protein sequence-similarity search acceleration using a heuristic algorithm with a sensitive matrix. ACTA ACUST UNITED AC 2017;17:147-154. [PMID: 28083762 PMCID: PMC5274646 DOI: 10.1007/s10969-016-9210-4] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/31/2015] [Accepted: 12/05/2016] [Indexed: 12/28/2022]

Hubley R, Finn RD, Clements J, Eddy SR, Jones TA, Bao W, Smit AFA, Wheeler TJ. The Dfam database of repetitive DNA families. Nucleic Acids Res 2015;44:D81-9. [PMID: 26612867 PMCID: PMC4702899 DOI: 10.1093/nar/gkv1272] [Citation(s) in RCA: 409] [Impact Index Per Article: 45.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2015] [Accepted: 11/03/2015] [Indexed: 11/20/2022] Open

Hoen DR, Hickey G, Bourque G, Casacuberta J, Cordaux R, Feschotte C, Fiston-Lavier AS, Hua-Van A, Hubley R, Kapusta A, Lerat E, Maumus F, Pollock DD, Quesneville H, Smit A, Wheeler TJ, Bureau TE, Blanchette M. A call for benchmarking transposable element annotation methods. Mob DNA 2015;6:13. [PMID: 26244060 PMCID: PMC4524446 DOI: 10.1186/s13100-015-0044-6] [Citation(s) in RCA: 65] [Impact Index Per Article: 7.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2015] [Accepted: 07/22/2015] [Indexed: 12/31/2022] Open

Affiliation(s)

Douglas R Hoen School of Computer Science, McGill University, McConnell Engineering Bldg., Rm. 318, 3480 Rue University, Montréal, Québec H3A 0E9 Canada ; Department of Biology, McGill University, Stewart Biology Bldg., 1205 Ave. du Docteur-Penfield, Montréal, Québec H3A 1B1 Canada
Glenn Hickey School of Computer Science, McGill University, McConnell Engineering Bldg., Rm. 318, 3480 Rue University, Montréal, Québec H3A 0E9 Canada ; McGill Centre for Bioinformatics, McGill University, Montréal, Québec Canada
Guillaume Bourque Department of Human Genetics, McGill University, Montréal, Québec Canada ; McGill University and Génome Québec Innovation Center, Montréal, Québec Canada
Josep Casacuberta Centre for Research in Agricultural Genomics CSIC-IRTA-UAB-UB, 08193 Barcelona, Spain
Richard Cordaux Université de Poitiers, UMR CNRS 7267 Ecologie et Biologie des Interactions, Equipe Ecologie Evolution Symbiose, 5 Rue Albert Turpin, 86073 Poitiers Cedex 9, France
Cédric Feschotte Department of Human Genetics, University of Utah School of Medicine, Salt Lake City, UT 84112 USA
Anna-Sophie Fiston-Lavier Institut des Sciences de l'Evolution de Montpellier (ISE-M), Equipe Evolution, Vecteurs, Adaptation et Symbiose, UMR5554 CNRS-Université Montpellier, Montpellier, 34090 cedex 05 France
Aurélie Hua-Van Laboratoire Evolution, Génomes, Comportement Ecologie, CNRS-Université Paris-Sud (UMR 9191)-IRD (UMR 247)-Université Paris-Saclay, F-91198 Gif-sur-Yvette, France
Robert Hubley Institute for Systems Biology, 401 Terry Ave. N, Seattle, WA 98109 USA
Aurélie Kapusta Department of Human Genetics, University of Utah School of Medicine, Salt Lake City, UT 84112 USA
Emmanuelle Lerat Laboratoire Biometrie et Biologie Evolutive, Universite Claude Bernard-Lyon 1, UMR-CNRS 5558-Bat. Mendel, 43 bd du 11 novembre 1918, 69622 Villeurbanne cedex, France
Florian Maumus INRA, UR1164 URGI-Research Unit in Genomics-Info, INRA de Versailles-Grignon, Route de Saint-Cyr, Versailles, 78026 France
David D Pollock University of Colorado School of Medicine, Aurora, CO 80045 USA
Hadi Quesneville INRA, UR1164 URGI-Research Unit in Genomics-Info, INRA de Versailles-Grignon, Route de Saint-Cyr, Versailles, 78026 France
Arian Smit Institute for Systems Biology, 401 Terry Ave. N, Seattle, WA 98109 USA
Travis J Wheeler Department of Computer Science, University of Montana, Missoula, MT 59812 USA
Thomas E Bureau Department of Biology, McGill University, Stewart Biology Bldg., 1205 Ave. du Docteur-Penfield, Montréal, Québec H3A 1B1 Canada
Mathieu Blanchette School of Computer Science, McGill University, McConnell Engineering Bldg., Rm. 318, 3480 Rue University, Montréal, Québec H3A 0E9 Canada ; McGill Centre for Bioinformatics, McGill University, Montréal, Québec Canada

Collapse

Frith MC, Kawaguchi R. Split-alignment of genomes finds orthologies more accurately. Genome Biol 2015;16:106. [PMID: 25994148 PMCID: PMC4464727 DOI: 10.1186/s13059-015-0670-9] [Citation(s) in RCA: 65] [Impact Index Per Article: 7.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2015] [Accepted: 05/08/2015] [Indexed: 04/29/2023] Open

Tsuji J, Frith MC, Tomii K, Horton P. Mammalian NUMT insertion is non-random. Nucleic Acids Res 2012;40:9073-88. [PMID: 22761406 PMCID: PMC3467031 DOI: 10.1093/nar/gks424] [Citation(s) in RCA: 73] [Impact Index Per Article: 6.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022] Open

Frith MC. Gentle masking of low-complexity sequences improves homology search. PLoS One 2011;6:e28819. [PMID: 22205972 PMCID: PMC3242753 DOI: 10.1371/journal.pone.0028819] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2011] [Accepted: 11/15/2011] [Indexed: 11/19/2022] Open

Kiryu H. Sufficient statistics and expectation maximization algorithms in phylogenetic tree models. ACTA ACUST UNITED AC 2011;27:2346-53. [PMID: 21757463 DOI: 10.1093/bioinformatics/btr420] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022]

Hudek AK, Brown DG. FEAST: sensitive local alignment with multiple rates of evolution. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2011;8:698-709. [PMID: 20733242 DOI: 10.1109/tcbb.2010.76] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/29/2023]

Söding J, Remmert M. Protein sequence comparison and fold recognition: progress and good-practice benchmarking. Curr Opin Struct Biol 2011;21:404-11. [PMID: 21458982 DOI: 10.1016/j.sbi.2011.03.005] [Citation(s) in RCA: 55] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2011] [Revised: 03/01/2011] [Accepted: 03/09/2011] [Indexed: 11/26/2022]

Selective loss of glycogen synthase kinase-3α in birds reveals distinct roles for GSK-3 isozymes in tau phosphorylation. FEBS Lett 2011;585:1158-62. [PMID: 21419127 DOI: 10.1016/j.febslet.2011.03.025] [Citation(s) in RCA: 37] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2011] [Revised: 03/06/2011] [Accepted: 03/11/2011] [Indexed: 01/05/2023]

Nakato R, Gotoh O. Cgaln: fast and space-efficient whole-genome alignment. BMC Bioinformatics 2010;11:224. [PMID: 20433723 PMCID: PMC2873541 DOI: 10.1186/1471-2105-11-224] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/02/2010] [Accepted: 04/30/2010] [Indexed: 11/10/2022] Open

Abstract

BACKGROUND

Whole-genome sequence alignment is an essential process for extracting valuable information about the functions, evolution, and peculiarities of genomes under investigation. As available genomic sequence data accumulate rapidly, there is great demand for tools that can compare whole-genome sequences within practical amounts of time and space. However, most existing genomic alignment tools can treat sequences that are only a few Mb long at once, and no state-of-the-art alignment program can align large sequences such as mammalian genomes directly on a conventional standalone computer.

RESULTS

We previously proposed the CGAT (Coarse-Grained AlignmenT) algorithm, which performs an alignment job in two steps: first at the block level and then at the nucleotide level. The former is "coarse-grained" alignment that can explore genomic rearrangements and reduce the sizes of the regions to be analyzed in the next step. The latter is detailed alignment within limited regions. In this paper, we present an update of the algorithm and the open-source program, Cgaln, that implements the algorithm. We compared the performance of Cgaln with those of other programs on whole genomic sequences of several bacteria and of some mammalian chromosome pairs. The results showed that Cgaln is several times faster and more memory-efficient than the best existing programs, while its sensitivity and accuracy are comparable to those of the best programs. Cgaln takes less than 13 hours to finish an alignment between the whole genomes of human and mouse in a single run on a conventional desktop computer with a single CPU and 2 GB memory.

CONCLUSIONS

Cgaln is not only fast and memory efficient but also effective in coping with genomic rearrangements. Our results show that Cgaln is very effective for comparison of large genomes, especially of intact chromosomal sequences. We believe that Cgaln provides novel viewpoint for reducing computational complexity and will contribute to various fields of genome science.

Collapse

Grabherr MG, Russell P, Meyer M, Mauceli E, Alföldi J, Di Palma F, Lindblad-Toh K. Genome-wide synteny through highly sensitive sequence alignment: Satsuma. Bioinformatics 2010;26:1145-51. [PMID: 20208069 DOI: 10.1093/bioinformatics/btq102] [Citation(s) in RCA: 188] [Impact Index Per Article: 13.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open

Frith MC, Hamada M, Horton P. Parameters for accurate genome alignment. BMC Bioinformatics 2010;11:80. [PMID: 20144198 PMCID: PMC2829014 DOI: 10.1186/1471-2105-11-80] [Citation(s) in RCA: 136] [Impact Index Per Article: 9.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2009] [Accepted: 02/09/2010] [Indexed: 11/25/2022] Open