Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Waterman MS, Eggert M, Lander E. Parametric sequence comparisons. Proc Natl Acad Sci U S A 1992;89:6090-3. [PMID: 1631095 PMCID: PMC49443 DOI: 10.1073/pnas.89.13.6090] [Citation(s) in RCA: 66] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022] Open

For:	Waterman MS, Eggert M, Lander E. Parametric sequence comparisons. Proc Natl Acad Sci U S A 1992;89:6090-3. [PMID: 1631095 PMCID: PMC49443 DOI: 10.1073/pnas.89.13.6090] [Citation(s) in RCA: 66] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022] Open

Number

Cited by Other Article(s)

Llinares-López F, Berthet Q, Blondel M, Teboul O, Vert JP. Deep embedding and alignment of protein sequences. Nat Methods 2023;20:104-111. [PMID: 36522501 DOI: 10.1038/s41592-022-01700-2] [Citation(s) in RCA: 16] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2021] [Accepted: 10/24/2022] [Indexed: 12/23/2022]

Pevzner P, Vingron M, Reidys C, Sun F, Istrail S. Michael Waterman's Contributions to Computational Biology and Bioinformatics. J Comput Biol 2022;29:601-615. [PMID: 35727100 DOI: 10.1089/cmb.2022.29066.pp] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open

Root-Bernstein R, Churchill B. Co-Evolution of Opioid and Adrenergic Ligands and Receptors: Shared, Complementary Modules Explain Evolution of Functional Interactions and Suggest Novel Engineering Possibilities. Life (Basel) 2021;11:life11111217. [PMID: 34833093 PMCID: PMC8623292 DOI: 10.3390/life11111217] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/08/2021] [Revised: 10/29/2021] [Accepted: 11/03/2021] [Indexed: 12/14/2022] Open

Xu Z, Yang Y, Huang B. A teaching approach from the exhaustive search method to the Needleman-Wunsch algorithm. BIOCHEMISTRY AND MOLECULAR BIOLOGY EDUCATION : A BIMONTHLY PUBLICATION OF THE INTERNATIONAL UNION OF BIOCHEMISTRY AND MOLECULAR BIOLOGY 2017;45:194-204. [PMID: 27740737 DOI: 10.1002/bmb.21027] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/04/2016] [Revised: 08/12/2016] [Accepted: 09/06/2016] [Indexed: 06/06/2023]

Hamada M. Fighting against uncertainty: an essential issue in bioinformatics. Brief Bioinform 2013;15:748-67. [PMID: 23803300 DOI: 10.1093/bib/bbt038] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open

Zhou H, Skolnick J. Template-based protein structure modeling using TASSER(VMT.). Proteins 2011;80:352-61. [PMID: 22105797 DOI: 10.1002/prot.23183] [Citation(s) in RCA: 34] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2011] [Revised: 08/25/2011] [Accepted: 09/04/2011] [Indexed: 12/29/2022]

Abstract

Template-based protein structure modeling is commonly used for protein structure prediction. Based on the observation that multiple template-based methods often perform better than single template-based methods, we further explore the use of a variable number of multiple templates for a given target in the latest variant of TASSER, TASSER(VMT) . We first develop an algorithm that improves the target-template alignment for a given template. The improved alignment, called the SP(3) alternative alignment, is generated by a parametric alignment method coupled with short TASSER refinement on models selected using knowledge-based scores. The refined top model is then structurally aligned to the template to produce the SP(3) alternative alignment. Templates identified using SP(3) threading are combined with the SP(3) alternative and HHEARCH alignments to provide target alignments to each template. These template models are then grouped into sets containing a variable number of template/alignment combinations. For each set, we run short TASSER simulations to build full-length models. Then, the models from all sets of templates are pooled, and the top 20-50 models selected using FTCOM ranking method. These models are then subjected to a single longer TASSER refinement run for final prediction. We benchmarked our method by comparison with our previously developed approach, pro-sp(3) -TASSER, on a set with 874 easy and 318 hard targets. The average GDT-TS score improvements for the first model are 3.5 and 4.3% for easy and hard targets, respectively. When tested on the 112 CASP9 targets, our method improves the average GDT-TS scores as compared to pro-sp3-TASSER by 8.2 and 9.3% for the 80 easy and 32 hard targets, respectively. It also shows slightly better results than the top ranked CASP9 Zhang-Server, QUARK and HHpredA methods. The program is available for download at http://cssb.biology.gatech.edu/.

Collapse

Hower V, Heitsch CE. Parametric analysis of RNA branching configurations. Bull Math Biol 2011;73:754-76. [PMID: 21207176 DOI: 10.1007/s11538-010-9607-3] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2009] [Accepted: 11/04/2010] [Indexed: 01/30/2023]

Parametric Analysis of Alignment and Phylogenetic Uncertainty. Bull Math Biol 2011;73:795-810. [DOI: 10.1007/s11538-010-9610-8] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2009] [Accepted: 11/04/2010] [Indexed: 10/18/2022]

Yakovlev VV, Roytberg MA. Increasing the accuracy of global alignment of amino acid sequences by constructing a set of alignment candidates. Biophysics (Nagoya-shi) 2010. [DOI: 10.1134/s0006350910060011] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022] Open

Didier G. Parametric Maximum Parsimonious Reconstruction on Trees. Bull Math Biol 2010;73:1477-502. [DOI: 10.1007/s11538-010-9574-8] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2009] [Accepted: 07/05/2010] [Indexed: 11/30/2022]

Frith MC, Hamada M, Horton P. Parameters for accurate genome alignment. BMC Bioinformatics 2010;11:80. [PMID: 20144198 PMCID: PMC2829014 DOI: 10.1186/1471-2105-11-80] [Citation(s) in RCA: 150] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2009] [Accepted: 02/09/2010] [Indexed: 11/25/2022] Open

Edgar RC. Optimizing substitution matrix choice and gap parameters for sequence alignment. BMC Bioinformatics 2009;10:396. [PMID: 19954534 PMCID: PMC2791778 DOI: 10.1186/1471-2105-10-396] [Citation(s) in RCA: 30] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/10/2009] [Accepted: 12/02/2009] [Indexed: 12/04/2022] Open

Zheng W, Friedman AM, Bailey-Kellogg C. Algorithms for joint optimization of stability and diversity in planning combinatorial libraries of chimeric proteins. J Comput Biol 2009;16:1151-68. [PMID: 19645597 DOI: 10.1089/cmb.2009.0090] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open

Abstract

In engineering protein variants by constructing and screening combinatorial libraries of chimeric proteins, two complementary and competing goals are desired: the new proteins must be similar enough to the evolutionarily-selected wild-type proteins to be stably folded, and they must be different enough to display functional variation. We present here the first method, Staversity, to simultaneously optimize stability and diversity in selecting sets of breakpoint locations for site-directed recombination. Our goal is to uncover all "undominated" breakpoint sets, for which no other breakpoint set is better in both factors. Our first algorithm finds the undominated sets serving as the vertices of the lower envelope of the two-dimensional (stability and diversity) convex hull containing all possible breakpoint sets. Our second algorithm identifies additional breakpoint sets in the concavities that are either undominated or dominated only by undiscovered breakpoint sets within a distance bound computed by the algorithm. Both algorithms are efficient, requiring only time polynomial in the numbers of residues and breakpoints, while characterizing a space defined by an exponential number of possible breakpoint sets. We applied Staversity to identify 2-10 breakpoint plans for different sets of parent proteins taken from the purE family, as well as for parent proteins TEM-1 and PSE-4 from the beta-lactamase family. The average normalized distance between our plans and the lower bound for optimal plans is around 2%. Our plans dominate most (60-90% on average for each parent set) of the plans found by other possible approaches, random sampling or explicit optimization for stability with implicit optimization for diversity. The identified breakpoint sets provide a compact representation of good plans, enabling a protein engineer to understand and account for the trade-offs between two key considerations in combinatorial chimeragenesis.

Collapse

Zhou H, Skolnick J. Protein structure prediction by pro-Sp3-TASSER. Biophys J 2009;96:2119-27. [PMID: 19289038 DOI: 10.1016/j.bpj.2008.12.3898] [Citation(s) in RCA: 54] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2008] [Revised: 11/12/2008] [Accepted: 12/03/2008] [Indexed: 12/29/2022] Open

Abstract

An automated protein structure prediction algorithm, pro-sp3-Threading/ASSEmbly/Refinement (TASSER), is described and benchmarked. Structural templates are identified using five different scoring functions derived from the previously developed threading methods PROSPECTOR_3 and SP(3). Top templates identified by each scoring function are combined to derive contact and distant restraints for subsequent model refinement by short TASSER simulations. For Medium/Hard targets (those with moderate to poor quality templates and/or alignments), alternative template alignments are also generated by parametric alignment and the top models selected by TASSER-QA are included in the contact and distance restraint derivation. Then, multiple short TASSER simulations are used to generate an ensemble of full-length models. Subsequently, the top models are selected from the ensemble by TASSER-QA and used to derive TASSER contacts and distant restraints for another round of full TASSER refinement. The final models are selected from both rounds of TASSER simulations by TASSER-QA. We compare pro-sp3-TASSER with our previously developed MetaTASSER method (enhanced with chunk-TASSER for Medium/Hard targets) on a representative test data set of 723 proteins <250 residues in length. For the 348 proteins classified as easy targets (those templates with good alignments and global structure similarity to the target), the cumulative TM-score of the best of top five models by pro-sp3-TASSER shows a 2.1% improvement over MetaTASSER. For the 155/220 medium/hard targets, the improvements in TM-score are 2.8% and 2.2%, respectively. All improvements are statistically significant. More importantly, the number of foldable targets (those having models whose TM-score to native >0.4 in the top five clusters) increases from 472 to 497 for all targets, and the relative increases for medium and hard targets are 10% and 15%, respectively. A server that implements the above algorithm is available at http://cssb.biology.gatech.edu/skolnick/webservice/pro-sp3-TASSER/. The source code is also available upon request.

Collapse

Kim E, Kececioglu J. Learning scoring schemes for sequence alignment from partial examples. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2008;5:546-556. [PMID: 18989042 DOI: 10.1109/tcbb.2008.57] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/27/2023]

Mount DW. Using gaps and gap penalties to optimize pairwise sequence alignments. ACTA ACUST UNITED AC 2008;2008:pdb.top40. [PMID: 21356856 DOI: 10.1101/pdb.top40] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]

Choi S, Jeon J, Yang JS, Kim S. Common occurrence of internal repeat symmetry in membrane proteins. Proteins 2008;71:68-80. [PMID: 17932930 DOI: 10.1002/prot.21656] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]

Huang X. Sequence alignment with an appropriate substitution matrix. J Comput Biol 2008;15:129-38. [PMID: 18312146 DOI: 10.1089/cmb.2007.0155] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open

Do CB, Katoh K. Protein multiple sequence alignment. Methods Mol Biol 2008;484:379-413. [PMID: 18592193 DOI: 10.1007/978-1-59745-398-1_25] [Citation(s) in RCA: 30] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/26/2023]

Frith MC, Valen E, Krogh A, Hayashizaki Y, Carninci P, Sandelin A. A code for transcription initiation in mammalian genomes. Genes Dev 2008;18:1-12. [PMID: 18032727 PMCID: PMC2134772 DOI: 10.1101/gr.6831208] [Citation(s) in RCA: 189] [Impact Index Per Article: 11.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2007] [Accepted: 10/14/2007] [Indexed: 11/24/2022]

Lunter G, Rocco A, Mimouni N, Heger A, Caldeira A, Hein J. Uncertainty in homology inferences: assessing and improving genomic sequence alignment. Genome Res 2007;18:298-309. [PMID: 18073381 DOI: 10.1101/gr.6725608] [Citation(s) in RCA: 90] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]

Abstract

Sequence alignment underpins all of comparative genomics, yet it remains an incompletely solved problem. In particular, the statistical uncertainty within inferred alignments is often disregarded, while parametric or phylogenetic inferences are considered meaningless without confidence estimates. Here, we report on a theoretical and simulation study of pairwise alignments of genomic DNA at human-mouse divergence. We find that >15% of aligned bases are incorrect in existing whole-genome alignments, and we identify three types of alignment error, each leading to systematic biases in all algorithms considered. Careful modeling of the evolutionary process improves alignment quality; however, these improvements are modest compared with the remaining alignment errors, even with exact knowledge of the evolutionary model, emphasizing the need for statistical approaches to account for uncertainty. We develop a new algorithm, Marginalized Posterior Decoding (MPD), which explicitly accounts for uncertainties, is less biased and more accurate than other algorithms we consider, and reduces the proportion of misaligned bases by a third compared with the best existing algorithm. To our knowledge, this is the first nonheuristic algorithm for DNA sequence alignment to show robust improvements over the classic Needleman-Wunsch algorithm. Despite this, considerable uncertainty remains even in the improved alignments. We conclude that a probabilistic treatment is essential, both to improve alignment quality and to quantify the remaining uncertainty. This is becoming increasingly relevant with the growing appreciation of the importance of noncoding DNA, whose study relies heavily on alignments. Alignment errors are inevitable, and should be considered when drawing conclusions from alignments. Software and alignments to assist researchers in doing this are provided at http://genserv.anat.ox.ac.uk/grape/.

Collapse

Chivian D, Baker D. Homology modeling using parametric alignment ensemble generation with consensus and energy-based model selection. Nucleic Acids Res 2006;34:e112. [PMID: 16971460 PMCID: PMC1635247 DOI: 10.1093/nar/gkl480] [Citation(s) in RCA: 89] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open

Dewey CN, Huggins PM, Woods K, Sturmfels B, Pachter L. Parametric alignment of Drosophila genomes. PLoS Comput Biol 2006;2:e73. [PMID: 16789815 PMCID: PMC1480539 DOI: 10.1371/journal.pcbi.0020073] [Citation(s) in RCA: 28] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2005] [Accepted: 05/10/2006] [Indexed: 12/29/2022] Open

Kececioglu J, Kim E. Simple and Fast Inverse Alignment. LECTURE NOTES IN COMPUTER SCIENCE 2006. [DOI: 10.1007/11732990_37] [Citation(s) in RCA: 16] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]

Redelings BD, Suchard MA. Joint Bayesian estimation of alignment and phylogeny. Syst Biol 2005;54:401-18. [PMID: 16012107 DOI: 10.1080/10635150590947041] [Citation(s) in RCA: 165] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2022] Open

Pachter L, Sturmfels B. Parametric inference for biological sequence analysis. Proc Natl Acad Sci U S A 2004;101:16138-43. [PMID: 15534223 PMCID: PMC528961 DOI: 10.1073/pnas.0406011101] [Citation(s) in RCA: 36] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023] Open

Pachter L, Sturmfels B. Tropical geometry of statistical models. Proc Natl Acad Sci U S A 2004;101:16132-7. [PMID: 15534224 PMCID: PMC528960 DOI: 10.1073/pnas.0406010101] [Citation(s) in RCA: 53] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open

Sun F, Fernández-Baca D, Yu W. Inverse parametric sequence alignment. ACTA ACUST UNITED AC 2004. [DOI: 10.1016/j.jalgor.2004.04.008] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]

Zhang K, Sun F, Waterman MS, Chen T. Haplotype block partition with limited resources and applications to human chromosome 21 haplotype data. Am J Hum Genet 2003;73:63-73. [PMID: 12802783 PMCID: PMC1180591 DOI: 10.1086/376437] [Citation(s) in RCA: 37] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2003] [Accepted: 04/10/2003] [Indexed: 11/03/2022] Open

Hoofer SR, Bussche RAVD. Molecular Phylogenetics of the Chiropteran Family Vespertilionidae. ACTA CHIROPTEROLOGICA 2003. [DOI: 10.3161/001.005.s101] [Citation(s) in RCA: 128] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]

Jaroszewski L, Li W, Godzik A. In search for more accurate alignments in the twilight zone. Protein Sci 2002;11:1702-13. [PMID: 12070323 PMCID: PMC2373660 DOI: 10.1110/ps.4820102] [Citation(s) in RCA: 64] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/14/2022]

Inverse Parametric Sequence Alignment. LECTURE NOTES IN COMPUTER SCIENCE 2002. [DOI: 10.1007/3-540-45655-4_12] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]

Fang W, Roberts FS, Ma Z. A measure of discrepancy of multiple sequences. Inf Sci (N Y) 2001. [DOI: 10.1016/s0020-0255(01)00108-6] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/17/2022]

Drasdo D, Hwa T, Lässig M. Scaling laws and similarity detection in sequence alignment with gaps. J Comput Biol 2000;7:115-41. [PMID: 10890391 DOI: 10.1089/10665270050081414] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open

Giribet G, Wheeler WC. On gaps. Mol Phylogenet Evol 1999;13:132-43. [PMID: 10508546 DOI: 10.1006/mpev.1999.0643] [Citation(s) in RCA: 225] [Impact Index Per Article: 8.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]

Altschul SF. Generalized affine gap costs for protein sequence alignment. Proteins 1998. [DOI: 10.1002/(sici)1097-0134(19980701)32:1<88::aid-prot10>3.0.co;2-j] [Citation(s) in RCA: 94] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]

Wheeler WC, Hayashi CY. The Phylogeny of the Extant Chelicerate Orders. Cladistics 1998;14:173-192. [DOI: 10.1111/j.1096-0031.1998.tb00331.x] [Citation(s) in RCA: 217] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/01/2022] Open

Apostolico A, Giancarlo R. Sequence alignment in molecular biology. J Comput Biol 1998;5:173-96. [PMID: 9672827 DOI: 10.1089/cmb.1998.5.173] [Citation(s) in RCA: 45] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open

Shibuya T, Imai H. New flexible approaches for multiple sequence alignment. J Comput Biol 1997;4:385-413. [PMID: 9278067 DOI: 10.1089/cmb.1997.4.385] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/05/2023] Open

Komatsoulis GA, Waterman MS. A new computational method for detection of chimeric 16S rRNA artifacts generated by PCR amplification from mixed bacterial populations. Appl Environ Microbiol 1997;63:2338-46. [PMID: 9172353 PMCID: PMC168526 DOI: 10.1128/aem.63.6.2338-2346.1997] [Citation(s) in RCA: 36] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/04/2023] Open

Koch I, Lengauer T, Wanke E. An algorithm for finding maximal common subtopologies in a set of protein structures. J Comput Biol 1996;3:289-306. [PMID: 8811488 DOI: 10.1089/cmb.1996.3.289] [Citation(s) in RCA: 85] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/02/2023] Open

Agarwal P, States DJ. A Bayesian evolutionary distance for parametrically aligned sequences. J Comput Biol 1996;3:1-17. [PMID: 8697232 DOI: 10.1089/cmb.1996.3.1] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/01/2023] Open

Abstract

There is an inherent relationship between the process of pairwise sequence alignment and the estimation of evolutionary distance. This relationship is explored and made explicit. Assuming an evolutionary model and given a specific pattern of observed base mismatches, the relative probabilities of evolution at each evolutionary distance are computed using a Bayesian framework. The mean or the median of this probability distribution provides a robust estimate of the central value. The evolutionary distance has traditionally been computed as zero for an observed homology of 20 bases with no mismatches; we prove that it is highly probable that the distance is greater than 0.01. The mean of the distribution is 0.047, which is a better estimate of the evolutionary distance. Bayesian estimates of the evolutionary distance incorporate arbitrary prior information about variable mutation rates both over time and along sequence position, thus requiring only a weak form of the molecular-clock hypothesis. The endpoints of the similarity between genomic DNA sequences are often ambiguous. The probability of evolution at each evolutionary distance can be estimated over the entire set of alignments by choosing the best alignment at each distance and the corresponding probability of duplication at that evolutionary distance. A central value of this distribution provides a robust evolutionary distance estimate. We provide an efficient algorithm for computing the parametric alignment, considering evolutionary distance as the only parameter. These techniques and estimates are used to infer the duplication history of the genomic sequence in C. elegans and in S. cerevisiae. Our results indicate that repeats discovered using a single scoring matrix show a considerable bias in subsequent evolutionary distance estimates.

Collapse

Waterman MS. Parametric and ensemble sequence alignment algorithms. Bull Math Biol 1994;56:743-67. [PMID: 8054893 DOI: 10.1007/bf02460719] [Citation(s) in RCA: 16] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/28/2023]

DeSalle R, Wray C, Absher R. Computational problems in molecular systematics. EXS 1994;69:353-70. [PMID: 7994115 DOI: 10.1007/978-3-0348-7527-1_21] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/28/2023]

Wheeler WC. Sources of ambiguity in nucleic acid sequence alignment. EXS 1994;69:323-52. [PMID: 7994113 DOI: 10.1007/978-3-0348-7527-1_20] [Citation(s) in RCA: 15] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/28/2023]

Naor D, Brutlag DL. On near-optimal alignments of biological sequences. J Comput Biol 1994;1:349-66. [PMID: 8790476 DOI: 10.1089/cmb.1994.1.349] [Citation(s) in RCA: 22] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/02/2023] Open

Weir BS. Analysis of DNA sequences. Stat Methods Med Res 1993;2:225-39. [PMID: 8261259 DOI: 10.1177/096228029300200303] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/29/2023]